In [None]:
#| include: false
%load_ext autoreload
%autoreload 2

In [None]:
#| include: false
from nbdev.showdoc import *

All `Downloaders` and `Submittors` support Google Cloud Storage (GCS).

__Credentials are detected automatically in the following way:__
1. The environment variable `GOOGLE_APPLICATION_CREDENTIALS` is set and points to a valid `.json` file.

2. (Fallback 1) You have a valid Cloud SDK installation.

3. (Fallback 2) The machine running the code is a GCP machine.

In [None]:
from nbdev import show_doc
from numerblox.download import NumeraiClassicDownloader, BaseIO

## Example usage

In order to use GCS you should:
1. Instantiate a `Downloader` or `Submitter`.

2a. For single files, call `.upload_file_to_gcs` or `.download_file_from_gcs`.

2b. For directories, call `.upload_directory_to_gcs` or `.download_directory_from_gcs`.

#### 1a. Downloading Numerai Classic inference data and uploading to GCS

In [None]:
# slow
# This should point to a valid GCS bucket within your Google Cloud environment.
bucket_name = "test"

# Get inference data for current round
downloader = NumeraiClassicDownloader("round_n")
downloader.download_inference_data("inference", version="4.1", int8=False)

2023-01-04 20:06:11,438 INFO numerapi.utils: starting download
round_n/inference/live.parquet: 4.51MB [00:00, 7.40MB/s]                            


All the data that has been downloaded can be uploaded to a GCS bucket with 1 line of code.

In [None]:
# Upload inference data for most recent round to GCS
# downloader.upload_directory_to_gcs(bucket_name=bucket_name, gcs_path="round_n")

#### 2b. Downloading inference data from GCS Bucket

Conversely, A directory stored in a GCS bucket can be downloaded to your local directory. It will be stored in the base directory specified when you instantiated `nmr_downloader`.

In [None]:
# Download data from bucket to local directory
# downloader.download_directory_from_gcs(bucket_name=bucket_name, gcs_path="round_n")

Hope you enjoyed this short example of how to work with GCS buckets for Downloaders and Submitters in this framework. The object handling all this logic under the hood is `BaseIO`.

In [None]:
#| echo: false
show_doc(BaseIO)

---

[source](https://github.com/crowdcent/numerblox/tree/master/blob/master/numerblox/download.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### BaseIO

>      BaseIO (directory_path:str)

Basic functionality for IO (downloading and uploading).

:param directory_path: Base folder for IO. Will be created if it does not exist.

Your local environment can be cleaned up with 1 line of code. Convenient if you are done with inference and would like to delete downloaded inference data automatically.

In [None]:
# Clean up environment
downloader.remove_base_directory()

------------------------------------