In [26]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


# SPAI storage


The SPAI Library also has a module called storage containing a set of Storage object declarations in order to easily manage different type of storages. Very useful for EO Applications in order to manage cloud buckets for TIFF images, vector files and other type of analytics files such as .csv or Dataframes.

The Storage class serves as an interface for accessing different storage backends. It inherits from the BaseStorage class and dynamically sets up access to various storage options such as local file systems and Amazon S3, based on the environment variables provided (If a spai project has been created, the environment variables for storage are created automatically from the spai.config.yaml file). The storage backends available to the Storage class are determined by the environment variables, allowing for flexible configuration and initialization of the storage solutions in use.


First install the storage dependencies if you have not yet:

In [None]:
!pip install spai[storage]

## Local Storage


LocalStorage is a storage system that initializes a directory on the local filesystem. This storage class inherits from BaseStorage and is responsible for managing a local directory where data can be stored.


The following code initializes a LocalStorage object that points to a directory called `data` in the local filesystem.


In [None]:
from spai.storage.LocalStorage import LocalStorage

storage = LocalStorage("data")

## S3 Storage


S3Storage is a storage system that initializes and interacts with an S3-compatible object storage service. This storage class inherits from BaseStorage and is meant for handling objects in remote buckets using the MinIO client. To work, we'll need to give the needed S3 credentials, such as:

- url: the URL of the S3-compatible service. Default is `storage.googleapis.com`.
- access: the access key for the S3-compatible service.
- secret: the secret key for the S3-compatible service.
- region: the region of the S3-compatible service. Default is `europe-west1`.
- bucket: the bucket name of the S3-compatible service.

The name of the bucket will be given for each storage in each project. It can be obtained by running in the CLI:


In [None]:
!spai list services <my_project_name>

On the other hand, credentials, such as the access or secret key, can be obtained with the CLI as well:


In [None]:
!spai auth credentials

With the following code we initialize a S3Storage object that points to a bucket called `spai-data` in the S3-compatible service.


In [None]:
from spai.storage.S3Storage import S3Storage

storage = S3Storage(
    url="storage.googleapis.com",
    access="your-access-key",
    secret="your-secret-key",
    region="europe-west1",
    bucket="spai-data",
)

## CRUD operations


Once local and/or S3 storages are initialized, you can interact with them in a unified way thanks to BaseStorage super class.


### List


Lists all entities in the storage that match the given pattern.


In [None]:
storage.list()  # List all files in the storage
storage.list("*.tif")  # List all TIFF files in the storage

### Create


Creates a new storage entity based on the data provided. It can also update an storage object.


In [None]:
storage.create(data, name="my_file.tif")
storage.create(gdf, name="my_vector_file.geojson")

### Read


Read a file from the storage. It will directly return the data as `GeoDataFrame` or `DataFrame` in the case of a vector file or csv, or a `GeoTiff` object if the file is a raster. However, the best way to read a raster file is with the `read_raster` function, which we will see soon.


In [None]:
storage.read("my_file.tif")
storage.read("my_vector_file.geojson")
storage.read("my_csv_file.csv")