## Accessing Jasmin Datastore

This notebook is just a record of my interactions with the clean air datastore, with everything mainly taken from this __[guide notebook](https://github.com/ADAQ-AQI/clean-air-project/blob/master/src/clean_air/data/data_store_access.ipynb)__.

To get to this point, I had to first get JASMIN permissions and credentials setup first:
1. Have Jasmin login & access to caf-o object store, etc granted.
2. __[Create access key and secret credentials](https://help.jasmin.ac.uk/article/4847-using-the-jasmin-object-store)__, i.e. I used:
	1. Accessing object store UI: `ssh -AY tomw@nx-login1.jasmin.ac.uk firefox`
	2. then:  `http://caf-o.s3.jc.rl.ac.uk:81/_admin/portal` (change URL for object store you want to access).
3. __[Set up](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)__ an  `~/.aws/credentials` file with your credentials (make sure you have them the right way around).
4. Now you should be able to upload your data to the bucket (Though this may break in the future, as codebase is still a WiP).

## Downloading

In [1]:
from clean_air.data.storage import create_dataset_store, create_metadata_store

from shapely.geometry import Polygon
import tempfile
from pathlib import Path
from clean_air.models import Metadata, DataSet, Extent, TemporalExtent


Create instances of (anon) stores.

In [3]:
metadata_store = create_metadata_store()
dataset_store = create_dataset_store()

See what's available in the bucket (using the default 'caf-data'), and loop through to download everything.

In [4]:
print(f"dataset_store.available_datasets = {dataset_store.available_datasets()}")

available_datasets = dataset_store.available_datasets()

for dataset_id in available_datasets:
	try:
		dataset = dataset_store.get(dataset_id)
		print(str(dataset))

		metadata = metadata_store.get(dataset_id)
		print(str(metadata))
	except:
		print('error')

/var/tmp/tmp1dmp78hp
dataset_store.available_datasets = ['m270']
DataSet(files=[], metadata=Metadata(title='M270', extent=Extent(spatial=<shapely.geometry.polygon.Polygon object at 0x7f19e0c145e0>, temporal=TemporalExtent(values=[], intervals=[])), crs=<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
, description='AQUM output and processed files corresponding to MOASA flight M270, uploaded for', keywords=[], data_type=<DataType.OTHER: 'other'>, contacts=[]))
Metadata(title='M270', extent=Extent(spatial=<shapely.geometry.polygon.Polygon object at 0x7f19e0be3bb0>, temporal=TemporalExtent(values=[], intervals=[])), crs=<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Ge

## Uploading

Create a new datastore instance, this time with write access - your `~/.aws/credentials` file (or equivalent) will need to be setup first.

In [None]:
dataset_store_with_write_access = create_dataset_store(anon=False)

Configure path to the files you're uploading.

In [None]:
data_dir_path='/new-flight-plots/Data_Files/Model/M270'

Upload your data and metadata.

In [None]:
# Create new dataset
with tempfile.TemporaryDirectory() as data_dir_path:
    # Create some test data
    test_datafile = Path(data_dir_path)
    test_datafile.touch()
    metadata = Metadata(title="M270", 
                        extent=Extent(Polygon([[-1, -1], [1, -1], [1, 1], [-1, 1]]), TemporalExtent()))
    test_dataset = DataSet(files=[test_datafile], metadata=metadata)

    # Upload it
    dataset_store_with_write_access.put(test_dataset)

metadata_store_with_write_access = create_metadata_store(anon=False)
# Update the metadata...
metadata.description = "AQUM output and processed files corresponding to MOASA flight M270, uploaded for testing"
# ...and upload it
metadata_store_with_write_access.put(metadata)