# Introduction

In this notebook we will see how to stage files, save them in the catalog, do it from a Prefect flow, run a simulated S1L0 processing from a Prefect flow  ... 
<mark>TODO: link to existing documentation that already explains these concepts ? Or copy/paste the documentation here ? 
Or write a simplified documentation here ?</mark>

# Check your installation

## `rs-client-libraries` installation

The `rs-client-libraries` Python library is the preferred way to access the RS-Server services from your environment. It is automatically installed in this notebook.

In [None]:
import rs_client
import rs_common
import rs_workflows

## Environment

In [None]:
import os

# In local mode, all your services are running locally.
# In hybrid or cluster mode, we use the services deployed on the RS-Server website.
# This configuration is set in an environment variable.
local_mode = (os.getenv("RSPY_LOCAL_MODE") == "1")

# In local mode, print the services URL
if local_mode:
    print (f"ADGS service: http://localhost:8001/docs")
    print (f"CADIP service: http://localhost:8002/docs")
    print (f"Catalog service: http://localhost:8003/api.html")
    print (f"Prefect dashboard: http://localhost:4200")
    print (f"Grafana dashboard: http://localhost:3000/explore")
    url = None # not used

# In hybrid or cluster mode, the RS-Server website is set in an environment variable.
else:
    url = os.environ["RSPY_WEBSITE"]
    print (f"RS-Server website: {url}")
    print (f"Create an API key: {url}/docs#/API-Key%20Manager/create_api_key_apikeymanager_auth_api_key_new_get")

---
**<mark>TO BE DISCUSSED</mark>**

In local mode, is it a good advice to tell the end-users to go to the Prefect and Grafana dashboard ? Should we also give the links to the Minio S3 bucket ?

Same question in hybrid/cluster mode, should we give these links and how to pass them ? (env variables ?)

---

## API key

In hybrid and cluster mode, you need an API key to access the RS-Server services. You can create one from the link displayed in the previous cell, then enter it manually in the cell below. 

If you prefer to load it automatically in all your notebooks, you can: 

  * From your JupyterHub workspace, open the text file `~/.rspy` <mark>(TODO: name to be defined)</mark>
  * Save your API key using this syntax:

    ```bash
    export RSPY_APIKEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx # replace by your value
    ```

  * Save and close the file.
  * <mark>TO BE CONFIRMED: Reload your JupyterHub session from Menu -> File -> Log Out / or just</mark>
  * <mark>Reload this notebook kernel from Menu -> Kernel -> Restart Kernel.</mark>

In [None]:
apikey = os.getenv("RSPY_APIKEY")
if (not local_mode) and (not apikey):
    import getpass
    apikey = getpass.getpass(f"Enter your API key:")

# RsClient initialisation

We are using Python RsClient instances to access the RS-Server services.

In [None]:
from rs_client.rs_client import RsClient
from rs_common.config import ECadipStation

# Init a generic RS-Client instance. Pass the:
#   - RS-Server website URL
#   - API key
#   - Logger (optional, a default one can be used)
generic_client = RsClient(url, apikey, logger=None)

# From this generic instance, get an Auxip client instance
auxip_client = generic_client.get_auxip_client()

# Or get a Cadip client instance. Pass the cadip station.
cadip_station = ECadipStation.CADIP
cadip_client = generic_client.get_cadip_client(cadip_station)

# Call services manually

In this section, we will see how to call manually these services: 

  * Search Auxip and Cadip stations for new files
  * Stage these files
  * Check the staging status
  * Search Cadip sessions

In [None]:
from datetime import datetime
import json

from rs_common.config import EPlatform

# Define a search interval
start_date = datetime(2014, 1, 1, 12, 0, 0)
stop_date = datetime(2024, 1, 1, 12, 0, 0)

# Timeout in seconds for the endpoints
TIMEOUT = 30

## Search Auxip and Cadip stations

In [None]:
# Do this using the Auxip then the Cadip client
for client in [auxip_client, cadip_client]:

    # Search stations for new files in the date interval.
    files = client.search_stations(start_date, stop_date, TIMEOUT)
    
    file_count = len(files)
    assert file_count, f"We should have at least one {client.station_name} file"
    print (f"Found {file_count} {client.station_name} files\n")

    # Print the first file. It is in the STAC format.
    print(f"First {client.station_name} file:\n{json.dumps(files[0], indent=2)}\n")

    # By default, the files are returned sorted by the most recent first.    
    ids="\n".join([f"{f['properties']['datetime']} - {f['id']}" for f in files[:10]])
    print(f"Most recent {client.station_name} IDs and datetimes:\n{ids}\n")

In [None]:
# We can sort by +/- any property, e.g. by datetime ascending = the oldest first.
for client in [auxip_client, cadip_client]:    
    files = client.search_stations(start_date, stop_date, TIMEOUT, sortby="+datetime")
    ids="\n".join([f"{f['properties']['datetime']} - {f['id']}" for f in files[:10]])
    print(f"Oldest {client.station_name} IDs and datetimes:\n{ids}\n")

In [None]:
# We can also limit the number of returned results
limit = 10
for client in [auxip_client, cadip_client]:
    files = client.search_stations(start_date, stop_date, TIMEOUT, limit=limit)
    assert len(files) == limit    

## Search Cadip sessions

In [None]:
# Search cadip sessions by date interval and platforms
platforms = [EPlatform.S1A, EPlatform.S2B]
sessions = cadip_client.search_sessions(TIMEOUT, start_date=start_date, stop_date=stop_date, platforms=platforms)

session_count = len(sessions)
assert session_count, "We should have at least one Cadip session"
print (f"Found {session_count} Cadip sessions")

In [None]:
# Print the first cadip session. It is in the STAC format.
print(f"First Cadip session:\n{json.dumps(sessions[0], indent=2)}\n")

In [None]:
# Print all the Cadip sessions ID
ids="\n".join([s["id"] for s in sessions])
print(f"Cadip sessions ID:\n{ids}")

In [None]:
# We can also search Cadip sessions by specific sessions ID, 
# rather than by date interval and platforms
# e.g. get information for the cadip sessions #2 and #3
search_ids=ids[1:3]
search_sessions = cadip_client.search_sessions(TIMEOUT, session_ids=search_ids)
print(f"Cadip sessions information:\n{json.dumps(search_sessions, indent=2)}\n")