# Generate RO Distributions

Authors: Stephen Leroy (sleroy@aer.com) \
Date: 16 May 2024

This notebook can be used to plot distributions of RO data as manifested in the 
[AWS Registry of Open Data](https://registry.opendata.aws/gnss-ro-opendata/). In 
fact, this notebook is derived from one of the tutorials associated with the 
Registry of Open Data; see [database_demonstration.ipynb](https://github.com/gnss-ro/aws-opendata/blob/master/tutorials/database_demonstration.ipynb). 


### Initialization

If you haven't been through this procedure already, be sure that the following 
prerequisites are installed. 

- matplotlib
- cartopy 
- awsgnssroutils (via "pip install")

Once the above are installed, be sure to have specified defaults associated with 
*awsgnssroutils*. 

In [None]:
from awsgnssroutils.database import setdefaults
import os

HOME = os.path.expanduser("~")
metadata_root = os.path.join( HOME, "local/rodatabase" )
data_root = os.path.join( HOME, "Data/rodata" )

setdefaults( metadata_root=metadata_root, data_root=data_root, version="v1.1" )

The *repository_directory* is a path to the directory where a collection of json 
files will be downloaded and stored. It must be an absolute path. The 
*rodata_root_directory* is the default root directory for RO data downloads. It 
is optional, but it must also be an absolute path if it is specified. 

*Be certain you have enough storage available in your ~/local/rodatabase 
file space. RO metadata requires ~50 GB of storage. If you don't 
have enough space allocated, edit the above command to point toward a 
repository directory on a scratch volume, which typically has plenty of 
storage available for all RO metadata.*


### Distribution figure

The portal to RO data in the AWS Registry of Open Data is the Python API *RODatabaseClient*. We will use it 
for the simple function of getting geolocation information over a to-be-defined range of time and plot the 
distribution of soundings color-coded by RO mission. 

In [None]:
from awsgnssroutils.database import RODatabaseClient

rodb = RODatabaseClient( update=False )
print( rodb )

Define a range of times over which to get RO geolocations as processed by a contributing RO processing center. Valid centers are "ucar", "jpl", and "romsaf". Valid RO soundings are those associated with successfully 
retrieved bending angle and refractivity profiles. 

In [None]:
processing_center = "ucar"
datetimerange = ( "2023-07-01", "2023-07-15" )
missions = None

availablefiletypes = f'{processing_center}_refractivityRetrieval'
occlist = rodb.query( missions=missions, datetimerange=datetimerange, availablefiletypes=availablefiletypes )
print( occlist )

from datetime import datetime, timedelta

trange = [ datetime.fromisoformat(dt) for dt in datetimerange ]
trange[1] = trange[1] - timedelta(days=1)

title = "RO Distribution for {:} through {:}".format( *[dt.strftime("%d %b %Y") for dt in trange ] )


Get a list of all of the RO missions that contributed profiles over this time range. 

In [None]:
missions = occlist.info( "mission" )
print( "missions: " + ", ".join( missions ) )

Because the mission names are lower-case mnemonics, compose a list of presentation names for these missions. 
You will have to add to this list or edit it according to the missions of interest. 

In [None]:
mission_names = [ 
    { 'aws': "cosmic2", 'presentation': "COSMIC-2" }, 
    { 'aws': "kompsat5", 'presentation': "KompSat-5" }, 
    { 'aws': "metop", 'presentation': "Metop" }, 
    { 'aws': "paz", 'presentation': "rohp-PAZ" }, 
    { 'aws': "spire", 'presentation': "Spire" }, 
    { 'aws': "tdx", 'presentation': "TanDEM-X" }, 
    { 'aws': "tsx", 'presentation': "TerraSAR-X" }, 
]

for mission in missions: 
    if mission not in [ m['aws'] for m in mission_names ]: 
        print( f'Be sure to account for {mission} in mission_names.' )

Do the plotting. 

In [None]:
outputfile = 'distribution.eps'  #  Set this to None if you wish to see output in this notebook. 
# outputfile = None

import matplotlib.pyplot as plt
import cartopy.crs as ccrs

plt.clf()
cmap = plt.get_cmap( "nipy_spectral" )
fig = plt.figure( figsize=(6,4) )
ax = fig.add_axes( [0.01,0.01,0.98,0.98], projection=ccrs.PlateCarree() )
ax.coastlines( color="black" )
ax.set_title( title )

for imission, mission in enumerate(missions): 
    occlist_by_mission = occlist.filter( missions=mission )
    lons = occlist_by_mission.values( "longitude" )
    lats = occlist_by_mission.values( "latitude" )
    color = cmap( (imission+0.5)/len(missions) )
    mission_name = [ m['presentation'] for m in mission_names if m['aws']==mission ][0]
    ax.scatter( lons, lats, color=color, marker="o", s=0.2 , label=mission_name )
    
longituderange = ( 20, 55 )
latituderange = ( 40, 65 )

ax.plot( [ longituderange[0], longituderange[1], longituderange[1], longituderange[0], longituderange[0] ], 
       [ latituderange[0], latituderange[0], latituderange[1], latituderange[1], latituderange[0] ], 
       lw=0.1, color="#808080" )

ax.legend(loc="lower left", fontsize="x-small")

if outputfile is None: 
    fig.show()
else: 
    fmt = outputfile.split(".")[-1]
    print( f'Generating {outputfile}.' )
    plt.savefig( outputfile, format=fmt)