(C) Crown Copyright, Met Office. All rights reserved.

## vars_retrieved_20200626.ipynb

This notebook calculates the number of variables that have been requested for retrieval from tape to disk at JASMIN by the PRIMAVERA project using data from the DMT. The number is the number of times that this variable has been requested. A value of 1 means that this variable has been requested once from a single model for a single experiment. 7 could mean that this variable had been requested from all models for a single experiment, or that the same variable from one model and experiment has been requested by seven different people. The name is in the form "var-name_table-name" and the table name includes the frequency of that variable. The variable's standard name is included in the third column.



In [1]:
import datetime
print(f'Last run {datetime.datetime.utcnow()}')

Last run 2020-06-29 09:17:25.696678


In [2]:
from collections import OrderedDict

import django
django.setup()
from pdata_app.models import DataRequest, RetrievalRequest, VariableRequest

In [4]:
var_reqs = {}

# Ignore retrievals by Jon as these were for publication rather than analysis
for rr in RetrievalRequest.objects.exclude(requester__username='jseddon').order_by('id'):
    for dr in rr.data_request.all():
        if dr.variable_request not in var_reqs:
            var_reqs[dr.variable_request] = 1
        else:
            var_reqs[dr.variable_request] += 1
    

In [5]:
var_reqs_ordered = OrderedDict(sorted(var_reqs.items(), key=lambda x: (x[0].frequency, x[0].table_name, x[0].cmor_name)))

Look at the variables requested for retrieval.

The number is the number of times that this variable has been requested. A value of 1 means that this variable has been requested once from a single model for a single experiment. 7 could mean that this variable had been requested from all models for a single experiment, or that the same variable from one model and experiment has been requested by seven different people.

In [9]:
for vr in var_reqs_ordered:
    print('{:<20} {:3}   {}'.format(vr.cmor_name + '_' + vr.table_name, var_reqs[vr], vr.standard_name))

pr_E1hr               28   precipitation_flux
clt_Prim1hr            4   cloud_area_fraction_in_atmosphere_layer
rsds_Prim1hr           8   surface_downwelling_shortwave_flux_in_air
rsdsdiff_Prim1hr       2   surface_diffuse_downwelling_shortwave_flux_in_air
tas_Prim1hr           18   air_temperature
ua50m_Prim1hr          4   eastward_wind
uas_Prim1hr           29   eastward_wind
va50m_Prim1hr          4   northward_wind
vas_Prim1hr           24   northward_wind
clt_3hr                2   cloud_area_fraction
hfls_3hr              43   surface_upward_latent_heat_flux
hfss_3hr              34   surface_upward_sensible_heat_flux
huss_3hr              48   specific_humidity
mrro_3hr              22   runoff_flux
mrsos_3hr              8   moisture_content_of_soil_layer
pr_3hr               146   precipitation_flux
prc_3hr               11   convective_precipitation_flux
prsn_3hr               9   snowfall_flux
ps_3hr                18   surface_air_pressure
rlds_3hr              45   surf

Display them again in popularity order

In [11]:
popularity_ordered = OrderedDict(sorted(var_reqs.items(), key=lambda x: x[1], reverse=True))
for vr in popularity_ordered:
    print('{:<20} {:3}   {}'.format(vr.cmor_name + '_' + vr.table_name, var_reqs[vr], vr.standard_name))

pr_Amon              1261   precipitation_flux
ts_Amon              834   surface_temperature
psl_Amon             828   air_pressure_at_sea_level
tas_Amon             692   air_temperature
zg_day               544   geopotential_height
ua_Amon              513   eastward_wind
thetao_Omon          472   sea_water_potential_temperature
vo_Omon              443   sea_water_y_velocity
va_Amon              417   northward_wind
pr_day               367   precipitation_flux
zg_Amon              361   geopotential_height
ta_Amon              354   air_temperature
hus_Amon             333   specific_humidity
evspsbl_Amon         329   water_evaporation_flux
huss_Amon            323   specific_humidity
ps_Amon              322   surface_air_pressure
uo_Omon              318   sea_water_x_velocity
clt_Amon             314   cloud_area_fraction
hfls_Amon            236   surface_upward_latent_heat_flux
hfss_Amon            234   surface_upward_sensible_heat_flux
tos_Omon             228   sea_sur

Look at the variables that have never been retrieved.

Below is a list of the Stream 1 and 2 variables that have been uploaded by at least one model, but have never been requested for retrieval from tape to disk.

In [10]:
drs = DataRequest.objects.filter(datafile__isnull=False).distinct()
vrs_ids = [vrs_id[0] for vrs_id in set(list(drs.values_list('variable_request')))]
vrs = VariableRequest.objects.filter(id__in=vrs_ids)
for vr in vrs.distinct().order_by('frequency', 'table_name', 'cmor_name'):
    if vr not in var_reqs_ordered:
        print('{:<23} {}'.format(vr.cmor_name + '_' + vr.table_name, vr.standard_name))

prc_E1hr                convective_precipitation_flux
ps_Prim1hr              surface_air_pressure
rsdsdiffmax_Prim1hr     surface_diffuse_downwelling_shortwave_flux_in_air
rsdsdiffmin_Prim1hr     surface_diffuse_downwelling_shortwave_flux_in_air
rsdsmax_Prim1hr         surface_downwelling_shortwave_flux_in_air
rsdsmin_Prim1hr         surface_downwelling_shortwave_flux_in_air
sfcWindmax_Prim1hr      wind_speed
sfcWindmin_Prim1hr      wind_speed
wsgmax_Prim1hr          wind_speed_of_gust
rsdsdiff_3hr            surface_diffuse_downwelling_shortwave_flux_in_air
rsuscs_3hr              surface_upwelling_shortwave_flux_in_air_assuming_clear_sky
clivi_E3hr              atmosphere_cloud_ice_content
clwvi_E3hr              atmosphere_cloud_condensed_water_content
prcsh_E3hr              hallow_convective_precipitation_flux
prra_E3hr               rainfall_flux
rlut_E3hr               toa_outgoing_longwave_flux
rlutcs_E3hr             toa_outgoing_longwave_flux_assuming_clear_sky
rsut_E3hr    