(C) Crown Copyright, Met Office. All rights reserved.

## hadgem_volumes.ipynb

Calulate the volumes generated by the Stream 1 and Stream 2 HadGEM3 hist-1950 coupled simulations which ran for 65 years. Stream 1 included the full model output but Stream 2 contained only the essential high-frequency variables and no variables from the CFday table. Please see https://doi.org/10.5281/zenodo.3607327 for more details of the Stream 2 data request.

In [1]:
import django
django.setup()
from django.db.models import Sum
from pdata_app.models import DataFile, DataRequest
from pdata_app.utils.common import filter_hadgem_stream2

In [2]:
models = [
    ('HadGEM3-GC31-LL', 'hist-1950', 'r1i1p1f1'),
    ('HadGEM3-GC31-MM', 'hist-1950', 'r1i1p1f1'),
    ('HadGEM3-GC31-HM', 'hist-1950', 'r1i1p1f1'),
    ('HadGEM3-GC31-HH', 'hist-1950', 'r1i1p1f1'),
]

In [3]:
def to_TiB(num_bytes):
    """
    Return a value in bytes as a value in TiB as a string
    
    :param int num_bytes: the number of bytes
    :returns: the number of bytes as a string in units of TiB to 1 decimal place
    :rtype: str
    """
    return f'{num_bytes / 1024**4:.1f}'

In [4]:
print(f"All data volumes are in TiB (1 TiB = 1024**4 bytes = {1024**4} bytes)")
print()
print(f"{'Model':<16} {'Stream 1 Total':<14}   {'Stream 1 Year':<13}   "
      f"{'Stream 2 Total':<14}   {'Stream 2 Year':<13}")

for model in models:
    query = {
        'climate_model__short_name': model[0],
        'experiment__short_name': model[1],
        'rip_code': model[2]
    }
    datareqs = DataRequest.objects.filter(**query)
    datafiles_s1 = DataFile.objects.filter(data_request__in=datareqs).distinct()
    datafiles_s2 = DataFile.objects.filter(data_request__in=filter_hadgem_stream2(datareqs)).distinct()
    s1_size = datafiles_s1.aggregate(Sum('size'))['size__sum']
    s2_size = datafiles_s2.aggregate(Sum('size'))['size__sum']    
    print(f'{model[0]:<16} {to_TiB(s1_size):>14}   {to_TiB(s1_size / 65):>13}   '
          f'{to_TiB(s2_size):>14}   {to_TiB(s2_size / 65):>13}')

All data volumes are in TiB (1 TiB = 1024**4 bytes = 1099511627776 bytes)

Model            Stream 1 Total   Stream 1 Year   Stream 2 Total   Stream 2 Year
HadGEM3-GC31-LL             2.1             0.0              1.0             0.0
HadGEM3-GC31-MM            11.7             0.2              6.4             0.1
HadGEM3-GC31-HM            49.1             0.8             21.2             0.3
HadGEM3-GC31-HH            75.9             1.2             47.8             0.7
