# Fix calendar issues with COSMO-REA6 forcing data

COMSO reanalysis data uses a gregorian calendar with leap years. Running long simulations (necessary for spin-up) with COMSO reanalysis data proved problematic, because the leap years don't align and the model terminates prematurely after ~100 years. 

NB! This notebook does not fix the problem of long simulations with gregorian-calendar data. This is because the data, in my case COSMO reanalysis data between 1995-2018, dosen't cover a long enough time span to include [skipped leap years (e.g.2100)](https://en.wikipedia.org/wiki/Gregorian_calendar). So in practice, the COSMO data follows a [Julian calendar](https://en.wikipedia.org/wiki/Julian_calendar), where every 4th year is a leap year without exception. Cycling such data means that when the gregorian calendar expects there to be an exception to the 4-year rule, the data won't fit.

This notebook will...
1. read in the cosmo reanalysis data
2. change the calendar attribute for all files
3. save the modified data

In [1]:
# import libraries
import os
import netCDF4 as nc
import xarray as xr  # NetCDF data handling
import zipfile # for unzipping
import shutil # easiest whole-directory zipping
from pathlib import Path  # For easy path handling

Download COSMOREA data from evalieungh/FATES_INCLINE repo if necessary:

In [None]:
%%bash
pwd
cd ../data
pwd
wget https://raw.githubusercontent.com/evalieungh/FATES_INCLINE/main/data/ALP4_cosmorea.zip

In [2]:
# set path to data, where we have the original (gregorian) data and will save the modified version
cosmo_path = str(Path(f"C:/Users/evaler/OneDrive - Universitetet i Oslo/Eva/PHD/FATES_INCLINE/data"))


Unzip folder:

In [None]:
print("extracting ", cosmo_path + "/ALP4_cosmorea.zip")
with zipfile.ZipFile(cosmo_path + "/ALP4_cosmorea.zip", 'r') as zip_ref:
    zip_ref.extractall(cosmo_path + "/ALP4_cosmorea")

In file explorer, copy the ALP4_cosmorea folder and rename the new copy ALP4_cosmorea_gregorian. Keep the original files in ALP4_cosmorea untouched.
 
Set paths to where the new files we make should be stored, overwriting the original copies:

In [7]:
# Define file paths
input_dir = str(Path(cosmo_path + f"/ALP4_cosmorea_gregorian/datmdata/"))
print("Input stored here:", input_dir)

Input stored here: C:\Users\evaler\OneDrive - Universitetet i Oslo\Eva\PHD\FATES_INCLINE\data\ALP4_cosmorea_gregorian\datmdata


List the atmospheric forcing files and check how they are structured.

In [8]:
# Print all NetCDF files in the input directory
files = [f for f in os.listdir(input_dir) if f.endswith('.nc')]
print(files)

['clm1pt_ALP4_1995-01.nc', 'clm1pt_ALP4_1995-02.nc', 'clm1pt_ALP4_1995-03.nc', 'clm1pt_ALP4_1995-04.nc', 'clm1pt_ALP4_1995-05.nc', 'clm1pt_ALP4_1995-06.nc', 'clm1pt_ALP4_1995-07.nc', 'clm1pt_ALP4_1995-08.nc', 'clm1pt_ALP4_1995-09.nc', 'clm1pt_ALP4_1995-10.nc', 'clm1pt_ALP4_1995-11.nc', 'clm1pt_ALP4_1995-12.nc', 'clm1pt_ALP4_1996-01.nc', 'clm1pt_ALP4_1996-02.nc', 'clm1pt_ALP4_1996-03.nc', 'clm1pt_ALP4_1996-04.nc', 'clm1pt_ALP4_1996-05.nc', 'clm1pt_ALP4_1996-06.nc', 'clm1pt_ALP4_1996-07.nc', 'clm1pt_ALP4_1996-08.nc', 'clm1pt_ALP4_1996-09.nc', 'clm1pt_ALP4_1996-10.nc', 'clm1pt_ALP4_1996-11.nc', 'clm1pt_ALP4_1996-12.nc', 'clm1pt_ALP4_1997-01.nc', 'clm1pt_ALP4_1997-02.nc', 'clm1pt_ALP4_1997-03.nc', 'clm1pt_ALP4_1997-04.nc', 'clm1pt_ALP4_1997-05.nc', 'clm1pt_ALP4_1997-06.nc', 'clm1pt_ALP4_1997-07.nc', 'clm1pt_ALP4_1997-08.nc', 'clm1pt_ALP4_1997-09.nc', 'clm1pt_ALP4_1997-10.nc', 'clm1pt_ALP4_1997-11.nc', 'clm1pt_ALP4_1997-12.nc', 'clm1pt_ALP4_1998-01.nc', 'clm1pt_ALP4_1998-02.nc', 'clm1pt_ALP

In [9]:
# print variables in the first nc file
example_file = str(Path(input_dir + "/" + f"clm1pt_ALP4_1995-01.nc"))
with nc.Dataset(example_file, 'r') as ds:
    # List all variables in the file
    print("Variables in the file:")
    print(ds.variables.keys())

Variables in the file:
dict_keys(['EDGEW', 'EDGEE', 'EDGES', 'EDGEN', 'LONGXY', 'LATIXY', 'SWDIFDS_RAD', 'SWDIRS_RAD', 'RAIN_CON', 'RAIN_GSP', 'SNOW_GSP', 'SNOW_CON', 'PRECTmms', 'TBOT', 'WIND', 'PSRF', 'SHUM', 'FLDS', 'time'])


In [10]:
# get more info on time variable
with nc.Dataset(example_file, 'r') as ds:
    # Access the "time" variable
    time_var = ds.variables['time']

    # Print the variable dimensions
    print("Variable dimensions:", time_var.dimensions)

    # Print the variable shape
    print("Variable shape:", time_var.shape)

    # Print the variable attributes
    print("Variable attributes:", time_var.ncattrs())

    # Print a specific attribute
    print("Variable units:", time_var.units)

    # Print the variable values
    print("Variable values:", time_var[:])

Variable dimensions: ('time',)
Variable shape: (248,)
Variable attributes: ['standard_name', 'units', 'calendar', 'axis']
Variable units: hours since 1995-1-1 01:00:00
Variable values: [  0.   3.   6.   9.  12.  15.  18.  21.  24.  27.  30.  33.  36.  39.
  42.  45.  48.  51.  54.  57.  60.  63.  66.  69.  72.  75.  78.  81.
  84.  87.  90.  93.  96.  99. 102. 105. 108. 111. 114. 117. 120. 123.
 126. 129. 132. 135. 138. 141. 144. 147. 150. 153. 156. 159. 162. 165.
 168. 171. 174. 177. 180. 183. 186. 189. 192. 195. 198. 201. 204. 207.
 210. 213. 216. 219. 222. 225. 228. 231. 234. 237. 240. 243. 246. 249.
 252. 255. 258. 261. 264. 267. 270. 273. 276. 279. 282. 285. 288. 291.
 294. 297. 300. 303. 306. 309. 312. 315. 318. 321. 324. 327. 330. 333.
 336. 339. 342. 345. 348. 351. 354. 357. 360. 363. 366. 369. 372. 375.
 378. 381. 384. 387. 390. 393. 396. 399. 402. 405. 408. 411. 414. 417.
 420. 423. 426. 429. 432. 435. 438. 441. 444. 447. 450. 453. 456. 459.
 462. 465. 468. 471. 474. 477. 480

## Change the value of the calendar attribute from 'standard' to 'gregorian':

In [11]:
%%bash
pwd
cd ../data/ALP4_cosmorea_gregorian/datmdata
for f in clm1pt_ALP4*; do ncatted -O -a calendar,time,o,c,gregorian $f; done

/mnt/c/Users/evaler/OneDrive - Universitetet i Oslo/Eva/PHD/FATES_INCLINE/src


Check if everything looks fine by printing some info for some example files:

In [19]:
# check that the files look correct, unmodified by other scripts
example_file = str(Path(cosmo_path + "/ALP4_cosmorea_gregorian/datmdata/" + f"clm1pt_ALP4_2001-02.nc"))
# get more info on time variable
with nc.Dataset(example_file, 'r') as ds:
    # Access the "time" variable
    time_var = ds.variables['time']

    # Print info
    print("Variable attributes:", time_var.ncattrs())
    print("Current calendar", time_var.calendar)
    print("Variable values:", time_var[:])

Variable attributes: ['standard_name', 'units', 'calendar', 'axis']
Current calendar gregorian
Variable values: [  0.   3.   6.   9.  12.  15.  18.  21.  24.  27.  30.  33.  36.  39.
  42.  45.  48.  51.  54.  57.  60.  63.  66.  69.  72.  75.  78.  81.
  84.  87.  90.  93.  96.  99. 102. 105. 108. 111. 114. 117. 120. 123.
 126. 129. 132. 135. 138. 141. 144. 147. 150. 153. 156. 159. 162. 165.
 168. 171. 174. 177. 180. 183. 186. 189. 192. 195. 198. 201. 204. 207.
 210. 213. 216. 219. 222. 225. 228. 231. 234. 237. 240. 243. 246. 249.
 252. 255. 258. 261. 264. 267. 270. 273. 276. 279. 282. 285. 288. 291.
 294. 297. 300. 303. 306. 309. 312. 315. 318. 321. 324. 327. 330. 333.
 336. 339. 342. 345. 348. 351. 354. 357. 360. 363. 366. 369. 372. 375.
 378. 381. 384. 387. 390. 393. 396. 399. 402. 405. 408. 411. 414. 417.
 420. 423. 426. 429. 432. 435. 438. 441. 444. 447. 450. 453. 456. 459.
 462. 465. 468. 471. 474. 477. 480. 483. 486. 489. 492. 495. 498. 501.
 504. 507. 510. 513. 516. 519. 522. 

Re-zip folder

In [20]:
shutil.make_archive(cosmo_path + "/ALP4_cosmorea_gregorian",
                    'zip', cosmo_path + "/ALP4_cosmorea_gregorian")
print("zipping complete")

zipping complete


Finally, commit and push changes back to github repository so the data can be downloaded from there.