# FATES_INCLINE_dataprep_surfacedata

By EL, 
heavily inspired from NorESM-LSP notebooks and https://github.com/huitang-earth/MossLichen_testbed/blob/main/scripts/SeedClim_surfacedata_modification.ipynb 

Data from the FunCaB project stored on OSF: 
Vandvik, V., Telford, R. J., Halbritter, A. H., Jaroszynska, F., Lynn, J. S., Geange, S. R., … Rüthers, J. (2022). FunCaB - The role of functional group interactions in mediating climate change impacts on the Carbon dynamics and biodiversity of alpine ecosystems. Retrieved from osf.io/4c5v2. DOI 10.17605/OSF.IO/4C5V2

Prerequisites:
1. get default input data from ...

This notebook:
- load libraries, define variables
- load in observation data from FunCab for sites ALP1-4
- load surface data for CLM-FATES
- modify surface data
- upload surface data to INCLINE branch of Eva LSP fork

In [None]:
# import libraries
import xarray as xr  # NetCDF data handling
import matplotlib.pyplot as plt  # Plotting
import time  # Keeping track of runtime
import json  # For reading data dictionaries stored in json format
import pandas as pd  # Tabular data analysis
import datetime as dt  # For workaround with long simulations (beyond year 2262)
from pathlib import Path  # For easy path handling

In [None]:
site_codes = ["ALP1","ALP2","ALP3","ALP4"]
#will this work?

# set path for where to store OSF data? 

# set path to default input data
inputdata_path = Path(f"../../data/{case_id}/")

# set path for where to store modified data


## 1. Load and view data

Refer to https://osf.io/4c5v2/ for data documentation. Download the data from OSF into a DataFrame.

In [None]:
# Download data directly from OSF storage
vcg_soil_temp_obs_df = pd.read_csv("https://osf.io/7tgxb/download", low_memory=False)

In [None]:
# Print first rows
vcg_soil_temp_obs_df.head()

Link site names in data to site names in LSP (requires LSP file, look at Hui's notebook for manual alternative)

In [None]:
# Read dictionary for mapping the name of the NorESM platform site codes to the corresponding name in the VCG dataset
with open(Path("./dicts/vestland_climate_grid.json"), 'r', encoding='utf-8') as vcg_site_json:
    vcg_site_dict = json.load(vcg_site_json).get("sites")

In [None]:
# Subset soil temperatures for selected sites
mysite_soil_temp_obs_df = vcg_soil_temp_obs_df[vcg_soil_temp_obs_df["siteID"] == vcg_site_dict[site_codes]['osf_csv_name']]
mysite_soil_temp_obs_df.head()

In [None]:
# Print period for available measurements
print(f"From: {min(mysite_soil_temp_obs_df['date_time'])}")
print(f"To: {max(mysite_soil_temp_obs_df['date_time'])}")

In [None]:
# Calculate monthly means
monthly_mean_df = mysite_soil_temp_obs_df.groupby(
    pd.PeriodIndex(mysite_soil_temp_obs_df['date_time'], freq='M')
)['soiltemperature'].mean()

# Convert to DataFrame
monthly_mean_df = monthly_mean_df.to_frame()

In [None]:
monthly_mean_df

In [None]:
# Calculate yearly mean (to enable yearly comparison to model data)
monthly_mean_df['date_dt'] = pd.to_datetime(monthly_mean_df.index.to_timestamp())
monthly_mean_df.groupby(monthly_mean_df.date_dt.dt.year)['soiltemperature'].transform('mean')
monthly_mean_df = monthly_mean_df.reset_index()
# Add integer month column for easier data handling later on
monthly_mean_df['month_int'] = [int(pd.to_datetime(x).strftime('%m')) for x in monthly_mean_df['date_dt'].values]
monthly_mean_df

In [None]:
# Plot for quick visualization
import matplotlib.dates as mdates

fig, ax = plt.subplots(figsize=(6, 6))

ax.plot(monthly_mean_df['date_dt'],
        monthly_mean_df['soiltemperature']
       )
ax.set_title(f"{site_code}: monthly mean soil temperatures")
ax.set_xlabel("Month")
ax.set_ylabel("Mean soil temperature [°C]")
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))

## Load default model surface data 

14.11.2022 code copied from Hui's script - need to change a lot of stuff!


In [None]:
# open surface data file to modify
surface_nc_data = netCDF4.Dataset('/home/huitang/saga/work/inputdata/lnd/clm2/surfdata_map/'+sites[i]+'/surfdata_'+sites[i]+'_simyr2000.nc', 'r+')


In [None]:

# modify land cover
surface_nc_data['PCT_NAT_PFT'][0,:,:] = 100-plant_cover_obs # pay attention to the index, 0: barren ground
surface_nc_data['PCT_NAT_PFT'][1:12,:,:] = 0          
surface_nc_data['PCT_NAT_PFT'][12,:,:] = plant_cover_obs      # 12: grass
surface_nc_data['PCT_NAT_PFT'][13:15,:,:] = 0
surface_nc_data['PCT_NATVEG'][:,:] = 100
surface_nc_data['PCT_CROP'][:,:] = 0
surface_nc_data['PCT_CFT'][:,:,:] = 0
surface_nc_data['PCT_WETLAND'][:,:] = 0
surface_nc_data['PCT_LAKE'][:,:] = 0
surface_nc_data['PCT_GLACIER'][:,:] = 0
surface_nc_data['PCT_URBAN'][:,:,:] = 0
# Modify soil properties
surface_nc_data['ORGANIC'][0:3,:,:] = org_obs        # the layers of soil to modify depending on the availability of the data
#surface_nc_data['PCT_SAND'][:,:,:] = 0
#surface_nc_data['PCT_CLAY'][:,:,:] = 0
surface_nc_data['zbedrock'][:,:] = sd_obs/100       # Modify soil depth
surface_nc_data['SLOPE'][:,:] = 20.0

# Modify satellite phenology (only if you use reduced complexity mode)
#surface_nc_data['MONTHLY_LAI'][:,:,:,:] = 0
#surface_nc_data['MONTHLY_SAI'][:,:,:,:] = 0
surface_nc_data['MONTHLY_HEIGHT_TOP'][:,12,:,:] = plant_height_obs/1000
surface_nc_data['MONTHLY_HEIGHT_BOT'][:,12,:,:] = 0.001/1000
# Modify topography

surface_nc_data.close()