# NC to CSV

*Thomas Bissinger, Hack and Harvest KN, 29.06.24*

Turns an .nc file into a .csv file for the heat wave data set. Requires an API key from the climate data store (CDS) *https://cds.climate.copernicus.eu*.

The data set can be found under *https://cds.climate.copernicus.eu/cdsapp#!/dataset/sis-heat-and-cold-spells?tab=overview*

In [127]:
%pip install -q netCDF4
%pip install -q pandas
%pip install -q wheel
%pip install -q cdsapi
%pip install -q netCDF4

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## Step 1: Download
Download the data from the CDS. **API key** needed. This make take a while.

In [None]:
import cdsapi
url = "https://cds.climate.copernicus.eu/api/v2"
key = "API KEY HERE" # insert your API key
c = cdsapi.Client()
c.retrieve( 
  'sis-heat-and-cold-spells',
  {
      'format': 'tgz',
      'variable': 'heat_wave_days',
      'definition': 'climatological_related',
      'experiment': [
          'rcp4_5', 'rcp8_5',
      ],
      'ensemble_statistic': [
          'ensemble_members_average', 'ensemble_members_standard_deviation',
      ],
  },
  'download.tar.gz')

2024-07-03 10:14:23,559 INFO Welcome to the CDS
2024-07-03 10:14:23,559 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/sis-heat-and-cold-spells
2024-07-03 10:14:23,806 INFO Request is queued
2024-07-03 10:14:24,878 INFO Request is running


## Step 2: Read a single .nc file
Get to know the values stored in a .nc file. You have to extract the archive downloaded in the previous step into some datafolder.
This function reads out a specific file and shows the data

In [129]:
import netCDF4 as nc

# Make your change here
datafolder='C:/Users/Thomas/OneDrive/Programming/2024_HackAndHarvestKN/data' 
filename='HWD_EU_climate_rcp85_mean_v1.0.nc'
filename='HWD_EU_climate_rcp85_stdev_v1.0.nc'
filepath = datafolder + '/' + filename
# Open NetCDF file
netcdf_file = nc.Dataset(filepath, 'r')


variable_names = list(netcdf_file.variables.keys())

# Print variable names
print("Variable names and dimensions the NetCDF file:")


print("{:<20} | {:>10} | {:>10} | {:>10}".format("variable name", "dim 1", "dim 2", "dim 3"))
print("-------------------------------------------------------------------")
for name in variable_names:
    data = netcdf_file.variables[name][:]
    try:
        dim_1 = len(data)
    except TypeError:
        dim_1 = 1
    try:
        dim_2 = len(data[0])
    except TypeError:
        dim_2 = 1
    except IndexError:
        dim_2 = 1
    try:
        dim_3 = len(data[0][0])
    except TypeError:
        dim_3 = 1
    except IndexError:
        dim_3 = 1
    print("{:<20} | {:>10} | {:>10} | {:>10}".format(name, dim_1, dim_2,dim_3))

#print("A closer look at the variables:")
#for name in variable_names:
#    data = netcdf_file.variables[name][:]
#    print("  ", name, data)


Variable names and dimensions the NetCDF file:
variable name        |      dim 1 |      dim 2 |      dim 3
-------------------------------------------------------------------
height               |          1 |          1 |          1
quantile             |          1 |          1 |          1
lat                  |        425 |          1 |          1
lon                  |        599 |          1 |          1
time                 |        100 |          1 |          1
HWD_EU_climate       |        100 |        425 |        599


# Step 3: Extract time series information about a single pixel
Create an auxiliary function to extract time series data for a single pixel, then a routine that realizes that

### Creating the routine
The routine gets a latitude and a longitude as input, the path to the nc file and the variable name that stores the relevant information. It computes the relevant index on the fly (so far, the step length has to be manually included as cells in the latitude-longitude grid are identified by their northeast corner, yet indexed with reference to the southwesternmost point in the domain

In [130]:
import netCDF4 as nc
def PixelToTimeSeries(latitude, longitude, ncpath, varname):
    steplength = 0.1
    netcdf_file = nc.Dataset(ncpath, 'r')
    lat_start = netcdf_file.variables["lat"][0] - steplength
    lon_start = netcdf_file.variables["lon"][0] - steplength
    lat_ind = int(( latitude - lat_start ) / steplength)
    lon_ind = int(( longitude - lon_start ) / steplength)
    return netcdf_file.variables[varname][:,lat_ind,lon_ind] # Assumes data format of varname to be time x lat x lon

### Extracting information
Now we take everything we found so far and extract two .csv files for a single city. Adjustments for the inclusion of additional pixels can be made.

In [132]:
import numpy as np
import csv

varname = 'HWD_EU_climate' 
latitude = 47.66 # For Konstanz
longitude = 9.17 # For Konstanz
datadir = 'C:/Users/Thomas/OneDrive/Programming/2024_HackAndHarvestKN/data' 
prefix = 'HWD_EU_climate'
mean_suffix = 'mean_v1.0.nc'
stdev_suffix = 'stdev_v1.0.nc'
for rcp_choice in [4.5, 8.5]:
    if rcp_choice == 4.5:
        modelname = 'rcp45'
        modelname_full = 'RCP4.5'
    elif rcp_choice == 8.5:
        modelname = 'rcp85'
        modelname_full = 'RCP8.5'
    mean_filepath = datadir + '/' + prefix + '_' + modelname + '_' + mean_suffix
    stdev_filepath = datadir + '/' + prefix + '_' + modelname + '_' + stdev_suffix
    csvfile_path = datadir + '/' + 'data_' + modelname + '.csv'
    start_year = 1986
    
    
    # Load the heat_days into a numpy array
    heat_days = PixelToTimeSeries(latitude, longitude, mean_filepath, varname)
    heat_days_std = PixelToTimeSeries(latitude, longitude, stdev_filepath, varname)
    date_vals = np.arange(start_year,start_year + heat_days.size)
    # Write to a CSV file
    with open(csvfile_path, mode='w', newline='') as file:
        writer = csv.writer(file)
        # Writing the header (optional)
        writer.writerow(['date', 'heat days', 'heat days std'])
        # Writing the heat_days rows
        for i in range(len(date_vals)):
            writer.writerow([date_vals[i], heat_days[i],heat_days_std[i]])

    print("Data written to " + csvfile_path)

Data written to C:/Users/Thomas/OneDrive/Programming/2024_HackAndHarvestKN/data/data_rcp45.csv
Data written to C:/Users/Thomas/OneDrive/Programming/2024_HackAndHarvestKN/data/data_rcp85.csv
