# P2. Editing the (meta)data and writing out the edited version to file

## Practical Notebook 2 (of 6) for *Intro to the NCAS CF Data Tools, cf-python and cf-plot*

**In this section we demonstrate how to change the data that has been read-in from file, both in terms of the data arrays and the metadata that describes it, and then how to write data back out to file with a chosen name, so that you can see how cf-python can be used to edit data or to make new data.**

***

<div class="alert alert-block alert-success">
<i>Practical instructions:</i> run all of the cells in this section to do the set up.
</div>

## Setting up

**In this short prelude we set up this Notebook, import the libraries and check the data we will work with, ready to use the libraries and the data (exactly as per the first Notebook setup but in one cell only for quick execution).**

In [1]:
# Set up for inline plots - only needed inside a Notebook environment - and to ignore some repeating warnings
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

# Import the two CF Data Tools libraries and inspect the versions
import cfplot as cfp
import cf
print("--- Version report: ---")
print("cf-python version is:", cf.__version__)
print("cf-plot version is:", cfp.__version__)
print("CF Conventions version is:", cf.CF())

# See what datasets we have to explore within the data directory we use throughout this course
print("--- Datasets available from the path '../ncas_data': ---")
# Note that in a Jupyter Notebook, '!' precedes a shell command - so this is a command, not Python
!ls ../ncas_data

--- Version report: ---
cf-python version is: 3.18.1
cf-plot version is: 3.4.0
CF Conventions version is: 1.12
--- Datasets available from the path '../ncas_data': ---
160by320griddata.nc			   precip_2010.nc
aaaaoa.pmh8dec.pp			   precip_DJF_means.nc
alpine_precip_DJF_means.nc		   qbo.nc
data1.nc				   regions.nc
data1-updated.nc			   rgp.nc
data2.nc				   sea_currents_backup.nc
data3.nc				   sea_currents.nc
data5.nc				   ta.nc
ggas2014121200_00-18.nc			   tripolar.nc
IPSL-CM5A-LR_r1i1p1_tas_n96_rcp45_mnth.nc  two_fields.nc
land.nc					   ua.nc
model_precip_DJF_means_low_res.nc	   u_n216.nc
model_precip_DJF_means.nc		   u_n96.nc
n2o_emissions.nc			   vaAMIPlcd_DJF.nc
POLCOMS_WAM_ZUV_01_16012006.nc		   va.nc
precip_1D_monthly.nc			   wapAMIPlcd_DJF.nc
precip_1D_yearly.nc


***

<div class="alert alert-block alert-success">
<i>Practical instructions:</i> now we can start the practical. We will follow the same sectioning as in the teaching notebook, so please consult the notes there in the matching section for guidance and you can also consult the cf-python and cf-plot documentation linked above.
</div>

## 2. Editing the (meta)data and writing out the edited version to file

### a) Changing the underlying data

Let's use the same field from section 1. Read in the field list from the dataset 'data1.nc' in the 'ncas_data' directory and take the first field from it to assign to the variable `field` which we will work with again throughout this section.

In [2]:
# Required from Step 1
fieldlist = cf.read("../ncas_data/data1.nc")
print("Field List is:", fieldlist)
field = fieldlist[0]
print("Field is:", field)

Field List is: [<CF Field: long_name=Potential vorticity(time(1), pressure(23), latitude(160), longitude(320)) K m**2 kg**-1 s**-1>,
 <CF Field: air_temperature(time(1), pressure(23), latitude(160), longitude(320)) K>,
 <CF Field: eastward_wind(time(1), pressure(23), latitude(160), longitude(320)) m s**-1>,
 <CF Field: northward_wind(time(1), pressure(23), latitude(160), longitude(320)) m s**-1>]
Field is: Field: long_name=Potential vorticity (ncvar%PV)
-----------------------------------------------
Data            : long_name=Potential vorticity(time(1), pressure(23), latitude(160), longitude(320)) K m**2 kg**-1 s**-1
Dimension coords: time(1) = [1964-01-21 00:00:00]
                : pressure(23) = [1000.0, ..., 1.0] mbar
                : latitude(160) = [89.14151763916016, ..., -89.14151763916016] degrees_north
                : longitude(320) = [0.0, ..., 358.875] degrees_east


**2.a.1)** Access the data (*not* the data array of the data) of the full field and assign it to a variable called 'data'.

In [3]:
data = field.data

**2.a.2)** Inspect the field data with medium detail use the `size` method on it to see its shape.

In [4]:
print(data)
data.shape

[[[[1.3371172826737165e-06, ..., -0.0072057610377669334]]]] K m**2 kg**-1 s**-1


(1, 23, 160, 320)

**2.a.3)** Use the `size` method on it to see how many items (in this case, numbers) there are in it. Can you see how this relates to the `shape` above, and to the structure of the coordinates from the field inspection in (1.b.2)?

In [5]:
data.size

1177600

**2.a.4)**

In [6]:
first_time_subarray = data[0, :, :, :]
first_time_subarray

<CF Data(1, 23, 160, 320): [[[[1.3371172826737165e-06, ..., -0.0072057610377669334]]]] K m**2 kg**-1 s**-1>

**2.a.5)** Change all of the values in the first time subarray to the value '-50.0'.

In [7]:
first_time_subarray[0] = -50
first_time_subarray

<CF Data(1, 23, 160, 320): [[[[-50.0, ..., -50.0]]]] K m**2 kg**-1 s**-1>

**2.a.6)** Access the index item `[0, :, 0, 0]` of the full data array from (2.a.1) and assign it to a variable called `subarray`. Then check what shape it is and try to understand the size that emerges for that sub-array given that specific index.

In [8]:
subarray = data[0, :, 0, 0]
subarray.shape

(1, 23, 1, 1)

**2.a.7)** Change the values for this sub-array item to all ones i.e. `1.0`. Note to create an array populated all with the value one of the required shape to match the shape of the subarray from (2.a.6), you can use `numpy.ones(<desired shape>)` (you will need to import `numpy` first, and let's call the module `np` i.e. use `import numpy as np`).

In [9]:
import numpy as np
np.ones((1, 1, 2, 1))
data[0, :, 0, 0] = np.ones((1, 23, 1, 1))

**2.a.8)** Inspect 'a_subarray' again to confirm it has been set with values of all one.

In [10]:
subarray

<CF Data(1, 23, 1, 1): [[[[1.3371172826737165e-06, ..., 0.010724794119596481]]]] K m**2 kg**-1 s**-1>

### b) Changing some metadata

**2.b.1)** Redefine the 'pressure' variable coordinate we created in (1.c) (`pressure = field.coordinate("pressure")`) and again `print` it to inspect it. Let's say we want to reverse the direction of this axes, so that the pressures increase in value along it (in order) rather than decreases. In `cf` we can do this using the `flip` method, which will adjust the data accordingly so that it is flipped in the corresponding way to match up to the new direction of the pressure axis.

Reassign `pressure` to itself with the `flip` method applied (note you can, alternatively and equivalently, call the method `flip` with the argument `inplace=True` set without re-assignment because that argument tells `cf` to adjust the variable in-place).

In [11]:
pressure = field.coordinate("pressure")
pressure = pressure.flip()
# Or: pressure.flip(inplace=True)

**2.b.2)** Now full inspect the pressure coordinate again with full detail to confirm that the flip of axis direction happened.

In [12]:
pressure.dump()

Dimension coordinate: pressure
    long_name = 'p'
    positive = 'down'
    standard_name = 'pressure'
    units = 'mbar'
    Data(23) = [1.0, ..., 1000.0] mbar


**2.b.3)** Notice the `standard_name`, which is a CF attribute used to identify the physical quantity in question, that must be one from the thousands of controlled options available in the CF Standard Name table (see https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html, you are encouraged to spend some minutes exploring this resource). At the moment the value in the field is not a valid standard name and it is also not specific enough to identify the variable well. 

Let's change it to the more descriptive valid CF Standard Name of 'air_pressure'. To do this, access the `standard_name` attribute of the pressure coordinate and re-set it to the string value desired.

In [13]:
pressure.standard_name = "air_pressure"

**2.b.4)** Once again view the pressure coordinate with maximal detail to see the two changes we made in this section. Also view the overall field in medium detail to see that these changes propagate through to the field object by Python object rules.

In [14]:
pressure.dump()

Dimension coordinate: air_pressure
    long_name = 'p'
    positive = 'down'
    standard_name = 'air_pressure'
    units = 'mbar'
    Data(23) = [1.0, ..., 1000.0] mbar


### c) Writing a (list of) fields out to a file

**2.c.1)** Let's write out a few fields to disk. We'll write our field from the previous sections as well as a new field all into the same file.

First let's get a new field. We could read in another file to access some new fields, but let's use one of the cf-python ready-made 'example fields', which are designed for exploring cf-python capability without the need for real (and large/memory-intensive) data. Run `cf.example_field()` with any integer from 0 to 11 inclusive to get an example field, assigning it to the variable called 'new_field'.

In [15]:
new_field = cf.example_field(1)  # Or: any integer from 0 to 11 works fine as the argument

**2.c.2)** Inspect this new field with medium level of detail.

In [16]:
print(new_field)

Field: air_temperature (ncvar%ta)
---------------------------------
Data            : air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K
Cell methods    : grid_latitude(10): grid_longitude(9): mean where land (interval: 0.1 degrees) time(1): maximum
Field ancils    : air_temperature standard_error(grid_latitude(10), grid_longitude(9)) = [[0.76, ..., 0.32]] K
Dimension coords: atmosphere_hybrid_height_coordinate(1) = [20.0] m
                : grid_latitude(10) = [2.2, ..., -1.76] degrees
                : grid_longitude(9) = [-4.7, ..., -1.18] degrees
                : time(1) = [2019-01-01 00:00:00]
Auxiliary coords: latitude(grid_latitude(10), grid_longitude(9)) = [[53.941, ..., 50.225]] degrees_N
                : longitude(grid_longitude(9), grid_latitude(10)) = [[2.004, ..., 8.156]] degrees_E
                : long_name=Grid latitude name(grid_latitude(10)) = [--, ..., kappa]
Cell measures   : measure:area(grid_longitude(9), grid_latitu

**2.c.3)** Create a two-field FieldList called 'new_fieldlist' containing the field from (1.a.3) that we have been exploring in the previous sections (1.a) and (1.b) along with this new 'new_field' from (2.c.2). You can use the `cf` function `cf.FieldList` with the fields it should contain in a normal Python list as an arugment to set this up.

In [17]:
new_fieldlist = cf.FieldList([field, new_field])

**2.c.4)** Call `print` with this new fieldlist as an argument to view it in medium detail.

In [18]:
print(new_fieldlist)

[<CF Field: long_name=Potential vorticity(time(1), pressure(23), latitude(160), longitude(320)) K m**2 kg**-1 s**-1>,
 <CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K>]


**2.c.4)** Now we have constructed our desired FieldList, we can write it out to file. Use `cf`'s `write` function to write it to a file called `two_fields.nc` in the `../ncas_data` directory.

In [19]:
cf.write(new_fieldlist, "../ncas_data/two_fields.nc")

**2.c.5)** Use the shell command `!ls ../ncas_data` to see the contents of that directory to confirm that the file was written there.

In [20]:
!ls ../ncas_data

160by320griddata.nc			   precip_2010.nc
aaaaoa.pmh8dec.pp			   precip_DJF_means.nc
alpine_precip_DJF_means.nc		   qbo.nc
data1.nc				   regions.nc
data1-updated.nc			   rgp.nc
data2.nc				   sea_currents_backup.nc
data3.nc				   sea_currents.nc
data5.nc				   ta.nc
ggas2014121200_00-18.nc			   tripolar.nc
IPSL-CM5A-LR_r1i1p1_tas_n96_rcp45_mnth.nc  two_fields.nc
land.nc					   ua.nc
model_precip_DJF_means_low_res.nc	   u_n216.nc
model_precip_DJF_means.nc		   u_n96.nc
n2o_emissions.nc			   vaAMIPlcd_DJF.nc
POLCOMS_WAM_ZUV_01_16012006.nc		   va.nc
precip_1D_monthly.nc			   wapAMIPlcd_DJF.nc
precip_1D_yearly.nc


<div class="alert alert-block alert-success">
<i>Practical instructions:</i> this is the end of the section. Please check your work, review the material and then move on to Practical 3 (see the Notebook 'cf_data_tools_practical_03.ipynb').
</div>

***