# Tips and tricks
This script uses the example datasets in the folder `test_data/`.

In [1]:
import pandas as pd
import numpy as np

import riversand

## 1. Validating input data
Use the function `rv.validate()` to validate the input data, e.g. the projection and resolution of geospatial data.

Use `rv.elevation`, `rv.shielding` and `rv.quartz` to display information about these datasets or 
`rv` to display a summary of all data uploaded to the project.

In [2]:
# Add raster data:
rv = riversand.Riversand("test_data")
rv.add_raster('dem_utm_35m.tif', dtype='elevation')
rv.add_raster('toposhielding_50m.tif', dtype='shielding') # optional 
rv.add_raster('quartz_35m.tif', dtype='quartz') # optional

In [3]:
# try and validate:
rv.validate()


Conflicting projections in raster data
No sample data defined
No catchment data defined


In [4]:
# display information about the shielding raster (~50 m resolution):
rv.shielding

dtype : shielding
fname : test_data/toposhielding_50m.tif
src   : <closed DatasetReader name='test_data/toposhielding_50m.tif' mode='r'>
epsg  : 32632
res   : (50.005889281507656, 49.98201438848921)

In [5]:
# upload a 35m-resolution shielding raster and repeat validation:
rv.add_raster('toposhielding_35m.tif', dtype='shielding') 
rv.validate()


Raster data valid
No sample data defined
No catchment data defined


### A word of caution

The raster data needs to be a **geotiff** with a valid **projection**, i.e. a **coordinate system** and a **geotransform**. 

In the following example, the file `toposhielding_35m_noproj1.tif` data was generated with the TopoToolbox `toposhielding.m` function but without the Matlab Mapping toolbox, therefore geospatial data is missing. Adding the raster raises a `NotGeoreferenceWarning`. The coordinate system is correctly identified as epsg: 32632 (UTM zone 32N), but the resolution is incorrectly set to default (1.0, 1.0). Without a valid projection, these data cannot be used.

In [6]:
rv = riversand.Riversand("test_data")
rv.add_raster('dem_utm_35m.tif', dtype='elevation')
rv.add_raster('toposhielding_35m_noproj1.tif', dtype='shielding') 

  dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)


In [7]:
rv

---------------
Raster data:

dtype : elevation
fname : test_data/dem_utm_35m.tif
src   : <closed DatasetReader name='test_data/dem_utm_35m.tif' mode='r'>
epsg  : 32632
res   : (35.0, 35.0)

dtype : shielding
fname : test_data/toposhielding_35m_noproj1.tif
src   : <closed DatasetReader name='test_data/toposhielding_35m_noproj1.tif' mode='r'>
epsg  : 32632
res   : (1.0, 1.0)

In [8]:
# remove a raster from the project
rv.shielding = None
rv

---------------
Raster data:

dtype : elevation
fname : test_data/dem_utm_35m.tif
src   : <closed DatasetReader name='test_data/dem_utm_35m.tif' mode='r'>
epsg  : 32632
res   : (35.0, 35.0)

## 2. Validating the catchment shapefile
The catchment shapefile must have the same projection as the raster data and can only be validated if raster data have been uploaded.

In [9]:
rv = riversand.Riversand("test_data")
rv.add_catchments('test_single_catchment.shp')
rv.validate()


No elevation raster defined
No sample data defined
No valid raster data, cannot validate shapefile projection


In [10]:
rv.add_raster('dem_WGS.tif', dtype='elevation')
rv.validate()


Raster data valid
No sample data defined
Shapefile projection (epsg=32632) does not match raster projection (epsg=4326)


In [11]:
rv.add_raster('dem_utm_35m.tif', dtype='elevation')
rv.validate()


Raster data valid
No sample data defined
Catchment data valid


## 2.1 Validating multi-catchment datasets
A shapefile with more than one polygon is considered a multi-catchment dataset and requires to define the attribute field ("catchment identifier") that has the catchment names (`rv.set_cid()`).

In [12]:
rv = riversand.Riversand("test_data")
rv.add_raster('dem_utm_35m.tif', dtype='elevation')
rv.add_catchments('test_multi_catchment.shp')
rv.validate()


Raster data valid
No sample data defined
No catchment identifier defined; use .set_cid()


In [13]:
# Available attribute fields are listed under 'attrs':
rv.catchments

fname : test_data/test_multi_catchment.shp
src   : <closed Collection 'test_data/test_multi_catchment.shp:test_multi_catchment', mode 'r' at 0x7f4051993ac0>
attrs : ['name', 'id', 'area_km2']
len   : 8
epsg  : 32632

In [14]:
# Set catchment identifier 'id':
rv.set_cid('id')
rv.validate()


Raster data valid
No sample data defined
Catchment data valid

Valid catchments / samples:
   No matches found


If sample data have been added to the project, validation identifies the "valid" catchments (i.e. sampled data is available, no duplicate in the shapefile) :

In [15]:
rv.add_samples('test_samples.ods')
rv.samples

Unnamed: 0,name,density,shielding,nuclide,N,delN,lat,long,elev
0,DB01,2.7,0.92,Be-10,12900,700,45.804,6.9653,1230
1,DB02,2.7,0.94,Be-10,10800,700,45.7167,7.1101,783
2,DB03,2.7,0.94,Be-10,23500,1400,45.6925,7.1935,699
3,DB04,2.7,0.94,Be-10,22000,1100,45.7003,7.2019,664
4,DB05,2.7,0.95,Be-10,20500,1000,45.7001,7.2337,638
5,DB06,2.7,0.95,Be-10,15400,800,45.5228,7.8375,251
6,DB07,2.7,0.95,Be-10,22500,2600,45.5962,7.7956,325
7,DB08,2.7,0.96,Be-10,48500,2100,45.6118,7.731,373
8,DB12,2.7,0.95,Be-10,12600,800,45.7183,7.2651,594
9,DB17,2.7,0.95,Be-10,27100,1300,45.7039,7.1622,689


In [16]:
rv.validate()


Raster data valid
Sample data valid
Catchment data valid

Valid catchments / samples:
   No matches found


Oops. Use `rv.catchments.get_names()` to show the names of all polygons in the shapefile including duplicates and unnamed catchments.<br>
Use `rv.get_valid_catchments()` to show the names of the "valid" catchments.

In the current example, the sample names are stored in the attribute field 'name' but we mistakenly set the catchment identifier to 'id'.

In [17]:
# Get all catchment names incl. duplicates and unnamed catchments:
rv.catchments.get_names()

['1', '2', '3', '4', '5', '6', '7', '8']

In [18]:
# Check which catchment identifier is currently set:
rv.cid

'id'

In [19]:
# Fix the problem:
rv.set_cid('name')
rv.validate()


Raster data valid
Sample data valid
Catchment data valid

Valid catchments / samples:
   Found 5 match(es)


In [20]:
# Show the valid catchments:
rv.get_valid_catchments()

['DB02', 'DB03', 'DB04', 'DB05', 'DB17']

## 3. Sample data
Columns that are recognized (processed) by the calculator are `name`, `press_flag`, `thickness`, `density`, `shielding`, `erate`, `year`, `nuclide`, `mineral`, `N`, `delN` and `standardization`<br>(see http://hess.ess.washington.edu/math/docs/v3/v3_input_explained.html).

Mandatory columns are:
- `name` : Can include letters, numbers and hyphens; avoid names that may be misinterpreted as numbers: use 'A2.1' or 'A2-1' instead of '2.1'.
- `N` and `delN` : Nuclide concentration and uncertainty in atoms/grams quartz.

Optional columns are:
- `press_flag` : 'std' or 'ant'; the default is 'std'.
- `density` : The subtrate density in g/cm3; the default is 2.65.
- `year` : The year of sampling; the default is 2010.
- `nuclide` : 'Be-10' or 'Al-26'; the default is 'Be-10'.
- `shielding` : A catchmentwide shielding factor; the values are ignored if shielding is calculated from a raster dataset, the default is 1.

Default values (see `riversand.params.default_values`) are used if a column is missing. Columns `thickness`, `erate` and `mineral` (only valid value: 'quartz') are irrelevant for the calculation of catchmentwide erosion rates, but if they are present they must contain valid values. All additional columns are ignored.

The `name` is used to match samples to catchment polygons. If several samples were measured from the same location, or if both Al-26 and Be-10 were measured, the data must be in separate rows with the same sample name.

The calculator assumes **standardizations** of '07KNSTD' for Be-10 and 'KNSTD' for Al-26 data. If your samples have been measured against a different standard you can use the following correction factors to restandardize your data: http://hess.ess.washington.edu/math/docs/al_be_v22/AlBe_standardization_table.pdf (see also
http://hess.ess.washington.edu/math/docs/al_be_v22/standard_names.html)

Sample data can be added manually from a python dictionary or uploaded from a spreadsheet:

In [21]:
# Add from a python dictionary (mandatory keys 'N' and 'delN'):
rv.add_samples({'N': 1e5, 'delN': 1e3}) # default values for all other parameters
rv.samples

Unnamed: 0,name,press_flag,thickness,density,shielding,erate,year,nuclide,mineral,N,delN,standardization
0,Test,std,0,2.65,1.0,0,2010,Be-10,quartz,100000,1000,07KNSTD


In [22]:
# Add a sample to an existing dataset:
rv.add_samples({'N': 6e5, 'delN': 6e3, 'nuclide': 'Al-26'}, add=True)
rv.samples

Unnamed: 0,name,press_flag,thickness,density,shielding,erate,year,nuclide,mineral,N,delN,standardization
0,Test,std,0,2.65,1.0,0,2010,Be-10,quartz,100000,1000,07KNSTD
1,Test,std,0,2.65,1.0,0,2010,Al-26,quartz,600000,6000,KNSTD


There is some validation when sample data are added:

In [23]:
rv.add_samples({'N': 6e5, 'delN': 6e3, 'nuclide': 'Cl-36'}, add=True)
rv.samples

Error adding sample data from dictionary:
   Invalid sample data: Illegal nuclide


Unnamed: 0,name,press_flag,thickness,density,shielding,erate,year,nuclide,mineral,N,delN,standardization
0,Test,std,0,2.65,1.0,0,2010,Be-10,quartz,100000,1000,07KNSTD
1,Test,std,0,2.65,1.0,0,2010,Al-26,quartz,600000,6000,KNSTD


Remember that a re-standardization is not performed by the calculator:

In [24]:
rv.add_samples({'N': 95923, 'delN': 959, 'nuclide': 'Be-10', 'standardization': 'NIST_Certified'}, add=True)
rv.samples

Error adding sample data from dictionary:
   Invalid sample data: Illegal standardization


Unnamed: 0,name,press_flag,thickness,density,shielding,erate,year,nuclide,mineral,N,delN,standardization
0,Test,std,0,2.65,1.0,0,2010,Be-10,quartz,100000,1000,07KNSTD
1,Test,std,0,2.65,1.0,0,2010,Al-26,quartz,600000,6000,KNSTD


In [25]:
# Show which default values are used for parameters that are not specified:
riversand.params.default_values

{'name': 'Test',
 'press_flag': 'std',
 'thickness': 0,
 'density': 2.65,
 'shielding': 1,
 'erate': 0,
 'year': 2010,
 'nuclide': 'Be-10',
 'mineral': 'quartz'}