**Summary**

In this tutorial we will enrich Beluga whale observations with bathymetry information.

You will first prepare the dataset to be enriched.
It can be your own DarwinCore archive, or you can download one from GBIF.
An enrichment file will be created where data will be added later on.

Afterwards, you can pick a variable id from the [catalog file](geoenrich/catalog.csv). If the variable you want is not available, feel free to update the *catalog.csv* file with new rows (see [installation instructions](https://geoenrich.readthedocs.io/en/latest/install.html)).


# Download from GBIF

In [5]:
from geoenrich.biodiv import *

#### Get GBIF id for the taxon of interest

In [2]:
taxKey = get_taxon_key('Delphinapterus leucas')

Selected taxon: SPECIES: Delphinapterus leucas (Pallas, 1776)


#### Request an archive with all occurrences of this taxon

In [3]:
request_id = request_from_gbif(taxKey)

INFO:Your download key is 0201232-210914110416597


#### Download request

For large requests, some waiting time is needed for the archive to be ready.

In [6]:
download_requested(request_key = request_id)

INFO:Download file size: 1621631 bytes
INFO:On disk at /media/Data/Data/biodiv/gbif/0201232-210914110416597.zip


# Load occurrence data and create enrichment file

In [None]:
import os
import geoenrich
from geoenrich.biodiv import *
from geoenrich.enrichment import create_enrichment_file

#### If data was downloaded from gbif

In [None]:
geodf = open_dwca(taxonKey = taxKey)

#### If you are using your own dataset (DarwinCore format)

In [None]:
# A DarwinCore archive is bundled into the package for user testing
example_path = os.path.split(geoenrich.__file__)[0] + '/data/AcDigitifera.zip'
geodf = open_dwca(path = example_path)

#### If you are using your own dataset (csv format)

In [None]:
geodf = import_csv()

#### Create enrichment file

In [None]:
dataset_ref = 'your_dataset_name'
create_enrichment_file(geodf, 'your_dataset_name')

# Enrich

In [4]:
from geoenrich.enrichment import enrich
from geoenrich.biodiv import get_taxon_key

#### Define enrichment scope

In [5]:
var_id = 'bathymetry'
dataset_ref = 'your_dataset_name'

geo_buff = 115       # kilometers
time_buff = (-7, 0)  # Download data from 7 days before occurrence date to occurrence date
                     # (for datasets where time is a dimension)

Selected taxon: SPECIES: Delphinapterus leucas (Pallas, 1776)


#### Enrich

Only enrich a small slice first to check speed.

In [9]:
enrich(dataset_ref, var_id, slice = (0, 100), geo_buff, time_buff)

6632 occurrences were loaded from enrichment file


100%|█████████████████████████████████████████| 100/100 [02:02<00:00,  1.23s/it]

Enrichment over





# Data retrieval

In [37]:
from geoenrich.enrichment import enrichment_status, read_ids, retrieve_data

dataset_ref = 'your_dataset_name'

Selected taxon: SPECIES: Delphinapterus leucas (Pallas, 1776)


#### Check the enrichment status of the dataset.

In [11]:
enrichment_status(dataset_ref)

Unnamed: 0,bathymetry
Enriched,100
Not enriched,6532
Data not available,0



#### Request data from local storage for the first row of our dataset.

In [8]:
ids = read_ids(dataset_ref)
output = retrieve_data(dataset_ref, ids[0], shape = 'rectangle')

#### Unpack and plot data

In [38]:
var_id = 'bathymetry'

data = output[var_id]['values']
unit = output[var_id]['unit']
coords = output[var_id]['coords']

In [42]:
from matplotlib import pyplot as plt
%matplotlib notebook

# Get latitude and longitude values for the requested data
lat_dim = [c[0] for c in coords].index('latitude')
lon_dim = [c[0] for c in coords].index('longitude')
lats = coords[lat_dim][1]
longs = coords[lon_dim][1]

# Plot
extent = [longs[0] , longs[-1], lats[0] , lats[-1]]
plt.imshow(data, extent = extent)
plt.title(var_id + ' (' + unit + ')')
plt.colorbar()

# NB: If your data has time or depth dimensions, you will have to pick a slice of the data array to be able to plot it

<IPython.core.display.Javascript object>

<matplotlib.colorbar.Colorbar at 0x7f9b18fc5160>