**Summary**

You will first prepare the dataset to be enriched.
It can be your own DarwinCore archive, or you can download one from GBIF.
An enrichment file will be created where data will be added later on.

Afterwards, you will pick a variable id from the *catalog.csv* file. If the variable you want is not available, feel free to update the *catalog.csv* file with new rows.


# Download from gbif

In [None]:
from geoenrich.biodiv import *

Get GBIF id for the taxon of interest

In [2]:
taxkey = get_taxon_key('Physeter macrocephalus')

Selected taxon: SPECIES: Physeter macrocephalus Linnaeus, 1758


Request an archive with all occurrences of this taxon

In [4]:
request_id = request_from_gbif(taxkey)

Request already made on 2022-03-18T07:38:46.054+00:00
Run again with override = True to force new request.
Request already made on 2022-03-17T11:08:58.123+00:00
Run again with override = True to force new request.


Download request (for large requests, some waiting time is needed for the archive to be ready)

In [5]:
download_requested(request_key = request_id, path = biodiv_path + 'gbif')

Request already made on 2022-03-18T07:38:46.054+00:00
Run again with override = True to force new request.
Request already made on 2022-03-17T11:08:58.123+00:00
Run again with override = True to force new request.


INFO:Download file size: 14259372 bytes
INFO:On disk at /media/Data/Data/biodiv/gbif/0186865-210914110416597.zip


Prepare enrichment file from the GBIF data.

Any DarwinCore archive may be used instead.

In [6]:
geodf = open_dwca(biodiv_path + 'gbif/' + str(taxkey) + '.zip')
save_file_for_enrichment(geodf, filename = 'gbif_' + str(taxkey))

Dropped 205 rows related to non-living occurrences
Dropped 1083 rows with missing event date
Selected 10000 random occurrences from the dataset
10000 occurrences were loaded.
File saved at /media/Data/Data/biodiv/gbif_8123917.csv


# Enrich

In [None]:
from geoenrich.enrichment import enrich

Define enrichment scope

In [2]:
taxkey = get_taxon_key('Istiompax indica')
var_id = 'sst'
dataset_ref = 'gbif_' + str(taxkey)

Selected taxon: SPECIES: Istiompax indica (Cuvier, 1832)


Enrich.

Only enrich a small slice first to check speed.

In [64]:
enrich(dataset_ref, var_id, slice = (0, 100))

710 occurrences were loaded from enrichment file


100%|█████████████████████████████████████████| 710/710 [06:36<00:00,  1.79it/s]


Enrichment over


# Data retrieval

In [None]:
from geoenrich.enrichment import enrichment_status, read_ids, retrieve_data 

Check the enrichment status of the dataset.

In [None]:
enrichment_status(dataset_ref)

Request data from local storage for the first row of our dataset.

In [None]:
ids = read_ids(dataset_ref)
output = retrieve_data(dataset_ref, ids[0])