# Basic manipulation of `pylipd.LiPD` object

## Authors

by [Deborah Khider](https://orcid.org/0000-0001-7501-8430)

## Preamble

### Goals:

* Extract a time series for data analysis
* Remove/pop LiPD datasets to an existing `LiPD` object

Reading Time: 5 minutes

### Keywords

LiPD; query

### Pre-requisites

None. This tutorial assumes basic knowledge of Python and Pandas. If you are not familiar with this coding language and the Pandas library, check out this tutorial: http://linked.earth/ec_workshops_py/.

### Relevant Packages

Pandas, pylipd

## Data Description

This notebook uses the following datasets, in LiPD format:

* Nurhati, I. S., Cobb, K. M., & Di Lorenzo, E. (2011). Decadal-scale SST and salinity variations in the central tropical Pacific: Signatures of natural and anthropogenic climate change. Journal of Climate, 24(13), 3294–3308. doi:10.1175/2011jcli3852.1

* Euro2k database: PAGES2k Consortium., Emile-Geay, J., McKay, N. et al. A global multiproxy database for temperature reconstructions of the Common Era. Sci Data 4, 170088 (2017). doi:10.1038/sdata.2017.88

In [1]:
from pylipd.lipd import LiPD

## Demonstration

### Extract time series data from LiPD formatted datasets

If you are famliar with the [R utilities](https://nickmckay.github.io/lipdR/), one useful functions is the ability to expand timeseries data. This capability was also present in the previous iteration of the Python utilities and `PyLiPD` retains this compatbility to ease the transition. 

If you're unsure about what a time series is in the LiPD context, [read more](https://nickmckay.github.io/lipdR/#helptimeseries). 

#### Working with one dataset

First, let's load a single dataset:

In [2]:
data_path = '../data/Ocn-Palmyra.Nurhati.2011.lpd'
D = LiPD()
D.load(data_path)

Loading 1 LiPD files


  0%|                                                                       | 0/1 [00:00<?, ?it/s]

100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 23.15it/s]

Loaded..





Now let's get all the timeseries for this dataset. Note that the [`get_timeseries`](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.get_timeseries) function requires to pass the dataset names. This is useful if you only want to expand only one dataset from your LiPD object. You can also use the function [`get_all_dataset_names`](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.get_all_dataset_names) in the call to expand all datasets: 

In [3]:
ts_list = D.get_timeseries(D.get_all_dataset_names())

type(ts_list)

Extracting timeseries from dataset: Ocn-Palmyra.Nurhati.2011 ...




dict

Note that the above function returns a dictionary that organizes the extracted timeseries by dataset name:

In [4]:
ts_list.keys()

dict_keys(['Ocn-Palmyra.Nurhati.2011'])

Each timeseries is then stored into a list of dictionaries that preserve essential metadata for each time/depth and value pair:

In [5]:
type(ts_list['Ocn-Palmyra.Nurhati.2011'])

list

Although the information is present, it is not easy to navigate or query across the various list. One simple way of doing so is to return the list into a [`Pandas.DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html):

In [6]:
ts_list, df = D.get_timeseries(D.get_all_dataset_names(), to_dataframe=True)

df

Extracting timeseries from dataset: Ocn-Palmyra.Nurhati.2011 ...


Unnamed: 0,mode,time_id,hasUrl,createdBy,geo_meanLon,geo_meanLat,geo_meanElev,geo_type,geo_siteName,geo_pages2kRegion,...,paleoData_description,paleoData_dataType,paleoData_inferredVariableType,paleoData_notes,paleoData_interpretation,paleoData_qCCertification,paleoData_iso2kUI,paleoData_ocean2kID,paleoData_pages2kID,paleoData_inCompilation
0,paleoData,age,https://data.mint.isi.edu/files/lipd/Ocn-Palmy...,matlab,-162.13,5.87,-10.0,Feature,Palmyra,Ocean,...,,,,,,,,,,
1,paleoData,age,https://data.mint.isi.edu/files/lipd/Ocn-Palmy...,matlab,-162.13,5.87,-10.0,Feature,Palmyra,Ocean,...,Year AD,float,Year,,,,,,,
2,paleoData,age,https://data.mint.isi.edu/files/lipd/Ocn-Palmy...,matlab,-162.13,5.87,-10.0,Feature,Palmyra,Ocean,...,Year AD,float,Year,,,,,,,
3,paleoData,age,https://data.mint.isi.edu/files/lipd/Ocn-Palmy...,matlab,-162.13,5.87,-10.0,Feature,Palmyra,Ocean,...,,,,; climateInterpretation_seasonality changed - ...,[{'seasonality': 'not applicable (subannually ...,"MNE, NJA",CO11NUPM01B,,,
4,paleoData,age,https://data.mint.isi.edu/files/lipd/Ocn-Palmy...,matlab,-162.13,5.87,-10.0,Feature,Palmyra,Ocean,...,,,,; paleoData_variableName changed - was origina...,"[{'seasonality': 'N/A (subannually resolved)',...","MNE, NJA",CO11NUPM01BT1,PacificNurhati2011,Ocn_129,Ocean2k_v1.0.0


You can now use all the pandas functionalities for filtering and querying dataframes. First, let's have a look at the available properties, which corresponds to the column headers:

In [7]:
df.columns

Index(['mode', 'time_id', 'hasUrl', 'createdBy', 'geo_meanLon', 'geo_meanLat',
       'geo_meanElev', 'geo_type', 'geo_siteName', 'geo_pages2kRegion',
       'geo_ocean', 'pub1_author', 'pub1_dataUrl', 'pub1_citeKey', 'pub1_link',
       'pub1_journal', 'pub1_issue', 'pub1_publisher', 'pub1_volume',
       'pub1_doi', 'pub1_title', 'pub1_pages', 'pub1_year', 'pub1_DOI',
       'pub2_author', 'pub2_citeKey', 'pub2_title', 'pub2_url',
       'pub2_institution', 'pub2_urldate', 'pub2_DOI', 'pub3_author',
       'pub3_volume', 'pub3_journal', 'pub3_year', 'pub3_citeKey',
       'pub3_dataUrl', 'pub3_title', 'pub3_publisher', 'pub3_link',
       'pub3_pages', 'pub3_doi', 'pub3_DOI', 'googleMetadataWorksheet',
       'googleDataURL', 'dataContributor', 'lipdVersion', 'dataSetName',
       'googleSpreadSheetKey', 'studyName', 'originalDataURL', 'archiveType',
       'tableType', 'paleoData_paleoDataTableName', 'paleoData_filename',
       'paleoData_googleWorkSheetKey', 'paleoData_measurement

Let's have a look at the `paleoData_variableName` column to see what's available:

In [8]:
df['paleoData_variableName']

0     d18O
1     year
2     year
3     d18O
4    Sr_Ca
Name: paleoData_variableName, dtype: object

All columns get extracted, hence why `year` is extracted as a paleo variable, with its associated values stored in `paleoData_values`. Notice that there is also two variables names `d18O`. Since this is a coral record, it stands to reason that one corresponds to the measured $\delta^{18}O$ of the coral and the other the $\delta^{18}O$ of the seawater. Let's have a look at the `qCnotes` field:

In [9]:
df[['paleoData_variableName','paleoData_qCnotes']]

Unnamed: 0,paleoData_variableName,paleoData_qCnotes
0,d18O,d18Osw (residuals calculated from coupled SrCa...
1,year,
2,year,
3,d18O,Duplicate of modern d18O record presented in C...
4,Sr_Ca,


In fact, one is for the measurement on the coral and the other for seawater. Querying on this small dataset is not necessary; however, it can become useful when looking at a collection of files as shown in the next example (working with multiple datasets). 

To extract by row index (here extracting for `Sr_Ca`):

In [10]:
df_cut = df.iloc[0,:]

df_cut

mode                                                                 paleoData
time_id                                                                    age
hasUrl                       https://data.mint.isi.edu/files/lipd/Ocn-Palmy...
createdBy                                                               matlab
geo_meanLon                                                            -162.13
                                                   ...                        
paleoData_qCCertification                                                  NaN
paleoData_iso2kUI                                                          NaN
paleoData_ocean2kID                                                        NaN
paleoData_pages2kID                                                        NaN
paleoData_inCompilation                                                    NaN
Name: 0, Length: 93, dtype: object

In [11]:
df_cut['paleoData_variableName']

'd18O'

This can be very useful when working with the [`Pyleoclim`](https://pyleoclim-util.readthedocs.io/en/master/) software since a `Pyleoclim.Series` can be initialized from the information contained in `df_cut`. Working with `PyLiPD` and `Pyleoclim` is a subject of [several tutorials](http://linked.earth/PyleoTutorials/intro.html).

Working with such a large dataframe can be overwhelming and not needed in some cases. Therefore, `PyLiPD` has a nifty function called `get_timeseries_essential` that grabs information about the dataset, its geographical location, the time/depth values, the variable information, including archive and proxy:

In [12]:
df_essential = D.get_timeseries_essentials()

df_essential

Unnamed: 0,dataSetName,archiveType,geo_meanLat,geo_meanLon,geo_meanElev,paleoData_variableName,paleoData_values,paleoData_units,paleoData_proxy,paleoData_proxyGeneral,time_variableName,time_values,time_units,depth_variableName,depth_values,depth_units
0,Ocn-Palmyra.Nurhati.2011,coral,5.87,-162.13,-10.0,year,"[1998.21, 1998.13, 1998.04, 1997.96, 1997.88, ...",AD,,,year,"[1998.21, 1998.13, 1998.04, 1997.96, 1997.88, ...",AD,,,
1,Ocn-Palmyra.Nurhati.2011,coral,5.87,-162.13,-10.0,year,"[1998.29, 1998.21, 1998.13, 1998.04, 1997.96, ...",AD,,,year,"[1998.29, 1998.21, 1998.13, 1998.04, 1997.96, ...",AD,,,
2,Ocn-Palmyra.Nurhati.2011,coral,5.87,-162.13,-10.0,d18O,"[-5.41, -5.47, -5.49, -5.43, -5.48, -5.53, -5....",permil,d18O,,year,"[1998.29, 1998.21, 1998.13, 1998.04, 1997.96, ...",AD,,,
3,Ocn-Palmyra.Nurhati.2011,coral,5.87,-162.13,-10.0,Sr_Ca,"[8.96, 8.9, 8.91, 8.94, 8.92, 8.89, 8.87, 8.81...",mmol/mol,Sr/Ca,,year,"[1998.29, 1998.21, 1998.13, 1998.04, 1997.96, ...",AD,,,
4,Ocn-Palmyra.Nurhati.2011,coral,5.87,-162.13,-10.0,d18O,"[0.39, 0.35, 0.35, 0.35, 0.36, 0.22, 0.33, 0.3...",permil,d18O,,year,"[1998.21, 1998.13, 1998.04, 1997.96, 1997.88, ...",AD,,,


The metadata (i.e., the column names) available through this function will always remain the same and are as follows:

In [13]:
df_essential.columns

Index(['dataSetName', 'archiveType', 'geo_meanLat', 'geo_meanLon',
       'geo_meanElev', 'paleoData_variableName', 'paleoData_values',
       'paleoData_units', 'paleoData_proxy', 'paleoData_proxyGeneral',
       'time_variableName', 'time_values', 'time_units', 'depth_variableName',
       'depth_values', 'depth_units'],
      dtype='object')

#### Working with multiple datasets

In [14]:
path = '../data/Euro2k/'

D_dir = LiPD()
D_dir.load_from_dir(path)

Loading 31 LiPD files


  0%|                                                                      | 0/31 [00:00<?, ?it/s]

 19%|████████████                                                  | 6/31 [00:00<00:00, 49.86it/s]

 39%|███████████████████████▌                                     | 12/31 [00:00<00:00, 53.07it/s]

 58%|███████████████████████████████████▍                         | 18/31 [00:00<00:00, 45.07it/s]

 77%|███████████████████████████████████████████████▏             | 24/31 [00:00<00:00, 49.29it/s]

 97%|███████████████████████████████████████████████████████████  | 30/31 [00:00<00:00, 52.01it/s]

100%|█████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 49.98it/s]

Loaded..





Let's expand into our essential dataframe:

In [15]:
df_dir = D_dir.get_timeseries_essentials()

Let's have a look at the dataframe:

In [16]:
df_dir.head()

Unnamed: 0,dataSetName,archiveType,geo_meanLat,geo_meanLon,geo_meanElev,paleoData_variableName,paleoData_values,paleoData_units,paleoData_proxy,paleoData_proxyGeneral,time_variableName,time_values,time_units,depth_variableName,depth_values,depth_units
0,Ocn-RedSea.Felis.2000,coral,27.85,34.32,-6.0,d18O,"[-4.12, -3.82, -3.05, -3.02, -3.62, -3.96, -3....",permil,d18O,,year,"[1995.583, 1995.417, 1995.25, 1995.083, 1994.9...",AD,,,
1,Ocn-RedSea.Felis.2000,coral,27.85,34.32,-6.0,year,"[1995.583, 1995.417, 1995.25, 1995.083, 1994.9...",AD,,,year,"[1995.583, 1995.417, 1995.25, 1995.083, 1994.9...",AD,,,
2,Arc-Forfjorddalen.McCarroll.2013,tree,68.73,15.73,200.0,year,"[1100.0, 1101.0, 1102.0, 1103.0, 1104.0, 1105....",AD,,,year,"[1100, 1101, 1102, 1103, 1104, 1105, 1106, 110...",AD,,,
3,Arc-Forfjorddalen.McCarroll.2013,tree,68.73,15.73,200.0,MXD,"[-0.6724083323, -0.9420877372, -0.9158317899, ...",,MXD,,year,"[1100, 1101, 1102, 1103, 1104, 1105, 1106, 110...",AD,,,
4,Eur-Tallinn.Tarand.2001,documents,59.4,24.75,10.0,temperature,"[-7.1, nan, -6.3, -4.5, -6.4, -7.8, -5.7, -5.3...",degC,historic,,year,"[1500, 1501, 1502, 1503, 1504, 1505, 1506, 150...",AD,,,


The size of this dataframe is:

In [17]:
df_dir.shape

(83, 16)

So we expandad into 83 timeseries.

Let's have a look at the available variables:

In [18]:
df_dir['paleoData_variableName'].unique()

array(['d18O', 'year', 'MXD', 'temperature', 'JulianDay', 'trsgi',
       'sampleID', 'uncertainty_temperature', 'age', 'density', 'd13C',
       'sampleDensity', 'Na', 'thickness'], dtype=object)

Let's assume we are only interested in the temperature data:

In [19]:
df_temp = df_dir[df_dir['paleoData_variableName']=='temperature']
df_temp.head()

Unnamed: 0,dataSetName,archiveType,geo_meanLat,geo_meanLon,geo_meanElev,paleoData_variableName,paleoData_values,paleoData_units,paleoData_proxy,paleoData_proxyGeneral,time_variableName,time_values,time_units,depth_variableName,depth_values,depth_units
4,Eur-Tallinn.Tarand.2001,documents,59.4,24.75,10.0,temperature,"[-7.1, nan, -6.3, -4.5, -6.4, -7.8, -5.7, -5.3...",degC,historic,,year,"[1500, 1501, 1502, 1503, 1504, 1505, 1506, 150...",AD,,,
8,Eur-CentralEurope.Dobrovolný.2009,documents,49.0,13.0,,temperature,"[0.199, -1.321, 1.842, 1.667, 1.997, -1.101, 0...",degC,Documentary,,year,"[1500, 1501, 1502, 1503, 1504, 1505, 1506, 150...",AD,,,
15,Eur-CentralandEasternPyrenees.Pla.2004,lake sediment,42.5,0.75,2280.0,temperature,"[0.0, 0.09114, -0.19458, 0.07387, -0.42006, -0...",degC,chrysophyte,,year,"[1994.0, 1984.0, 1963.0, 1943.0, 1932.0, 1916....",AD,,,
16,Eur-CentralandEasternPyrenees.Pla.2004,lake sediment,42.5,0.75,2280.0,temperature,"[0.0, 0.09114, -0.19458, 0.07387, -0.42006, -0...",degC,chrysophyte,,age,"[-44.0, -34.0, -13.0, 7.0, 18.0, 34.0, 45.0, 5...",BP,,,
35,Eur-LakeSilvaplana.Trachsel.2010,lake sediment,46.5,9.8,1791.0,temperature,"[0.181707222, 0.111082797, 0.001382129, -0.008...",degC,reflectance,,year,"[1175, 1176, 1177, 1178, 1179, 1180, 1181, 118...",AD,,,


In [20]:
df_temp.shape

(13, 16)

which leaves us with 13 timeseries.

Let's assume that you want everything that is not related to time, sampleID, and uncertainty:

In [21]:
df_filt = df_dir.query("paleoData_variableName in ('temperature','MXD','density','d18O','trsgi')")
df_filt.head()

Unnamed: 0,dataSetName,archiveType,geo_meanLat,geo_meanLon,geo_meanElev,paleoData_variableName,paleoData_values,paleoData_units,paleoData_proxy,paleoData_proxyGeneral,time_variableName,time_values,time_units,depth_variableName,depth_values,depth_units
0,Ocn-RedSea.Felis.2000,coral,27.85,34.32,-6.0,d18O,"[-4.12, -3.82, -3.05, -3.02, -3.62, -3.96, -3....",permil,d18O,,year,"[1995.583, 1995.417, 1995.25, 1995.083, 1994.9...",AD,,,
3,Arc-Forfjorddalen.McCarroll.2013,tree,68.73,15.73,200.0,MXD,"[-0.6724083323, -0.9420877372, -0.9158317899, ...",,MXD,,year,"[1100, 1101, 1102, 1103, 1104, 1105, 1106, 110...",AD,,,
4,Eur-Tallinn.Tarand.2001,documents,59.4,24.75,10.0,temperature,"[-7.1, nan, -6.3, -4.5, -6.4, -7.8, -5.7, -5.3...",degC,historic,,year,"[1500, 1501, 1502, 1503, 1504, 1505, 1506, 150...",AD,,,
8,Eur-CentralEurope.Dobrovolný.2009,documents,49.0,13.0,,temperature,"[0.199, -1.321, 1.842, 1.667, 1.997, -1.101, 0...",degC,Documentary,,year,"[1500, 1501, 1502, 1503, 1504, 1505, 1506, 150...",AD,,,
10,Eur-EuropeanAlps.Büntgen.2011,tree,47.0,10.7,2050.0,trsgi,"[0.405, 0.395, 1.209, 1.244, -0.101, 0.658, 0....",,TRW,,year,"[-500, -499, -498, -497, -496, -495, -494, -49...",AD,,,


To keep the rows that are relevant to our problem, you can use the [`DataFrame.query`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html) function available in Pandas: 

In [22]:
df_filt.shape

(34, 16)

Which leaves us with 34 timeseries.

### Removing and popping datasets out of a LiPD object

You can also remove (i.e., delete the corresponding dataset from the `LiPD` object) or pop (i.e., delete the corresponding dataset from the `LiPD` object **and** return the dataset) datasets from a `LiPD` object. Note that these functionalities behave similarly as the functions with the same names on Python lists. These functions underpin more adavanced filtering and querying capabilities that we will discuss in later tutorials. 

First let's make a [copy](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.copy) of `D_dir`:

In [23]:
D_test = D_dir.copy()
print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen

And let's [remove](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.remove) `Arc-AkademiiNaukIceCap.Opel.2013`, which corresponds to the last entry in the list above:

In [24]:
D_test.remove('Arc-AkademiiNaukIceCap.Opel.2013')

print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen

Now let's [pop](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.pop) `Eur-Stockholm.Leijonhufvud.2009` from `D_test`:

In [25]:
d_eur = D_test.pop('Eur-Stockholm.Leijonhufvud.2009')

Now let's have a look at `d_eur`:

In [26]:
print(d_eur.get_all_dataset_names())

['Eur-Stockholm.Leijonhufvud.2009']


It contains the dataset we are expecting. Let's have a look at `D_test`:

In [27]:
print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen


<div class="alert alert-warning">
    The dataset was removed from `D_test` in the process. Hence, it's always prudent to make a copy of the original object when using the `remove` and `pop` functionalities. 
</div>


If can also remove/pop more than one dataset at a time:

In [28]:
rem = ['Ocn-RedSea.Felis.2000','Arc-Forfjorddalen.McCarroll.2013']

D_test.remove(rem)
print(D_test.get_all_dataset_names())

['Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschental.B_ntgen.2006']
