# Basic manipulation of `pylipd.LiPD` object

## Authors

Deborah Khider, Varun Ratnakar

Information Sciences Institute, University of Southern California

Author1 = {"name": "Deborah Khider", "affiliation": "Information Sciences Institute, University of Southern California", "email": "khider@usc.edu", "orcid": "0000-0001-7501-8430"}

Author2 = {"name": "Varun Ratnakar", "affiliation": "Information Sciences Institute, University of Southern California", "email": "varunr@isi.edu"}

## Preamble

### Goals:

* Extract a time series for data analysis
* Remove/pop LiPD datasets to an existing `LiPD` object

Reading Time: 5 minutes

### Keywords

LiPD; query

### Pre-requisites

None. This tutorial assumes basic knowledge of Python and Pandas. If you are not familiar with this coding language and the Pandas library, check out this tutorial: http://linked.earth/ec_workshops_py/.

### Relevant Packages

Pandas, pylipd

## Data Description

This notebook uses the following datasets, in LiPD format:

* Nurhati, I. S., Cobb, K. M., & Di Lorenzo, E. (2011). Decadal-scale SST and salinity variations in the central tropical Pacific: Signatures of natural and anthropogenic climate change. Journal of Climate, 24(13), 3294–3308. doi:10.1175/2011jcli3852.1

* Moses, C. S., Swart, P. K., and Rosenheim, B. E. (2006), Evidence of multidecadal salinity variability in the eastern tropical North Atlantic, Paleoceanography, 21, PA3010, doi:10.1029/2005PA001257.

* Euro2k database: PAGES2k Consortium., Emile-Geay, J., McKay, N. et al. A global multiproxy database for temperature reconstructions of the Common Era. Sci Data 4, 170088 (2017). doi:10.1038/sdata.2017.88

* Stott, L., Timmermann, A., & Thunell, R. (2007). Southern Hemisphere and deep-sea warming led deglacial atmospheric CO2 rise and tropical warming. Science (New York, N.Y.), 318(5849), 435–438. doi:10.1126/science.1143791

* Tudhope, A. W., Chilcott, C. P., McCulloch, M. T., Cook, E. R., Chappell, J., Ellam, R. M., et al. (2001). Variability in the El Niño-Southern Oscillation through a glacial-interglacial cycle. Science, 291(1511), 1511-1517. doi:doi:10.1126/science.1057969

* Tierney, J. E., Abram, N. J., Anchukaitis, K. J., Evans, M. N., Giry, C., Kilbourne, K. H., et al. (2015). Tropical sea surface temperatures for the past four centuries reconstructed from coral archives. Paleoceanography, 30(3), 226–252. doi:10.1002/2014pa002717

* Orsi, A. J., Cornuelle, B. D., and Severinghaus, J. P. (2012), Little Ice Age cold interval in West Antarctica: Evidence from borehole temperature at the West Antarctic Ice Sheet (WAIS) Divide, Geophys. Res. Lett., 39, L09710, doi:10.1029/2012GL051260.

In [1]:
from pylipd.lipd import LiPD

## Demonstration

### Extract time series data from LiPD formatted datasets

If you are famliar with the [R utilities](https://nickmckay.github.io/lipdR/), one useful functions is the ability to expand timeseries data. This capability was also present in the previous iteration of the Python utilities and `pylipd` retains this compatbility to ease the transition. 

If you're unsure about what a time series is in the LiPD context, [read more](https://nickmckay.github.io/lipdR/#helptimeseries). 

#### Working with one dataset

First, let's load a single dataset:

In [2]:
data_path = '../data/Ocn-Palmyra.Nurhati.2011.lpd'
D = LiPD()
D.load(data_path)

Loading 1 LiPD files
Conversion to RDF done..
Loading RDF into graph
Loaded..


Now let's get all the timeseries for this dataset. Note that the [`get_timeseries`](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.get_timeseries) function requires to pass the dataset names. This is useful if you only want to expand only one dataset from your LiPD object. You can also use the function [`get_all_dataset_names`](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.get_all_dataset_names) in the call to expand all datasets: 

In [3]:
ts_list = D.get_timeseries(D.get_all_dataset_names())

type(ts_list)

Extracting timeseries from dataset: Ocn-Palmyra.Nurhati.2011 ...


dict

Note that the above function returns a dictionary that organizes the extracted timeseries by dataset name:

In [4]:
ts_list.keys()

dict_keys(['Ocn-Palmyra.Nurhati.2011'])

Each timeseries is then stored into a list of dictionaries that preserve essential metadata for each time/depth and value pair:

In [5]:
type(ts_list['Ocn-Palmyra.Nurhati.2011'])

list

Although the information is present, it is not easy to navigate or query across the various list. One simple way of doing so is to return the list into a `Pandas.DataFrame`[https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html]:

In [6]:
ts_list, df = D.get_timeseries(D.get_all_dataset_names(), to_dataframe=True)

df

Extracting timeseries from dataset: Ocn-Palmyra.Nurhati.2011 ...


Unnamed: 0,mode,time_id,googleMetadataWorksheet,pub1_author,pub1_citeKey,pub1_title,pub1_url,pub1_institution,pub1_urldate,pub1_DOI,...,paleoData_sensorSpecies,paleoData_variableName,paleoData_pages2kID,paleoData_notes,paleoData_variableType,paleoData_values,paleoData_qCnotes,paleoData_dataType,paleoData_description,paleoData_inferredVariableType
0,paleoData,age,oax0htr,I.S. Nurhati,nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...,World Data Center for Paleoclimatology,https://www.ncdc.noaa.gov/paleo/study/8609,World Data Center for Paleoclimatology,2010.0,hello,...,lutea,Sr_Ca,Ocn_129,; paleoData_variableName changed - was origina...,measured,"[8.96, 8.9, 8.91, 8.94, 8.92, 8.89, 8.87, 8.81...",,,,
1,paleoData,age,oax0htr,I.S. Nurhati,nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...,World Data Center for Paleoclimatology,https://www.ncdc.noaa.gov/paleo/study/8609,World Data Center for Paleoclimatology,2010.0,hello,...,lutea,d18O,,; climateInterpretation_seasonality changed - ...,measured,"[-5.41, -5.47, -5.49, -5.43, -5.48, -5.53, -5....",Duplicate of modern d18O record presented in C...,,,
2,paleoData,age,oax0htr,I.S. Nurhati,nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...,World Data Center for Paleoclimatology,https://www.ncdc.noaa.gov/paleo/study/8609,World Data Center for Paleoclimatology,2010.0,hello,...,,year,,,inferred,"[1998.29, 1998.21, 1998.13, 1998.04, 1997.96, ...",,float,Year AD,Year
3,paleoData,age,oax0htr,I.S. Nurhati,nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...,World Data Center for Paleoclimatology,https://www.ncdc.noaa.gov/paleo/study/8609,World Data Center for Paleoclimatology,2010.0,hello,...,,year,,,inferred,"[1998.21, 1998.13, 1998.04, 1997.96, 1997.88, ...",,float,Year AD,Year
4,paleoData,age,oax0htr,I.S. Nurhati,nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...,World Data Center for Paleoclimatology,https://www.ncdc.noaa.gov/paleo/study/8609,World Data Center for Paleoclimatology,2010.0,hello,...,lutea,d18O,,,measured,"[0.39, 0.35, 0.35, 0.35, 0.36, 0.22, 0.33, 0.3...",d18Osw (residuals calculated from coupled SrCa...,,,


You can now use all the pandas functionalities for filtering and querying dataframes. First, let's have a look at the available properties, which corresponds to the column headers:

In [7]:
df.columns

Index(['mode', 'time_id', 'googleMetadataWorksheet', 'pub1_author',
       'pub1_citeKey', 'pub1_title', 'pub1_url', 'pub1_institution',
       'pub1_urldate', 'pub1_DOI', 'pub2_author', 'pub2_publisher',
       'pub2_link', 'pub2_citeKey', 'pub2_year', 'pub2_pages', 'pub2_volume',
       'pub2_doi', 'pub2_title', 'pub2_journal', 'pub2_dataUrl', 'pub2_DOI',
       'pub3_author', 'pub3_volume', 'pub3_publisher', 'pub3_doi', 'pub3_year',
       'pub3_issue', 'pub3_link', 'pub3_title', 'pub3_citeKey', 'pub3_pages',
       'pub3_journal', 'pub3_dataUrl', 'pub3_DOI', 'createdBy',
       'originalDataURL', 'hasUrl', 'geo_meanLon', 'geo_meanLat',
       'geo_meanElev', 'geo_type', 'geo_ocean', 'geo_pages2kRegion',
       'geo_siteName', 'dataSetName', 'dataContributor',
       'googleSpreadSheetKey', 'studyName', 'lipdVersion', 'googleDataURL',
       'archiveType', 'tableType', 'paleoData_filename',
       'paleoData_googleWorkSheetKey', 'paleoData_paleoDataTableName',
       'paleoData_meas

Let's have a look at the `paleoData_variableName` column to see what's available:

In [8]:
df['paleoData_variableName']

0    Sr_Ca
1     d18O
2     year
3     year
4     d18O
Name: paleoData_variableName, dtype: object

All columns get extracted, hence why `year` is extracted as a paleo variable, with its associated values stored in `paleoData_values`. Notice that there is also two variables names `d18O`. Since this is a coral record, it stands to reason that one corresponds to the measured $\delta^{18}O$ of the coral and the other the $\delta^{18}O$ of the seawater. Let's have a look at the `qCnotes` field:

In [37]:
df[['paleoData_variableName','paleoData_qCnotes']]

Unnamed: 0,paleoData_variableName,paleoData_qCnotes
0,Sr_Ca,
1,d18O,Duplicate of modern d18O record presented in C...
2,year,
3,year,
4,d18O,d18Osw (residuals calculated from coupled SrCa...


In fact, one is for the measurement on the coral and the other for seawater. Querying on this small dataset is not necessary; however, it can become useful when looking at a collection of files as shown in the next example (working with multiple datasets). 

To extract by row index (here extracting for `Sr_Ca`):

In [27]:
df_cut = df.iloc[0,:]

df_cut

mode                                                                      paleoData
time_id                                                                         age
googleMetadataWorksheet                                                     oax0htr
pub1_author                                                            I.S. Nurhati
pub1_citeKey                      nurhati2010httpswwwncdcnoaagovpaleostudy8609Da...
                                                        ...                        
paleoData_values                  [8.96, 8.9, 8.91, 8.94, 8.92, 8.89, 8.87, 8.81...
paleoData_qCnotes                                                               NaN
paleoData_dataType                                                              NaN
paleoData_description                                                           NaN
paleoData_inferredVariableType                                                  NaN
Name: 0, Length: 93, dtype: object

In [29]:
df_cut['paleoData_variableName']

'Sr_Ca'

This can be very useful when working with the [`Pyleoclim`](https://pyleoclim-util.readthedocs.io/en/master/) software since a `Pyleoclim.Series` can be initialized from the information contained in `df_cut`. Working with `pylipd` and `Pyleoclim` will be the subject of a later tutorial.

#### Working with multiple datasets

In [30]:
path = '../data/Euro2k/'

D_dir = LiPD()
D_dir.load_from_dir(path)

Loading 31 LiPD files
Conversion to RDF done..
Loading RDF into graph
Loaded..


Let's start by expanding all datasets using `get_all_dataset_names()`:

In [31]:
ts_list_dir, df_dir = D_dir.get_timeseries(D_dir.get_all_dataset_names(), to_dataframe=True)

Extracting timeseries from dataset: Ocn-RedSea.Felis.2000 ...
Extracting timeseries from dataset: Arc-Forfjorddalen.McCarroll.2013 ...
Extracting timeseries from dataset: Eur-Tallinn.Tarand.2001 ...
Extracting timeseries from dataset: Eur-CentralEurope.Dobrovoln_.2009 ...
Extracting timeseries from dataset: Eur-EuropeanAlps.B_ntgen.2011 ...
Extracting timeseries from dataset: Eur-CentralandEasternPyrenees.Pla.2004 ...
Extracting timeseries from dataset: Arc-Tjeggelvas.Bjorklund.2012 ...
Extracting timeseries from dataset: Arc-Indigirka.Hughes.1999 ...
Extracting timeseries from dataset: Eur-SpannagelCave.Mangini.2005 ...
Extracting timeseries from dataset: Ocn-AqabaJordanAQ19.Heiss.1999 ...
Extracting timeseries from dataset: Arc-Jamtland.Wilson.2016 ...
Extracting timeseries from dataset: Eur-RAPiD-17-5P.Moffa-Sanchez.2014 ...
Extracting timeseries from dataset: Eur-LakeSilvaplana.Trachsel.2010 ...
Extracting timeseries from dataset: Eur-NorthernSpain.Mart_n-Chivelet.2011 ...
Extracti

Let's have a look at the dataframe:

In [32]:
df_dir.head()

Unnamed: 0,mode,time_id,dataSetName,hasUrl,dataContributor,pub1_author,pub1_url,pub1_citeKey,pub1_institution,pub1_title,...,pub2_urldate,pub2_url,paleoData_calibration,paleoData_uncertainty,age,ageUnits,pub2_institution,investigator,data,pub1_abstract
0,paleoData,age,Ocn-RedSea.Felis.2000,https://data.mint.isi.edu/files/lipd/Ocn-RedSe...,{'name': 'JZ'},T. Felis,https://www.ncdc.noaa.gov/paleo/study/1861,felis2000httpswwwncdcnoaagovpaleostudy1861Data...,World Data Center for Paleoclimatology,World Data Center for Paleoclimatology,...,,,,,,,,,,
1,paleoData,age,Ocn-RedSea.Felis.2000,https://data.mint.isi.edu/files/lipd/Ocn-RedSe...,{'name': 'JZ'},T. Felis,https://www.ncdc.noaa.gov/paleo/study/1861,felis2000httpswwwncdcnoaagovpaleostudy1861Data...,World Data Center for Paleoclimatology,World Data Center for Paleoclimatology,...,,,,,,,,,,
2,paleoData,age,Arc-Forfjorddalen.McCarroll.2013,https://data.mint.isi.edu/files/lipd/Arc-Forfj...,,D. McCarroll,this study,mccarroll0thisstudyDataCitation,,This study,...,,,,,,,,,,
3,paleoData,age,Arc-Forfjorddalen.McCarroll.2013,https://data.mint.isi.edu/files/lipd/Arc-Forfj...,,D. McCarroll,this study,mccarroll0thisstudyDataCitation,,This study,...,,,,,,,,,,
4,paleoData,age,Eur-Tallinn.Tarand.2001,https://data.mint.isi.edu/files/lipd/Eur-Talli...,,A. Tarand;P.Ø. Nordli,,tarand2001thetallinntemperatureseri,,The Tallinn temperature series reconstructed b...,...,0.0,this study,"[{'uncertaintyType': 'Maximum error', 'uncerta...",ice break-up: 1 deg C; max. 3 deg C; rye harve...,,,,,,


The size of this dataframe is:

In [33]:
df_dir.shape

(73, 120)

73 rows and 122 columns. The available properties are:

In [34]:
print([item for item in df_dir.columns])

['mode', 'time_id', 'dataSetName', 'hasUrl', 'dataContributor', 'pub1_author', 'pub1_url', 'pub1_citeKey', 'pub1_institution', 'pub1_title', 'pub1_urldate', 'pub1_DOI', 'pub2_author', 'pub2_citeKey', 'pub2_year', 'pub2_title', 'pub2_link', 'pub2_doi', 'pub2_pages', 'pub2_publisher', 'pub2_issue', 'pub2_dataUrl', 'pub2_volume', 'pub2_journal', 'pub2_DOI', 'pub3_author', 'pub3_dataUrl', 'pub3_citeKey', 'pub3_pages', 'pub3_volume', 'pub3_issue', 'pub3_doi', 'pub3_journal', 'pub3_link', 'pub3_publisher', 'pub3_title', 'pub3_year', 'pub3_DOI', 'createdBy', 'googleDataURL', 'studyName', 'googleMetadataWorksheet', 'geo_meanLon', 'geo_meanLat', 'geo_meanElev', 'geo_type', 'geo_pages2kRegion', 'geo_siteName', 'geo_ocean', 'originalDataURL', 'googleSpreadSheetKey', 'lipdVersion', 'archiveType', 'tableType', 'paleoData_filename', 'paleoData_paleoDataTableName', 'paleoData_googleWorkSheetKey', 'paleoData_measurementTableMD5', 'paleoData_measurementTableName', 'year', 'yearUnits', 'paleoData_sensor

Let's have a look at the available variables:

In [39]:
df_dir['paleoData_variableName'].unique()

array(['d18O', 'year', 'MXD', 'temperature', 'JulianDay', 'trsgi',
       'uncertainty_temperature', 'sampleID', 'age', 'density', 'd13C',
       'sampleDensity', 'thickness', 'Na'], dtype=object)

Let's assume we are only interested in the temperature data:

In [40]:
df_temp = df_dir[df_dir['paleoData_variableName']=='temperature']
df_temp.head()

Unnamed: 0,mode,time_id,dataSetName,hasUrl,dataContributor,pub1_author,pub1_url,pub1_citeKey,pub1_institution,pub1_title,...,pub2_urldate,pub2_url,paleoData_calibration,paleoData_uncertainty,age,ageUnits,pub2_institution,investigator,data,pub1_abstract
4,paleoData,age,Eur-Tallinn.Tarand.2001,https://data.mint.isi.edu/files/lipd/Eur-Talli...,,A. Tarand;P.Ø. Nordli,,tarand2001thetallinntemperatureseri,,The Tallinn temperature series reconstructed b...,...,0.0,this study,"[{'uncertaintyType': 'Maximum error', 'uncerta...",ice break-up: 1 deg C; max. 3 deg C; rye harve...,,,,,,
7,paleoData,age,Eur-CentralEurope.Dobrovolný.2009,https://data.mint.isi.edu/files/lipd/Eur-Centr...,,P. Dobrovolný,https://www.ncdc.noaa.gov/paleo/study/9970,dobrovolny2010httpswwwncdcnoaagovpaleostudy997...,World Data Center for Paleoclimatology,World Data Center for Paleoclimatology,...,,,"[{'uncertainty': 0.34, 'uncertaintyType': 'RMS...",,,,,,,
14,paleoData,age,Eur-CentralandEasternPyrenees.Pla.2004,https://data.mint.isi.edu/files/lipd/Eur-Centr...,,Jordi Catalan;Sergi Pla,,pla2004chrysophytecystsfromlakes,,Chrysophyte cysts from lake sediments reveal t...,...,0.0,this study,"[{'uncertainty': 0.7, 'uncertaintyType': 'Lowe...",,"[-44.0, -34.0, -13.0, 7.0, 18.0, 34.0, 45.0, 5...",BP,,,,
29,paleoData,age,Eur-LakeSilvaplana.Trachsel.2010,https://data.mint.isi.edu/files/lipd/Eur-LakeS...,,M. Trachsel,http://www.ncdc.noaa.gov/paleo/study/13016,trachsel2010httpwwwncdcnoaagovpaleostudy13016D...,World Data Center for Paleoclimatology,World Data Center for Paleoclimatology,...,,,[{'uncertaintyType': 'Root-mean-squared error ...,,,,,,,
38,paleoData,age,Arc-Tornetrask.Melvin.2012,https://data.mint.isi.edu/files/lipd/Arc-Torne...,,H. Grudd;K. R. Briffa;T. M. Melvin,,melvin2012potentialbiasinupdatingtr,,Potential bias in updating tree-ring chronolog...,...,2013.0,https://crudata.uea.ac.uk/cru/papers/melvin201...,[{'uncertaintyType': 'calibrationRMSE'}],,,,Climatic Research Unit,,,


In [41]:
df_temp.shape

(12, 120)

which leaves us with 12 timeseries.

If you are only interested in `Ocn-RedSea.Felis.2000` and `Arc-PolarUrals.Wilson.2015`, you can pass these names directly to the function:

In [35]:
ts_list_short, df_short = D_dir.get_timeseries(['Ocn-RedSea.Felis.2000', 'Arc-PolarUrals.Wilson.2015'], to_dataframe=True)

Extracting timeseries from dataset: Ocn-RedSea.Felis.2000 ...
Extracting timeseries from dataset: Arc-PolarUrals.Wilson.2015 ...


In [15]:
ts_list_short.keys()

dict_keys(['Ocn-RedSea.Felis.2000', 'Arc-PolarUrals.Wilson.2015'])

### Removing and popping datasets out of a LiPD object

You can also remove (i.e., delete the corresponding dataset from the `LiPD` object) or pop (i.e., delete the corresponding dataset from the `LiPD` object **and** return the dataset) datasets from a `LiPD` object. Note that these functionalities behave similarly as the functions with the same names on Python lists. These functions underpin more adavanced filtering and querying capabilities that we will discuss in later tutorials. 

First let's make a [copy](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.copy) of `D_dir`:

In [42]:
D_test = D_dir.copy()
print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen

And let's [remove](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.remove) `Arc-AkademiiNaukIceCap.Opel.2013`, which corresponds to the last entry in the list above:

In [43]:
D_test.remove('Arc-AkademiiNaukIceCap.Opel.2013')

print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen

Now let's [pop](https://pylipd.readthedocs.io/en/latest/source/pylipd.html#pylipd.lipd.LiPD.pop) `Eur-Stockholm.Leijonhufvud.2009` from `D_test`:

In [44]:
d_eur = D_test.pop('Eur-Stockholm.Leijonhufvud.2009')

Now let's have a look at `d_eur`:

In [45]:
print(d_eur.get_all_dataset_names())

['Eur-Stockholm.Leijonhufvud.2009']


It contains the dataset we are expecting. Let's have a look at `D_test`:

In [46]:
print(D_test.get_all_dataset_names())

['Ocn-RedSea.Felis.2000', 'Arc-Forfjorddalen.McCarroll.2013', 'Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschen


<div class="alert alert-warning">
    The dataset was removed from `D_test` in the process. Hence, it's always prudent to make a copy of the original object when using the `remove` and `pop` functionalities. 
</div>


If can also remove/pop more than one dataset at a time:

In [47]:
rem = ['Ocn-RedSea.Felis.2000','Arc-Forfjorddalen.McCarroll.2013']

D_test.remove(rem)
print(D_test.get_all_dataset_names())

['Eur-Tallinn.Tarand.2001', 'Eur-CentralEurope.Dobrovoln_.2009', 'Eur-EuropeanAlps.B_ntgen.2011', 'Eur-CentralandEasternPyrenees.Pla.2004', 'Arc-Tjeggelvas.Bjorklund.2012', 'Arc-Indigirka.Hughes.1999', 'Eur-SpannagelCave.Mangini.2005', 'Ocn-AqabaJordanAQ19.Heiss.1999', 'Arc-Jamtland.Wilson.2016', 'Eur-RAPiD-17-5P.Moffa-Sanchez.2014', 'Eur-LakeSilvaplana.Trachsel.2010', 'Eur-NorthernSpain.Mart_n-Chivelet.2011', 'Eur-MaritimeFrenchAlps.B_ntgen.2012', 'Ocn-AqabaJordanAQ18.Heiss.1999', 'Arc-Tornetrask.Melvin.2012', 'Eur-EasternCarpathianMountains.Popa.2008', 'Arc-PolarUrals.Wilson.2015', 'Eur-LakeSilvaplana.Larocque-Tobler.2010', 'Eur-CoastofPortugal.Abrantes.2011', 'Eur-TatraMountains.B_ntgen.2013', 'Eur-SpanishPyrenees.Dorado-Linan.2012', 'Eur-FinnishLakelands.Helama.2014', 'Eur-Seebergsee.Larocque-Tobler.2012', 'Eur-NorthernScandinavia.Esper.2012', 'Arc-GulfofAlaska.Wilson.2014', 'Arc-Kittelfjall.Bjorklund.2012', 'Eur-L_tschental.B_ntgen.2006']
