# Extraction of temperature data out of the Temperature12K LiPD files

This notebook extracts the temperature data of all temperature series in the Temperature12K database and exports them in a tab-separated file. To get the latest version of the data, just click on *Cell* &rarr; *Run all*. When the notebook is finished (i.e. if you see a table [at the bottom](#final)), you can download the temperature data [here](../data/binned-temperature-data.tsv).

This notebook downloads the database from http://lipdverse.org/globalHolocene/current_version, based on the version you specify in the `db_version` variable (see below). It has been developed by Philipp Sommer (philipp.sommer@unil.ch), please do not hesitate to get in touch if you run into any problems.

**Things you might want to adapt:**

- the database version string (`db_version`, see [here](#db_version))
- the `binwidth` (by default 100, see [here](#binwidth)). We have to make averages over the time series in order to put everything together
- the records you want to download (see [here](#filter))
- the meta data for the temperature export (see [here](#meta-cols))

In [1]:
import pandas as pd
import lipd
import numpy as np
import contextlib
import os
import os.path as osp
from urllib import request
import zipfile

<a id=binwidth></a>We have to specify the bin widths in years. Each timeseries will be averaged into bins with this length in order to merge them all together.

In [2]:
binwidth = 100

<a id=db_version></a>Read in the LipD data from http://lipdverse.org/globalHolocene/current_version

You should set the latest version here manually!

In [3]:
db_version = '0_30_1'

In [4]:
%%time
if not osp.exists('../data'):
    os.makedirs('../data')
zipped = f'globalHolocene{db_version}.zip'
uri = f'http://lipdverse.org/globalHolocene/current_version/{zipped}'
target = osp.join('../data', zipped)
print('downloading ' + uri)
request.urlretrieve(uri, target)
with zipfile.ZipFile(target) as f:
    f.extractall('../data')

downloading http://lipdverse.org/globalHolocene/current_version/globalHolocene0_30_1.zip
CPU times: user 289 ms, sys: 317 ms, total: 607 ms
Wall time: 5.58 s


In [5]:
@contextlib.contextmanager
def remember_cwd():
    """Context manager to switch back to the current working directory

    Usage::

        with remember_cwd():
            os.chdir('test')
            print(os.getcwd())  # test
        print(os.getcwd())      # test/.."""
    curdir = os.getcwd()
    try:
        yield
    except:
        raise
    finally:
        os.chdir(curdir)

In [6]:
%%time
with remember_cwd():
    os.chdir('../data/')
    data = lipd.readLipd('.')

Disclaimer: LiPD files may be updated and modified to adhere to standards

Found: 943 LiPD file(s)
reading: Hams.Bennett.1987.lpd
reading: MV99-GC31.Barron.2012.lpd
reading: EagleTarn.Rees.2010.lpd
reading: Sonk11D.Lauterbach.2014.lpd
reading: HjortSo.Wagner.2008.lpd
reading: Alley.GISP2.2000.lpd
reading: Victoria.Berke.2012.lpd
reading: GeoB58044.Lamy.2003.lpd
reading: Frozen.Rosenberg.2004.lpd
reading: GeoB9307_3.lpd
reading: Composit_MD012421,KR0206.lpd
reading: FloatingIsland.Baker.1976.lpd
reading: LakeSix.Liu.1990.lpd
reading: ScreamingLynxLake.Clegg.2011.lpd
reading: LonarLake.Prasad.2014.lpd
reading: Malawi.Johnson.2016.lpd
reading: WCA3B_GumboLimboFarTail.Willard.2001.lpd
reading: Coghill.King.1986.lpd
reading: Rogers.Marsicek.2013.lpd
reading: LacduSommet.Hausmann.2011.lpd
reading: BanksIsland74MS15.Gajewski.2000.lpd
reading: Mohawk_Pond.Marsicek.2013.lpd
reading: Sagistalsee.Pollen.Switzerland.lpd
reading: MD03_2611.lpd
reading: ODP1084B.lpd
reading: GiK18515_3.lpd
reading: 

reading: GeoB6211_2.lpd
reading: MooseLake.Clegg.2010.lpd
reading: penny.Fisher.1998.lpd
reading: DonggeCave.Wang.2005.lpd
reading: MD03_2699.lpd
reading: YakYakumo.Leipe.2013.lpd
reading: LakeLyadhej-To.Andreev.2005.lpd
reading: igaliku.Massa.2012.lpd
reading: LochantSuidhe.Edwards.2007.lpd
reading: Khoe.Leipe.2015.lpd
reading: igelsjon..2003.lpd
reading: GeoB12605_3.lpd
reading: LacColin.Mott.1977.lpd
reading: Decoy.Szeicz.1991.lpd
reading: PC_01.lpd
reading: SouthRhodyBog.Booth.2012.lpd
reading: POS362_2_33.lpd
reading: Little.Whitlock.1995.lpd
reading: Yanahaizi.Chen.2003.lpd
reading: Egelsee.Larocque.2010.lpd
reading: Baldegg.Wirth.2013.lpd
reading: Camp11.Brubaker.1975.lpd
reading: MD99-2322.Jennings.2011.lpd
reading: Duck_Pond.Marsicek.2013.lpd
reading: soylegrotta.Lauritzen.1999.lpd
reading: OregonCaves.Ersek.2012.lpd
reading: GeoB10235.Kim.2002.lpd
reading: PerchLake_Manitoba.Ritchie.NA.lpd
reading: LV29_114_3.lpd
reading: MD84_527.lpd
reading: GeoB9508-5.Collins.2013.lpd
read

reading: tsuolbmajavri.Korhola.2009.lpd
reading: Hinterburgsee.Heiri.2003.lpd
reading: Bjornfjell.Brooks.2006.lpd
reading: renland.Vinther.2009.lpd
reading: BerryPondWest1979.lpd
reading: Asia-LianhuaCave.lpd
reading: BegbieLake.Brown.2019.lpd
reading: WCA3B_GumboLimboWetHead.Willard.2001.lpd
reading: Hani.Zheng.2017.lpd
reading: MD97_2151.lpd
reading: Gass.Webb.1983.lpd
reading: DavisLake.Whitlock.1981.lpd
reading: ME005A43JC.lpd
reading: CuevadeAsiul.Smith.2016.lpd
reading: rainbow.Clegg.2011.lpd
reading: V12-107.Schmidt.2016.lpd
reading: k2.Fallu.2005.lpd
reading: Gammelheimvatnet.Seppa.2009.lpd
reading: PlatypusTarn.Rees.2010.lpd
reading: Castor.Nelson.2011.lpd
reading: LakeLB1.Gajewski.1993.lpd
reading: CottonwoodLake.Grimm.1987.lpd
reading: DANA12_11_2_GC01.lpd
reading: Midden.Cluster1.2019.lpd
reading: LacAurelie.Bajolle.2018.lpd
reading: EDML.Stenni.2009.lpd
reading: CottonwoodPassPond.Fall.1997.lpd
reading: PP10-07.Mary.2016.lpd
reading: SouthAfricanSummerRainfallZone.Climate-

reading: Steel.Shuman.2016.lpd
reading: GeoB59012.Kim.2004.lpd
reading: ForestPond1.Lynch.1998.lpd
reading: MD97-2151.Yamamoto.2013.lpd
reading: Whatever.Wolfe.1996.lpd
reading: Topptjonna.Paus.2011.lpd
reading: TR163_22.lpd
reading: MB01.Pollen.Canada.lpd
reading: Mo05LakeMondsee.Swierczynski.2013.lpd
reading: Hulun.Chen.2008.lpd
reading: GeoB3129.Weldeab.2006.lpd
reading: P1B3.deVernal.2005.lpd
reading: ClearLake_Indiana.Bailey.1972.lpd
reading: Jenny.Larson.2016.lpd
reading: century.Vinther.2009.lpd
reading: LacNoir.Pollen.Canada.lpd
reading: Kollioksak.Anderson.NA.lpd
reading: yarnyshnoe.Seppa.2008.lpd
reading: Glattalp.Wirth.2013.lpd
reading: Telmen.Chen.2008.lpd
reading: LoughMeenachrinna,CountyDonegal.Taylor.2018.lpd
reading: quartz.Wooller.2012.lpd
reading: Gerzensee.Pollen.Switzerland.lpd
reading: SK168_GC_1.lpd
reading: KinderlinskayaCave.Baker.2017.lpd
reading: Colo.Baker.1990.lpd
reading: LacMarcotte.Labelle.1981.lpd
reading: SS8.Anderson.2012.lpd
reading: Furnival.McAndrew

Extract all the paleoData series from the data

In [7]:
%%time
all_series = lipd.extractTs(data)

extracting paleoData...
extracting: Hams.Bennett.1987
extracting: MV99-GC31.Barron.2012
extracting: EagleTarn.Rees.2010
extracting: Sonk11D.Lauterbach.2014
extracting: HjortSo.Wagner.2008
extracting: Alley.GISP2.2000
extracting: Victoria.Berke.2012
extracting: GeoB58044.Lamy.2003
extracting: Frozen.Rosenberg.2004
extracting: GeoB9307_3
extracting: Composit_MD012421,KR0206
extracting: FloatingIsland.Baker.1976
extracting: LakeSix.Liu.1990
extracting: ScreamingLynxLake.Clegg.2011
extracting: LonarLake.Prasad.2014
extracting: Malawi.Johnson.2016
extracting: WCA3B_GumboLimboFarTail.Willard.2001
extracting: Coghill.King.1986
extracting: Rogers.Marsicek.2013
extracting: LacduSommet.Hausmann.2011
extracting: BanksIsland74MS15.Gajewski.2000
extracting: Mohawk_Pond.Marsicek.2013
extracting: Sagistalsee.Pollen.Switzerland
extracting: MD03_2611
extracting: ODP1084B
extracting: GiK18515_3
extracting: DolgoeOzero.Pollen.Russia
extracting: MD06_3075
extracting: gunnarsfjorden.Allen.2007
extracting: 

extracting: Duck_Pond.Marsicek.2013
extracting: soylegrotta.Lauritzen.1999
extracting: OregonCaves.Ersek.2012
extracting: GeoB10235.Kim.2002
extracting: PerchLake_Manitoba.Ritchie.NA
extracting: LV29_114_3
extracting: MD84_527
extracting: GeoB9508-5.Collins.2013
extracting: MD02_2575.Ziegler.2008
extracting: ODP1098B
extracting: Moraine.Hansen.1985
extracting: MD99-2284.Risebrobakken.2009
extracting: Chaohu.Pollen.China
extracting: KirmanLake.MacDonald.2016
extracting: MoonLake.Shuman.2016
extracting: hallet.McKay.2009
extracting: LacBrule.Pollen.Canada
extracting: LakeStowell.Lemmen&Lacourse.2018
extracting: klotjarnen.Seppa.2009
extracting: HodellMiragoane1991
extracting: BS7938
extracting: DSDP480.Barron.2004
extracting: JR51-GC35.Bendle.2007
extracting: SaintCalixte.Larouche.NA
extracting: MD982181
extracting: LakeLR3.Gajewski.1992
extracting: vikjordvatnet.Balascio.2012
extracting: WCA3B_GumboLimboMarsh.Willard.2001
extracting: GEOFAR_KF16_MgCa.Repschlager.2016
extracting: EagleLa

extracting: Cummins.McAndrews.1984
extracting: LakeXimencuo.Pollen.China
extracting: lake850.Shemesh.2001
extracting: Kharinei.Jones.2011
extracting: LakeCF8.Axford.2011
extracting: Kurupa.Boldt.2015
extracting: LawDome.Dahl-Jensen.1999
extracting: HulunLake.Wen.2009
extracting: HomesteadScarp.McGlone.2010
extracting: CangoCave.Climate-12k-Site-Data-Entry-Form-v3.Cango
extracting: DevonIslandGlacier.McAndrews.1984
extracting: SaintGabriel.Larouche.1977
extracting: Edward.Russell.2003
extracting: Cadagno.Wirth.2013
extracting: NP04KH3KH4.Tierney.2008
extracting: Hudson.Bailey.1972
extracting: LagoMoreno.Pollen.Argentina
extracting: LostLake_MT.Whitlock.1989
extracting: ODP1019D
extracting: LittleBass.Swain.NA
extracting: arapisto.Sarmaja-Korjonen.2007
extracting: DonggeCave.Dyoski.2005
extracting: Hypkana.Hajkova.2016
extracting: LakeKupal'noe.Ilyashuk.2013
extracting: MSM05-723..2012
extracting: Pingwang.Pollen.China
extracting: SO139-74KL.Luckge.2008
extracting: LakeVan.Chen.2008
extr

extracting: LS009.Ledu.2010
extracting: MD99-2266.Moossen.2015
extracting: Mackenzie.Woltering.2014
extracting: Reiarsdalvatnet.Seppa.2009
extracting: ESM-1.Mackay.2012
extracting: andy.Szeicz.1995
extracting: MD95-2011.Calvo.2002
extracting: Tourbiere_deLanorie_CoteauJaune.Comtois.1982
extracting: NorthGRIP.Gkinis.2014
extracting: hjort.Schmidt.2011
extracting: MD952043
extracting: Eldora.Maher.1969
extracting: Buckbean.Baker.1976
extracting: Turkana.Berke.2012
extracting: Sellevollmyra.Vorren.2007
extracting: SchellingsBog.Barron.2004
extracting: MD99-2256.Jennings.2015
extracting: Pixie.Brown.2002
extracting: Leviathan.Lachniet.2014
extracting: FiddlersPond.White.1982
extracting: GISP2.Kobashi.2012
extracting: Boone.White.1986
extracting: LoneFoxLake.MacDonald.1985
extracting: Minnie.McAndrews.1992
extracting: OroLake.Michels.2007
extracting: A7.Sun.2005
extracting: 74KL_TEX86.Huguet.2006
extracting: Cub.Rasmussen.1982
extracting: bjornfjelltjorn.Brooks.2006
extracting: TTR17_434G.R

Extract only the temperature series with units in degrees Celsius

In [8]:
filtered_ts_temp12k = lipd.filterTs(all_series,'paleoData_inCompilation == Temp12k')
filtered_ts_useinglobal = lipd.filterTs(filtered_ts_temp12k,'paleoData_useInGlobalTemperatureAnalysis == TRUE')
temperatures = lipd.filterTs(filtered_ts_useinglobal,'paleoData_units == degC')
sorted(temperatures[0])

Found 1085 matches from 6396 columns
Found 548 matches from 1085 columns
Found 509 matches from 548 columns


['@context',
 'age',
 'ageUnits',
 'agesPerKyr',
 'archiveType',
 'archiveTypeOriginal',
 'createdBy',
 'dataSetName',
 'depth',
 'depthUnits',
 'geo_countryOcean',
 'geo_location',
 'geo_meanElev',
 'geo_meanLat',
 'geo_meanLon',
 'geo_notes',
 'geo_siteName',
 'investigator',
 'lastUpdated',
 'lipdVersion',
 'lipdverseLink',
 'maxYear',
 'minYear',
 'mode',
 'nUniqueAges',
 'neotomaCoreID',
 'paleoData_QCCertification',
 'paleoData_QCLastUpdated',
 'paleoData_TSid',
 'paleoData_calibration_method',
 'paleoData_calibration_methodDetail',
 'paleoData_calibration_uncertainty',
 'paleoData_calibration_uncertaintyType',
 'paleoData_datum',
 'paleoData_filename',
 'paleoData_hasMaxValue',
 'paleoData_hasMeanValue',
 'paleoData_hasMedianValue',
 'paleoData_hasMinValue',
 'paleoData_hasResolution_hasMaxValue',
 'paleoData_hasResolution_hasMeanValue',
 'paleoData_hasResolution_hasMedianValue',
 'paleoData_hasResolution_hasMinValue',
 'paleoData_inCompilation',
 'paleoData_interpretation',
 'p

<a id='meta_cols'></a>Transform the temperatures into pandas series. The `meta_cols` is the meta data that should be available as column in the data frame and has to match one of the keys in the previous list. The `meta_names` then specifies how the corresponding field from `meta_cols` appears in the final data frame.

In [9]:
meta_cols = ['geo_meanLon', 'geo_meanLat', 'dataSetName', 'paleoData_variableName']
meta_names = ['lon', 'lat', 'dataSetName', 'variableName']

series = [
    pd.Series(
        np.asarray(d['paleoData_values'], dtype=float),
        index=np.asarray(d['age'], dtype=float),
        name=tuple(d.get(name, np.nan) for name in meta_cols))
    for d in temperatures if 'age' in d]

and bin them based on centennial scales and merge them into one single dataframe

In [10]:
def age_grouper(age):
    """Bin age to centuries"""
    return age - (age % binwidth)

In [11]:
binned = [s.groupby(age_grouper).mean() for s in series]

merged = binned[0].to_frame()
merged.columns.names = meta_names

for s in binned[1:]:
    merged = merged.merge(s.to_frame(), left_index=True, right_index=True,
                          how='outer')
    
merged = merged.T

Now save the results

In [12]:
merged.to_csv('../data/binned-temperature-data.tsv', '\t')

<a id='final'></a>That's it! If the notebook has finished, you can download the temperature data [here](../data/binned-temperature-data.tsv) and the corresponding meta data [here](../data/meta.tsv) as tab-separated files.

So let's have a look into the final temperature data:

In [13]:
merged

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,-300.0,-200.0,-100.0,0.0,100.0,200.0,300.0,400.0,500.0,600.0,...,1237500.0,1240500.0,1241800.0,1243100.0,1261400.0,1263300.0,1264800.0,1266200.0,1267900.0,1277300.0
lon,lat,dataSetName,variableName,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
-80.41,43.2400,Hams.Bennett.1987,temperature,,,8.015000,8.005000,7.798000,,8.595000,,8.849000,8.500000,...,,,,,,,,,,
146.5914,-42.6799,EagleTarn.Rees.2010,temperature,,,,9.618000,,,10.489000,8.589000,7.566500,7.509000,...,,,,,,,,,,
-38.5,72.6000,Alley.GISP2.2000,temperature,,,,-37.785174,,,,,,,...,,,,,,,,,,
33.1983,-1.2317,Victoria.Berke.2012,temperature,,,,,,,,,,,...,,,,,,,,,,
37.38,-18.5650,GeoB9307_3,temperature,,,,,,25.300000,,,,25.900000,...,,,,,,,,,,
141.8,36.0000,"Composit_MD012421,KR0206",temperature,,,17.465000,17.236000,17.115000,17.500000,17.330000,17.542000,17.978000,17.810000,...,,,,,,,,,,
-107.47,44.5500,FloatingIsland.Baker.1976,temperature,,,,2.167000,,,,,,2.808000,...,,,,,,,,,,
-81.32,48.4000,LakeSix.Liu.1990,temperature,,,,2.175000,,2.321000,,2.175000,,,...,,,,,,,,,,
34.4372,-11.2939,Malawi.Johnson.2016,temperatureCorrected,,,,,,,24.115000,21.910000,23.190000,,...,23.34,20.62,17.26,17.46,24.2,19.51,22.02,17.9,17.65,21.32
-80.5056,25.7747,WCA3B_GumboLimboFarTail.Willard.2001,temperature,,,23.636000,23.425000,23.580000,,23.527000,,,,...,,,,,,,,,,


In [14]:
merged.loc[merged.iloc[:, 0].notnull()]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,-300.0,-200.0,-100.0,0.0,100.0,200.0,300.0,400.0,500.0,600.0,...,1237500.0,1240500.0,1241800.0,1243100.0,1261400.0,1263300.0,1264800.0,1266200.0,1267900.0,1277300.0
lon,lat,dataSetName,variableName,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
-120.82,49.54,BCTempComp.Gavin.2013,temperature,-0.6715,-0.5005,,-1.126667,-0.414,0.6675,0.28,0.3125,-0.4055,,...,,,,,,,,,,
-105.54,40.08,Redrock.Maher.1972,temperature,2.295,,1.665,,2.173,,,2.102,,2.739,...,,,,,,,,,,
