<div class="clearfix" style="padding: 10px; padding-left: 0px; padding-top: 40px">
<img src="https://www.dropbox.com/s/y8dd1z3sl4uofep/lipd_logo.png?raw=1" width="700px" class="pull-right" style="display: inline-block; margin: 0px;">
</div>

## Welcome to the LiPD Quickstart Notebook!

This Notebook was created to help you get familiar with the commands you can use in LiPD. Follow each step and experiment as much as you'd like! Each step will help prepare you for the next step. 

For this tutorial we will be using the example files found in the Github Repostiory's [Examples folder](https://github.com/nickmckay/LiPD-utilities/tree/master/Examples).


### Table of Contents
  * [Install Package](#install)
  * [Import Package](#lipdimport)
  * [Reading Files](#lipdread)
  * [Load LiPDs](#loadlipds)
  * [Viewing LiPD data](#lipddata)
  * [Extract TimeSeries](#timeseries)
  * [Timeseries Object](#tso)
  * [Excel Converter](#excel)
  * [NOAA Converter](#noaa)
  * [DOI Updater](#doi)
  * [Pandas](#pandas)
  * [Remove files](#removelipds)
  * [Glossary](#glossary)


## Install Package <a id="install"></a>

This guide assumes that you have already followed the [installation steps](https://nickmckay.github.io/LiPD-utilities/) and the LiPD package is installed on your computer. 

##  Import Package <a id="lipdimport"></a>


Import the LiPD package into your python environment. 

</p>
</div>

In [None]:
# Import the LiPD package
import lipd

##  Reading Files<a id="lipdread"></a>

There are four valid file types that you may import. LiPD (.lpd), Excel (.xls, .xlsx), NOAA (.txt)
Use the appropriate function for the file type that you would like to read. 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p> All "read" functions accept a file path argument. If the path argument is left empty, a GUI window will appear and allow you to choose a file or choose a directory.
</p><br>
<p>
If you want to select multiple files, but not a whole directory, use any function under "# Read File" without a path argument. Use the GUI to select the files you want.  
</p>
</div>


In [None]:
# Read File - GUI
lipd.readLipd()
lipd.readExcel()
lipd.readNoaa()

# Read Directory - GUI
lipd.readLipds()
lipd.readExcels()
lipd.readNoaas()

# Read with path argument - No GUI
lipd.readLipd("/path/to/file.lpd")
lipd.readLipds("/path/to/dir/")


####  Excel Spreadsheet Converter <a id="excel"></a>

-----

Microsoft Excel spreadsheets must be converted to LiPD before any LiPD functions can be used. Use the [Excel template](https://github.com/nickmckay/LiPD-utilities/raw/master/Examples/excel_lipd_v1.2.xlsx) to create an Excel file with your data.  Make sure to follow the formatting guidelines and the hints noted throughout the spreadsheet.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p> The Excel converter runs every `.xls` and `.xlsx` file currently loaded.
</p>
</div>

In [None]:
lipd.excel()

####  NOAA Converter <a id="noaa"></a>

-----

National Oceanic and Atmospheric Administration (NOAA) text files must be converted to LiPD before any LiPD functions can be used. The converter is designed to parse data from the [NOAA template](https://github.com/nickmckay/LiPD-utilities/raw/master/Examples/noaa_v2.0.txt). Please insert your data in this template format to ensure a complete and accurate conversion to LiPD. Use the example file as a reference for correct formatting.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p> The NOAA converter will run for every `.txt` file currently loaded.

</p>
</div>

In [None]:
lipd.noaa()

####  DOI Updater <a id="doi"></a>

-----

The DOI updater will take your LiPD files and update them with the most recent information provided by [doi.org](doi.org). The updater will run once per LiPD, and will skip any LiPD files that were updated previously. 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p> `doi()` will update the LiPD files stored on your computer. It will not update LiPD files currently loaded.
</p>
</div>

In [None]:
lipd.doi()

##  Writing Files <a id="writelipds"></a>

-----
Save all datasets currently loaded to LiPD files. 


In [None]:
# Write Files - GUI
lipd.writeLipds()

# Write with path argument - No GUI
lipd.writeLipds("/path/to/dir/")

##  Pickling Data <a id="pickle"></a>

-----
The Pickle module is a python core module allows us to share LiPD data with python 2.7 users. It won't have the support or functions of LiPD Utilities, but it will give access to the data. 

A Pickle file (.pklz) is a compressed archive file. It's small size makes sharing easy. 

In [None]:
import pickle
import gzip

# Read a pickle file
f = gzip.open('filename.pklz','rb')
newData = pickle.load(f)
f.close()


# Write a pickle file 
yourData = {'a':'blah','b':range(10)}
f = gzip.open('filename.pklz','wb')
pickle.dump(yourData,f)
f.close()

##  Other Functions<a id="other"></a>

-----
The functions below are not critical to the use of LiPD Utilities, but are included for convenience as helper functions that may make your workflow easier. 

####  SHOW data  <a id="showdata"></a>

----
`Show` functions are useful for printing data to the console.

`showLipds()`<br>
* Show the names of the LiPD files in the current LiPD Library.

`showMetadata(filename)`<br>
* Show metadata for a specific dataset

`showCsv(filename)`<br>
* Show CSV data for a specific dataset

`showDfs(dataframe_dictionary)`<br>
* Show a list of dataframes in a dataframe dictionary

`showTso(tso)`<br>
* Show the keys found in the current time series object.
* Note: only available after using `extractTs()`


####  GET data  <a id="getdata"></a>
----

`Get` functions are useful for retrieving data and placing it in the workspace as a variable.

`getCsv(filename)`<br>
* Returns: dictionary
* Get the values for the specific dataset.
<br>

`getMetadata(filename)`<br>
* Returns: dictionary
* Get the metadata for the specific dataset.
<br>


In [None]:
odp_csv = lipd.getCsv("ODP1098B12.lpd")

In [None]:
odp_metadata = lipd.getMetadata("ODP1098B12.lpd")

### TimeSeries  <a id="timeseries"></a>

----

TimeSeries functions are useful creating, filtering, and exporting TimeSeries from the LiPD data in the workspace.

`extractTs()`<br>
* Returns: dictionary
* Creates a time series from the data in the workspace.
<br>

`collapseTs(time_series)`<br>
* Puts time series data back into the workspace data. 
<br>

`find(expression, time_series)`<br>
* Returns: Names of matching records(array), Filtered time series (dictionary)
* Find all time series objects that match a certain criteria. 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**WARNING**</p>
    <p> `extractTs()` will NOT create time series objects for age/year/depth variables. All age/year/depth data is included with other time series objects (when available) to allow for comparison calculations.
    <br>
    <br>
    <p>`collapseTs()` will overwrite the contents of the LiPD Library</p>
</div>

### TimeSeries Object (example)  <a id="tso"></a>

----

A time series holds many time series objects. Below is an example of what the contents of a time series object look like. 


In [16]:
%%html
<img src="./tso1.png" />
<img src="./tso2.png" />

In [None]:
time_series = lipd.extractTs()

In [None]:
new_time_series = lipd.find("archiveType is marine sediment", time_series)

In [None]:
new_time_series = lipd.find("geo_meanElev <= -1000 && geo_meanElev > -1100", time_series)

In [None]:
lipd.collapseTs(time_series)

### Pandas  Dataframes<a id="pandas"></a>

----

`ensToDf(arrays)`
* Returns: data frame (obj)
* Create an ensemble data frame from some given nested numpy arrays

`lipdToDf(filename)`
* Returns: data frame(s) (dictionary)
* Creates a collection of pandas data frames from LiPD data

`tsToDf(time_series, filename)`
* Returns: data frame(s) (dictionary)
* Creates a collection of pandas data frames from a TimeSeries object. The CSV data frame will be plot with depth, age, and year columns when available.


<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
    <p>After creating the data frames, calling a specific data frame variable will display the formatted data frame. </p>
</div>

In [None]:
dfs_lipd = lipd.lipdToDf("ODP1098B12.lpd")

In [None]:
lipd.showDfs(dfs_lipd)

In [None]:
dfs_lipd["metadata"]

In [None]:
dfs_lipd["paleoData"]["ODP1098B12.Paleo1.measurementTable1.csv"]

In [None]:
dfs_lipd["chronData"]["ODP1098B12.Chron1.measurementTable1.csv"]

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
    <p>TimeSeries objects are not able to use the autocomplete feature for filenames. Be sure to run `showTsos()` and copy/paste the TimeSeries object of interest</p>
</div>

In [None]:
dfs_ts = lipd.tsToDf(time_series, "ODP1098B12_data_SST")

In [None]:
showDfs(dfs_ts)

In [None]:
dfs_ts["metadata"]

In [None]:
dfs_ts["paleoData"]

In [None]:
dfs_ts["chronData"]["ODP1098B12"]

####  Removing LiPDs <a id="removelipds"></a>
----

`removeLipd(filename)`
* Remove one dataset from the LiPD Library

`removeLipds()`
* Remove all datasets from the LiPD Library

In [None]:
lipd.removeLipds()

## Glossary <a id="glossary"></a>

----

<div style="background-color: #f2f2f2">
<dl style="padding-top=20px">

<dt style="margin-left: 0.5em"> DOI </dt><br>
<dd style= "margin-left: 2em"> A Digital Object Identifier is a unique alphanumeric string assigned by a registration agency to identify content and provide a persistent link to its location on the Internet. The publisher assigns a DOI when your article is published and made available electronically. </dd><br>

<dt style="margin-left: 0.5em"> Environment / Workspace </dt><br>
<dd style= "margin-left: 2em"> The current state of the Notebook. Variables and modules in the Notebook are constantly changing, and all of these contribute to the state of the workspace. </dd><br>

<dt style="margin-left: 0.5em"> LiPD </dt><br>
<dd style= "margin-left: 2em"> Refers to the LiPD package or LiPD files, depending on the context.  </dd><br>

<dt style="margin-left: 0.5em"> LiPD Library </dt><br>
<dd style= "margin-left: 2em"> A collection of LiPD file data. </dd><br>

<dt style="margin-left: 0.5em"> Module </dt><br>
<dd style= "margin-left: 2em"> A set of related functions that is imported for use in the Notebook.</dd><br>

<dt style="margin-left: 0.5em"> NOAA </dt><br>
<dd style= "margin-left: 2em"> National Oceanic and Atmospheric Administration. This document uses NOAA to signify a specific text file format used by the organization. </dd><br>

<dt style="margin-left: 0.5em"> Notebook </dt><br>
<dd style= "margin-left: 2em"> Jupyter uses Notebooks as a way to save a single session of workflow and scientific computations. This Quickstart Notebook is for learning and documentation, though you may later create your own Notebooks with graphs, functions, and various datasets. </dd><br>

<dt style="margin-left: 0.5em"> Magic Commands </dt><br>
<dd style= "margin-left: 2em"> Special built-in Jupyter commands that provide common useful functions that "magically" work. </dd><br>

<dt style="margin-left: 0.5em"> TimeSeries Library </dt><br>
<dd style= "margin-left: 2em"> A collection of TimeSeries data and objects. </dd><br>

<dt style="margin-left: 0.5em"> TimeSeries Object </dt><br>
<dd style= "margin-left: 2em"> An extracted piece of the TimeSeries from a LiPD file. </dd><br>

</dl>
</div>