# ODM2 Example 2: Load Data into ODM2 from an Excel Template File

This example shows how to load data from an ODM2 Specimen Excel Template into an ODM2 SQLite database instance using the ODM2 YODA Tools library and the ODM2 Python application programming interface (API). This example uses SQLite for the database because it doesn't require a server. However, the process for creating ODM2 databases using other relational database management systems is very similar. The ODM2 Python API and YODA Tools demonstrated here can be used with ODM2 databases implemented in:
* Microsoft SQL Server
* MySQL
* PostgresSQL
* SQLite

Details of the Specimen Excel Template (and others) for ODM2 can be found at: https://github.com/ODM2/YODA-File/tree/master/excel_templates. These Excel Templates were designed for investigators to enter their observations and metadata for parsing into an ODM2 database. We designed these templates under the premise that most scientists have and use Excel and can load thier data into the teamplate.

Details of the YODA Tools libraries can be found at: https://github.com/ODM2/YODA-Tools. YODA Tools is a code base for working with ODM2 related files, loading them into ODM2 instances, and exporting from ODM2 to files.

Details of the ODM2 Python API can be found at: https://github.com/ODM2/ODM2PythonAPI. The ODM2 Python API is an application programming interface for ODM2 databases that is cross platform and cross database compatible.

In [1]:
import os
import sqlite3
import sys
import shutil

from yodatools.converter.Inputs.excelInput import ExcelInput
from yodatools.converter.Outputs.dbOutput import dbOutput
from yodatools.converter.Outputs.yamlOutput import yamlOutput

import odm2api.ODM2.models as odm2models

from IPython.core.display import display, HTML

You really should upgrade to SQLAlchemy=>0.6 to get the full bootalchemy experience


In [2]:
# JUST FOR CHECKING PACKAGE VERSION (USE DURING DEV ONLY)
import sqlalchemy
import yodatools

print("Package versions: sqlalchemy {}, yodatools {}".format(sqlalchemy.__version__, 
                                                             yodatools.__version__))

Package versions: sqlalchemy 1.1.7, yodatools 0.2.0-alpha


### Create a New ODM2 Database to Load Data Into

Create a blank ODM2 database into which we can load data. This is the same process from Example 1.

In [3]:
# Assign directory paths and SQLite file name
dpth = os.getcwd()
dbname_sqlite = "ODM2_Example2.sqlite"

sqlite_pth = os.path.join(dpth, os.path.pardir, "data", dbname_sqlite)

### Read the Excel Template Data File

The HydroShare resource containing this notebook also contains an ODM2 Excel Template file in the "data" directory that contains a bunch of data derived from water quality samples collected at monitoring sites that are part of our iUTAH Gradients Along Mountain to Urban Transitions (GAMUT) monitoring network. This code opens the Excel template file and parses it using YODA Tools and the ODM2 Python API. Once the Excel Template file has been parsed, all of the data in the Excel file are available in the ODM2 Python API objects and can be accessed via code.

**NOTE:  This Excel template file contains a fairly large number of samples, so it takes a few seconds to parse.**

In [4]:
yodaxls_dbname = 'YODA_iUTAH_Specimen_Example.xlsx'

yoda_pth = os.path.join(dpth, os.path.pardir, "data", yodaxls_dbname)
print(yoda_pth)

excel = ExcelInput()

/usr/mayorgadat/workmain/proposals/MyProposals-Fellowships/2013_NSF_BiGCZ_SSI/ProjectWork/ProjectMeetings/2017_11_UserWorkshop/wshp2017_tutorial_content/notebooks/../data/YODA_iUTAH_Specimen_Example.xlsx


**When the `parse` method below is applied to the `YODA_iUTAH_Specimen_Example_small.xlsx` file, it always fails on the first run, but then it always works on the second run!?**

In [5]:
excel.parse(yoda_pth)

48.5601329803


True

In [6]:
session = excel.sendODM2Session()
print("Done parsing Excel file!")

Done parsing Excel file!


### Get Data from the Current API Session

At this point, the data from the Excel Template file now exist in memory in the ODM2 Python API objects. We can do several things with the data now, including manipulating it or using it for visualization or analysis. We could write the data out to an operational ODM2 database, or we can write the data out to a YODA file.

The following is a quick example of a simple query to the current session where the data are now held in memory.

In [7]:
# Get all of the Methods that were loaded from the Excel file
methods = session.query(odm2models.Methods).all()
# Print some of the attributes of the methods
for x in methods:
    print("MethodCode: " + x.MethodCode + ", MethodName: " + x.MethodName + ", MethodTypeCV: " + x.MethodTypeCV)

MethodCode: Ast_TN, MethodName: Astoria Total Nitrogen, MethodTypeCV: Specimen analysis
MethodCode: Ast_TP, MethodName: Astoria Total Phosphorus, MethodTypeCV: Specimen analysis
MethodCode: Ast_EPA350.1, MethodName: Astoria EPA 350.1, MethodTypeCV: Specimen analysis
MethodCode: Ast_EPA353.2, MethodName: Astoria EPA 353.2, MethodTypeCV: Specimen analysis
MethodCode: Ast_EPA365.1, MethodName: Astoria EPA 365.1, MethodTypeCV: Specimen analysis
MethodCode: Pic_WaterISO, MethodName: Picarro Water Isotopes, MethodTypeCV: Specimen analysis
MethodCode: IDEXX_EC&TC, MethodName: IDEXX Ecoli & Total Coliform, MethodTypeCV: Specimen analysis
MethodCode: HIDS_WtrISO, MethodName: HIDS Water Isotopes, MethodTypeCV: Specimen analysis
MethodCode: Null_WtrTemp, MethodName: WaterTemp, MethodTypeCV: Unknown
MethodCode: Shim_DOC, MethodName: Shimadzu Dissolved Organic Carbon, MethodTypeCV: Specimen analysis
MethodCode: Shim_TDN, MethodName: Shimadzu Total Dissolved Nitrogen, MethodTypeCV: Specimen analysis

### Write the Data to the ODM2 Database

Now that the Excel template file has been parsed, all of the data exist in the API objects. The following code actually writes the data to the empty ODM2 SQLite database created above. Although I'm using SQLite for this example to avoid needing a separate database server, this functionality will also work with Microsoft SQL Server, MySQL, and PostgresSQL. 

**NOTE: This Excel template file contains a fairly large number of samples, so it will take a bit to write it all to the SQLite database.**

You can download the ODM2 SQLite file that has been populated with the data from the Excel Template file using the link that is printed when you run this code.

In [8]:
# Create new ODM2 SQLite database by copying the one created in Example 1

# First check to see if the ODM2 SQLite file already exists from previous runs of this example. 
# If so, delete it.
if os.path.isfile(sqlite_pth):
    os.remove(sqlite_pth)

shutil.copy(os.path.join(dpth, os.path.pardir, "data", "ODM2_Example1.sqlite"), 
            sqlite_pth)

In [9]:
# Write the data to the SQLite database, using the connection string to the ODM2 database defined
dbconn_str = "sqlite:///" + sqlite_pth
do = dbOutput()
do.save(session, dbconn_str)

# Provide a link to the ODM2 SQLite file that the data were written to
print("\nYou can download the ODM2 SQLite database populated with data using the following link:")

# This is hard-wiring a path expectation. 
# Which is fine if we know what file path jupyter was started under

sqlite_relpth = os.path.join(os.path.pardir, "data", dbname_sqlite)
display(HTML('<a href=%s target="_blank">%s<a>' % (sqlite_relpth, dbname_sqlite)))
# display(HTML('<a href=%s target="_blank">%s<a>' % ('data/%s' % dbname_sqlite, dbname_sqlite)))


You can download the ODM2 SQLite database populated with data using the following link:


### Write the Data to a YODA File
The data contained in the API objects can also be written out to a YAML Observations Data Archive (YODA) file.  After running the following code, you can download and examine the YAML file using the link that is printed.

In [10]:
# Write the output to a YODA file
yodaname = "ODM2_Example2.yaml"

yoda_relpth = os.path.join(os.path.pardir, "data", yodaname)

yo = yamlOutput()
yo.save(session, yoda_relpth)

# Provide a link to download the YODA file created
print("\nYou can download the populated YODA file using the following link:")

display(HTML('<a href=%s target="_blank">%s<a>' % (yoda_relpth, yodaname)))
#display(HTML('<a href=%s target="_blank">%s<a>' % ('data/%s' % dbname, dbname)))


You can download the populated YODA file using the following link:
