# Evaluation, analysis, and reporting on LTER metadata from DataONE

The first step is to extract all nodes that contain text, element or attribute, into a csv that flattens the xml while retaining all information, except for order of elements (though there is a parameter to extract that information in the XSL if you're interested). This includes the collection and record the xpath and content comes from for ease of combining them with other years at the site.

Second, we create a version of the data that only contains the xpaths from the LTER recommendation. To do this I've used the EML xpaths that coorespond with the EML Best Practices for LTER Sites recommendation, and in some cases, the element name. This instantiation of the recommendation does not go all the way into the child elements neccessary for the recommendation, but is employed in such a way as to scrape all the children elements used. This way the result contains all of the metadata that that site used to add additional context to the concepts the recommendation contains.

Next these csv are analyzed for counts and occurrence and combined by site.

Finally to compare directly between the differences in child elements each site uses, we pivot the data to contain the highest occurring child element at each site every year the site contributed metadata and visualize the completeness. This process is repeated for each year across LTER. The process will be done for each version of the Best Practices

Combined reports for comparison of each site over time, and a report for LTER over time are created.

A final cell is available that can generate detailed reports including record content that can be run on any site__year combination. The evaluated data could be used with more in depth tests that look at the content existing at the element and quality of fulfilling a communitie's information needs can be visualized in the same fashion as you will see completeness of collections for a communities recommendation. When a repository can apply tests like these to the records in their catalog they can then ensure the curation of their content, even if it is an interdisciplinary one like DataONE. This allows their users to trust the repository as wellcan for including all the concepts a potential reuser would need to identify, discover, evaluate, access and integrate the dataset into their own scientific research as LTER has done. 

This is the NbMeta metadata record for this notebook:

[Create_a_metadataset_nbmeta.json](../metadata/Evaluate_Analyze_and_Report_metadata_nbmeta.json)

## Prepare the notebook

* import modules
* define variables
* define recommendations

In [1]:
import sys
import os
import pandas as pd
import gzip
import shutil
import subprocess
import tarfile

#import local python module
sys.path.append(os.path.join(os.path.dirname(sys.path[0]),'../scripts'))
import EARmd as md

'''
If you want to place your reports into a specific folder in a Google Drive,
grab the url after the "folders/", and paste it into the MyfolderID variable below, 
otherwise set to None. 
To make the md.writeToGoogle function work, you'll need to update the 
"scripts/client_secrets.json", and settings.yaml with your own client id 
and client secret. If you don't want to go through this process 
you can simply generate the Excel spreadsheets or click on the links 
in the nbviewer version of this notebook to view the Google Sheet versions
of the reports.
'''
MyfolderID = 'yourFolderIDhere'

# The list of LTER sites to evaluate, analyze and report on
Sites = ['and','arc','bes','bnz','cce',
         'cdr','cap','cwt','fce','gce',
         'hfr','hbr','jrn','kbs','knz',
         'luq','mcm', 'mcr','nwt','ntl',
         'pal','pie','sbc','sev','sgs','vcr']


# Every year we have metadata from LTER sites
YearsInvestigated = ['2005','2006','2007','2008','2009',
                     '2010','2011','2012','2013','2014',
                     '2015','2016','2017','2018'
                    ]
os.makedirs("../data/LTER", exist_ok=True)

# create a list of each collections name
collectionsToProcess = [name for name in os.listdir("../collections/LTER") if not name.startswith('.') ]

### Define a recommendation for each version of the EML Best Practices for LTER Sites

We need to be able to look at the recommendation specific elements and attributes used in each version. To create analyses and reports for this subset of the evaluated data, we will need to build a few lists and a dictionary for each version.

#### 2004 EML Best Practices for LTER Sites

Representation of the first version of LTER's EML Best Practices for LTER Sites

<p>Adapted from: </p>
    
[https://lternet.edu/wp-content/uploads/2010/12/emlbestpractices_oct2004_final.pdf](https://lternet.edu/wp-content/uploads/2010/12/emlbestpractices_oct2004_final.pdf)

In [2]:
# create a pattern to look for elements used in fulfilling the communities stated information needs
LTERrecV2004Elements = ['/eml:eml/@packageId',
 '/eml:eml/dataset/title',
 '/eml:eml/dataset/creator',
 '/eml:eml/dataset/metadataProvider',
 '/eml:eml/dataset/associatedParty',
 '/eml:eml/dataset/publisher',
 '/eml:eml/dataset/pubDate',
 '/eml:eml/dataset/contact',
 '/eml:eml/dataset/abstract',
 '/eml:eml/dataset/keywordSet/keyword',
 '/eml:eml/dataset/distribution',
 '/eml:eml/dataset/coverage/geographicCoverage',
 '/eml:eml/dataset/coverage/taxonomicCoverage',
 '/eml:eml/dataset/coverage/temporalCoverage',
 '/eml:eml/dataset/maintenance',
 '/eml:eml/dataset/intellectualRights',
 '/eml:eml/dataset/methods',
 '/eml:eml/dataset/project',
 '/eml:eml/dataset/dataTable/entityDescription',
 '/eml:eml/dataset/dataTable/attributeList/attribute/attributeDefinition',
 '/eml:eml/access',
 '/eml:eml/dataset/dataTable/physical/dataFormat',
 '/eml:eml/dataset/dataTable/attributeList',
 '/eml:eml/dataset/dataTable/constraint',
 '/eml:eml/dataset/methods/qualityControl']

# a dictionary containing the recommendation xpaths and the relevent sub element 
RecDict2004 = {"/eml:eml/@packageId": "packageId",
           "/eml:eml/dataset/title": "title",
           "/eml:eml/dataset/creator": "creator",
           "/eml:eml/dataset/metadataProvider": "metadataProvider",
           "/eml:eml/dataset/associatedParty": "associatedParty",
           "/eml:eml/dataset/publisher": "publisher",
           "/eml:eml/dataset/pubDate": "pubDate",
           "/eml:eml/dataset/contact": "contact",
           "/eml:eml/dataset/abstract": "abstract",
           "/eml:eml/dataset/keywordSet/keyword": "keyword",
           "/eml:eml/dataset/distribution": "distribution",
           "/eml:eml/dataset/coverage/geographicCoverage": "geographicCoverage",
           "/eml:eml/dataset/coverage/taxonomicCoverage": "taxonomicCoverage",
           "/eml:eml/dataset/coverage/temporalCoverage": "temporalCoverage",
           "/eml:eml/dataset/maintenance": "maintenance",
           "/eml:eml/dataset/intellectualRights": "intellectualRights",
           "/eml:eml/dataset/methods/qualityControl": "qualityControl",
           "/eml:eml/dataset/methods": "methods",
           "/eml:eml/dataset/project": "project",
           "/eml:eml/dataset/dataTable/entityDescription": "entityDescription",
           "/eml:eml/dataset/dataTable/attributeList/attribute/attributeDefinition": "attributeDefinition",
           "/eml:eml/access": "access",
           "/eml:eml/dataset/dataTable/physical/dataFormat": "dataFormat",
           "/eml:eml/dataset/dataTable/attributeList": "attributeList",
           "/eml:eml/dataset/dataTable/constraint": "constraint",
           "Number of Records": "Number of Records"
          }
# define a list of element recommendation level
LevelOrder2004 = ["Number of Records",
    'Identification',"Identification","Identification","Identification","Identification","Identification","Identification","Identification","Identification","Identification","Identification",
    "Discovery","Discovery","Discovery","Discovery",
    "Evaluation","Evaluation","Evaluation","Evaluation","Evaluation",
    "Access","Access",
    "Integration","Integration","Integration",]

# create a list to order the table that corresponds with the order of the LTER recommendation levels. 
ElementOrder2004 = ["Number of Records","packageId","title","creator","metadataProvider","associatedParty","publisher","pubDate","contact","abstract","keyword","distribution",
                "geographicCoverage","taxonomicCoverage","temporalCoverage","maintenance",
                "intellectualRights","methods","project","entityDescription","attributeDefinition",
                "access","dataFormat",
                "attributeList","constraint","qualityControl"]
# Used to order a dataframe in the order of the recommendation
ConceptOrder2004 = ['Number of Records',
                'Resource Identifier',
                "Resource Title",
                "Author / Originator",
                "Metadata Contact",
                "Contributor Name",
                "Publisher",
                "Publication Date",
                "Resource Contact",
                "Abstract",
                "Keyword",
                "Resource Distribution",
                "Spatial Extent",
                "Taxonomic Extent",
                "Temporal Extent",
                "Maintenance",
                "Resource Use Constraints",
                "Process Step",
                "Project Description",
                "Entity Type Definition",
                "Attribute Definition",
                "Resource Access Constraints",
                "Resource Format",
                "Attribute List",
                "Attribute Constraints",
                "Resource Quality Description"]

#### 2011 EML Best Practices for LTER Sites
<p>Representation of the EML Best Practices for LTER Sites Version 2 from 2011</p>
<p>Adapted from: </p>
    
[https://lternet.edu/wp-content/uploads/2011/08/emlbestpractices-2.0-FINAL-20110802.pdf](https://lternet.edu/wp-content/uploads/2011/08/emlbestpractices-2.0-FINAL-20110802.pdf)

In [3]:
# create a pattern to look for elements used in fulfilling the communities stated information needs
LTERrecV2011Elements = [
    '/eml:eml/@xsi:schemaLocation',# recommended
    '/eml:eml/@packageId',
    '/eml:eml/@system',# optional
    '/eml:eml/access',# optional
    '/eml:eml/dataset/alternateIdentifier',
    '/eml:eml/dataset/title', 
    '/eml:eml/dataset/creator',
    '/eml:eml/dataset/contact',# required
    '/eml:eml/dataset/metadataProvider',
    '/eml:eml/dataset/associatedParty',
    '/eml:eml/dataset/publisher',
    '/eml:eml/dataset/pubDate',
    '/eml:eml/dataset/abstract',
    '/eml:eml/dataset/project/abstract',
    '/eml:eml/dataset/keywordSet',
    '/eml:eml/dataset/project/keywordSet',
    '/eml:eml/dataset/intellectualRights',
    '/eml:eml/dataset/distribution',
    '/eml:eml/dataset/coverage',
    '/eml:eml/dataset/maintenance',
    '/eml:eml/dataset/methods',
    '/eml:eml/dataset/project',
    '/eml:eml/dataset/dataTable',
    '/eml:eml/dataset/spatialRaster',
    '/eml:eml/dataset/spatialVector',
    '/eml:eml/dataset/storedProcedure',
    '/eml:eml/dataset/view',
    '/eml:eml/dataset/otherEntity',
    "/eml:eml/dataset/dataTable/attributeList",
    "/eml:eml/dataset/spatialRaster/attributeList",
    "/eml:eml/dataset/spatialVector/attributeList",
    "/eml:eml/dataset/storedProcedure/attributeList",
    "/eml:eml/dataset/view/attributeList",
    "/eml:eml/dataset/otherEntity/attributeList",
    "/eml:eml/dataset/dataTable/constraint",
    "/eml:eml/dataset/spatialRaster/constraint",
    "/eml:eml/dataset/spatialVector/constraint",
    "/eml:eml/dataset/storedProcedure/constraint",
    "/eml:eml/dataset/view/constraint",
    "/eml:eml/dataset/otherEntity/constraint",
    'eml:eml/additionalMetadata']

# A dictionary containing the recommendation xpaths and the relevent sub element. 
RecDict2011 = {'/eml:eml/@xsi:schemaLocation': "xsi:schemaLocation",
            "/eml:eml/@packageId": "packageId",
            '/eml:eml/@system': 'system',
            "/eml:eml/access": "access",
            '/eml:eml/dataset/alternateIdentifier': "alternateIdentifier",
            "/eml:eml/dataset/title": "title",
            "/eml:eml/dataset/creator": "creator",
            "/eml:eml/dataset/contact": "contact",
            "/eml:eml/dataset/metadataProvider": "metadataProvider",
            "/eml:eml/dataset/associatedParty": "associatedParty",
            "/eml:eml/dataset/publisher": "publisher",
            "/eml:eml/dataset/pubDate": "pubDate",
            "/eml:eml/dataset/abstract": "abstract",
            '/eml:eml/dataset/project/abstract': "abstract",
            "/eml:eml/dataset/keywordSet": "keywordSet",
            "/eml:eml/dataset/project/keywordSet": "keywordSet",
            "/eml:eml/dataset/intellectualRights": "intellectualRights",
            "/eml:eml/dataset/distribution": "distribution",
            "/eml:eml/dataset/coverage": "coverage",
            "/eml:eml/dataset/maintenance": "maintenance",
            "/eml:eml/dataset/methods": "methods",
            "/eml:eml/dataset/project": "project",
            "/eml:eml/dataset/dataTable/attributeList": "attributeList",
            "/eml:eml/dataset/spatialRaster/attributeList": "attributeList",
            "/eml:eml/dataset/spatialVector/attributeList": "attributeList",
            "/eml:eml/dataset/storedProcedure/attributeList": "attributeList",
            "/eml:eml/dataset/view/attributeList": "attributeList",
            "/eml:eml/dataset/otherEntity/attributeList": "attributeList",
            "/eml:eml/dataset/dataTable/constraint": "constraint",
            "/eml:eml/dataset/spatialRaster/constraint": "constraint",
            "/eml:eml/dataset/spatialVector/constraint": "constraint",
            "/eml:eml/dataset/storedProcedure/constraint": "constraint",
            "/eml:eml/dataset/view/constraint": "constraint",
            "/eml:eml/dataset/otherEntity/constraint": "constraint",
            "/eml:eml/dataset/dataTable": "[entity]",
            "/eml:eml/dataset/spatialRaster": "[entity]",
            "/eml:eml/dataset/spatialVector": "[entity]",
            "/eml:eml/dataset/storedProcedure": "[entity]",
            "/eml:eml/dataset/view": "[entity]",
            "/eml:eml/dataset/otherEntity": "[entity]",
            "/eml:eml/dataset/project": "project",

            '/eml:eml/additionalMetadata': 'additionalMetadata',
            "Number of Records": "Number of Records"
           }
# define a list of element recommendation level
LevelOrder2011 = ["Number of Records",'','','','','','','','','','','','','','','','','','','','','','','','']

# create a list to order the table that corresponds with the order of the LTER recommendation levels. 
ElementOrder2011 = ["Number of Records",
                 'xsi:schemaLocation',
                 'packageId',
                 'system',# optional
                 'access',# optional
                 'alternateIdentifier',
                 'title', 
                 'creator',
                 'contact',
                 'metadataProvider',
                 'associatedParty',
                 'publisher',
                 'pubDate',
                 'abstract',
                 'keywordSet',
                 'intellectualRights',
                 'distribution',
                 'coverage',
                 'maintenance',
                 'methods',
                 'project',
                 '[entity]',
                 'attributeList',
                 'constraint',
                 'additionalMetadata']
# Used to order a dataframe in the order of the recommendation
ConceptOrder2011 = ['Number of Records','','','','','','','','','','','','','','','','','','','','','','','','']

## Evaluation using the AllNodes.xsl transform

This XSL is standards agnostic. AllNodes will work with any number of valid XML records, regardless of their standards compliance or creativity.
The transform flattens the XML in each record in a directory into a csv. For each node that has text the XSL writes a row that contains the directory name, file name, text content, and the Xpath for each element and attribute in the records in the collection.


In [4]:

# use the list of collections to run the evaluation for each collection
for collection in collectionsToProcess:

    """
    build a shell command to run the Evaluation XSL. 
    You'll need java installed and describe the location in the first string of the cmd list
    """   
    cmd = ["/usr/bin/java",
           '-jar', "../scripts/saxon-b-9.0.jar",
           '-xsl:' + "../scripts/AllNodes.xsl",
           '-s:' + "../scripts/dummy.xml",
           '-o:' + "../data/LTER/"+ str(collection) + "_XpathEvaluated.csv",
           'recordSetPath=' + "../collections/LTER/" + str(collection) + "/"]
    # run the transform
    subprocess.run(' '.join(cmd), shell=True, check=True)
    xpath_eval_file = "../data/LTER/"+ str(collection) + "_XpathEvaluated.csv"
    with open(xpath_eval_file, 'rb') as f:
            gzxpath_eval_file = xpath_eval_file + '.gz'
            with gzip.open(gzxpath_eval_file, 'wb') as gzf:
                shutil.copyfileobj(f, gzf)
                os.remove(xpath_eval_file)

## Analysis using the MDeval.py module
The module has already been used for getting the records via the Requests module. Now we are going to utilize the flat nature of the evaluated metadataset to use pandas to analyze the metadata for elements in the LTER recommendation. This process will yield three versions of the dataset: the absolute return of the evaluation, and the subset each recommendation pattern identified. Each version will be organized differently. All three versions will have an analysis applied called XpathOccurrence. It returns various information about the occurrence of each xpath used in the collection's records. The most important of these is the percentage of records that contained which elements.



In [5]:
for collection in collectionsToProcess:
    # places for all the evaluated and analyzed data
    XpathEvaluated = os.path.join("../data/LTER/", collection + "_XpathEvaluated.csv.gz")
    XpathOccurrence = os.path.join("../data/LTER/", collection +'_XpathOccurrence.csv')

    # Read in the evaluated metadata
    EvaluatedDF = pd.read_csv(XpathEvaluated)

    # Use above dataframe and apply the xpathOccurrence functions from MDeval
    md.XpathOccurrence(EvaluatedDF, collection, XpathOccurrence)
    
    # Apply the recommendation to the collection
    md.applyRecommendation(LTERrecV2004Elements, 'BestPractices2004', collection)
    
    md.applyRecommendation(LTERrecV2011Elements, 'BestPractices2011', collection)
    


## Create reports 

#### LTER EML All Elements Useage and visualization
* The first row is the number of records. Use the *RecordCount* column
* Rows are Xpath in any record throughout the collection
* Columns are XpathCount, RecordCount, AverageOccurrencePerRecord, CollectionOccurrence%

#### Recommendation Elements Useage
* same as the Element Usage Analysis, but limited to elements and their children that occurr in the conceptual recommendation.
We will first apply a list of xpaths from a "50 thousand foot view". What is meant by this is that instead of  explicitly looking for each child element of /eml:eml/dataset/contact looking for xpaths that contain /eml:eml/dataset/contact. This will allow us to create a version of the evaluation that contains elements important to fulfilling specific recommendation needs. It will also allow for additional insight in how element choices shift over time. 

#### Recommendation Concepts Useage
* Take the occurrence percentage from the most used child element for each recommendation level parent element, and assign it to the element to get a high level view on recommendations compliance over time.

Use the analyzed data to create reports for each LTER site through time, and LTER the organization through time. All reports are created as Excel spreadsheets then uploaded and translated into Google Sheets, and shared as a viewable link. To manipulate any of the data yourself copy the Sheet or download as an Excel file.

#### Completeness Through Time
* Visualize the Best Practices versions completeness percentage for each LTER Site for the years 2005-2018 and take the mean of each year to visualize LTER's average completeness through time as a way to determine the likelyhood a catalog will address the information needs of LTER data users and producers. 
<p>Gordon, S 2019 Is your metadata catalog in shape?. Zenodo. https://doi.org/10.5281/zenodo.2558631</p>

### Combine collection analyses then create reports for each site through time and for LTER as an organization

In [6]:
os.makedirs("../reports/LTER", exist_ok=True)

for Site in Sites:
    # places for all the combined data and SiteThroughTimeReport
    DataDestination = os.path.join('../reports/LTER/', Site.upper() + ".xlsx")
    XpathOccurrence = os.path.join("../data/LTER/", Site.upper() + '_XpathOccurrence.csv')
    BestPractices2004Occurrence = os.path.join("../data/LTER/", Site.upper() + '_BestPractices2004Occurrence.csv')
    BestPractices2004Concept = os.path.join('..','data','LTER', Site.upper() + '_BestPractices2004Completeness.csv')
    BestPractices2004Graph = os.path.join('..','data','LTER', Site.upper() + '_BestPractices2004_.png')
    BestPractices2011Occurrence = os.path.join("../data/LTER/", Site.upper() + '_BestPractices2011Occurrence.csv')
    BestPractices2011Concept = os.path.join('..','data','LTER', Site.upper() + '_BestPractices2011Completeness.csv')
    BestPractices2011Graph = os.path.join('..','data','LTER', Site.upper() + '_BestPractices2011_.png')
    
    # combine the absolute occurance analysis for a site through time
    XpathOccurrenceToCombine = [os.path.join("../data/LTER", name) for name in os.listdir("../data/LTER") if name.startswith(Site.upper() + "__") and name.endswith('_XpathOccurrence.csv') ]
    md.CombineXPathOccurrence(XpathOccurrenceToCombine,
                              XpathOccurrence, to_csv=True)
    
    # Build lists of recommendation specific occurrence analysis for a site through time  
    BestPractices2004OccurrenceToCombine = [os.path.join("../data/LTER", name) for name in os.listdir("../data/LTER") if name.startswith(Site.upper() + "__") and name.endswith('_BestPractices2004Occurrence.csv') ]
    BestPractices2011OccurrenceToCombine = [os.path.join("../data/LTER", name) for name in os.listdir("../data/LTER") if name.startswith(Site.upper() + "__") and name.endswith('_BestPractices2011Occurrence.csv') ]
    
    # utilize function to combine the recommendation specific analyses 
    md.CombineAppliedRecommendation(Site, LTERrecV2004Elements, 'BestPractices2004', BestPractices2004OccurrenceToCombine)
    md.CombineAppliedRecommendation(Site, LTERrecV2011Elements, 'BestPractices2011', BestPractices2011OccurrenceToCombine)
    
    # create recommendation pivot tables and radar graphs to acess the parent elements useage through time
    md.Site_ttConceptAnalysis(Site, 'BestPractices2004', RecDict2004, LevelOrder2004, ConceptOrder2004, ElementOrder2004, YearsInvestigated, folderID=MyfolderID)
    md.Site_ttConceptAnalysis(Site, 'BestPractices2011', RecDict2011, LevelOrder2011, ConceptOrder2011, ElementOrder2011, YearsInvestigated, folderID=MyfolderID)
    
    #write full quality image to Google Drive and get a link to insert next to the lower-quality picture in the google sheet
    BestPractices2004GraphLink = md.WriteToGoogle(
        os.path.join('..','data','LTER',Site.upper() + '_BestPractices2004_.png'), folderID=MyfolderID, Convert=None, Link=True)
    BestPractices2011GraphLink = md.WriteToGoogle(
        os.path.join('..','data','LTER',Site.upper() + '_BestPractices2011_.png'), folderID=MyfolderID, Convert=None, Link=True)
                                       
    #create Excel report on all analyses, write additional functions on data to provide some collection analytics
    md.CombinationSpreadsheet(XpathOccurrence, BestPractices2004Occurrence,
                              BestPractices2004Concept, BestPractices2004Graph,
                              BestPractices2004GraphLink, DataDestination,
                              recommendationOccurrence2=BestPractices2011Occurrence,
                              RecommendationConcept2=BestPractices2011Concept,
                              RecommendationGraph2=BestPractices2011Graph,
                              RecGraphLink2=BestPractices2011GraphLink
                             )
    # write the spreadsheet to Google Drive, convert to Sheet
    md.WriteToGoogle(DataDestination, folderID=MyfolderID, Convert=True)

### Create a report on the entire organization through time

It may be valuable to directly compare site__year collections directly to see if trends occur in the recommendation elements across the sites in certain years. For example, do sites that start participating later go through similar improvement processes or do early adopters' experiences inform the structure(shape) of their metadata in observable ways?

In [7]:
# place for the report for the entire organization
DataDestination = os.path.join('../reports/LTER/', "All_LTER.xlsx")

#create variables for functions
BestPractices2004Occurrence = os.path.join("../data/LTER/", 'LTER_BestPractices2004Occurrence.csv')
BestPractices2004Concept = os.path.join('..','data','LTER', 'LTER_BestPractices2004Completeness.csv')
BestPractices2004Graph = os.path.join('..','data','LTER', 'LTER_BestPractices2004_.png')
BestPractices2011Occurrence = os.path.join("../data/LTER/", 'LTER_BestPractices2011Occurrence.csv')
BestPractices2011Concept = os.path.join('..','data','LTER', 'LTER_BestPractices2011Completeness.csv')
BestPractices2011Graph = os.path.join('..','data','LTER', 'LTER_BestPractices2011_.png')       
XpathOccurrence = os.path.join('..','data','LTER','LTER_XpathOccurrence.csv')

# combine individual analyses from each site__year
XpathOccurrenceToCombine = [os.path.join('..','data','LTER', name) for name in os.listdir("../data/LTER") if "__" in name and name.endswith('_XpathOccurrence.csv') ]

md.CombineXPathOccurrence(XpathOccurrenceToCombine,
                              XpathOccurrence, to_csv=True)

# Lists of recommendation occurrence analyses to combine
BestPractices2004OccurrenceToCombine = [os.path.join('..','data','LTER', name) for name in os.listdir("../data/LTER") if "__" in name and name.endswith('_BestPractices2004Occurrence.csv') ]
BestPractices2011OccurrenceToCombine = [os.path.join('..','data','LTER', name) for name in os.listdir("../data/LTER") if "__" in name and name.endswith('_BestPractices2011Occurrence.csv') ]

# utilize function to combine the recommendation specific analyses 
md.CombineAppliedRecommendation('LTER', LTERrecV2004Elements, 'BestPractices2004', BestPractices2004OccurrenceToCombine)
md.CombineAppliedRecommendation('LTER', LTERrecV2011Elements, 'BestPractices2011', BestPractices2011OccurrenceToCombine)

# create recommendation pivot tables and radar graphs
md.Organization_ttConceptAnalysis(Sites, 'BestPractices2004', RecDict2004, LevelOrder2004,
                          ConceptOrder2004, ElementOrder2004,
                          YearsInvestigated, MyfolderID
                         )
md.Organization_ttConceptAnalysis(Sites, 'BestPractices2011', RecDict2011, LevelOrder2011,
                          ConceptOrder2011, ElementOrder2011,
                          YearsInvestigated, MyfolderID
                         )

#write full quality image to Google Drive and get a link to insert next to the picture in the google sheet
BestPractices2004GraphLink = md.WriteToGoogle(
    os.path.join('..','data','LTER', 'LTER_BestPractices2004_.png'), folderID=MyfolderID, Convert=None, Link=True)
BestPractices2011GraphLink = md.WriteToGoogle(
    os.path.join('..','data','LTER','LTER_BestPractices2011_.png'), folderID=MyfolderID, Convert=None, Link=True)

#create Excel report on all analyses, write additional functions on data to provide some collection analytics
md.CombinationSpreadsheet(XpathOccurrence, BestPractices2004Occurrence,
                          BestPractices2004Concept, BestPractices2004Graph,
                          BestPractices2004GraphLink, DataDestination,
                          recommendationOccurrence2=BestPractices2011Occurrence,
                          RecommendationConcept2=BestPractices2011Concept,
                          RecommendationGraph2=BestPractices2011Graph,
                          RecGraphLink2=BestPractices2011GraphLink,
                         )
# Create a linked Google sheet to share
md.WriteToGoogle(DataDestination, folderID=MyfolderID, Convert=True)

### [Prepare the data and code for release so they can be cited in publication](CleanRepository_UploadToZenodo.ipynb)