### Notebook Goals
* Create csv with recommendation concept content for the collection 
* translate concept names into the schema.org vocabulary
* create valid JSON-LD for a record
* Use Google's Structured Data Testing Tool to test results

In [1]:
# refine dataframe, create record json
import pandas as pd
# create dataframe structure that contains the records content
import MDeval as md

#### Describe the metadata. 
* What organization created the records? (Organization)
* What collection are the records from? (Collection)
* What dialect are the records written in? (Dialect)

In [2]:
# variables for function arguments, fill these out
Organization = 'ESIP'
Collection = 'MILES'
Dialect = 'ISO'

#### Read in the metadata's recommendation evaluated csv

In [3]:
# Read in the recommendation evaluated csv defined by the above variables
RecommendationEvaluatedDF = pd.read_csv(
    './data/'+Organization+'/'+Collection+'_'+Dialect+'_ConceptEvaluated.csv'
)

#### Record Concept Content Function
* Rows are records
* Columns are concepts

In [19]:
# Read in the recommendation evaluated csv created in the last Notebook
RecommendationEvaluatedDF = pd.read_csv(
    './data/'+Organization+'/'+Collection+'_'+Dialect+'_RecommendationEvaluated.csv'
)
''' requires a dataframe with concepts. Creates a vertical view of
concept content for each record in the collection. Useful in the
creation of json. 
'''
recordDF = md.recordConceptContent(RecommendationEvaluatedDF)
# organize the table (readability)
RecRecordDF = recordDF[['Collection', 'Record', 'Resource Title', 'Abstract',
                        'Online Resource', 'Keyword',
                        'Author / Originator', 'Distribution Format',
                        'Resource Type', 'Resource Version',
                        'Temporal Extent', 'Spatial Extent',
                        'Resource Citation']]
# display the dataframe
RecRecordDF

Unnamed: 0,Collection,Record,Resource Title,Abstract,Online Resource,Keyword,Author / Originator,Distribution Format,Resource Type,Resource Version,Temporal Extent,Spatial Extent,Resource Citation
1,MILES,10.1016.j.ecoinf.2017.09.005.xml,The influence of community recommendations on ...,0.0,-1.0,0.0,0.0,0.0,Text,,-1.0,0.0,-1.0
2,MILES,10.1016.j.ecoinf.2017.09.006.xml,Application of open source tools for biodivers...,0.0,-1.0,0.0,0.0,0.0,Text,,-1.0,0.0,-1.0


#### Choose a record to translate

In [20]:
# Set RecordChoice variable
RecordChoice = '10.1016.j.ecoinf.2017.09.005.xml'

In [21]:
# Select record row
RecRecordDF = RecRecordDF[RecRecordDF['Record'] == RecordChoice]
# Drop the Collection and Record columns
RecRecordDF = RecRecordDF.drop(['Collection', 'Record'], 'columns')
# Display the chosen record's content
RecRecordDF

Unnamed: 0,Resource Title,Abstract,Online Resource,Keyword,Author / Originator,Distribution Format,Resource Type,Resource Version,Temporal Extent,Spatial Extent,Resource Citation
1,The influence of community recommendations on ...,0.0,-1.0,0.0,0.0,0.0,Text,,-1.0,0.0,-1.0


#### Translate concepts to schema.org vocabulary

In [22]:
RecRecordDF = RecRecordDF.rename({'Resource Title':'name', 'Abstract':'description', 'Online Resource':'url', 'Keyword':'keywords', 'Author / Originator':'creator', 'Distribution Format': 'distribution', 'Resource Type':'@type', 'Resource Version': 'version', 'Temporal Extent': 'temporalCoverage', 'Spatial Extent': 'spatialCoverage', 'Resource Citation':'citation'}, axis='columns')
RecRecordDF

Unnamed: 0,name,description,url,keywords,creator,distribution,@type,version,temporalCoverage,spatialCoverage,citation
1,The influence of community recommendations on ...,0.0,-1.0,0.0,0.0,0.0,Text,,-1.0,0.0,-1.0


#### Add the required context

In [23]:
RecRecordDF.insert(2, '@context', 'http://schema.org/')
RecRecordDF

Unnamed: 0,name,description,@context,url,keywords,creator,distribution,@type,version,temporalCoverage,spatialCoverage,citation
1,The influence of community recommendations on ...,0.0,http://schema.org/,-1.0,0.0,0.0,0.0,Text,,-1.0,0.0,-1.0


#### Create JSON-LD String

In [24]:
recordDict = RecRecordDF.to_json(orient='records')
RecordJSONld = '<script type="application/ld+json">' + recordDict[1:-1] + '</script>'
RecordJSONld

'<script type="application/ld+json">{"name":"The influence of community recommendations on metadata completeness","description":0.0,"@context":"http:\\/\\/schema.org\\/","url":-1.0,"keywords":0.0,"creator":0.0,"distribution":0.0,"@type":"Text","version":"nan","temporalCoverage":-1.0,"spatialCoverage":0.0,"citation":-1.0}</script>'

#### Test JSON-LD for validity
* Take string produced by the cell above and copy it.
* Go to [Google's Structured Data Testing Tool](https://search.google.com/structured-data/testing-tool#new-test)
* Select the "Code Snippet Tab"
* Paste string and "Run Test"
* Click on errors to highlight the portion of the string that needs improvement
* rerun test with the play button in the middle bottom of the screen

[Next Notebook: Create a dialect specific translation using the recordXpathContent MDeval function](./03.CreateXpathJSON-LD.ipynb)