# Ingest and Prepare a Metadata Collection for Evaluation
### Notebook Goals
* How you can use JupyterLab's GUI to upload a metadata record or a zip of many records and move the metadata to a directory
* Download a metadata collection from a repository or other URL
* normalize namespace location so concepts can be read accurately by the Metadata Evaluation Web Service

In [None]:
# create directories
import os
# compress metadata collection
import zipfile
# download records from a repository and prepare for evaluation
import MDeval as md

## Describe the metadata. 
* What organization created the records? (Organization)
* What collection are the records from? (Collection)
* What dialect are the records written in? (Dialect)

In [None]:
# variables for function arguments, fill these out
Organization = 'LTER'
Collection = 'MILES'
Dialect = 'EML'

# variable created from other variables, defining where to put the metadata
MetadataLocation = './metadata/' + Organization + '/' + Collection
# creates a directory
os.makedirs(MetadataLocation, exist_ok=True)

## Choose a method of metadata ingest

#### 1. Upload from your computer through the graphical user interface:
* use the file explorer on the left of your screen to navigate to the MILES directory. You'll see a metadata directory. Navigate to the *MetadataLocation* you created 
* Just above the directory and below the Lab toolbar is an arrow pointing up over a horizontal line. Click that and use the file explorer to select your metadata.
* upload a zip file called *metadata.zip* to the MILES directory, then unzip to the *MetadataLocation* after switching the cell type from Raw to Code on the following cell.

#### 2. Use records from the shared/resources/samples/metadata directory

* Use the Terminal to copy a directory to your instance:
    * cp -r shared/resources/samples/metadata/BCO-DMO/GeoTraces metadata/BCO-DMO/GeoTraces

* Copy/Paste and Drag n Drop are possible, but not recommended  

* Since I've cleaned up all the example collections it is possible to refer directly to the metadata directory in the shared/resources/metadata 
just alter the MetadataLocation from starting with the string './metadata' to 'shared/resources/samples/metadata' (Untested)


#### 3. Download records from a repository
* Create a list of record urls and a list of names for the records
* Use the lists to supply pairs of arguments to the function MDeval. The items will be paired in order, first *urls* with first *xml_files*

In [None]:
# variables for function arguments

# locations of metadata
urls = [
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.118.6",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.46.4",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.110.5",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.62.5",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.101.8",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.98.7",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-pie.69.7",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pie%2F429%2F1",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pie%2F430%2F1",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pal%2F98%2F2",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pal%2F101%2F2",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pal%2F97%2F5",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-pal%2F92%2F5",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-nwt.102.10",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-nwt.98.10",
    "https://cn.dataone.org/cn/v2/resolve/doi%3A10.6073%2FAA%2Fknb-lter-nwt.401.5",
    "https://cn.dataone.org/cn/v2/resolve/doi%3A10.6073%2FAA%2Fknb-lter-nwt.73.2",
    "https://cn.dataone.org/cn/v2/resolve/knb-lter-nwt.34.8",
    "https://cn.dataone.org/cn/v2/resolve/doi%3A10.6073%2FAA%2Fknb-lter-nwt.54.2",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-nwt%2F927%2F2",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-nwt%2F928%2F3",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-nwt%2F31%2F14",
    "https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-nwt%2F411%2F8"
    ]

# names you want to give the records
xml_files = [MetadataLocation + '/' + '1' + '.xml',
             MetadataLocation + '/' + '2' + '.xml',
             MetadataLocation + '/' + '3' + '.xml',
             MetadataLocation + '/' + '4' + '.xml',
             MetadataLocation + '/' + '5' + '.xml',
             MetadataLocation + '/' + '6' + '.xml',
             MetadataLocation + '/' + '7' + '.xml',
             MetadataLocation + '/' + '8' + '.xml',
             MetadataLocation + '/' + '9' + '.xml',
             MetadataLocation + '/' + '10' + '.xml',
             MetadataLocation + '/' + '11' + '.xml',
             MetadataLocation + '/' + '12' + '.xml',
             MetadataLocation + '/' + '13' + '.xml',
             MetadataLocation + '/' + '14' + '.xml',
             MetadataLocation + '/' + '15' + '.xml',
             MetadataLocation + '/' + '16' + '.xml',
             MetadataLocation + '/' + '17' + '.xml',
             MetadataLocation + '/' + '18' + '.xml',
             MetadataLocation + '/' + '19' + '.xml',
             MetadataLocation + '/' + '20' + '.xml',
             MetadataLocation + '/' + '21' + '.xml',
             MetadataLocation + '/' + '22' + '.xml',
             MetadataLocation + '/' + '23' + '.xml',
            ]   
# MDeval function to retrieve records
md.get_records(urls, xml_files, well_formed=False)

## Ensure namespace conformance
A note on namespaces - the transform identifies dialect from the default or explicit schema location. This means if I have records referencing DataCite kernel 4 instead of kernel 3, the conceptual content of the record will not be recognized. The records must be altered to point to kernel 3. The [Namespace prefix locations](Namespace_prefix_Location.md) document will allow you to use the same location the evaluation testing that is done in the web service uses to identify when you run the MDeval.XMLeval function in the next notebook

In [None]:
# variables for function arguments
oldNamespaceLocation = 'xmlns:eml="eml://ecoinformatics.org/eml-2.0.1"'
newNamespaceLocation = 'xmlns:eml="eml://ecoinformatics.org/eml-2.1.1"'
# MDeval function to find and replace the old with the new.
md.normalizeNamespace(MetadataLocation, newNamespaceLocation, oldNamespaceLocation)

#### Now you are ready to evaluate, analyze, and create reports on your own metadata!
[Next Notebook: Create Recommendation Report for a Metadata Collection](./01.CreateRecReport.ipynb)