<a href="https://colab.research.google.com/github/skybristol/experiments/blob/dev/USGS_GeoChem_lab_public_data_distribution_interface.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Prospectus
Exploring methods and techniques for interfacing with geochemistry Laboratory Information Management Systems to streamline published data distribution processing

# Background and Goal
The USGS Energy and Minerals Mission area supports and operates both the [Petroleum Geochemistry Research Laboratory](https://www.usgs.gov/centers/cersc/labs/petroleum-geochemistry-research-laboratory) and the Analytical Geochemistry Laboratory. Both of these facilities have different internal management systems and processes in place for conducting operations, managing samples and analytical results, and producing the data that are intended for publishing. While those internal systems also need some attention and major modernization in at least one case, there is an opportunity to combine forces and advance some new thinking in the public data distribution part of the overall system. This project is intended to help introduce some new thinking into how the distribution end of the process might work, putting some cloud-based components in place to provide new data capabilities and demonstrate some architectural patterns useful in this case and others throughout the Mission Area.

# Approach
The project will pursue several components as part of the overall cyberinfrastructure between the two internal LIMS and a public-facing interface. We will focus entirely on data that are already or can be made publicly available. The project will produce what will essentially be middleware that will not seek to replace but rather augment and streamline the processes behind any existing public distribution points, web applications, and other tools (e.g., the [National Geochemical Database](https://www.usgs.gov/energy-and-minerals/mineral-resources-program/science/national-geochemical-database)).

## LIMS Interfaces
Many different information systems across different domains share some of the same type of characteristics to these two labs in terms of having some internal management system used to handle the regular operations of the organization. These can be highly specialized toolsets with a mix of off the shelf and custom components whose architectural decisions are driven by the core competencies within the operational context. Those systems then need to interface out in some way, often using a "push out" type of construct where the security boundary of the internal system is maintained. That interface can be built directly into the system or might sit adjacent to that system in some way with security constraints to ensure that only appropriate information is pushed out to whatever type of public interface(s) is(are) needed.

In the project we will look at this component of the system as a significant part of what we examine and develop. Indications are that the Analytical Geochemistry Laboratory is operating on a legacy system where the interface is currently cumbersome and somewhat fraught with the need for significant human intervention. The overall goal will be to put an automated process in place that handles the process of packaging the data appropriate for distribution and sends those to some other place where they can be picked up and processed further for use. Data interface protocols are reportedly ODBC or JDBC to a relational database, which generally means that the interface needs to "sit" within a fairly narrow part of the security boundary that has the appropriate access. We may need to look at a mix of operations that occur within the database itself and things that can be operated with code connecting to the database. We will likely need to examine some form of service account/credentials that can be assigned to the automated tool that handles the work. 

We will also look carefully at the provenance trace that, from the standpoint of the public record, begins at the point of the interfaces to the LIMS. There is an internal provenance trace that is all part of the internal operations of the given lab. For our purposes in the project, we will assume that all of the information associated with internal processing of samples and analyses are curated and captured within the data that will be released in some way already. Our efforts will focus on ensuring that the provenance trace picks up from the point of that interface and is carried through to the final distributed results. As we work to automate these processes with code, that provides 

Once the distribution data can be mobilized in whatever form from the internal LIMS and placed somewhere onto a network, many different things can take shape from that point. Work has already been done from the Petroleum Geochemistry Lab to pull data into the cloud-based relational database management system (Amazon RDS) that is our "out of the box" option under the USGS Cloud Hosting Solutions (CHS). This may work out well, but we may also look at options involving optimized indexing capabilities or other forms of the data depending on how we define the interface needs.

Defining the interface will come down to what kind of purpose we are wanting to put these data to. We already have existing outlets for some of all of the data we'll be working with in this project. Can this process help make those products better or ease any pain points currently experienced in getting data into those products? Is there something new that we've been wanting to do with these data where a more nimble process like we are pursuing here might be able to help?

Technically, we will take a "API-first" approach to this, following the principle of admitting at the outset that we can't anticipate all good uses of what we put out. Building an Application Programming Interface as our means of interacting with these data will help make that possible. Within that, we'll also look toward an API approach that is fairly wide open like a GraphQL interface such that nearaly any possible queries can be run against the entire data structure vs. needing individual development to add new features. We may then optimize certain API routes for different purposes, but we'll have a foundational interface that can serve any purpose.

# Project Team
We are deliberately working to develop a small project team with the majority of the resources contributed through a dedicated pilot effort funded by the Energy and Minerals Mission Area (EMMA). The project will be conducted fully in the open with regular opportunities for input welcome from anyone with an interest.

* Architect: Sky Bristol (on detail with EMMA designing and future USGS Geoscience Data System)
* Developer: Jay Shah (contractor from the USGS Cloud Hosting Solutions group, dedicated to working on EMMA development tasks)
* Subject Matter Experts: Analytical Geochemistry Lab - Steve Smith, Maggie Goldman; Petroleum Geochem Lab - Greg Gunther, Rob Miller, Steve Groves; MRData - Peter Schweitzer; NGGDPP - Mikki Johnson, Lindsay Powers
* Student Interns: Interns from the Geo-Launchpad Program at UNAVCO will be working through a summer program and will assist with the effort through code development and other data work where activities can be aligned with learning objectives

# Timeline and Project Process
Some groundwork has been laid for the project with a number of cloud components already in place for the Petroleum Geochem Lab. This may allow us to jumpstart parts of the solution. We anticipate that the minimum viable products can be fully completed in 45 days or less from project start. All specific tasks, issues, and work of the project will be conducted within an open and transparent code project so that any stakeholder can check in on work in progress and weigh in on issues in need of subject matter expertise. The small project team will check in on video at least once weekly to work through any outstanding issues.

In [None]:
import pandas as pd
import requests
import xmltodict

ModuleNotFoundError: ignored

# Property Registry
Similar to another project we are pursuing with the USGS National Minerals Information Center and a new distribution method for their DS-140 commodity statistics data, we need to evolve a registry of the properties found in the data that we can use as a systematic reference. The legacy or historic NGDB already has this concept well established through the MRData platform. Individual sample records that share data all have pointers to field definitions from associated data tables, descriptions of analytical methods used, and many other details. Those are all called up dynamically through a web application and we may be able to build on the overall design pattern.

Given a reported break in time when lab methods and practices as well as data packaging methodology has been changing, we are starting here with a recent [data release](https://doi.org/10.5066/P9WHRLXH) of samples specifically associated with the EarthMRI program. That data package provides detailed entity and attribute information in its metadata as well as a separate data dictionary table that makes for a fairly simple source to build upon dynamically. The following code consults this reference point as a "design pattern" and pulls the data dictionary in dynamically so we can start thinking about where this information should go.

The basic design pattern set in MRData is incredibly sound - documentation details stored in a reasonable data model, effective persistent and resolvable identifiers for each reference "element," and a simple web interface for human readability and navigation. The next step in the evolution is to incorporate some additional explicit semantics in the model such that automated processes can be built to operate with the property registry as a reasoning foundation. This will come along with some additional interfaces tuned for those reasoners to operate against but will retain all the same features from what MRData started. Under the hood, we may also diverge slightly from a wholly relational database base store toward something a bit more distributed and heterogenous where different parts of the overall model are stored in optimal management environments for how they need to operate, all hooked together through APIs as an abstraction.

In [None]:
pattern_item = "601963c6d34edf5c66f0d0e5"

r_pattern_item = requests.get(f"https://www.sciencebase.gov/catalog/item/{pattern_item}?format=json&fields=files").json()

f_pattern_data_dict = next((f for f in r_pattern_item["files"] if f["name"] == 'EMRI_DataDictionary.csv'), None)
f_metadata = next((f for f in r_pattern_item["files"] if f["originalMetadata"]), None)

df_pattern_data_dict = pd.read_csv(f_pattern_data_dict["url"])
df_pattern_data_dict

Unnamed: 0,Tblname,SortOrder,AttributeLabel,AttributeDescription,AttributeUnit
0,EMRI_Data,1,Lab_ID,Unique identifier assigned to each submitted s...,
1,EMRI_Data,2,Field_ID,The identifier originally assigned to a sample...,
2,EMRI_Data,3,Prev_Lab_ID,The unique identifier previously assigned to a...,
3,EMRI_Data,4,IGSN,Unique International Geo Sample Number (IGSN; ...,
4,EMRI_Data,5,Parent_IGSN,The IGSN identifier assigned to the parent sam...,
...,...,...,...,...,...
319,QAQC_Values,531,SrO_pct_WDX,"Strontium, reported as strontium oxide or stro...",weight percent (% or pct)
320,QAQC_Values,532,TiO2_pct_WDX,"Titanium, reported as titanium dioxide or tita...",weight percent (% or pct)
321,QAQC_Values,533,V2O5_pct_WDX,"Vanadium, reported as vanadium pentoxide or va...",weight percent (% or pct)
322,QAQC_Values,534,F_pct_ISE,"Fluoride anion (F-), in weight percent, determ...",weight percent (% or pct)


## Additional Depth to the Property Registry
The metadata for the EarthMRI reference dataset provides a whole lot of richness that we will need to incorporate into the property registry. We'll need to develop from what serves as documenatation for a given dataset pulled from the Analytical Geochem Lab LIMS to an externalized registry that "controls" all data flowing from that system to the notional distribution point we are exploring. For instance, enumerated values in this case may not include every value the entire data system may eventually contain.

The following codeblocks explore this a little bit further by pulling metadata XML from the same reference data release item, parsing it to a dictionary for ease of use (at least for me...I hate working with XML), and then working through some initial notes.

In [None]:
r_metadata = requests.get(f_metadata["url"])
d_metadata = xmltodict.parse(r_metadata.text, dict_constructor=dict)

### Enumerated Values to Vocabularies
Similar to what MRData has already done (and building directly on that if possible), we need to externalize the sets of terms and definitions used for some of the properties into their own system.

Wherever possible, it would be great to move this outside of even our own domain to leverage external vocabularies or more robust semantic sources maintained by recognized authorities or community groups. This gets at the core principle of not rolling our own if we don't have to. If someone has already put the time and energy into figuring out a reasonable set of definitions for some concept we include in our data, let's see if we can use it. We may not be able to for a variety of reasons, but we should at least examine the possibility. It would also be a perfectly acceptable approach to build our own dynamic vocabularies made up of some terms from a third party with other terms of our own design; we just make everything explicit in either case. For instance, are there IUGS-CGI vocabularies that we could be leveraging in part or in full for any of these concepts?

One of the really vital concepts in the enumerated values are the analytical methods. The historic NGDB records point to somewhat more robust documentation of the methods that incorporates a link to a relevant publication, which would seem to be something we'd want to bring forward here. Other communities have started leveraging a platform like protocols.io for this type of reference purpose, and it might be worth looking into whether we could take advantage of something like that as a rich, external source for curating the references for protocols/methods in a way that provides persistent, resolvable identifiers, API access, and other useful capabilities.

For anything in the property registry that represents an identifier of some kind, we will want to incorporate details about whether or not that identifier is something persistent and resolvable and how to resolve it if applicable. Anything that can't be parameterized this way will be treated more as an unresolvable, unverifiable reference to some third party offline system.

Date properties should all be relatively straightforward to validate and transform, but things will get a bit complicated as we look at all the variations on how date, date/time, duration, range, and other temporal dynamics can be represented in data. We fundamentally have to interpret what can be messy string parameters in metadata into something that software code can operate against.

One of the things I always wonder about when looking at data like this test case is whether political unit information like states and counties in a dataset with explicit geospatial footprint information were actually derived from something like point coordinates and included as text values for ease of use. That dynamic of one property leading to another property through some type of processing logic is something we may also need to account for. We may also want to make certain properties like political unit names more robust than we find here by incorporating validation against reference sources.

For numeric/float properties that indicate an upper and lower bounding constraint, we will need to determine the full range possible beyond what may be representative in this particular reference dataset.

In [None]:
for attribute in d_metadata["metadata"]["eainfo"]["detailed"][0]["attr"]:
    if isinstance(attribute["attrdomv"], list):
        print(attribute["attrlabl"])
        display(attribute["attrdomv"])
        print("======")

for attribute in d_metadata["metadata"]["eainfo"]["detailed"][0]["attr"]:
    if isinstance(attribute["attrdomv"], dict):
        print(attribute["attrlabl"])
        display(attribute["attrdomv"])
        print("======")

Ref_Flag


[{'edom': {'edomv': 'Yes',
   'edomvd': 'The sample is identified as a Geologic Reference Material (GRM) sample used for Quality Control',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'No',
   'edomvd': 'The sample is not identified as a Geologic Reference Material (GRM) sample',
   'edomvds': 'U.S. Geological Survey'}}]

QAQC_Sample


[{'udom': 'Alphanumeric text field for QAQC sample names'},
 {'edom': {'edomv': 'Original',
   'edomvd': 'The original sample that was randomly chosen to be split into an analytical duplicate pair.',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'Duplicate',
   'edomvd': 'The duplicate or second split of a randomly chosen sample used as an analytical duplicate pair.',
   'edomvds': 'U.S. Geological Survey'}}]

Orig_Datum


[{'edom': {'edomv': 'NAD27',
   'edomvd': 'North American Datum of 1927 based on the Clarke Ellipsoid of 1866',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'NAD83',
   'edomvd': 'North American Datum of 1983 based on the Geodetic Reference System of 1980',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'WGS84',
   'edomvd': 'World Geodetic System of 1984',
   'edomvds': 'U.S. Geological Survey'}}]

Sample_Type


[{'edom': {'edomv': 'rock',
   'edomvd': "A sample of 'rock' collected from the source identified in the Sample_Source field",
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'sediment',
   'edomvd': "A sample of 'sediment' collected from the source identified in the Sample_Source field",
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'soil',
   'edomvd': "A sample of 'soil' collected from the source identified in the Sample_Source field",
   'edomvds': 'U.S. Geological Survey'}}]

Method_Collected


[{'edom': {'edomv': 'single/grab',
   'edomvd': 'Sample is collected without any attempt to composite material for representivity; may or may not be representative of material found at the site and is often used for collecting unusual, one-of-a-kind, or un-composited samples',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'composite',
   'edomvd': 'Sample is collected with the intent to homogenize material and to be representative of a larger body (rock unit, local soil, etc.) at the site; consist of multiple subsamples or "grabs" and may be collected from multiple locations at the site',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'channel',
   'edomvd': 'Sample is a composite sample collected over a continuous distance (e.g., chip samples of rock collected across the width of a bed, a sample collected by integrating media across the entire width of the stream)',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'other',
   'edomvd': "Samp

Sample_Source


[{'edom': {'edomv': 'artificial exposure',
   'edomvd': 'sample collected from an artificial exposure such as a road cut, excavation, trench, or tunnel (but not from a mine)',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'drill core',
   'edomvd': 'sample collected from drill core',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'dune/loess',
   'edomvd': 'sample collected from eolian dune or loess deposit',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'float/colluvium/talus',
   'edomvd': 'sample collected from float, colluvium, or talus implying that the sample may have moved from the original source',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'mine dump/waste rock',
   'edomvd': 'sample collected from unprocessed mine material from a mine dump or waste rock pile',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'mine/mill tailings',
   'edomvd': 'sample collected from mine material that has undergone 

Rock_Type


[{'edom': {'edomv': 'igneous',
   'edomvd': 'rock formed through the cooling and solidification of magma or lava, may be plutonic or volcanic',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'sedimentary',
   'edomvd': "rock formed by the deposition of material at the Earth's surface",
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'metamorphic',
   'edomvd': 'rock formed by igneous or sedimentary rocks being subjected to heat and pressure',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'tectonite',
   'edomvd': 'rock formed by shearing or tectonic forces',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'ore',
   'edomvd': 'a rock category that includes many types of mineralized rock or veins; may include sub-economic ores',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'miscellaneous',
   'edomvd': 'a rock that does not fit into normal rock classifications; examples include calcrete, concretion, gossan, limon

Igneous_Form


[{'edom': {'edomv': 'breccia/agglomerate',
   'edomvd': 'an igneous breccia or agglomerate',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'dike/sill/laccolith',
   'edomvd': 'a dike, sill, or laccolith',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'extrusive',
   'edomvd': 'an igneous extrusive body, not further defined',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'intrusive',
   'edomvd': 'an igneous intrusive body, not further defined',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'lava flow',
   'edomvd': 'a lava flow',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'plug/pipe',
   'edomvd': 'an igneous plug or pipe',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'pluton/stock',
   'edomvd': 'a pluton or stock',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'pyroclastic',
   'edomvd': 'a pyroclastic flow',
   'edomvds': 'U.S. Geological Survey'}}]

Depositional_Env


[{'edom': {'edomv': 'marine',
   'edomvd': 'originally deposited in a marine environment',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'continental',
   'edomvd': 'originally deposited in a continental environment',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'transitional',
   'edomvd': 'originally deposited in an environment transitional between marine and continental',
   'edomvds': 'U.S. Geological Survey'}}]

Metamorphism


[{'edom': {'edomv': 'regional',
   'edomvd': 'regional metamorphism',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'contact',
   'edomvd': 'contact metamorphism',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'hydrothermal',
   'edomvd': 'hydrothermal metamorphism',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'shear',
   'edomvd': 'metamorphism by shearing',
   'edomvds': 'U.S. Geological Survey'}}]

Meta_SourceRk


[{'edom': {'edomv': 'igneous',
   'edomvd': 'a meta-igneous rock',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'sedimentary',
   'edomvd': 'a meta-sedimentary rock',
   'edomvds': 'U.S. Geological Survey'}}]

Soil_Drainage


[{'edom': {'edomv': 'well drained',
   'edomvd': 'The soil sample site is well drained',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'poorly drained',
   'edomvd': 'The soil sample site is poorly drained',
   'edomvds': 'U.S. Geological Survey'}}]

Soil_Salinity


[{'edom': {'edomv': 'saline',
   'edomvd': 'The soil at the sample site is saline; contains or is impregnated with salt',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'non-saline',
   'edomvd': 'The soil at the sample site is not saline',
   'edomvds': 'U.S. Geological Survey'}}]

Ferritic_Char


[{'edom': {'edomv': 'ferritic',
   'edomvd': 'The soil at the sample site is ferritic; contains visible iron oxides',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'non-ferritic',
   'edomvd': 'The soil at the sample site is not ferritic',
   'edomvds': 'U.S. Geological Survey'}}]

Analytic_Mthds


[{'edom': {'edomv': 'C_AQUA_REGIA',
   'edomvd': 'The sample is digested with aqua regia and then analyzed by ICP-OES and ICP-MS for 51 elements (Al, Ca, Fe, K, Mg, Na, S, Ti, Ag, As, Au, B, Ba, Be, Bi, Cd, Ce, Co, Cr, Cs, Cu, Ga, Ge, Hf, Hg, In, La, Li, Mn, Mo, Nb, Ni, P, Pb, Rb, Re, Sb, Sc, Se, Sn, Sr, Ta, Te, Th, Tl, U, V, W, Y, Zn,  Zr). A full method description is included within these metadata in the Sample Analysis and Methods process step section.',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'C_FA_AU-PD-PT',
   'edomvd': 'Gold, Palladium, and Platinum are determined by lead fusion and ICP-OES and ICP-MS after collection by fire assay. A full method description is included within these metadata in the Sample Analysis and Methods process step section.',
   'edomvds': 'U.S. Geological Survey'}},
 {'edom': {'edomv': 'C_FA_ICPMS-AU',
   'edomvd': 'Gold is determined by lead fusion and ICP-OES and ICP-MS after collection by fire assay. A full method description is

Lab_ID


{'udom': 'Alphanumeric text field for unique sample identification labels'}

Field_ID


{'udom': 'Alphanumeric text field for field identification labels'}

Prev_Lab_ID


{'udom': 'Alphanumeric text field for unique sample identification labels'}

IGSN


{'udom': 'Alphanumeric text field for sample identification labels'}

Parent_IGSN


{'udom': 'Alphanumeric text field for sample identification labels'}

Job_ID


{'udom': 'Alphanumeric text field for unique job identification labels'}

EMRI_JobNo


{'udom': 'Alphanumeric text field for Earth MRI job identification labels'}

PID


{'udom': 'Alphanumeric text field for Earth MRI job identification labels'}

Proj_Name


{'udom': 'Comment field'}

Affiliation


{'udom': 'Comment field'}

Date_Submitted


{'rdom': {'attrunit': 'dd-mmm-yy',
  'rdommax': '16-Oct-20',
  'rdommin': '24-Feb-20'}}

Date_Approved


{'rdom': {'attrunit': 'dd-mmm-yy',
  'rdommax': '31-Dec-20',
  'rdommin': '14-Apr-20'}}

Lat_WGS84


{'rdom': {'attrunit': 'Decimal degrees',
  'rdommax': '65.9488',
  'rdommin': '32.01906'}}

Long_WGS84


{'rdom': {'attrunit': 'Decimal degrees',
  'rdommax': '-77.4901',
  'rdommin': '-150.4759'}}

Orig_Lat


{'rdom': {'attrunit': 'Decimal degrees',
  'rdommax': '65.9488',
  'rdommin': '32.01884'}}

Orig_Long


{'rdom': {'attrunit': 'Decimal degrees',
  'rdommax': '-77.4901',
  'rdommin': '-150.4759'}}

Country


{'udom': 'Alpha character field with full country name'}

State


{'codesetd': {'codesetn': 'USPS 2-letter state code',
  'codesets': 'https://about.usps.com/who-we-are/postal-history/state-abbreviations.htm'}}

Location_Desc


{'udom': 'Uncontrolled comment field'}

Sample_Desc


{'udom': 'Uncontrolled comment field'}

Date_Collected


{'rdom': {'attrunit': 'mm/dd/yyyy or yyyy',
  'rdommax': '12/9/2020',
  'rdommin': '6/20/1967'}}

Sample_Depth


{'udom': 'Depth or range of depths at which the sample was collected'}

Rock_Name


{'udom': 'Alpha character field containing a rock name'}

Geologic_Age


{'udom': 'Alpha character field containing a geologic age or age range'}

Stratigraphy


{'udom': 'Alphanumeric character field with a stratigraphic name'}

Facies_Grade


{'udom': 'Alpha character field used to describe metamorphic facies'}

Mineralization


{'udom': 'Alpha character field used to describe mineralization type'}

Alteration


{'udom': 'Alpha character field used to describe alteration type'}

Land_Cover


{'udom': 'Alpha character field used to describe sample environment'}

Soil_Parent


{'udom': 'Alpha character field used to describe soil parent material'}

Soil_Horizon


{'udom': 'Alpha character field used to describe the soil horizon that was sampled'}

Horizon_Char


{'udom': 'A characterization of how well the sampled soil horizon was defined'}

Organic_Conc


{'udom': 'A general observation on the amount of organic material found in the sampled soil'}

Soil_Moisture


{'udom': 'A general observation on the amount of moisture observed at the soil sample site'}

Sample_Prep


{'udom': 'Alpha character field used to describe the sample preparation methods'}

Al_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '19.3',
  'rdommin': '0.01'}}

Ca_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '48.9',
  'rdommin': '-0.01'}}

Fe_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '42.5',
  'rdommin': '0.02'}}

K_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '12.2',
  'rdommin': '-0.01'}}

Mg_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '13.7',
  'rdommin': '-0.01'}}

P_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '17.7',
  'rdommin': '-0.01'}}

S_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '30.4',
  'rdommin': '-0.1'}}

Si_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '45.1',
  'rdommin': '0.01'}}

Ti_pct_ICP60


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '2.71',
  'rdommin': '-0.01'}}

Ag_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '893',
  'rdommin': '-1'}}

As_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '89500',
  'rdommin': '-5'}}

B_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '607',
  'rdommin': '-10'}}

Ba_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '101000',
  'rdommin': '4.1'}}

Be_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '190',
  'rdommin': '-5'}}

Bi_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '118',
  'rdommin': '-0.1'}}

Cd_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '466',
  'rdommin': '-0.2'}}

Ce_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '30400',
  'rdommin': '1.1'}}

Co_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '297',
  'rdommin': '-0.5'}}

Cr_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '3910',
  'rdommin': '-10'}}

Cs_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '213',
  'rdommin': '-0.1'}}

Cu_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '256000',
  'rdommin': '-5'}}

Dy_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '183',
  'rdommin': '0.11'}}

Er_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '76.1',
  'rdommin': '-0.05'}}

Eu_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '212',
  'rdommin': '-0.05'}}

Ga_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '180',
  'rdommin': '-0.01'}}

Gd_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '601',
  'rdommin': '0.19'}}

Ge_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '66',
  'rdommin': '-1'}}

Hf_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '47',
  'rdommin': '-1'}}

Ho_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '29.9',
  'rdommin': '-0.05'}}

In_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '11.8',
  'rdommin': '-0.2'}}

La_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '17300',
  'rdommin': '0.5'}}

Li_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '967',
  'rdommin': '-10'}}

Lu_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '7.24',
  'rdommin': '-0.05'}}

Mn_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '425000',
  'rdommin': '-10'}}

Mo_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '331',
  'rdommin': '-2'}}

Nb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1450',
  'rdommin': '-0.1'}}

Nd_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '7820',
  'rdommin': '1'}}

Ni_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1270',
  'rdommin': '-5'}}

Pb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '77000',
  'rdommin': '-5'}}

Pr_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '2890',
  'rdommin': '0.18'}}

Rb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1130',
  'rdommin': '-0.2'}}

Re_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '0.33',
  'rdommin': '-0.02'}}

Sb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '3060',
  'rdommin': '-0.1'}}

Sc_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '51',
  'rdommin': '-5'}}

Se_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '142',
  'rdommin': '-5'}}

Sm_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '856',
  'rdommin': '0.2'}}

Sn_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '246',
  'rdommin': '-1'}}

Sr_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '49100',
  'rdommin': '6.2'}}

Ta_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '28.7',
  'rdommin': '-0.5'}}

Tb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '66.2',
  'rdommin': '-0.05'}}

Te_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '21.5',
  'rdommin': '-0.5'}}

Th_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '575',
  'rdommin': '-0.1'}}

Tl_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '30.7',
  'rdommin': '-0.5'}}

Tm_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '9.45',
  'rdommin': '-0.05'}}

U_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '182',
  'rdommin': '-0.05'}}

V_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1340',
  'rdommin': '-5'}}

W_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '682',
  'rdommin': '-1'}}

Y_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1150',
  'rdommin': '0.8'}}

Yb_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '59.5',
  'rdommin': '-0.1'}}

Zn_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '92500',
  'rdommin': '-5'}}

Zr_ppm_ICP60


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1800',
  'rdommin': '1.3'}}

SiO2_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '93',
  'rdommin': '0.04'}}

TiO2_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '4.53',
  'rdommin': '-0.01'}}

Al2O3_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '36.7',
  'rdommin': '0.03'}}

Fe2O3_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '62.7',
  'rdommin': '0.06'}}

MnO_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '51.1',
  'rdommin': '-0.01'}}

MgO_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '23.2',
  'rdommin': '-0.01'}}

CaO_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '65',
  'rdommin': '0.01'}}

Na2O_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '11.4',
  'rdommin': '-0.01'}}

K2O_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '14.7',
  'rdommin': '-0.01'}}

P2O5_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '37.1',
  'rdommin': '-0.01'}}

BaO_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '48',
  'rdommin': '-0.01'}}

Cr2O3_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '0.19',
  'rdommin': '-0.01'}}

SrO_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '6.19',
  'rdommin': '-0.01'}}

V2O5_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '0.17',
  'rdommin': '-0.01'}}

LOI_pct_WDX


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '56.3',
  'rdommin': '0.19'}}

F_pct_ISE


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '36.4',
  'rdommin': '0.01'}}

S_pct_IR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '7.92',
  'rdommin': '-0.005'}}

Au_ppb_FA


{'rdom': {'attrunit': 'parts per billion by weight (ppb)',
  'rdommax': '59467',
  'rdommin': '-1'}}

Pd_ppb_FA


{'rdom': {'attrunit': 'parts per billion by weight (ppb)',
  'rdommax': '495',
  'rdommin': '-1'}}

Pt_ppb_FA


{'rdom': {'attrunit': 'parts per billion by weight (ppb)',
  'rdommax': '170',
  'rdommin': '-5'}}

Al_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '3.45',
  'rdommin': '-0.01'}}

Ca_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '19.9',
  'rdommin': '-0.01'}}

Fe_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '14.5',
  'rdommin': '-0.01'}}

K_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '0.87',
  'rdommin': '-0.01'}}

Mg_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '10.8',
  'rdommin': '-0.01'}}

Na_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '0.15',
  'rdommin': '-0.01'}}

S_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '8.89',
  'rdommin': '-0.01'}}

Ti_pct_AR


{'rdom': {'attrunit': 'weight percent (% or pct)',
  'rdommax': '0.69',
  'rdommin': '-0.01'}}

Ag_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '93',
  'rdommin': '-0.01'}}

As_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '88300',
  'rdommin': '0.8'}}

Au_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '51.5',
  'rdommin': '-0.005'}}

B_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '93',
  'rdommin': '-5'}}

Ba_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '2810',
  'rdommin': '-1'}}

Be_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '23',
  'rdommin': '-0.05'}}

Bi_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '39.2',
  'rdommin': '-0.01'}}

Cd_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '30.4',
  'rdommin': '-0.01'}}

Ce_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '319',
  'rdommin': '4.88'}}

Co_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '143',
  'rdommin': '0.5'}}

Cr_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '508',
  'rdommin': '-0.5'}}

Cs_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '175',
  'rdommin': '0.09'}}

Cu_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1240',
  'rdommin': '-0.5'}}

Ga_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '24.9',
  'rdommin': '0.35'}}

Ge_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1.28',
  'rdommin': '-0.05'}}

Hf_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '3.72',
  'rdommin': '-0.02'}}

Hg_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '26.6',
  'rdommin': '-0.01'}}

In_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '12.7',
  'rdommin': '-0.005'}}

La_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '149',
  'rdommin': '-0.1'}}

Li_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '244',
  'rdommin': '-0.1'}}

Mn_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '15300',
  'rdommin': '-1'}}

Mo_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '77',
  'rdommin': '-0.05'}}

Nb_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '14',
  'rdommin': '-0.05'}}

Ni_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1160',
  'rdommin': '-0.5'}}

P_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '3290',
  'rdommin': '-10'}}

Pb_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '4180',
  'rdommin': '0.2'}}

Rb_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '251',
  'rdommin': '1.3'}}

Re_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '0.042',
  'rdommin': '-0.001'}}

Sb_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '2030',
  'rdommin': '-0.05'}}

Sc_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '9.8',
  'rdommin': '-0.1'}}

Se_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '26.6',
  'rdommin': '-0.2'}}

Sn_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '9.2',
  'rdommin': '-0.2'}}

Sr_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '1630',
  'rdommin': '-0.2'}}

Ta_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '0.46',
  'rdommin': '-0.01'}}

Te_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '11.9',
  'rdommin': '-0.01'}}

Th_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '63.5',
  'rdommin': '0.3'}}

Tl_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '11.6',
  'rdommin': '-0.01'}}

U_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '48.4',
  'rdommin': '0.07'}}

V_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '681',
  'rdommin': '-0.5'}}

W_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '708',
  'rdommin': '-0.05'}}

Y_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '115',
  'rdommin': '0.91'}}

Zn_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '4080',
  'rdommin': '-0.5'}}

Zr_ppm_AR


{'rdom': {'attrunit': 'parts per million by weight (ppm)',
  'rdommax': '42.5',
  'rdommin': '-0.5'}}



# Peeking at the Data
Part of our goal in this effort will be to write a nominal LIMS interface component that handles (automates) as much of the process that went behind creating logical data release products like the EarthMRI example as possible. We'll be reading different source data that is somehow specified or laid out for release from the internal system and then sending those data out to some place for further processing and eventual distribution. To get a peek at what the end product might look like, the following codeblock uses the three table references from the data dictionary and retrieves the first few rows into dataframes.

As we get into this, some of the things that we may go back and encode into the property registry will be information needed to run automated checks against the actual data coming into our system. These are perhaps data integrity checks at the point of packaging and distribution in addition to QA/QC information that has come forward from the LIMS itself. These are things like verifying that the data values all line up with what's expected in terms of range of values for a given unit or within detection limits or that text values align with some type of vocabulary source. Some of that detail appears to be in the XML form of the metadata in the EarthMRI data release example (e.g. enumerated values not present in the data dictionary), so we'll work to incorporate those details into our property registry model. Ideally, our property registry should have enough in it to accommodate at least a baseline check of every data element it is "controlling."

In addition to validation, such information can also manage on-demand transformations from valid unit conversions to adding definitions in place of or in addition to coded values. We will also look for obvious linkage points that our data content should be able to make; something else that the MRData system has done very well. Simply putting a little attention on things like person and organization information inherent in many data, very rarely explicitly linked to persistent resolvable identifiers, but linkable with low uncertainty to those idenifiers can sometimes be a powerful augmentation to data as we bring them into a higher level context beyond an individual data release.

In [None]:
for table in df_pattern_data_dict.Tblname.unique():
    table_url = next((f["url"] for f in r_pattern_item["files"] if f["name"] == f'{table}.csv'), None)
    if table_url is not None:
        print(table)
        display(pd.read_csv(table_url, nrows=5))
        print("==============")

EMRI_Data


Unnamed: 0,Lab_ID,Field_ID,Prev_Lab_ID,IGSN,Parent_IGSN,Ref_Flag,QAQC_Sample,Job_ID,EMRI_JobNo,PID,Proj_Name,Affiliation,Date_Submitted,Date_Approved,Lat_WGS84,Long_WGS84,Orig_Lat,Orig_Long,Orig_Datum,Country,State,Location_Desc,Sample_Type,Sample_Desc,Date_Collected,Method_Collected,Sample_Source,Sample_Depth,Rock_Type,Rock_Name,Geologic_Age,Stratigraphy,Igneous_Form,Depositional_Env,Metamorphism,Facies_Grade,Meta_SourceRk,Mineralization,Alteration,Land_Cover,...,B_ppm_AR,Ba_ppm_AR,Be_ppm_AR,Bi_ppm_AR,Cd_ppm_AR,Ce_ppm_AR,Co_ppm_AR,Cr_ppm_AR,Cs_ppm_AR,Cu_ppm_AR,Ga_ppm_AR,Ge_ppm_AR,Hf_ppm_AR,Hg_ppm_AR,In_ppm_AR,La_ppm_AR,Li_ppm_AR,Mn_ppm_AR,Mo_ppm_AR,Nb_ppm_AR,Ni_ppm_AR,P_ppm_AR,Pb_ppm_AR,Rb_ppm_AR,Re_ppm_AR,Sb_ppm_AR,Sc_ppm_AR,Se_ppm_AR,Sn_ppm_AR,Sr_ppm_AR,Ta_ppm_AR,Te_ppm_AR,Th_ppm_AR,Tl_ppm_AR,U_ppm_AR,V_ppm_AR,W_ppm_AR,Y_ppm_AR,Zn_ppm_AR,Zr_ppm_AR
0,C-505148,2019-52-2,,,,No,,MRP-18627,AR20-001,20008,Marine phosphate deposits of Arkansas,Arkansas Geological Survey,24-Feb-20,14-Apr-20,35.886,-91.853,35.886,-91.853,WGS84,United States,AR,Love Hollow Quarry SE highwall,rock,,9/24/2019,single/grab,mine/quarry/prospect pit,surface,sedimentary,siltstone,ordovician,Cason Formation,,marine,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,C-505149,2019-53,,,,No,,MRP-18627,AR20-001,20008,Marine phosphate deposits of Arkansas,Arkansas Geological Survey,24-Feb-20,14-Apr-20,35.886,-91.853,35.886,-91.853,WGS84,United States,AR,Love Hollow Quarry SE highwall,rock,,9/24/2019,single/grab,mine/quarry/prospect pit,surface,sedimentary,siltstone,ordovician,Cason Formation,,marine,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,C-505150,2019-64,,,,No,Original,MRP-18627,AR20-001,20008,Marine phosphate deposits of Arkansas,Arkansas Geological Survey,24-Feb-20,14-Apr-20,35.875,-91.822,35.875,-91.822,WGS84,United States,AR,along road,rock,,10/9/2019,single/grab,natural exposure/outcrop,surface,sedimentary,siltstone,ordovician,Cason Formation,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,C-505151,2019-131,,,,No,,MRP-18627,AR20-001,20008,Marine phosphate deposits of Arkansas,Arkansas Geological Survey,24-Feb-20,14-Apr-20,35.907,-91.765,35.907,-91.765,WGS84,United States,AR,old manganese prospect,rock,,10/29/2019,single/grab,float/colluvium/talus,surface,sedimentary,sandstone,ordovician,Cason Formation,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,C-505152,2019-132,,,,No,,MRP-18627,AR20-001,20008,Marine phosphate deposits of Arkansas,Arkansas Geological Survey,24-Feb-20,14-Apr-20,35.911,-91.763,35.911,-91.763,WGS84,United States,AR,old manganese prospect,rock,,10/29/2019,single/grab,float/colluvium/talus,surface,sedimentary,sandstone,ordovician,Cason Formation,,,,,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


Limits_AnalyticalMethods


Unnamed: 0,SortOrder,Analytic_Mthds,Constituent,LowerDetectLimit,UpperDetectLimit,Unit,Instrument
0,1,C_ICPOES_MS-60,Al,0.01,25,%,ICP-OES
1,2,C_ICPOES_MS-60,Ca,0.01,35,%,ICP-OES
2,3,C_ICPOES_MS-60,Fe,0.01,30,%,ICP-OES
3,4,C_ICPOES_MS-60,K,0.01,25,%,ICP-OES
4,5,C_ICPOES_MS-60,Mg,0.01,30,%,ICP-OES


QAQC_Values


Unnamed: 0,SortOrder,GRM,ValueType,Al_pct_ICP60,Ca_pct_ICP60,Fe_pct_ICP60,K_pct_ICP60,Mg_pct_ICP60,P_pct_ICP60,S_pct_ICP60,Si_pct_ICP60,Ti_pct_ICP60,Ag_ppm_ICP60,As_ppm_ICP60,B_ppm_ICP60,Ba_ppm_ICP60,Be_ppm_ICP60,Bi_ppm_ICP60,Cd_ppm_ICP60,Ce_ppm_ICP60,Co_ppm_ICP60,Cr_ppm_ICP60,Cs_ppm_ICP60,Cu_ppm_ICP60,Dy_ppm_ICP60,Er_ppm_ICP60,Eu_ppm_ICP60,Ga_ppm_ICP60,Gd_ppm_ICP60,Ge_ppm_ICP60,Hf_ppm_ICP60,Ho_ppm_ICP60,In_ppm_ICP60,La_ppm_ICP60,Li_ppm_ICP60,Lu_ppm_ICP60,Mn_ppm_ICP60,Mo_ppm_ICP60,Nb_ppm_ICP60,Nd_ppm_ICP60,...,Ni_ppm_AR,P_ppm_AR,Pb_ppm_AR,Rb_ppm_AR,Re_ppm_AR,Sb_ppm_AR,Sc_ppm_AR,Se_ppm_AR,Sn_ppm_AR,Sr_ppm_AR,Ta_ppm_AR,Te_ppm_AR,Th_ppm_AR,Tl_ppm_AR,U_ppm_AR,V_ppm_AR,W_ppm_AR,Y_ppm_AR,Zn_ppm_AR,Zr_ppm_AR,Au_ppb_FA,Pd_ppb_FA,Pt_ppb_FA,Al2O3_pct_WDX,BaO_pct_WDX,CaO_pct_WDX,Cr2O3_pct_WDX,Fe2O3_pct_WDX,K2O_pct_WDX,LOI_pct_WDX,MgO_pct_WDX,MnO_pct_WDX,Na2O_pct_WDX,P2O5_pct_WDX,SiO2_pct_WDX,SrO_pct_WDX,TiO2_pct_WDX,V2O5_pct_WDX,F_pct_ISE,S_pct_IR
0,1,AGV-1,Preferred,9.08,3.53,4.47,2.42,0.92,0.22,<0.01,27.5,0.63,<1,<5,<10,1230,<5,<0.1,<0.2,,15.0,10.0,1.3,60.0,3.6,1.7,1.6,20.0,5.0,1.3,5.1,,<0.2,38.0,12.0,0.27,710.0,2.7,15.0,33.0,...,,,,,,,,,,,,,,,,,,,,,0.6,,,17.2,0.14,4.94,<0.01,6.77,2.92,1.2,1.53,0.09,4.26,0.5,58.8,0.08,1.05,0.02,0.04,
1,2,AGV-1,Mean,9.15,3.42,4.72,2.4,0.88,0.21,<0.01,27.86,0.6,<1,<30,<10,1188,<5,<0.1,<0.2,69.2,15.1,12.8,1.32,57.4,3.63,1.86,1.7,20.2,4.88,1.0,5.4,0.69,<0.2,38.2,10.3,0.26,722.0,3.4,13.8,31.7,...,,,,,,,,,,,,,,,,,,,,,,,,17.2,,4.98,<0.01,6.84,2.95,1.92,1.53,0.09,4.2,0.5,59.4,,1.09,,0.05,
2,11,CBT-QCM-1,Preferred,0.23,33.4,0.29,0.11,2.99,0.03,0.33,0.98,0.01,<1,5.97,<10,16000,<5,<0.1,<0.2,3900.0,1.7,<10,0.3,8.0,3.8,1.0,9.3,,18.3,<1,<1,0.56,<0.2,2800.0,<10,0.11,387.0,<2,6.0,884.0,...,,,,,,,,,,,,,,,,,,,,,,,,0.41,1.79,46.7,<0.01,0.41,0.13,41.0,4.95,0.05,0.09,0.07,2.1,0.18,0.02,<0.01,,
3,12,CBT-QCM-1,Mean,0.23,32.7,0.28,0.15,2.93,0.03,0.41,0.95,0.02,<1,<30,<10,>10000,<5,<0.1,0.2,3781.0,1.7,<10,0.33,7.6,3.73,0.93,9.51,,17.6,<1,<1,0.53,<0.2,2640.0,<10,0.09,,<2,7.0,898.0,...,,,,,,,,,,,,,,,,,,,,,,,,0.41,1.83,46.6,<0.01,0.42,0.13,41.9,4.87,0.05,0.06,0.07,2.23,,0.03,,,
4,21,DGPM-1,Preferred,4.82,0.17,1.36,2.23,0.32,0.04,0.31,>30,0.33,<1,180,93.6,1272,<5,0.10,0.34,90.5,1.36,120,8.92,13.7,3.22,2.08,0.78,11.3,3.9,2.0,9.8,0.67,<0.2,51.7,41.9,0.32,28.0,13.6,9.8,30.5,...,,,,,,,,,,,,,,,,,,,,,730.0,12.0,9.4,,,,,,,,,,,,,,,,0.09,


