In [2]:
%matplotlib widget
import numpy as np
import matplotlib.pyplot as plt
from util import render_audio_sample

<img src="https://github.com/morganjwilliams/gs2020-diggingdeeper/raw/develop/img/ipyvolume.png" style="display:inline; float: right; margin: 0px 15px 15px 0px;" width="25%"/>


# Digging Into Deep Time and Deep Cover


<a href="https://doi.org/10.5281/zenodo.3875779"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3875779.svg" align="right" alt="doi: 10.5281/zenodo.3875779" style="padding: 0px 10px 10px 0px"></a>
<a href="https://github.com/morganjwilliams/gs2020-diggingdeeper/blob/master/LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" align="right" alt="License: MIT" style="padding: 0px 10px 10px 0px"></a>


<span id='authors'><b>Morgan Williams <a class="fa fa-twitter" aria-hidden="true" href="https://twitter.com/metasomite" title="@metasomite"></a></b>, Jens Klump, Steve Barnes and Fang Huang; </span>
<span id='affiliation'><em>CSIRO Mineral Resources</em></span>



### Contents

| [**Abstract**](./00_overview.ipynb) | **Introduction**                                                    | [**Examples**](./00_overview.ipynb#Examples)            | [**Tools**](./00_overview.ipynb#Leveraging-the-Scientific-Python-Ecosystem) | [**Insights**](./00_overview.ipynb#Insights) |
|:-----|:-----|:-----|:-----|:-----|
|  | [Minerals Exploration](./00_overview.ipynb#An-Evolving-Role-of-Geochemistry-in-Mineral-Exploration)  | [Classification](./011_classification.ipynb) |  |  |
|  | [Data Driven Geochem](./00_overview.ipynb#Data-Driven-Geochemistry) | [Data Visualization](./012_datavis.ipynb) |  |  |

<details class='alert alert-success'>
    <summary><b>About this Presentation</b></summary>
 
This is a Binder-enabled repository to accompany the abstract
<a href="https://goldschmidt.info/2020/abstracts/abstractView?id=2020003649">"Digging into Deep Time & Deep Cover"</a> for <a href="https://goldschmidt.info/2020/program/programViewThemes#period_472_4730_12338">Goldschmidt 2020 Session 06h</a> (Development of Big Data Geochemical Networks and new Analysis and Visualization
tools: Innovative approaches for 21st Century Multidimensional and Transdisciplinary
Science; 13:45-14:45 Thursday June 25 AWST / 19:45-20:45 Wednesday June 24 HST). To view the un-rendered notebooks, have a look at the <a href="https://github.com/morganjwilliams/gs2020_diggingdeeper">repository on GitHub</a>.

This presentation has been constructed using <a href="https://jupyter.org">Jupyter notebooks</a> and <a href="https://voila.readthedocs.io">Voilà</a> as an experiment in blending what would typically be found in a conference presentation with live-rendered interactive elements in a reproducible notebook.    
</details>

## Abstract 
Increasing volumes of open data and improved data quality allow geochemists to use data-driven approaches to address large-scale geological problems. At the same time, exploration for both base and critical metals is moving into under-explored areas with deeper cover, as a significant fraction of readily identifiable near-surface resources have likely already been discovered. With the cost of discovery increasing, predictive mineral systems science looks to better integrate and utilise both new and existing data to constrain the subsurface environment. Where we can use this data to restrict viable exploration spaces, exploration efforts may be focused to reduce cost and potentially time-to-discovery.

We will demonstrate a series of data-driven and machine learning approaches to classical geological problems, principally using data from global geochemical data repositories. We’ll use multivariate whole-rock geochemistry to distinguish tectonic environments, examine shifts in global basaltic geochemistry through time, use dimensional reduction and network techniques to visualize and better understand the relationship between samples and endmembers, and use multi-modal drill core data for predictive geochemistry. These examples illustrate some of the common challenges encountered while working with geochemical data:  working across different scales, and linking geochemistry to spatiotemporal domains. We focus these towards extending existing methods through the use of multivariate statistics and visualization methods, addressing model uncertainties, and acknowledging the potential impacts of common confounding effects (such as evolution, alteration and deformation). 

We highlight where we can readily gain useful insight,  where we may be able to transfer methods and learning to new problems and scales, and how we can use data to drive geological knowledge, extract latent features, and perhaps identify some 'unknown unknowns'. Finally, we demonstrate some tools which can make these methods more approachable for geochemists, such that the methods can be better integrated into established geochemical scientific workflows.

-------

## Introduction & Context

This presentation largely focuses on one question:

> #### *How can we get the most from our geochemical data?*

As we discuss below and in other sections of this presentation, this is partly about asking the right questions, but also how we value, use and enrich our data. Below we demonstrate the construction and assessment of simple machine learning models based on whole-rock geochemistry, and discuss some aspects of how modern data analysis and visualisation techniques can further constribute to our understanding of geological processes. Note that this presentation will remain available after the conference, and is archived on Zenodo for reference. Small changes and additions may be made to this repository over time to keep it up to date and fix any issues.

For those here principally for the examples, feel free to skip ahead!

### Data Driven Geochemistry

Similar to most scientific fields, geochemistry has seen a relatively rapid growth in the volume and variety of data produced over the past few decades. Consistent efforts to develop a few domain-specific public repositories for geochemical data (including EarthChem and GEOROC) now allow large-scale statistical analysis and comparison of geochemistry on a global scale. Significant databases have also been generated by national and state geological surveys, in addition to those generated by the minerals industry through both exploration and resource characterization. Together these large repositories have made data-driven approaches to large-scale geochemical and geological problems more feasible and accessible (where understanding is principally derived from the data itself, rather than our idea of what it represents). When it comes to data analysis, the best place to start is a typically a good question. Questions around 'why' and 'how' typically require complicated reasoning, and are more often the domain of simulation and modeling - here we're largely focusing on questions addressing the 'what, where and when' of geological processes. For example, there are many practical questions to ask of an exploration dataset:

* Is this more enriched/depleted in Y than expected?
* What rocks are most similar to X?
* Which chemical signatures are associated with mineralization?
* Which features provide the most valuable information for prediction? At what scale?
* What setting did this form in?

### An Evolving Role of Geochemistry in Mineral Exploration

At the same time as we're adopting increasingly digital approaches to research and amassing large collections of data, we're encountering the growing challenges of sustainability extracting the resources used to facilitate the technology. A steady demand for 'tech metals' along with a shift towards renewable energy (and together with it, increasing demand for battery metals) coupled with limited supply (either due to geographical heterogeneity, or simply that many of these metals are produced as by-products of major commodities) renders many of these resources 'critical metals'. To ensure future metal supply, continued exploration for both well known and potentially novel mineral deposits will be required. However, like many sectors, mineral exploration teams are expected to do more with lower budgets. They're also often working in relatively data-poor scenarios (at least relative to the scale they're working at). Below we discuss some of the challenges of questions in exploration, and some approaches which might be useful in 'putting our data to work' to mitigate some of the risk of mineral exploration, potentially reduce time to discoveries, and provide some more certainty for the inputs to critical minerals pipelines.

The rate of discovery for major deposits (those needed to fulfill future demand) has declined over the last few decades. While there certainly remain many undiscovered deposits which are likely amendable to discovery following traditional exploration approaches targeting near-surface deposits, many of the deposits that have been discovered recently are of lower quality or volume than some of the large high-grade deposits which global resources companies have been built on. As a result, the search space for mineral exploration is expanding to include areas with more significant cover, and with it approaches to exploration are changing. As exploration pursues deeper targets, one of the principal challenges is the relative lack of geological information, especially considering classical approaches are geared towards finding deposits with readily-identifiable surface expressions (e.g. deposits you can 'kick'). Exploration geochemistry still has a key role to play in these scenarios, but increasingly the integration of data from various sources will be key to identifying signatures of buried deposits (e.g. including geophysics, remote sensing, groundwater and regolith chemistry and detrital indicators; all of these provide information over a variety of scales!). The shallow subsurface is an accessible frontier, and typically remains under-explored.

Despite the evolving challenges of mineral exploration, budgets for exploration are increasingly tight and continue to be at least partly tied to resources cycles. Using existing data to enrich exploration processes and reduce exploration search spaces is one way in which exploration teams can adapt to 'do more with less'. Further, using data to make decisions around sampling activities and targeting in near-real time (i.e. active sampling) could allow adaptive exploration campaigns which return higher information value, reduce search spaces faster with lower costs and potentially lower risk. Mineral exploration is the beginning of the resource value chain, with a relatively long lead time to resource development and extraction. While there's no guarantees, honing approaches to mineral exploration to better cover search spaces and making the most of both new and existing data should decrease average times to discovery, and solidify the longer-term viability of growing and emergent technologies dependent on sustained supply of critical metals.

While it's difficult to provide useful constraints on questions along the lines of 'What is there left to discover?', useful constraints on 'Where could it be?' are attainable. From a high level, the use of geochemistry to understand geological reservoirs and processes can provide first-order constraints to reduce the exploration search space. Particularly older terranes, geochemistry has a larger role in providing geological context which has been lost through deformation, alteration and passage of time. However, the certainty with which we can use geochemistry to provide information about geological environments into deep time is limited, and increasingly so the further back we go. The magnitude of evolution from the early Earth to modern environments (those we can directly observe) necessitates that we consider what some of these limits may be, and how appropriate our models based on modern geological systems are in deep time. In exploration these problems typically have a strong spatial component, but here we focus largely on the geochemistry. The remainder of this presentation focuses on data-driven approaches to provide these first-order constraints, and how we can adapt our approach to classical geological problems with modern data analysis and visualization techniques.

<!--
Prospectivity and Fertility - While mineral deposits are often considered to be 'unique', the environments we find them in exhibit similarities we can exploit on larger scales. Even where an area may be prospective, the system may not be 'fertile' for forming mineral deposits
-->


### Adopting a Programmatic Approach

While there exists a range of data analysis software one could use to help derive insight from datasets, there are a number of advantages to adopting a programmatic approach ("writing code to ask questions of our data"):

* We can **avoid making simple errors** (e.g See "One in five genetics papers contains errors thanks to Microsoft Excel" [doi: 10. 1186/s13059-016-1044-7](https://doi.org/10.1186/s13059-016-1044-7)).
* We can **make our analysis repeatable** - it's not dependent on a particular sequence of mouse clicks and potentially unrecorded data analysis/reduction options. By recording the environment under which it was done (e.g. software versions, platform etc), you can also make sure that someone else can get the same results and come to the same conclusions, **making the analysis reproducible**.
* By using version control together with code-based approach to data analysis means we can effectively **version the process** (and potentially tie versions of data analysis pipelines to versions of datasets). This allows our data analysis to evolve without fear of lost history, and potentially identify where errors may have been present after the fact.

Beyond this, the flexibility of a programmatic approach also lends itself towards developing new tools which you can integrate into your own workflows, automatic repetitive work, using analysis for decision support, and potentially productivity gains (although, this is never guaranteed; 'better science' is largely our goal here).

However, perhaps one of the key benefits of a programmatic approach for data-driven geochemistry is that it **changes our perspective** (e.g. beyond 2D and 3D diagrams to multidimensional analysis), and changes the questions we ask. Particularly, it allows us to more easily quantify or estimate uncertainties, investigate how well our data support our models and better supports iterative testing and model development.    

Finally, community adoption of open science practices and open source software will contribute to the socialization of data, ideas, methods, code and analyses. It is also a viable pathway to developing consensus on **best practice** in a new era of data-driven geoscience, and developing community-driven research software considering interoperability and flexibility (open data formats, common standards).

## Examples

<img src="https://github.com/morganjwilliams/gs2020-diggingdeeper/raw/develop/img/ipyvolume.png" style="display:inline; float: right; margin: 0px 15px 15px 0px;" width="30%"/>


We've chosen to provide a few practical examples to illustrate how we might adapt our approaches to common problems in geochemistry using modern data analysis and machine learning, and included links to these below. For brevity, these don't quite cover everything we had originally planned to present at an in-person event. Instead we've included a few notes below on the aspects which we have excluded, as the concepts and ideas fit well within the theme of 'data driven geochemistry'. 

<div class='alert alert-success'> <b>Note:</b> the links below will open separate notebooks, and take a short while to execute and load.</div>

The [**first notebook**](./011_classification.ipynb) presents an example of multivariate geochemical classification applied to the tectonic discrimination of basalts. While it has minerals exploration relevance, it is generally relevant to the classification of rock and mineral chemistry where natural (*or artificial*) groupings and segmentations are known (supervised classification). Some of the latter sections of this example could also be directly applied to understand some of the relationships between samples for unsupervised classification/clustering.

The [**second notebook**](./012_datavis.ipynb) follows from the first, and presents a few data visualization examples which can be used for geochemical data exploration, and as one way of communicating the results of classification models and their uncertainties.

### Other Opportunities in Data Integration and Exploration

We've included a few notes below explaining some the key aspects of what was originally intended to be covered in examples around multivariate regressions (including space and time, and using multi-modal data), and using graph/network analysis in geochemistry. Both of these present interesting challenges and opportunities for data-driven investigation of geochemical, mineralogical and petrological datasets.

Using geochemical data within a spatiotemporal domain presents a number of challenges, some of which we can begin to address, and some of which (at least, as of now) have no simple solutions. In the spatial domain, geostatistics provides a number of methods to deal with spatial continuity, but requires that some assumptions about stationarity which are typically only valid on limited spatial scales. Further, even simple problems like dealing with spatial similarity on a global scale can be troublesome when the fact that we live on an oblate spheroid is ignored (consider two samples from Russia at 66°04'48.0"N 179°54'29.7"W and 66°05'49.4"N 179°00'32.7"E - they're very close, but depending on your metric, they could be almost 360° apart). 

Especially it comes to dealing with geological time, there are biases induced by geological processes themselves (destruction and preservation of crust, and specific lithologies within it). We also have biases induced by our sampling - choosing black rocks, only being able to access certain areas and the upper (typically exposed) sections of the crust, and collecting rocks within a reasonable distance of civilization. While we can adjust our sampling to explore new areas, or maybe even drill deeper, we have less choice when it comes to sampling in geological time. Methods like weighted bootstrap resampling can address the uneven distribution of data, but can't address concepts such as 'representativeness' into geological time, and we'll always need to make some assumptions there (but less so as we integrate more datasets and interlink deep time records, including with tectonic models!). Regardless of the magnitude of our efforts, spatial and time domains in large-scale geochemical problems will remain undersampled for quite some time (consider how little of the ocean floor we've explored, and how much less we've sampled!).

Combining space, time and variables of interest surely doubles down on some of these complexities, and how best to consider and deal with these problems remains an active area of research.

#### The Concept of Rosetta: Predictive Geochemistry from Multi-modal Drillcore Data

[Rosetta] is a platform developed by CSIRO Mineral Resources to provide probabilistic chemistry and mineralogy predictions from hyperspectral images of drill core. It's an interesting example which can be approached with multivariate regression which integrates at least one spatial dimension, and of increasing the value of data and analytics processes through data integration. While Rosetta is currently applied in a resource characterization framework, there are potential lessons to learn regarding integrating spatial information, point analyses and imagery which could readily translate to geochemical research which utilizes large numbers of volumes of samples (potentially over a range of scales; e.g. mapping thin sections for petrological studies, scanning core from scientific drilling).

One key challenge this addresses is the contrast between the sampling resolution of geochemical analyses (e.g. XRF, XRD and less frequent bulk-rock assays of small sections of round/half-round core) and the magnitude of ore deposits - together with the uncertainties and variabilities which result. With the advent of routine automated core scanning using hyperspectral imagery, Rosetta provides a method of predicting the geochemistry and mineralogy of unsampled core using a model trained on existing databases. These constraints provide useful operational inputs for further drilling, targeted follow-up analyses (and model improvement), in addition to fundamental metallurgical data with uncertainty estimates which can be used in forecasting and planning. The probabilisitc outputs and model uncertaintes are perhaps one of the most valuable features (rather than a secondary output), as uncertainty and unforeseen variability can be costly; estimation and management of uncertainties (e.g. through targeted sampling) is a key part of the operational use of the tool.

The platform puts existing data to work through adding context to increase the overall value and maximize the "information content" of new data, and uses lower-cost high-coverage data (i.e. hyperspectral imaging and XRF measurements) to provide a scaffold for prediction of higher-cost geochemical assays. 

Additionally, the concept involves an interesting crossover in terms of spatial domains and scales, where high-resolution imaging is integrated with low-resolution data and point analyses. Spatial context is typically under-utilized, but in these scenarios may be exploited to some extent to provide 'super-resolution' prediction of undersampled variables.

[Rosetta]: https://www.csiro.au/en/Research/MRF/Areas/Resourceful-magazine/Issue-13/A-rosetta-stone-for-ore "A Rosetta Stone for ore"

#### Networks: Making Use of Data Relationships and Integrating Qualitative and Categorical Information

While we have a wealth of methods for applying data analysis to the understanding the relationships between arrays of continuous numeric variables, where that data is non-continuous, non-numeric, boolean or qualitative, we have far fewer options.

Graph or network analysis is an emerging method in geochemistry which provides some powerful tools for working with these kinds of data, which may otherwise be relegated to the appendix. These methods can provide constraints on relationships between samples, experiments or locations. Network analysis has increasingly been used in ecological studies, and also applied to similar geological domains (e.g. geobiology and the fossil record; [Muscente et al., 2018]). Over the past five years, network analysis has been also been applied to investigate mineralogical associations and mineralogical diversity [Morrison et al. (2017)], and for community detection. [Huang et al . (2019)] provide an interesting example of community detection using [Louvain](https://en.wikipedia.org/wiki/Louvain_modularity) modularity (a specific measure of division or 'separateness' within a network; [Blondel, 2008]) to investigate the systematics of a wide range of serpentinization experiments (with different experimental parameters and conditions) to deconvolve the significant factors for the production of hydrogen and hydrocarbons during serpentinization.


[Morrison et al. (2017)]: https://doi.org/10.2138/am-2017-6104CCBYNCND "Morrison, S.M., Liu, C., Eleish, A., Prabhu, A., Li, C., Ralph, J., Downs, R.T., Golden, J.J., Fox, P., Hummer, D.R., Meyer, M.B., Hazen, R.M., 2017. Network analysis of mineralogical systems. American Mineralogist 102, 1588. doi: 10.2138/am-2017-6104CCBYNCND"

[Muscente et al., 2018]: https://doi.org/10.1073/pnas.1719976115 "Muscente, A.D., Prabhu, A., Zhong, H., Eleish, A., Meyer, M.B., Fox, P., Hazen, R.M., Knoll, A.H., 2018. Quantifying ecological impacts of mass extinctions with network analysis of fossil communities. PNAS 201719976. doi: 10.1073/pnas.1719976115"

[Huang et al . (2019)]: https://agu2019fallmeeting-agu.ipostersessions.com/Default.aspx?s=44-54-89-D3-59-E5-B5-85-69-91-A2-9C-54-4F-60-A9 "Huang et al., 2019. A Review of H₂ and Hydrocarbon Formation in Serpentinization Experiments Using Network Analysis. AGU Fall Meeting 2019 Abstracts."

[Blondel, 2008]: https://doi.org/2008101003130400 "Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E., 2008. Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008. doi: 2008101003130400"

<img src="https://www.python.org/static/community_logos/python-logo-master-v3-TM.png" style="display:inline; float: right; margin: 0px 15px 15px 0px;" width="40%"/>

## Leveraging the Scientific Python Ecosystem

The scientific Python ecosystem is large and relatively mature; a wide variety of numerical, visualization and utility packages exist which make for a solid foundation for data science projects, and from which to build more specialized tools. 

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/Jupyter_logo.svg/518px-Jupyter_logo.svg.png" style="display:inline; float: right; margin: 0px 15px 15px 0px;" width="15%"/>

Some of the foundational packages include `numpy` and `Pandas` for working with array-based and tabular data ([Oliphant, 2006]; [van der Walt, 2011]; [McKinney, 2010]),  and `matplotlib` for (principally) static visualization  ([Hunter, 2007]). Another key Python package for machine learning is `scikit-learn` ([Pedregosa, 2011]). 


Beyond this, there is an ecosystem of tools which which capitalize on the cross-language capabilities of [Jupyter](https://jupyter.org/), which provides notebooks and interactive environments for development, visualization and communication (e.g. these notebooks are running via <a href="https://voila.readthedocs.io">Voilà</a> in a Jupyterhub generously hosted by [mybinder](https://mybinder.org/)).



### [pyrolite]: Python for Geochemistry
<img src="https://pyrolite.readthedocs.io/en/master/_static/icon.png" style="display:inline; float: right; margin: 0px 15px 15px 0px;" width="25%"/>

pyrolite is a Python packaged developed specifically for working with geochemical data, and has recently been [published in the Journal of Open Source Software]. The python package includes functions to work with compositional data, to transform geochemical variables (e.g. elements to oxides), functions for common plotting tasks (e.g. spiderplots, ternary diagrams, bivariate and ternary density diagrams), and numerous auxiliary utilities.

``pyrolite`` and related tools are built upon already-existing and widely used tools for working with tabular data ( pandas) and visualization (matplotlib). As a result generally follows their conventions and syntax, and also exposes exposes their API such that it can be readily accessed. In particular, the API makes use of dataframe accessor classes provided by ``pandas`` to add additional dataframe 'namespaces' (e.g. accessing the ``pyrolite`` spiderplot method via `df.pyroplot.spider()`). This approach allows ``pyrolite`` to use more familiar syntax, helping geochemists new to Python to hit the ground running, and encouraging development of transferable knowledge and skills.

<div class='alert alert-success'>
<b>See Also:</b> For more information and interactive demonstration of pyrolite and a few other software tools developed for geochemistry,  have a look at <a href="https://mybinder.org/v2/gh/morganjwilliams/gs2020-python4geochem/master?urlpath=/voila/render/00_overview.ipynb">our other Goldschmidt 2020 presentation</a>, or alternatively check out the <a href="https://github.com/morganjwilliams/gs2020-python4geochem/">relevant repository</a>.
</div>

[Oliphant, 2006]: https://numpy.org/ "Oliphant, T.E., 2006. A guide to NumPy. Trelgol Publishing USA."

[van der Walt, 2011]: https://numpy.org/ "van der Walt, S., Colbert, S.C., Varoquaux, G., 2011. The NumPy Array: A Structure for Efficient Numerical Computation. Computing in Science Engineering 13, 22–30. https://doi.org/10.1109/MCSE.2011.37"

[Hunter, 2007]: https://matplotlib.org/ "Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95. https://doi.org/10.1109/MCSE.2007.55"

[McKinney, 2010]: https://pandas.pydata.org/pandas-docs/stable/ "McKinney, W., 2010. Data structures for statistical computing in python, in: van der Walt, S., Millman, J. (Eds.), Proceedings of the 9th Python in Science Conference. pp. 51–56."

[Pedregosa, 2011]: https://scikit-learn.org/ "Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É., 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825−2830."

[pyrolite]: https://pyrolite.rtfd.io "pyrolite docuementation website"

[published in the Journal of Open Source Software]: https://joss.theoj.org/papers/10.21105/joss.02314 "Williams et al., (2020). pyrolite: Python for geochemistry. Journal of Open Source Software, 5(50), 2314, doi: 10.21105/joss.02314"

<!--
## Insights, Perspectives and Conclusions

### Knowledge from Data

We have a variety of formal and informal models for how we understand the workings of our (and other) planets, but with new data we should continually revisit some of these to see whether we can improve them or learn something new

### Transferring Methods

From other fields to geochemistry

From other scales - higher dimensional imagery, micro-GIS etc

Predictive geochemistry and properties - best 'bang for buck' under analyses, transforming point-analyses into spatial/image predictions

### "Unknown Unknowns" - the ones we don't know we don't know.

What does this mean?

How do you find them?

* In practical sense, this largely relates to discovering unknown structure and relationships in data

* Links between timescales, space and geochemistry
-->

------

### Index

| [**Abstract**](./00_overview.ipynb) | **Introduction**                                                    | [**Examples**](./00_overview.ipynb#Examples)            | [**Tools**](./00_overview.ipynb#Leveraging-the-Scientific-Python-Ecosystem) | [**Insights**](./00_overview.ipynb#Insights) |
|:-----|:-----|:-----|:-----|:-----|
|  | [Minerals Exploration](./00_overview.ipynb#An-Evolving-Role-of-Geochemistry-in-Mineral-Exploration)  | [Classification](./011_classification.ipynb) |  |  |
|  | [Data Driven Geochem](./00_overview.ipynb#Data-Driven-Geochemistry) | [Data Visualization](./012_datavis.ipynb) |  |  |