# Python Packages and Resources for Geologists

Python is a widely-used programming language and the developers involved in Earth Science have created many libraries and packages to solve Geology problems. As an example, the <a href="https://github.com/softwareunderground/awesome-open-geoscience">Awesome Open Geoscience repository</a> and a <a href="https://agu-h3s.org/2021/03/29/resources-for-programming-in-hydrology/">blog post</a> by the American Geophysical Union Hydrology Section Student Subcommittee (AGU-H3S)
list many Python projects, also dividing them by application fields. Also, the book titled "Pythonic Geodynamics" (Morra, 2018) lists several examples of the application of Python programming to geodynamic modeling. Finally, you should look at <a href="https://pangeo.io/">Pangeo</a>, a community of researchers who work together to develop a platform for Big Data geoscience.

In the following, I report a list of Python libraries and resources that have been developed for Earth Scientists. I am aware that it is a partial report, and it will start getting old the day after its publication. Anyway, I think it could be a starting point for novices. Note that I took descriptions from the documentation of the library, reporting a link to each repository.

## Python libraries for Geologists

<h4 style="margin-bottom:4px">ArcGIS API for Python</h4> 
<a href="https://developers.arcgis.com/python">ArcGIS API</a> for Python is a powerful Python library for mapping, spatial analysis, data science, geospatial AI and automation.

<h4 style="margin-bottom:4px">APSG</h4> 
<a href="https://apsg.readthedocs.io">APSG</a> defines several new Python classes to easily manage, analyze, and visualize orientational structural geology data. 

<h4 style="margin-bottom:4px">Badlands</h4> 
<a href="https://badlands.readthedocs.io">Badlands</a>    is an open-source Python-based code that can be used to simulate Basin and Landscape Dynamics. 

<h4 style="margin-bottom:4px">BERT</h4>
Boundless Electrical Resistivity Tomography (<a href="http://resistivity.net/bert">BERT</a>) is a software package for modelling and inversion of ERT data. It has originally been programmed as C++ apps based on the pyGIMLi core library, plus bash scripts for command line, but is increasingly using Python through pyGIMLi and pybert, not only for visualization but also for computing.

<h4 style="margin-bottom:4px">Cartopy</h4> 
<a href="https://scitools.org.uk/cartopy">Cartopy</a> is a Python package designed for geospatial data processing in order to produce maps and other geospatial data analyses.

<h4 style="margin-bottom:4px">cbsyst</h4> 
<a href="https://github.com/oscarbranson/cbsyst">cbsyst</a> is a Python module for calculating seawater carbon and boron chemistry.

<h4 style="margin-bottom:4px">cf-python</h4>
The Python <a href="https://github.com/NCAS-CMS/cf-python">cf package</a> is an Earth Science data analysis library that is built on a complete implementation of the CF data model.

<h4 style="margin-bottom:4px">Devito</h4> 
<a href="http://www.devitoproject.org">Devito</a> is a Python package to implement optimized stencil computation (e.g., finite differences, image processing, machine learning) from high-level symbolic problem definitions. Devito builds on SymPy and uses automated code generation and just-in-time compilation to execute optimized computational kernels on several computer platforms, including CPUs, GPUs, and clusters thereof.

<h4 style="margin-bottom:4px">DensityX</h4> 
<a href="https://github.com/kaylai/DensityX">DensityX</a> is a Python script that takes an excel spreadsheet containing major oxide data, T, and P for a silicate melt and outputs the density of each sample as a new excel spreadsheet.

<h4 style="margin-bottom:4px">detritalPy</h4> 
<a href="https://github.com/grsharman/detritalPy">detritalPy</a> is a Python module for visualizing and analyzing detrital geo-thermochronologic data.

<h4 style="margin-bottom:4px">diffusion_chronometry</h4> 
<a href="https://github.com/jlubbersgeo/diffusion_chronometry">diffusion_chronometry</a> is a repository by <a href="https://twitter.com/caldera_curator">Jordan Lubbers</a> for all things pertaining to the modelling of diffusive equilibration of trace elements in minerals. Rather than build a bunch of fancy functions, the Jupyter notebooks are built "from scratch" so as to be transparent with as much of the building of the model as possible

<h4 style="margin-bottom:4px">EQcorrscan</h4> 
<a href="https://github.com/iris-edu/pyweed">EQcorrscan</a> is a Python package for the detection and analysis of repeating and near-repeating seismicity.

<h4 style="margin-bottom:4px">Fastscape</h4> 
<a href="https://github.com/fastscape-lem/fastscape">Fastscape</a>   is a Python package that provides a lot a small model components (i.e., processes) to use with the xarray-simlab modeling framework. Those components can readily be combined together in order to create custom Landscape Evolution Models (LEMs).

<h4 style="margin-bottom:4px">Fatiando a Terra</h4> 
<a href="https://www.fatiando.org">Fatiando a Terra</a> develops and maintains Python packages for Geophysics data processing, modeling like VERDE (Spatial data processing and interpolation using Green's functions), harmonica (processing and modeling gravity and magnetic data), and Boule (Reference ellipsoids for geodesy and geophysics).

<h4 style="margin-bottom:4px">FloPy</h4>
<a href="https://github.com/modflowpy/flopy">FloPy</a> is a Python package for creating, running, and post-processing MODFLOW-Based models.

<h4 style="margin-bottom:4px">geemap</h4> 
<a href="https://geemap.org">geemap</a> is a Python package for interactive mapping with Google Earth Engine (GEE).</br>

<h4 style="margin-bottom:4px">GemGIS</h4> 
The aim of <a href="https://github.com/cgre-aachen/gemgis">GemGIS</a> is to become a bridge between conventional geoinformation systems (GIS) such as ArcGIS and QGIS, and geomodeling tools such as GemPy, allowing simpler and more automated workflows from one environment to the other.

<h4 style="margin-bottom:4px">GemPy</h4> 
<a href="https://www.gempy.org">GemPy</a> is a tool for generating three-dimensional structural geological models in Python. It allows the user to create complex combinations of stratigraphical and structural features such as folds, faults, and unconformities. It was furthermore designed to enable probabilistic modeling to address parameter and model uncertainties.

<h4 style="margin-bottom:4px">GeoPandas</h4> 
<a href="https://geopandas.org">GeoPandas</a> is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types.

<h4 style="margin-bottom:4px">GeostatsPy</h4>
<a href="https://github.com/GeostatsGuy/GeostatsPy">GeostatsPy</a> brings GSLIB: Geostatistical Library functions to Python. GSLIB is a practical and extremely robust set of code for building spatial modeling workflows.

<h4 style="margin-bottom:4px">gprMax</h4> 
<a href="https://www.gprmax.com">gprMax</a>   is open-source software that simulates electromagnetic wave propagation. It solves Maxwell’s equations in three dimensions by using the finite-difference time-domain method. gprMax was designed for modeling ground-penetrating radar but can also be used to model electromagnetic wave propagation for many other applications.

<h4 style="margin-bottom:4px">HyVR</h4>
The Hydrogeological Virtual Reality simulation package (<a href="https://github.com/driftingtides/hyvr">HyVR</a>) is a Python module that helps researchers and practitioners generate subsurface models with multiple scales of heterogeneity that are based on geological concepts. The simulation outputs can then be used to explore groundwater flow and solute transport behavior. This is facilitated by HyVR outputs in the input formats of common flow simulation packages. Given that each site is unique, HyVR has been designed for users to take the code and extend it to suit their particular simulation needs.

<h4 style="margin-bottom:4px">Lasio</h4> 
<a href="https://github.com/kinverarity1/lasio">Lasio</a>   is a Python package to read and write Log ASCII Standard (LAS) files, which are used for borehole data such as geophysical, geological, or petrophysical logs. It is compatible with versions 1.2 and 2.0 of the LAS file specification, published by the Canadian Well Logging Society. Support for LAS 3 is ongoing. In principle, it is designed to read as many types of LAS files as possible, including those containing common errors or non-compliant formatting.
Sometimes we want a higher-level object, for example, to contain methods that have nothing to do with LAS files. We may want to handle other well data, such as deviation surveys, tops (aka picks), engineering data, striplogs, synthetics, and so on. This is where welly comes in. 

<h4 style="margin-bottom:4px">Landlab</h4>
<a href="https://github.com/landlab/landlab">Landlab</a>   is an open-source Python package for numerical modeling of Earth surface dynamics. It contains (1) a gridding engine that represents the model domain and that supports regular and irregular grids; (2) a library of process components, each of which represents a physical process (e.g., generation of rain, erosion by flowing water); (3) utilities that support general numerical methods, file input and output, and visualization. In addition Landlab contains a set of Jupyter notebook tutorials that introduce core concepts and give examples of use.

<h4 style="margin-bottom:4px">LakePy</h4> 
<a href="https://github.com/ESIPFed/LakePy">LakePy</a>    is the pythonic user-centered front-end to the Global Lake Level Database. This package can instantly deliver lake water levels for some 2000+ lakes scattered across the globe.

<h4 style="margin-bottom:4px">latools</h4> 
Laser Ablation Tools (<a href="https://latools.readthedocs.io">latools</a>) is a Python toolbox for processing Laser Ablations Mass Spectrometry (LA-MS) data.

<h4 style="margin-bottom:4px">litholog</h4> 
<a href="https://litholog.readthedocs.io">litholog</a>    is focused on providing a framework to digitize, store, plot, and analyze sedimentary graphic logs.

<h4 style="margin-bottom:4px">Loop</h4> 
<a href="https://loop3d.github.io">Loop</a> is an open source 3D probabilistic geological and geophysical modelling platform, initiated by Geoscience Australia and the OneGeology consortium. The project is funded by Australian territory, State and Federal Geological Surveys, the Australian Research Council and the MinEx Collaborative Research Centre. It includes the Loopstructural and  map2loop  packages, minded for 3d geological modelling.

<h4 style="margin-bottom:4px">LSDTopoTools</h4>
<a href="https://lsdtopotools.github.io">LSDTopoTools</a>    is a software package for analysing topography. Applications of these analyses span hydrology, geomorphology, soil science, ecology, and cognate fields. The serious number crunching in LSDTopoTools is done in C++ code, but the output needs to be visualised with either a GIS or python.

<h4 style="margin-bottom:4px">MetPy</h4> 
<a href="https://unidata.github.io/MetPy">MetPy</a>   is a collection of tools in Python for reading, visualizing, and performing calculations with weather data. 

<h4 style="margin-bottom:4px">MIMiC</h4> 
Melt inclusion modification corrections (<a href="https://github.com/DJRgeoscience/MIMiC">MIMiC</a>) is a program corrects melt inclusions for post-entrapment crystallization/melting (PEC/PEM) with optional corrections for Fe-Mg exchange with the host and vapor bubble growth.

<h4 style="margin-bottom:4px">MintPy</h4> 
The Miami INsar Time-series software in PYthon (<a href="https://github.com/insarlab/MintPy">MintPy</a>) is an open-source package for Interferometric Synthetic Aperture Radar (InSAR) time series analysis. 

<h4 style="margin-bottom:4px">MSNoise</h4> 
<a href="https://github.com/ROBelgium/MSNoise">MSNoise</a> is the first complete software package for computing and monitoring relative velocity variations using ambient seismic noise. MSNoise is a fully-integrated solution that automatically scans data archives and determines which jobs need to be done whenever the scheduled task is executed.

<h4 style="margin-bottom:4px">MTpy</h4>
<a href="https://github.com/paudetseis/RfPy">MTpy</a> is a Python Toolbox for Magnetotelluric (MT) Data Processing, Analysis, Modelling and Visualization. 

<h4 style="margin-bottom:4px">OceanSpy</h4> 
<a href="https://oceanspy.readthedocs.io">OceanSpy</a>   is an open-source and user-friendly Python package that enables scientists and interested amateurs to analyze and visualize ocean model datasets.

<h4 style="margin-bottom:4px">ObsPy</h4>
<a href="https://github.com/obspy/obspy/wiki">ObsPy</a>    is an open-source project dedicated to providing a Python framework for processing seismological data. It provides parsers for common file formats, clients to access data centers, and seismological signal-processing routines that enable the manipulation of seismological time series.

<h4 style="margin-bottom:4px">PCRaster</h4>
<a href="https://pcraster.geo.uu.nl">PCRaster</a>  is a collection of software targeted at the development and deployment of spatio-temporal environmental models.  Scripting languages supported include PCRcalc and Python.

<h4 style="margin-bottom:4px">PetroPy</h4> 
<a href="https://github.com/toddheitmann/PetroPy">PetroPy</a> is a python petrophysics package allowing scientific Python computing of conventional and unconventional formation evaluation. It uses lasio to read las files and includes a petrophysical workflow and a log viewer based on XML templates.

<h4 style="margin-bottom:4px">PmagPy</h4> 
<a href="https://github.com/PmagPy">The PmagPy project</a>  is a set of tools written in Python for the analysis of paleomagnetic data.

<h4 style="margin-bottom:4px">PVGeo</h4> 
<a href="https://pvgeo.org">PVGeo</a> is an open-source Python package for geoscientific visualization and analysis harnessing an already powerful software platform: the Visualization Toolkit (VTK) and its front-end application, ParaView. 

<h4 style="margin-bottom:4px">PyDGS</h4>
<a href="https://github.com/dbuscombe-usgs/pyDGS">PyDGS</a>   is an open-source project dedicated to provide a Python framework to compute estimates of grain size distribution using the continuous wavelet transform method.

<h4 style="margin-bottom:4px">PyFLOWGO</h4> 
<a href="https://github.com/pyflowgo/pyflowgo">PyFLOWGO</a> is an open-source platform for simulation of channelized lava thermo-rheological properties.

<h4 style="margin-bottom:4px">pyGIMLi</h4> 
<a href="https://www.pygimli.org">pyGIMLi</a> is an open-source library for modeling and inversion and in geophysics. The object-oriented library provides management for structured and unstructured meshes in two and three dimensions, finite-element and finite-volume solvers, various geophysical forward operators, as well as Gauss-Newton--based frameworks for constrained, joint, and fully coupled inversions with flexible regularization.

<h4 style="margin-bottom:4px">pyGeoPressure</h4> 
<a href="https://github.com/whimian/pyGeoPressure">pyGeoPressure</a> is an open-source Python package designed for pore-pressure prediction from both well log data and seismic velocity data. Though light weight, pyGeoPressure performs the entire workflow, from data management to pressure prediction. The main features of pyGeoPressure are (1) it makes overburden (or lithostatic) pressure calculations; 2) it uses Eaton’s method and parameter optimization; 3) it uses Bowers’ method and parameter optimization; and (4) it implements a multivariate method with parameter optimization.

<h4 style="margin-bottom:4px">PyGMT</h4> 
<a href="https://www.pygmt.org">PyGMT</a> is a Python wrapper for the Generic Mapping Tools (GMT), a command-line program widely used in the Earth Sciences. It provides capabilities for processing spatial data (gridding, filtering, masking, FFTs, etc) and making high quality plots and maps.

<h4 style="margin-bottom:4px">Pyleoclim</h4> 
<a href="https://pyleoclim-util.readthedocs.io">Pyleoclim</a>   is a Python package designed for the analysis of paleoclimate data. Pyleoclim leverages various data science libraries (numpy, pandas, scikit-learn) for time series analysis, as well as and Matplotlib and Cartopy for the creation of publication-quality figures. 

<h4 style="margin-bottom:4px">Pyrocko</h4>
<a href="https://git.pyrocko.org/pyrocko/pyrocko"> Pyrocko</a>    is an open source seismology toolbox and library. Most of Pyrocko is coded in the Python programming language, with a few parts coded in C.

<h4 style="margin-bottom:4px">pyrolite</h4>
<a href="https://pyrolite.readthedocs.io/en/master">pyrolite</a> is a set of tools to handle and visualize geochemical data.
The Python package includes functions to work with compositional data and to transform geochemical variables (e.g., elements to oxides), functions for common plotting tasks (e.g., spiderplots, ternary diagrams, bivariate and ternary density diagrams), and numerous auxiliary utilities.

<h4 style="margin-bottom:4px">PySAL</h4>
<a href="https://oceanspy.readthedocs.io">PySAL"</a>   is an open-source project designed to support spatial data science.  A compilation of  notebooks demonstrating the functionality of PySAL are <a href="http://pysal.org/notebooks">available online</a>.

<h4 style="margin-bottom:4px">PySAT</h4>
<a href="https://github.com/pysathq/pysat">PySAT</a>    is a Python toolkit, which aims at providing a simple and unified interface to a number of state-of-art Boolean satisfiability (SAT) solvers as well as to a variety of cardinality and pseudo-Boolean encodings. 

<h4 style="margin-bottom:4px">PyWEED</h4> 
<a href="https://github.com/iris-edu/pyweed">PyWEED</a> is an application for retrieving event-based seismic data.

<h4 style="margin-bottom:4px">python-geospatial</h4> 
<a href="https://github.com/giswqs/python-geospatial">python-geospatial</a> is a collection of Python packages for geospatial analysis with binder-ready notebook examples. 

<h4 style="margin-bottom:4px">PyVista</h4> 
<a href="https://docs.pyvista.org">PyVista</a> (formerly vtki) is a helper module for the Visualization Toolkit (VTK) that takes a different approach on interfacing with VTK through NumPy and direct array access. This package provides a Pythonic, well-documented interface exposing VTK’s powerful visualization backend to facilitate rapid prototyping, analysis, and visual integration of spatially referenced datasets.

<h4 style="margin-bottom:4px">QuakeMigrate</h4>
<a href="https://github.com/QuakeMigrate/QuakeMigrate">QuakeMigrate</a>    is a Python package for automatic earthquake detection and location using waveform migration and stacking. It can be used to produce catalogues of earthquakes, including hypocentres, origin times, phase arrival picks, and local magnitude estimates, as well as rigorous estimates of the associated uncertainties.

<h4 style="margin-bottom:4px">REDPy</h4> 
 Repeating Earthquake Detector in Python (<a href="https://github.com/ahotovec/REDPy">REDPy</a>) is a tool for automated detection and analysis of repeating earthquakes in continuous data. It works without any previous assumptions of what repeating seismicity looks like (that is, does not require a template event).

<h4 style="margin-bottom:4px">RfPy</h4> 
<a href="https://github.com/paudetseis/RfPy">RfPy</a>   RfPy is a software to calculate single event-station receiver functions from the spectral deconvolution technique. 

<h4 style="margin-bottom:4px">SediNet</h4> 
<a href="https://github.com/MARDAScience/SediNet">SediNet</a>   configurable machine-learning framework for estimating either (or both) continuous and categorical variables from a photographic image of clastic sediment.

<h4 style="margin-bottom:4px">Segyio</h4> 
<a href="https://github.com/equinor/segyio">Segyio</a>    is a small LGPL licensed C library for easy interaction with SEG-Y and Seismic Unix formatted seismic data, with language bindings for Python and Matlab. Segyio is an attempt to create an easy-to-use, embeddable, community-oriented library for seismic applications. Features are added as they are needed; suggestions and contributions of all kinds are very welcome.

<h4 style="margin-bottom:4px">SHTOOLS</h4> 
<a href="https://github.com/SHTOOLS/SHTOOLS">SHTOOLS</a>  is a Fortran-95/Python library that can be used to perform spherical harmonic transforms, multitaper spectral analyses, expansions of functions into Slepian bases, and standard operations on global gravitational and magnetic field data.
 
<h4 style="margin-bottom:4px">SimPEG</h4> 
Simulation and Parameter Estimation in Geophysics (<a href="https://github.com/simpeg/simpeg">SimPEG</a>) is a python package for simulation and gradient-based parameter estimation in the context of geophysical applications.

<h4 style="margin-bottom:4px">SplitPy</h4> 
<a href="https://paudetseis.github.io/SplitPy">SplitPy</a>     is a teleseismic shear-wave (SKS) Splitting Toolbox based on the Matlab Tool SplitLab, developed by Wustefeld et al (2008).

<h4 style="margin-bottom:4px">tdmtpy</h4> 
Time Domain Moment Tensor Inversion in Python (<a href="https://github.com/LLNL/mttime">tdmtpy</a>) is a python package developed for time domain inversion of complete seismic waveform data to obtain the seismic moment tensor. It supports deviatoric and full moment tensor inversions, and 1-D and 3-D basis Green's functions.

<h4 style="margin-bottom:4px">Thermobar</h4> 
<a href="https://github.com/PennyWieser/Thermobar">Thermobar</a> is a Mineral-Melt Equilibrium tool written in the open-source language Python3. Thermobar allows pressures, temperatures and melt water contents to be easily calculated using more than 100 popular thermobarometers. We also provide computationally-fast functions for calculating pressures and temperatures for all possible pairs of phases in equilibrium from a given sample/volcanic center (e.g., cpx-liquid, opx-liquid, two-pyroxene, two-feldspar matching).

<h4 style="margin-bottom:4px">VESIcal</h4>
<a href="https://github.com/kaylai/VESIcal">VESIcal</a> is a generalized python library for calculating and plotting various things related to mixed volatile ($H_2O-CO_2$) solubility in silicate melts.
 
<h4 style="margin-bottom:4px">Welly</h4> 
<a href="https://github.com/agile-geoscience/welly">Welly</a> uses lasio for data input and output but hides much of it from the user. I recommend that you look at both projects before deciding if you need the "well-level" functionality that welly provides.
Welly is a family of classes to facilitate the loading, processing, and analysis of subsurface wells and well data, such as striplogs, formation tops, well log curves, and synthetic seismograms.

<h4 style="margin-bottom:4px">xarray</h4>
<a href="https://github.com/pydata/xarray">xarray</a> is an evolution of an internal tool developed at The Climate Corporation. It was originally written by Climate Corp researchers Stephan Hoyer, Alex Kleeman and Eugene Brevdo and was released as open source in May 2014. The project was renamed from "xray" in January 2016. Xarray became a fiscally sponsored project of NumFOCUS in August 2018. Xarray makes working with labelled multi-dimensional arrays simple, efficient, and fun!


## Python learning resources for Geologists

Surfing the World Wide Web, Geologists can find many excellent resources to improve their knowledge and abilities in Python programming. As example, the <a href="https://earthlab.colorado.edu">Earth Lab</a> at the University of Colorado (Boulder) provides many tutorials and course lessons about the application of Python methods and techniques to Earth Sciences problems. 

Many other Universities (e.g., <a href="https://handbook.unimelb.edu.au/2017/subjects/erth90051">The University of Melbourne</a>, <a href="https://www.uib.no/en/course/GEOV302">The University of Bergen</a>, <a href="https://bit.ly/3xeLAAQ">The  Max Planck Institute for Meteorology</a>, <a href="https://programsandcourses.anu.edu.au/2019/course/emsc8033">The Australian National University</a>, and <a href="https://www.unipg.it/personale/maurizio.petrelli/en/teaching">The University of Perugia</a>, to cite a few) have active courses (May 2021) teaching the application of Python programming to Earth Scientists. 

Also, various active researchers continuously spread excellent material to improve the application of advanced statistical and computational techniques in Python to geologists. Examples are the resources on Geostatistics and Machine Learning by <a href="https://github.com/GeostatsGuy">Michael Pyrcz</a>, Associate Professor at the University of Texas at Austin, or on Geospatial Data Analysis by Google Earth Engine by <a href="https://github.com/giswqs">Qiusheng Wu</a>, Assistant Professor at the Department of Geography at the University of Tennessee (Knoxville).