Skip to content

patmejia/soil_data_research

Repository files navigation


Soil Data Research

A Global Good Project

We are gathering data concerning the Earth System and Planetary Boundary.

This series of articles explores a path from data collection to visualization and machine learning from the human-planet perspective. Please contribute to helping this project grow.

Soil and prisma Image Created with DALL·E

My background is in operations, people, analysis, and mathematics. Particular interest to me is the soil system's health.

Coincidentally, Hackers News led me to an article about Soil.Spectroscopy 4 Global Good: open soil database, open source software and more. This database was a great starting point to get familiar with some of the field.



Contents



Quick dataset download from Kaggle

Alternetively, here is a link to the dataset on Kaggle. This dataset is a subset of the Open Soil Spectral Library (OSSL).

Project 1: Load Soil Site Data with Javascript

Example image generated by a range query to SQLite database. Partitioned into cells by (sum of samples):

image of soil site data

Install and Run Tests

Try out the first project that generates geoJSON from a local SQLite database by installing with node.js. From the main directory. This loads, parses, and visualizes the data with a local image d3.js globe. The data are partitioned into cells with the h3-js library. For the first step we load only the ID and latitude/longitude of the sample. See src/js/database.js and src/js/globe.js for more details.

npm install
npm test

Loader

The core piece of code is loader for OSSL csv from the soilsite collection. It looks like this:

async function load(dataToLoad) {
  const inputStream = createReadStream(dataToLoad, "utf8");
  const parser = inputStream.pipe(parse());
  const createSql = "CREATE TABLE soilsite (id TEXT, lat REAL, lon REAL)";
  db.run(createSql);
  const insertSql = "INSERT INTO soilsite VALUES (?, ?, ?)";
  const state = db.prepare(insertSql);
  let count = 0;

  // every record
  for await (const record of parser) {
    if (count !== 0) {
      state.run(record[0], record[10], record[11]);
    }
    count++;
  }
  state.finalize();
}

Bin to h3 cells and Convert to GeoJSON

function pointsToH3Cells(points, resolution = 7) {
  const cells = {};
  for (const point in points) {
    const cell = pointToH3Cell(points[point][0], points[point][1], resolution);
    if (cells[cell]) {
      cells[cell]++;
    } else {
      cells[cell] = 1;
    }
  }
  return cells;
}

function cellsToGeoJson(cells) {
  const geojson = h3SetToFeatureCollection(Object.keys(cells), (hex) => ({
    value: cells[hex],
  }));
  return geojson;
}

Soil Research

Spectroscopy

The importance of spectroscopy is centered around the fact that every element in the periodic table has a unique light spectrum.

Soil spectroscopy is the measurement of light absorption when a light in the visible, near-infrared or mid-infrared (Vis–NIR-MIR) regions of the electromagnetic spectrum is applied to a soil surface:

Spectroradiometer Image Source

The reflected infrared radiation is converted to electrical energy and fed to a computer for interpretation. Each major organic component of the soil absorbs and reflects light differently. By measuring these different reflectance characteristics, the Spectroradiometer and a computer determine the ingredients in the soil sample.

A typical soil spectrum in the (A) visible, (B) near-infrared, and (C) mid-infrared portion of the Electromagnetic Spectrum:

Explorer Image Source: Advances in Agronomy

Light absorption in the VIS region is due to the excitation of electrons. For longer wavelengths, NIR-MIR, the absorption is due to vibrations in the chemical bonds within molecules: symmetrical stretch, asymmetrical stretch, and bending vibrations.

The spectra will show overtones and combinations of these vibrations, mainly in the NIR region.

The mathematical modeling of these vibrations, combinations, and overtones can be analyzed using polynomial algebra as tools or functions:

Overtone_polynomials Image Source

Water has unique soil spectral features (Absorbance vs. Wavelength):

Spectra_overtones_water Image Source

Soil

Soil is a living system working as a life-sustaining resource. It teams up with billions of bacteria, fungi, and other microbes to create an abundant soil community filled with diverse soil biota.

Soils have 4 essential components:

  • Mineral particles: sand, silt, and clay
  • Organic matter
  • Water
  • Air

Organism abundance, diversity, and activity are not randomly distributed in the soil but vary in a patchy fashion both horizontally across a landscape and vertically through the soil profile:

Soil_horizons Image Source: Soil Horizons

Most soils evolve slowly over centuries through the weathering of underlying rocks and the decomposition of organic matter. Other soils are formed from deposits laid down by rivers, seas, or wind forces.

A sample of typical topsoil contains about

  • ~50% pore space filled with varying proportions of air and water,
  • ~50% of mineral particles and organic matter

Contrary to 30-100% organic matter found in marshes, bogs, and swamps soils.

Minerals

Soil minerals give soil different texture attributes and colors. Minerals are classified by size.

Minerals Classified by Size
Type Size (mm) Texture Characteristics
Sand 2.0 - 0.05 Gritty Quite visible, consists of small particles with low surface area, significant drainage Soil Minerals Image Source
Silt 0.05 - 0.002 Buttery Not visible, increases the water holding capacity
Clay < 0.002 Sticky High water holding capacity, smallest pores, and large charged surfaces attract and retain nutrients.
Table Source

The most common mineral in soils is quartz; it is not very reactive. But on the other hand, clay is very reactive. Clay particles can form strongly protected structures that store soil C for long periods.

These protected structures made with clay ensure good water-holding capacity and provide a good source of plant nutrients.

Organic Matter (SOM)

Soil organic matter SOM is composed mainly of carbon, hydrogen, and oxygen. In addition, it contains small amounts of other elements, such as nitrogen, phosphorous, sulfur, potassium, calcium, and magnesium, in organic residues. It is divided into 'living' and 'dead' components and can range from very recent inputs, such as stubble, to decayed materials that might be many hundreds of years old. About 10% of below-ground SOM, such as roots, fauna, and microorganisms, is living.

SOM exists as four distinct fractions which vary widely in size, turnover time, and composition in the soil:

  • dissolved organic matter
  • particulate organic matter
  • stable organic matter or humus
  • resistant organic matter

SOM Image Source

Structure

Soil structure refers to the proportions of solids and voids. A key aspect of soil structure is the aggregation of individual mineral and organic particles into larger units.

Aggregates are separated into size classes: macroaggregates (250 μm–2 mm) and microaggregates (53–250 μm).

Macroaggregates are formed when fungi and bacteria decompose fresh plant residue or, technically speaking, light fraction SOM.

Bacterial secretion of high-molecular-weight sugar-based polymers (EPSs). These EPSs and fungal hyphae serve as nucleation cores to accrete larger masses of slightly decayed SOM that become macroaggregates. These macroaggregates are constantly weathering in the soil to produce microaggregates within SOM:

Macro_micro_aggregates

Macro_micro_aggregates Image Source: American Society of Microbiology

Organic Carbon (SOC)

Soil Organic Carbon SOC refers to the carbon components in organic compounds. Soil organic matter (SOM) is challenging to measure directly, so laboratories tend to measure and report SOC. Soil organic carbon is a measurable component of soil organic matter which contributes to nutrient retention and turnover, soil structure, moisture retention and availability, degradation of pollutants, and carbon sequestration. SOC has been identified as a global indicator for monitoring soil health and productivity.

Visual Assessment of Soils

Soil color is usually due to 3 primary pigments:

  • black—from organic matter
  • red—from iron and aluminum oxides
  • white—from silicates and salt.

Soil_color

Soil_color Image Source

Soil texture can also be asses by estimating the size of the observable particle:

Sample_triangle Image Source

Soil color and observable soil texture are valuable indicators of the chemical processes beneath the surface and can offer some quick soil management information:

Sample_full
Image Source: Queensland Government

Soil Health

The basic principles of soil health:

  • ✅ Presence Living Roots

  • ✅ Soil Cover

  • ✅ Biodiversity

  • ⛔️ Disturbance

Soil Health Image Source: USDA

The Open Soil Spectral Library (OSSL)

The Open Soil Spectral Library (OSSL) is a global good project which serves collections of soil properties derived from spectral data. OSSL is also a network that delivers robust statistical models, calibration and prediction models, research tools, and opportunities to collaborate across borders.

The initiative received a funding award through the National Institute of Food and Agriculture (USDA). NIFA has invested over $7 Million in Big Data, Artificial Intelligence, and Other Cyberinformatics Research.

Among other valuable resources, the OSSL project offers beautifully developed software:

Explorer

And the user manual, which is open for contributions:


Start

Connecting to the OSSL database

The OSSL manual mentioned two ways to access the data. The first method uses MongoDb via R; however, the last yields a certification error. See the image below:

cert_error

As an alternative, we tried to connect directly with Javascript through NodeJS, but we also ran into another certificate error.

/Users/dev/code/soil_data_research/node_modules/mongodb/lib/utils.js:419
                    throw error;
                    ^

MongoServerSelectionError: certificate has expired
    at Timeout._onTimeout (/Users/dev/code/soil_data_research/node_modules/mongodb/lib/sdam/topology.js:293:38)
    at listOnTimeout (node:internal/timers:564:17)
    at process.processTimers (node:internal/timers:507:7) {
  reason: TopologyDescription {
    type: 'Unknown',

Lastly, we used the second method from the OSSL manual to access the data with Studio 3T and inserted the following parameters:

  • Connection Name: soilspec4gg
  • Server: api.soilspectroscopy.org
  • Authentication DB: soilspec4gg
  • User name: soilspec4gg
  • Password: soilspec4gg
  • Use SSL: true
  • Accept any SSL certificates: true

To see full details of this step go to: OSSL connect.

Exporting a Soil Site Sample from the OSSL as CVS

To see full details of this step go to: database download.

OSSL_download

End

Return to Contents

About

  • Currently working with 👩‍💻 JavaScript + Python + HTML + CSS 📐 + automated testing + continuous integration.˝
  • Developing data pipelines with a team ✨ focused on tech-for-good .

Updates

  • Available for contract + consult + full-time.
  • Collaborating with Chromatic Systems 🌈 on Software Engrg + Visualizations + Data Science + Web Dev.
  • One of our current projects: Soil Data Research 🔬.
  • Future posts will explore Machine Learning models.
  • Welcoming collaboration ideas + learning ✌️ together initiatives.

Thank you for reading!

Connect