Work in progress.
The goal of this GROBID module is to recognize in textual documents any expressions of measurements (e.g. pressure, temperature, etc.), to parse and normalization them, and finally to convert these measurements into SI units. We focus our work on technical and scientific articles (text, XML and PDF input) and patents (text and XML input).
As part of this task we support the recognition of the different value representation: numerical, alphabetical, exponential and date/time expressions.
Finally we support the identification of the "quantified" substance related to the measure, e.g. silicon nitride powder in
As the other GROBID models, the module relies only on machine learning and uses linear CRF. The normalisation is handled by the java library Units of measurement.
You can find the latest documentation here.
GROBID and grobid-quantities are distributed under Apache 2.0 license.