GROBID extension for identifying and normalizing physical quantities.
Switch branches/tags
Nothing to show
Clone or download

README.md

grobid-quantities

License Documentation Status

Work in progress.

The goal of this GROBID module is to recognize in textual documents any expressions of measurements (e.g. pressure, temperature, etc.), to parse and normalization them, and finally to convert these measurements into SI units. We focus our work on technical and scientific articles (text, XML and PDF input) and patents (text and XML input).

GROBID Quantity Demo

As part of this task we support the recognition of the different value representation: numerical, alphabetical, exponential and date/time expressions.

Grobid Quantity Demo

Finally we support the identification of the "quantified" substance related to the measure, e.g. silicon nitride powder in

GROBID Quantity Demo

As the other GROBID models, the module relies only on machine learning and uses linear CRF. The normalisation is handled by the java library Units of measurement.

Documentation

You can find the latest documentation here.

License

GROBID and grobid-quantities are distributed under Apache 2.0 license.

Contact: Patrice Lopez (patrice.lopez@science-miner.com), Luca Foppiano (luca.foppiano@inria.fr)