Example pdf parser application using GROBID

This project has been developed within the Open Science and Artificial Intelligence in Research Software Engineering class.
This framework shows the power of the pdf parser grobid in combination with different xml parser by showing result for the following questions for scientific papers provided by the user.

Keywordcloud of the abstract
Number of figures
List of links

##How to start

Make sure you have GROBID installed and running

For example by using docker. Run:
docker pull lfoppiano/grobid:0.7.2
to install and
docker run -t --rm -p 8070:8070 lfoppiano/grobid:0.7.2
to start GROBID
For further information see the documentation

Set up your environment

By running conda create --name <env> --file requirements.txt

Provide your pdfs and run

Move your pdf to the resources directory OR choose your own directory by changing the RESOURCE_DIRECTORY constant in xml_parser.py
Start the application by running python xml_parser.py

XML Mode

If you want parse existing xml files. Set the MODE constant in xml_parser.py to "XML" provide your xml files in a directory and assign your directory to the XML_RESSOURCE_DIRECTORY constant in xml_parser.py

Now just start the feature extraction for xml files by running python xml_parser.py

Documentation

See full documentation at https://os-ai-cd.readthedocs.io/en/latest/

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
grobid_client		grobid_client
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
LICENSE		LICENSE
README.md		README.md
codemeta.json		codemeta.json
config.json		config.json
mkdocs.yml		mkdocs.yml
rationale.md		rationale.md
requirements.txt		requirements.txt
xml_parser.py		xml_parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example pdf parser application using GROBID

XML Mode

Documentation

About

Releases 1

Packages

Languages

License

FROZD/OS_AI_CD

Folders and files

Latest commit

History

Repository files navigation

Example pdf parser application using GROBID

XML Mode

Documentation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages