Data Engineering

gamechanger-parser is an isolated feature of the gamechanger-data ingestion pipeline. This parser was created to be a stand alone sharable tool; to be set up as a Docker container, python venv or importable package across Advana Databricks.

To see all repositories gamechanger

The aim of this repository is to provide an effective tool for the extraction of text and metadata from documents for; general text extraction, gamechanger-policy specific usage, or a foundation for other platforms with the ability to use your own Machine Learning Model.

In the example script/notebook you will find an example run of our parser. Defining your data input and deciding wether to use the

general text extractor: writer
GAMECHANGER Policy's parse + additional embeddings: policy_text_pipeline

After setting up your environment using one of the two setup options below, you can use the example.py or example.ipynb files to run an example parsing job. You can also view the inputs and outputs of this job in the folders example_input and example_output respectively.

Docker setup

Follow config/dockerConf/README.md

Conda env setup

Follow config/venvSetup.md

Unfinished:

Getting the gc-parser wheel uploaded to Advana's Mirror and providing an example of how to import the parser on Databricks

License

See LICENSE.md

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
config		config
devTools/whl		devTools/whl
example_input		example_input
example_output		example_output
example_policy_analytics_output		example_policy_analytics_output
img		img
policy_analytics_parser		policy_analytics_parser
text_extraction		text_extraction
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
example.ipynb		example.ipynb
example.py		example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineering

Docker setup

Conda env setup

Unfinished:

License

About

Releases

Packages

Contributors 4

Languages

License

dod-advana/gamechanger-parser

Folders and files

Latest commit

History

Repository files navigation

Data Engineering

Docker setup

Conda env setup

Unfinished:

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages