open-redact

open-redact is an open source api to help anonymize pdf. This can be used can be used to redact names and other identifiable information from resume before review to create a more equitable hiring process.

Today open-redact supports the following redactions *People's names *email address

Configuration

This package uses a spacy.io named entity recognition model. By default it is set to their en_core_web_lg model, but you can choose a smaller model to make development easier or a larger model for more performance. You can also select models for other languages. Check out https://spacy.io/models/en for English options.

2 places need to be edited to use a different model. Inside the dockerfile the following line should be edited to install the model of your choice.

RUN python -m spacy download en_core_web_lg

Inside app/main/sensitive_text_check.py the following line should be edited to install the model of your choice.

nlp = spacy.load("en_core_web_lg")

Installation

Clone from source and build an image using the included docker file

 docker build --tag openredact:python .

If not using image be sure manually install your named entity recognition model with the following

python -m spacy download en_core_web_lg

Usage

When up and running the system auto generates swagger documentation which can be viewed at http://127.0.0.1:8000/docs#/ where the address and port should be updated for your deployment.

Testing

From root run the following command to execute all unit tests

python -m pytest .

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

open-redact

Configuration

Installation

Usage

Testing

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

cjensen506/open-redact

Folders and files

Latest commit

History

Repository files navigation

open-redact

Configuration

Installation

Usage

Testing

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages