This is the back-end of an annotation tool that allows you to annotate text documents with threat intelligence vocabulary, which is saved into a dataset. This dataset is later used to train an NER model which tags documents to extract high-level threat intelligence indicators like Actor, Targeted Application, Targeted Location, TTPs, etc.
Repo for front-end:
git clone https://github.com/yghazi/g4ti-tator.git
Clone this repo:
git clone https://github.com/yghazi/g4ti-nlp-processor.git
- Python 3.X
For Windows, you will also need the following:
- .NET framework
- Visual Studio build tools
#!python
cd g4ti-nlp-processor # navigate to the g4ti-nlp-processor folder
pip install -r requirements.txt
#!python
python -m nltk.downloader punkt
python -m nltk.downloader averaged_perceptron_tagger
#!python
python tator.py
If your front-end is also running, you should be able to now use the nlp processor through your browser.