nlp_tools

nlp_tools This repository contains Python scripts for Natural Language Processing tasks, using the Hugging Face Transformers library. The tasks include emotion detection from text, extracting keywords from text, and summarizing text.

Prerequisites

Windows, macOS, or Linux machine with Python 3.8 or later installed.
Local Administrator rights (Windows) or sudo access (macOS/Linux).
Internet Connection (for the initial model downloads).

Clone Repo

You can clone this repo into any location you like and run it from anywhere. Run these commands in your terminal:

cd /path_of_your_choice
git clone https://github.com/AznIronMan/nlp_tools.git

Setup virtual Python environment

We recommend using Anaconda for your virtual python environment:

cd /path_of_your_choice
conda create -n nlp_tools
conda activate nlp_tools

Be sure that you have your env name before your prompt (e.g. (nlp_tools) /filepath/ # )

Install Dependencies

You will need a NVIDIA card for this. Install the appropriate Pytorch Cuda package from https://pytorch.org/

This is the bash command we used:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Then install the required Python libraries for this project, use the requirements.txt file with pip:

cd /path_of_your_choice
pip install -r requirements.txt

Scripts

This repository contains the following Python scripts:

emotions.py - Uses the Transformers library to classify the emotion of a given text.
keywords.py - Extracts keywords from the provided text using T5 model.
summarizer.py - Summarizes the given text using the BART model.

Each script can be run from the command line with additional arguments to specify the input text and other parameters.

Usage

Emotion Detection emotions.py

python emotions.py "<Your Text Here>"

Keywords Extraction keywords.py

# num_beams = how many words you want per 'keyword' / # max_tokens is the max number of tokens for your response
# num_beams and max_tokens are optional arguments - they default to 3 and 4096 respectively
python keywords.py "<Your Text Here>" [num_beams] [max_tokens]

Text Summarization summarizer.py

# threshold is the minimum word count to be considered for summarization - anything under this number will not be summarized - default: 32 (optional)
# curve is the decimal that the word count is multiplied by to get the percentage for max and min length when max and min are invalid - default: 0.1 (optional)
# max_length defaults to None (optional) / min_length defaults to None (optional) -- these values being None = use of curve
python summarizer.py "<Your Text Here>" [max_length] [min_length] [threshold] [curve]

Note: Please replace <Your Text Here> with your input text. Also Note: For keywords.py and summarizer.py, the additional arguments are optional and have default values if not specified.

Dependencies

Here is a list of Python libraries used in this project:

Transformers
Torch
Numpy
tqdm
requests
regex
urllib3
PyYAML
typing
certifi
charset-normalizer
colorama
filelock
fsspec
huggingface-hub
idna
Jinja2
MarkupSafe
mpmath
mypy-extensions
networkx
packaging
Pillow
pyre-extensions
sentencepiece
sympy
tokenizers
torchaudio
torchvision
typing-inspect
typing_extensions
xformers

License

This project is licensed under the MIT License.

Contact

Discord: AznIronMan E-Mail: geoff at clark tribe games dot com (no spaces and replace at with @ and dot with .)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
emotions.py		emotions.py
keywords.py		keywords.py
readme.md		readme.md
requirements.txt		requirements.txt
run.bat		run.bat
run.sh		run.sh
summarizer.py		summarizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp_tools

Prerequisites

Clone Repo

Setup virtual Python environment

Install Dependencies

Scripts

Usage

Dependencies

License

Contact

About

Languages

License

AznIronMan/nlp_tools

Folders and files

Latest commit

History

Repository files navigation

nlp_tools

Prerequisites

Clone Repo

Setup virtual Python environment

Install Dependencies

Scripts

Usage

Dependencies

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages