The repository contains a series of natural language processing (NLP) tasks tackled using Transformers models and standard open-source datasets.
Its implementation primarily utilizes PyTorch, heavily relying on the HuggingFace Transformers and PEFT libraries.
The intention is to have a codebase for experimenting with different models on various standard datasets, therefore the repository has more of a demonstrative or experimental purpose.
Tasks included are:
- [✅] Text Classification
- [✅] Named Entity Recognition
- [✅] Question Answering
- [✅] Summarization
- [✅] Causal Llm
- [] Instruction Fine tuning Llm
-
Initial setup
1.1 Create a new conda environment to install the dependencies, and activate it:conda create -n nlp-tasks-env python=3.11 -y conda activate nlp-tasks-env
1.2 Install the dependencies:
pip install git+https://github.com/sarapiscitelli/nlp-tasks/
1.3 Clone the repository to get the scripts:
git clone https://github.com/sarapiscitelli/nlp-tasks.git
-
Run the experiments
Training:python scripts/train/<task name>.py
Evaluation:
python scripts/evaluate/<task name>.py --model_name_or_checkpoin_path <path_to_model>