This repository contains programs for detecting and classifying COVID-19 related tweets into scienitifc claim categories. This work follows the annotation framework described by Hafid et. al.
The repository is divided into two directories - one for Llama 2 models and another one for GPT models. The Llama 2 models contains the following prompting techniques:
- Few Shot Prompting
- Few Shot Prompting with Guidelines
- Few Shot Prompting with Guidelines and Emotional Stimuli
- Chain of Thought
- Clue and Reasoning Prompting
Run the following command to install necessary packages for running the standalone python program for `GPT`` models:
# first, activate your virtual environment
pip install openai pandas scikit-learn
Similarly, for Llama 2
models:
pip install requests together langchain pandas scikit-learn
- To run the python program for
GPT
models, first set the environment variable, namedOPENAI_API_KEY
in my case:
setx OPENAI_API_KEY “<yourkey>” # for windows
echo "export OPENAI_API_KEY='yourkey'" >> ~/.bashrc # for Linux
echo "export OPENAI_API_KEY='yourkey'" >> ~/.zshrc # for mac
- The Jupyter Notebook for
Llama 2
callsTogetherAI
's API, a third-party service which hosts several large language models includingLlama 2
and provides free credits. You can set the API key for TogetherAI likewise, namedTOGETHERAI_API_KEY
in this case:
setx TOGETHERAI_API_KEY “<yourkey>” # for windows
echo "export TOGETHERAI_API_KEY='yourkey'" >> ~/.bashrc # for Linux
echo "export TOGETHERAI_API_KEY='yourkey'" >> ~/.zshrc # for mac
- To run the
Llama 2
notebook in a GPU cluster, like Coloarado State University's Falcon HPC Cluster, you need to craft a shell script with required configuration parameters. Falcon Cluster uses Slurm scheduler to schedule jobs. Once the job is submitted to the cluster, to interact with the notebook, you need to enable port forwarding:
ssh -N -f -R $port:localhost:$port falcon # This port forwards the specified port of HPC cluster to the same port of the machine that was used to submit the job
ssh -N -f -L localhost:$port:localhost:$port <username>@<machine_name>.<domain> # This further forwards the port from the machine that was used to submit the job to your local machine
A truncated version of the dataset is available in .csv
format.