This repo contains the scripts to run the LLM model on a device. Currently this repo supports only Linux-based OS and is being used for internal puporses only.
models_list.csv
: This file contains the list of available models that can be downloaded from themodel.sh
script. Update this with other models.scripts/model.sh
: Script to download an LLM model's weights. List of available models here.setup.sh
: Script to installllama.cpp
. This is used to run the above downloaded models.scripts/run.sh
: Script to run the LLM model on a device and log the benchmarks. This script can be executed after completing the setup steps mentioned above.scripts/run_all.sh
: Script to run all the downloaded models (located in models/) and log the benchmarks.scripts/t_run.sh
: Script to benchmark the time taken to execute a selected model with a variable number of processor threads (from 1 to the maximum number of threads).scripts/prompt_run.sh
: Script to benchmark the time taken to execute a selected model with a variable length of prompts as specified in theprompt.txt
file.scripts/plotter.py
: Python script to plot the results of various benchmarks including prompt eval timing, eval timing, memory usage, time consumed with varying threads and time consumed with varying prompts.
-
Clone this repo
git clone https://github.com/VIS-WA/LLM-replication.git
-
Cd into the directory
cd LLM-replication
-
Make the setup file executable
chmod +x setup.sh
-
Setup
llama.cpp
and make other scripts executable by running the setup script./setup.sh
-
Setup the device and model parameters in the
config.txt
-
cd into scripts directory
cd scripts
-
Download the required models
./model.sh
-
Run the script to select and execute an LLM model and save the benchmarks
./run.sh
-
Run the other scripts benchmark other results.
./run_all.sh ./t_run.sh ./prompt_run.sh
- Create script to run a model and log the benchmarks
- Merge individual scripts to a single master script
- create supporting scripts for Windows OS (Latte Panda)
- Create scripts for downloading the models and setting up the environment
- Optimise the scripts with modular components