Enhancing-LLM-with-Jenkins-Knowledge

Overview

Built using Python.
This Project is from Google Summer of code 2024

Setting up env

you may need to update the environment variables set in BE/.env and FE/.env

Frontend env setup

reach to .env file in FE/ directory, you will find the url setup by default to localhost

VITE_SERVER_URL = http://127.0.0.1:5000/

Backend env setup

reach to .env file in BE/ directory, you will find also both HOST and PORT which are configured to localhost be default

FLASK_RUN_HOST = 0.0.0.0
FLASK_RUN_PORT = 5000

How To Run

Open a new terminal in the project directory

Frontend server setup

Need to install Node first
Install all required packages

cd ./FE
npm install

Start the server

npm run dev

You will get a message that the server is running at http://localhost:5173/

Backend server setup

Install the needed packages.

cd ./BE
python3 -m venv .
source ./bin/activate
pip install -r ./requirements.txt

Start the server

python app.py

note that if you are running the BE server for the first time so it will download the model locally on your machine and it is about 6GB, notice: this is for the first time you are running this only

Fine-Tune your version

You can fine-tune your own version and get it uploaded on hugging face using the following steps

we fine-tune llama2 using colab free resources of T4 GPU with 16 GB VRAM
we provided ./src/Fine-Tuning.ipynb
- we clone our repository to access the dataset provided for training
```
git clone https://github.com/nouralmulhem/Enhancing-LLM-with-Jenkins-Knowledge.git
```
- drive is used to store the checkpoints just to ensure its persistance in case of colab enviornment crashes
  
  you can edit the path to drive you want to save the model in by editting new_model_path variable
- you also can set the number of epochs you would like to use to fine-tune the model by updating num_train_epochs variable
after getting done with fine-tuning the model you can access ./src/Upload_Model.ipynb to merge lora weights with the model and upload your own model on hugging face and start using it
- at this stage you need to update new_model_path variable to the correct path on your drive
- as a final step you need to update repo_id variable to match your repo on hugging face

VOILA! you got your own model

Convert fine-tuned to GGML

CPU model

You can load this full model onto the GPU and run it like you would any other hugging face model, but we are here to take it to the next level of running this model on the CPU.

we are using llama.cpp, so first of all we need to clone the repo

git clone https://github.com/ggerganov/llama.cpp.git

Llama.cpp has a script called convert_hf_to_gguf.py that is used to convert models to the binary GGML format that can be loaded and run on CPU.

python convert_hf_to_gguf.py path/to/fine-tuned/model/  --outtype f16 --outfile path/to/binary/model.bin

This should output a 13GB binary file at the specified path/to/binary/model.bin that is ready to run on CPU with the same code that we started with!

Quantization

Part of the appeal of the GGML library is being able to quantize this 13GB model into smaller models that can be run even faster. There is a tool called quantize in the Llama.cpp repo that can be used to convert the model to different quantization levels.

First you need to build the tools in the Llama.cpp repository.

cd llama.cpp
cmake -B build  
cmake --build build --config Release

This will create the tools in the bin directory. You can now use the quantize tool to shrink our model to q8_0 by running:

cd build/bin/release
./llama-quantize.exe path/to/binary/model.bin path/to/binary/merged-q8_0.bin q8_0

Now we have a 6.7 GB model at path/to/binary/merged-q8_0.bin

To upload the local quantized model on huggingface

huggingface-cli upload username/repo_id path/to/binary/quantized/model.bin model.bin

Contributors

_{Nour Almulhem}

🔒 License

Note: This software is licensed under MIT License, See License for more information ©nouralmulhem.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
BE		BE
FE		FE
datasets		datasets
images		images
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
JenAi Final Document.docx		JenAi Final Document.docx
JenAi Final Document.pdf		JenAi Final Document.pdf
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Enhancing-LLM-with-Jenkins-Knowledge

Overview

Setting up env

Frontend env setup

Backend env setup

How To Run

Frontend server setup

Backend server setup

Fine-Tune your version

Convert fine-tuned to GGML

CPU model

Quantization

Contributors

🔒 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

nouralmulhem/Enhancing-LLM-with-Jenkins-Knowledge

Folders and files

Latest commit

History

Repository files navigation

Enhancing-LLM-with-Jenkins-Knowledge

Overview

Setting up env

Frontend env setup

Backend env setup

How To Run

Frontend server setup

Backend server setup

Fine-Tune your version

Convert fine-tuned to GGML

CPU model

Quantization

Contributors

🔒 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages