LLM Sentinel (NeMo LangChain Ollama Gradio) for NVIDIA GenAI Contest
- nemoguardrails==0.9.0
- langchain-community==0.0.38
- ollama==0.2.1
- gradio==4.42.0
- python-dotenv==1.0.1
It's tested on Python 3.9 and above on macOS, and Ubuntu Linux
-
Hardware Requirements: Ensure you have access to NVIDIA GPUs, ideally A100 80GB VRAM, to run the model (Llama3:70b) efficiently. In my case I rent A100 GPU from Digital Ocean Paperspace. Please see the screenshot. OS: Ubuntu 22.04 Disk Size: At least 200 GB (llama3:70b)-> 40GB, (llama3:8b)-> 5GB
ssh paperspace@XXX.XXX.XXX.XXX
-
First git clone the repository
cd ~ git clone https://github.com/aidatatools/LLM_Sentinel.git cd LLM_Sentinel
-
venv:
Ensure you have Python 3.10 or later installed.
cd ~/LLM_Sentinel python3.10 -m venv venv source venv/bin/activate
-
Install requirements.txt
pip install -r requirements.txt
-
Check the backend ollama service is running, and the model (llama3:8b)(for DEV) or (llama3:70b)(for Production) exists. If you are not familiar with ollama, please visit https://ollama.com
ollama list curl http://127.0.0.1:11434
-
Copy .env.example to .env and set the variable(ENV_PROD) to True or False
echo 'ENV_PROD=False' > .env
-
Start the WebUI in terminal:
python chatbot3.py
-
Open a browser, and visit the site with port number: