## Install Docker

___src: https://docs.docker.com/engine/install/ubuntu/___

1- Uninstall old versions:

The unofficial packages to uninstall are:

docker.io</br>
docker-compose</br>
docker-compose-v2</br>
docker-doc</br>
podman-docker</br>

    for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done

2- Install from a package:

If you can't use Docker's apt repository to install Docker Engine, you can download the deb file for your release and install it manually. You need to download a new file each time you want to upgrade Docker Engine.

Go to `https://download.docker.com/linux/ubuntu/dists/`.

Select your Ubuntu version in the list.

Go to pool/stable/ and select the applicable architecture (amd64, armhf, arm64, or s390x).

Download the following deb files for the Docker Engine, CLI, containerd, and Docker Compose packages:

    containerd.io_<version>_<arch>.deb
    docker-ce_<version>_<arch>.deb
    docker-ce-cli_<version>_<arch>.deb
    docker-buildx-plugin_<version>_<arch>.deb
    docker-compose-plugin_<version>_<arch>.deb
    
Install the .deb packages. Update the paths in the following example to where you downloaded the Docker packages.

 sudo dpkg -i ./containerd.io_<version>_<arch>.deb \
  ./docker-ce_<version>_<arch>.deb \
  ./docker-ce-cli_<version>_<arch>.deb \
  ./docker-buildx-plugin_<version>_<arch>.deb \
  ./docker-compose-plugin_<version>_<arch>.deb
  
The Docker daemon starts automatically.

Verify that the Docker Engine installation is successful by running the hello-world image.

     sudo service docker start
     sudo docker run hello-world


## Install CUDA

Install CUDA for using GPU

1- sudo apt update</br>
2- sudo apt upgrade</br>
3- sudo apt install ubuntu-drivers-common</br>
4- sudo ubuntu-drivers devices</br>
5- recommends the NVIDIA driver 535</br>

	driver   : nvidia-driver-535 - distro non-free recommended
	
6- sudo apt install nvidia-driver-535</br>
7- Reboot</br>
8- Using NVIDIA icon in top page change to "Switch to: NVIDIA (Performance Mode) and then Logout.</br>
9- nvidia-smi</br>

you must see a table. At the top of the table, we will see the driver version and CUDA driver API compatibility:

	NVIDIA-SMI 535.86.05              Driver Version: 535.86.05    CUDA Version: 12.2

10- sudo apt install gcc</br>
11- Install CUDA toolkit Ubuntu</br>

src:https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network

	wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
	sudo dpkg -i cuda-keyring_1.1-1_all.deb
	sudo apt-get update  ((for this step you must use VPN such as windscribe:( ))
	sudo apt-get -y install cuda-toolkit-12-3 ((for this step you must use VPN such as windscribe:( ))
	
If you encounter dependency errors during the installation, try running `sudo apt --fix-broken install` to fix them. Apt will suggest running it if needed.

12- Reboot</br>
13- Environment setup</br>

We will now proceed to update the environment variables as recommended by the NVIDIA documentation.
Add the following line to your `.bashrc` file using `nano ~/.bashrc` and paste the following lines at the end of the file.

	export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
	export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\
                         	 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Save the file.

14- Reboot
	
15- Test the CUDA toolkit</br>
	
    nvcc -V
	
You must see:

	nvcc: NVIDIA (R) Cuda compiler driver
	Copyright (c) 2005-2023 NVIDIA Corporation
	Built on Fri_Nov__3_17:16:49_PDT_2023
	Cuda compilation tools, release 12.3, V12.3.103
	Build cuda_12.3.r12.3/compiler.33492891_0

src: https://www.cherryservers.com/blog/install-cuda-ubuntu

## Install Ollama

src: https://github.com/jmorganca/ollama/blob/main/docs/linux.md

1- Create a user for Ollama:

    sudo useradd -r -s /bin/false -m -d /usr/share/ollama ollama

2- Create a service file in `/etc/systemd/system/ollama.service`:

    [Unit]
    Description=Ollama Service
    After=network-online.target
    
    [Service]
    ExecStart=/usr/bin/ollama serve
    User=ollama
    Group=ollama
    Restart=always
    RestartSec=3
    
    [Install]
    WantedBy=default.target

3- Then start the service:

    sudo systemctl daemon-reload
    sudo systemctl enable ollama

4- Start Ollama using systemd:

    sudo systemctl start ollama

5- Update ollama by downloading the ollama binary:

    sudo curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
    sudo chmod +x /usr/bin/ollama

6- To view logs of Ollama running as a startup service, run:

    journalctl -u ollama

## Ollama Docker image

src:https://hub.docker.com/r/ollama/ollama

1- docker pull ollama/ollama

2- CPU only:

    docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

3- Nvidia GPU with Apt

3-1- Configure the repository:

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
        | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
    curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
        | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
        | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    sudo apt-get update

3-2- Install the NVIDIA Container Toolkit packages

    sudo apt-get install -y nvidia-container-toolkit

3-3- Configure Docker to use Nvidia driver:

    sudo nvidia-ctk runtime configure --runtime=docker
    sudo systemctl restart docker

3-4- Start the container:

    docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

3-5- Run model locally

Now you can run a model:

    docker exec -it ollama ollama run llama2 

for the first time this code will download your desired model [Ollama Models](https://ollama.ai/library).

4- If you want run again and again model use below code in terminal and again 2 (for only CPU) and 3-4(GPU):

    sudo docker ps -a
    sudo docker stop ollama
    sudo docker rm ollama

# Latest Method For Ollama Installation with Langchain

[Build your own RAG and run it locally: Langchain + Ollama + Streamlit](https://blog.duy-huynh.com/build-your-own-rag-and-run-them-locally/)


[Install Ollama](src: https://github.com/jmorganca/ollama/blob/main/docs/linux.md#manual-install)

1- Install Ollama in linux:

	sudo curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
	sudo chmod +x /usr/bin/ollama

2- Create a user for Ollama:

	sudo useradd -r -s /bin/false -m -d /usr/share/ollama ollama

3- Create a service file in /etc/systemd/system/ollama.service:

	[Unit]
	Description=Ollama Service
	After=network-online.target

	[Service]
	ExecStart=/usr/bin/ollama serve
	User=ollama
	Group=ollama
	Restart=always
	RestartSec=3

	[Install]
	WantedBy=default.target

4- Then start the service:

	sudo systemctl daemon-reload
	udo systemctl enable ollama
	
5- Start Ollama using systemd:

	sudo systemctl start ollama


============================================

## Above Steps not Worked in Ubuntu

============================================

Using Ollama as Docker (src:https://hub.docker.com/r/ollama/ollama)

1- docker pull ollama/ollama

2- CPU only

	docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
	
3- Using Ollama with GPU needs at least 4 GPU 4Gb (total 16 Gb), so I can only use CPU.

4- Run model locally

	docker exec -it ollama ollama run llama2

6- Create new env with python < 3.12 

	- micromamba activate base
	- micromamba create -n ollama python=3.11
	- micromamba activate ollama
	
7- Build the RAG pipeline (src:https://blog.duy-huynh.com/build-your-own-rag-and-run-them-locally/)

	- pip install langchain==0.0.343
	- pip install streamlit==1.29.0
	- pip install streamlit-chat==0.1.1
	- pip install pypdf==3.17.1
	- pip install fastembed==0.1.1
	- pip install openai==1.3.6
	- pip install langchainhub==0.1.14
	- pip install chromadb==0.4.18
	- pip install watchdog==3.0.0

8- Some Commands with Ollama

    sudo docker ps -a # List of AvailableModels
    sudo docker stop ollama # Stop the current model
    sudo docker rm ollama # Remove the current model

9- How do I clean the memory cache?

    sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

# Using Streamlit with Ollama and Langchain

Save below code in a file with name myapp.py:

In [None]:
# Import Required Libraries
import os
import sys
import tempfile
import textwrap
import streamlit as st
from datetime import datetime
from streamlit_chat import message
from langchain.llms import Ollama #Cohere
from langchain.vectorstores import Chroma, FAISS
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA, LLMChain
from langchain.embeddings import HuggingFaceEmbeddings #CohereEmbeddings
from langchain.memory import ConversationBufferMemory
from langchain.document_loaders import PyMuPDFLoader, DirectoryLoader
from langchain.memory.chat_message_histories import StreamlitChatMessageHistory
from langchain.text_splitter import CharacterTextSplitter,RecursiveCharacterTextSplitter

__import__("pysqlite3")
sys.modules["sqlite3"] = sys.modules.pop("pysqlite3")

#--------------------------------------------------------------#
# Setting Up Streamlit Page
st.set_page_config(page_title="Ollama Chatbot", page_icon= "💬")

#--------------------------------------------------------------#
with st.sidebar:
    st.title('💬 OLLAMA Chatbot')
    
    #st.divider()
    # Select the model
    selected_model = st.selectbox('Choose a model', ['Mistral', 'Llama2', 'Code Llama', 'Zephyr'], key='selected_model')
    
    if selected_model == "Mistral":
        llm_model = "mistral"
        st.caption("""
                   The Mistral 7B model released by Mistral AI.
                   """) 
    elif selected_model == "Llama2":
        llm_model = "llama2"
        st.caption("""
                   Llama 2 is released by Meta Platforms, Inc.
                   """)
    elif selected_model == "Zephyr":
        llm_model = "zephyr"
        st.caption("""
                   Zephyr is a 7 billion parameter model, fine-tuned on Mistral to achieve results similar to 
                   Llama 2 70B Chat in several benchmarks.
                   """)
    else:
        llm_model = "codellama"
        st.caption("""
                   Code Llama is a model for generating and discussing code, built on top of Llama 2.
                   """) 
    #st.divider()
    temp_r = st.slider("Temperature", 0.0, 0.9, 0.0, 0.1)
    chunk_size = st.slider("Chunk Size for Splitting Document ", 256, 1024, 400, 10)
    chunk_overlap = st.slider("Chunk Overlap ", 0, 100, 20, 5)
    clear_button = st.button("Clear Conversation", key="clear")

#-----------------------Functions-------------------------------#
# function for loading the embedding model
def load_embedding_model(model_path, normalize_embedding=True):
    return HuggingFaceEmbeddings(
        model_name=model_path,
        #model_kwargs={'device': 'cuda'}, #  you can set model_kwargs={'device': 'cuda:0'} for the first GPU, model_kwargs={'device': 'cuda:1'} for the second GPU, and so on.(src:https://github.com/langchain-ai/langchain/issues/10436)
        model_kwargs={'device':'cpu'}, # here we will run the model with CPU only
        encode_kwargs = {
            'normalize_embeddings': normalize_embedding # keep True to compute cosine similarity
        }
    )

#--------------------------------------------------------------#
# Function for creating embeddings using FAISS
def create_embeddings(chunks, embedding_model, storing_path="vectorstore"):
    # Creating the embeddings using FAISS
    vectorstore = FAISS.from_documents(chunks, embedding_model)
    
    # Saving the model in current directory
    vectorstore.save_local(storing_path)
    
    # returning the vectorstore
    return vectorstore

#--------------------------------------------------------------#
# Creating the chain for Question Answering
def load_qa_chain(retriever, llm, prompt):
    return RetrievalQA.from_chain_type(
        llm=llm,
        retriever=retriever, # here we are using the vectorstore as a retriever
        chain_type="stuff",
        return_source_documents=True, # including source documents in output
        chain_type_kwargs={'prompt': prompt} # customizing the prompt
    )

#--------------------------------------------------------------#
# tabs
tab1, tab2, tab3, tab4 = st.tabs(["💬 Chatbot", "🖹 ChatPDFs", "📈 ChatPandas", "🌍 ChatMaps"])

#--------------------------------------------------------------#
# Chatbot Tab
# with tab1():

#--------------------------------------------------------------#
# ChatPDFs Tab
with tab2:
	# Upload PDF files
    uploaded_PDF_files = st.file_uploader("Upload multiple files", accept_multiple_files=True, type="pdf")

if uploaded_PDF_files:
    with tempfile.TemporaryDirectory() as tmpdir:
        for uploaded_file in uploaded_PDF_files:
            file_name = uploaded_file.name
            file_content = uploaded_file.read()
            st.write("Filename: ", file_name)

            # Write the content of the PDF files to a temporary directory
            with open(os.path.join(tmpdir, file_name), "wb") as file:
                file.write(file_content)

        # Load the PDF files from the temporary directory
        loader = DirectoryLoader(tmpdir, glob="**/*.pdf", loader_cls=PyMuPDFLoader)
        documents = loader.load()

        # Split the PDF files into smaller chunks of text
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
        documents = text_splitter.split_documents(documents)
        embeddings = load_embedding_model(model_path="all-MiniLM-L6-v2")
        vectorstore = Chroma.from_documents(documents, embeddings)
        #vectorstore.persist()
        retriever = vectorstore.as_retriever()

        prompt_template = """ 
        System Prompt:
        Your are an AI chatbot that helps users chat with PDF documents. How may I help you today?

        {context}

        {question}
        """
        PROMPT = PromptTemplate(
        template=prompt_template, input_variables=["context", "question"]
        )
        chain_type_kwargs = {"prompt": PROMPT}

        chain = RetrievalQA.from_chain_type(
        llm=Ollama(model=llm_model, temperature=temp_r),
        chain_type="stuff",
        retriever=retriever,
        chain_type_kwargs=chain_type_kwargs,
        )
        # Question-Answer
        # Get the user question
        query = st.text_input("Ask a question:")

        if query:
                response = chain({'query': query})
                # Wrapping the text for better output in Jupyter Notebook
                wrapped_text = textwrap.fill(response['result'], width=100)
                # Display the answer
                st.markdown(f"**Q:** {query}")
                st.markdown(f"**A:** {wrapped_text}")

#--------------------------------------------------------------#
# ChatPandas Tab
# with tab3():

#--------------------------------------------------------------#
# ChatMaps Tab
# with tab4():


After Run Ollama docker run below code:

    streamlit run myapp.py