# Loading the Vector Database
1.  First download the DB from [google drive](https://drive.google.com/file/d/1L-ejvE8oTeuREYTKLbcDnNX9w0bMOgYZ/view?usp=drive_link)
2. Upload it to colab
3. Unzip it using "zipfile"
4. Download required libraries
5. Define the query function with OpenAIEmbeddings



### Unziping the DB into Current Colab Session Storage

In [10]:
import zipfile

with zipfile.ZipFile("chroma_db.zip", "r") as zip_ref:
    zip_ref.extractall("./")

### Download Required Libraries

In [16]:
!pip install langchain_community
!pip install langchain_openai
!pip install chromadb

Collecting chromadb
  Downloading chromadb-0.5.5-py3-none-any.whl (584 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m584.3/584.3 kB[0m [31m8.2 MB/s[0m eta [36m0:00:00[0m
Collecting chroma-hnswlib==0.7.6 (from chromadb)
  Downloading chroma_hnswlib-0.7.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting fastapi>=0.95.2 (from chromadb)
  Downloading fastapi-0.111.1-py3-none-any.whl (92 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/92.2 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting uvicorn[standard]>=0.18.3 (from chromadb)
  Downloading uvicorn-0.30.3-py3-none-any.whl (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.8/62.8 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
Collecting posthog>=2.4.0 (from chromadb)
  Downloading posthog-3.5.0-py2.

### Define the Query Function with **OpenAIEmbeddings**
the query function return the top four results with a score more than 0.7

In [34]:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings


def get_embedding_function():
    embeddings = OpenAIEmbeddings(openai_api_key="...") # add your openai_api_key here
    return embeddings


def query_db(query):
    db = Chroma(
        persist_directory="./chroma_db",
        embedding_function=get_embedding_function(),
    )
    result = db.similarity_search_with_relevance_scores(query, k=4)

    content = ""

    if len(result) != 0:
        for i in range(len(result)):
            if result[i][1] >= 0.7:
                content += (
                    " " + str(i) + "- " + str(result[i][0]).split("page_content=")[1]
                )

    return content

# Load the Fine-tuned Model


1. Install [unsloth](https://github.com/unslothai/unsloth)
  * Unsloth is an open-source project tool that aims to facilitate the process of finetuning language models
2. Load the model from [huggingface](https://huggingface.co/obadabaq/ai_lawyer)
3. Define the prompt structure




### Installing Unsloth

In [27]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft bitsandbytes

### Loading the Model

In [28]:
from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "obadabaq/ai_lawyer",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.


adapter_config.json:   0%|          | 0.00/734 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/728 [00:00<?, ?B/s]

==((====))==  Unsloth: Fast Llama patching release 2024.7
   \\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.3.1+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.26.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/9 [00:00<?, ?it/s]

model-00001-of-00009.safetensors:   0%|          | 0.00/1.97G [00:00<?, ?B/s]

model-00002-of-00009.safetensors:   0%|          | 0.00/1.90G [00:00<?, ?B/s]

model-00003-of-00009.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00004-of-00009.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00005-of-00009.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00006-of-00009.safetensors:   0%|          | 0.00/1.95G [00:00<?, ?B/s]

model-00007-of-00009.safetensors:   0%|          | 0.00/1.98G [00:00<?, ?B/s]

model-00008-of-00009.safetensors:   0%|          | 0.00/1.31G [00:00<?, ?B/s]

model-00009-of-00009.safetensors:   0%|          | 0.00/1.05G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/9 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/194 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/51.3k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/325 [00:00<?, ?B/s]

StevenChen16/llama3-8b-Lawyer does not have a padding token! Will use pad_token = <|reserved_special_token_250|>.


adapter_model.safetensors:   0%|          | 0.00/168M [00:00<?, ?B/s]

Unsloth 2024.7 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


### Define the Prompt Structure

In [29]:
prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Trying our Fine-Tuned Model with RAG and a Contextual Prompts
1. Get the user question
2. Query the DB to get the most related Laws regarding this question
3. Use the fine-tuned model to get an answer

### Enter a Legal Question
You can enter a question regarding one of the following topics:
* economy and business
* family and community
* finance and banking
* industry and technical standardisation
* justice and juiciary, labour
* residency and leberal professions
* security and safety
* tax


In [31]:
question = input("Enter your legal question\n")
# for example: what are the rights and obligations of an employer and an employee in the event of termination of an employment contract?

Enter your legal question
what are the rights and obligations of an employer and an employee in the event of termination of an employment contract?


### Querying the DB to get the most related Laws regarding your question

In [32]:
content = query_db(question)

### Generate a Response using the Model and the Vector DB result

In [35]:
FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
    prompt.format(
        question,
        content,
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128)

<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
what are the rights and obligations of an employer and an employee in the event of termination of an employment contract?

### Input:
 0- 'Law Category: labour, residency and leberal professions, Law Name: Federal Decree by Law No. (33) of 2021 Concerning Regulating Labour Relations, Law Content: Article (42) 
Cases of Termination of the Employment Contract  
The employment contract shall be terminated in any of the following cases:  
1. If the parties agree in writing to terminate it.  
2. Upon the expiry of the period specified in the contract unless it is extended or renewed 
in accordance with the provisions of this Decree by law.  
3. At the request of one of the parties, provided that the provisions of this Decree by law 
upon in the contract are abided by.  
4. The death of the Employer if