#!pydantic_core==2.10.1
# Comparing Fine Tuned Models using Ollama - Locally

## I used 3 fine tuned models for this analayis. The models have also been deployed on HuggingFace

- ### [Phi3](https://huggingface.co/abhi7991/promptFineTuning)
- ### [Llama3 - 8b](https://huggingface.co/abhi7991/promptfinetuning-llama3)
- ### [Llama3 - 8b (Developed by Neo4j)](https://huggingface.co/collections/tomasonjo/llama3-text2cypher-demo-6647a9eae51e5310c9cfddcf)

I fine tuned Phi3 and Llama3b to convert natural language queries to cypher queries which can be used to tap into Neo4j knowledge graphs.
When combined with Neo4jGraph it can execute the Cypher queries and obtain results in natural language. 

**Refer to the previous notebooks to see how the models were finetuned and to provide you with a deeper insight on how to go about it as well as challenges faced**

# My Fine tuned model

This fine tuned model consists of Llama3 trained on my custom dataset to provide it with much more accurate results

In [1]:

from langchain_community.graphs import Neo4jGraph
from langchain_community.chat_models import ChatOllama
import os
from graphdatascience import GraphDataScience
import time
from dotenv import load_dotenv
from langchain_core.prompts import ChatPromptTemplate
database = os.getenv('NEO4J_DATABASE')
uri, user, password = os.getenv('NEO4J_URI'), os.getenv('NEO4J_USER'),'@PromptEngg'
DEMO_URL = uri
DATABASE = database

graph = Neo4jGraph(
    url=DEMO_URL,
    database=DATABASE,
    username=user,
    password=password,
    enhanced_schema=True,
    sanitize=True,
)
llm = ChatOllama(model="text2cypher-llama3", request_timeout=60.0)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Given an input question, convert it to a Cypher query. No pre-amble.",
        ),
        (
            "human",
            (
                "Based on the Neo4j graph schema below, write a Cypher query that would answer the user's question: "
                "\n{schema} \nQuestion: {question} \nCypher query:"
            ),
        ),
    ]
)
chain = prompt | llm


questions = ["What are the various countries products get shipped to?", 
             "How many different Brands are there in the supply chain",
            "Give me the count of products handled by each office?"]
i = 0
for q in questions:

    t0 = time.time()    
    question = q
    response = chain.invoke({"question": question, "schema": graph.schema})
    print(response.content)


    t1 = time.time()
    total = t1-t0
    i+=1
    print("Reponse Time for Question "+str(i)+": ", total)

MATCH (p:Product)-[w:WEIGHT]-(c:Country)
RETURN c.name AS Country, COUNT(w) AS ProductCount
ORDER BY ProductCount DESC;
Reponse Time for Question 1:  380.12241530418396
MATCH (p:Product)-[:BRAND]->(b:Brand)
RETURN b.name, count(b.name) AS BrandCount
GROUP BY b.name
ORDER BY BrandCount DESC;
Reponse Time for Question 2:  304.29602813720703
MATCH (o:Office)-[:MANAGED_BY]->(p:Product)
RETURN o.name, count(p) AS product_count
ORDER BY product_count DESC;
Reponse Time for Question 3:  262.7514851093292


# Neo4j Data Scientist Fine tuned model

The Data Scientists at Neo4j have trained a model which is bigger on a larger dataset andd the results are shown as below


In [None]:

from langchain_community.graphs import Neo4jGraph
from langchain_community.chat_models import ChatOllama
import os
from graphdatascience import GraphDataScience
import time
from dotenv import load_dotenv
from langchain_core.prompts import ChatPromptTemplate
database = os.getenv('NEO4J_DATABASE')
uri, user, password = os.getenv('NEO4J_URI'), os.getenv('NEO4J_USER'),'@PromptEngg'
DEMO_URL = uri
DATABASE = database

graph = Neo4jGraph(
    url=DEMO_URL,
    database=DATABASE,
    username=user,
    password=password,
    enhanced_schema=True,
    sanitize=True,
)
llm = ChatOllama(model="tomasonjo/llama3-text2cypher-demo", request_timeout=60.0)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Given an input question, convert it to a Cypher query. No pre-amble.",
        ),
        (
            "human",
            (
                "Based on the Neo4j graph schema below, write a Cypher query that would answer the user's question: "
                "\n{schema} \nQuestion: {question} \nCypher query:"
            ),
        ),
    ]
)
chain = prompt | llm

questions = ["What are the various countries products get shipped to?", 
             "How many different Brands are there in the supply chain",
            "Give me the count of products handled by each office?"]
i = 0
for q in questions:

    t0 = time.time()    
    question = q
    response = chain.invoke({"question": question, "schema": graph.schema})
    print(response.content)


    t1 = time.time()
    total = t1-t0
    i+=1
    print("Reponse Time for Question "+str(i)+": ", total)

# Final Thoughts

After viewing our models performance and latency we can see that while the results were right, the time taken to generate the results was long.

it's essential to understand the factors that influence Ollama's performance:

- **Hardware capabilities (CPU, RAM, GPU)**
- **Model size and complexity**
- **Quantization level**
- **Context window size**
- **System configuration and settings**

## Key Takeaways 
- The model size is another important parameter to take into consideration, I used Llama3 which is quite large and could have contributed to the latency.
  
- In this case the configurations of my local environment were as follows - The right GPU was not configured and would have sped up the process. 

```
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.76                 Driver Version: 560.76         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   59C    P0             15W /   50W |    3578MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
```