<img width="8%" alt="LangChain.png" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/LangChain.png" style="border-radius: 15%">

# QA - Comparative Analysis

**Tags:** #neo4j #abi #knowledgegraph

**Author:** [Florent Ravenel](https://www.linkedin.com/in/florent-ravenel)

**Last update:** 2024-04-08 (Created: 2024-03-25)

**Description:** This notebook creates ABI Knowledge Graph in Neo4j.

## Input

### Import libraries

In [1]:
import naas
import pandas as pd
from naas_drivers import gsheet
import time
import os
import requests
import json
try:
    from langchain.graphs import Neo4jGraph
    from langchain.vectorstores.neo4j_vector import Neo4jVector
    from langchain.chains import RetrievalQA, GraphCypherQAChain
    from langchain.agents import initialize_agent, Tool
    from langchain.agents import AgentType
except:
    !pip install langchain==0.1.13 --user
    from langchain.graphs import Neo4jGraph
    from langchain.vectorstores.neo4j_vector import Neo4jVector
    from langchain.chains import RetrievalQA, GraphCypherQAChain
    from langchain.agents import initialize_agent, Tool
    from langchain.agents import AgentType
try:
    from langchain_openai import OpenAIEmbeddings
    from langchain_openai import ChatOpenAI
except:
    !pip install langchain-openai=0.1.1 --user
    from langchain_openai import OpenAIEmbeddings
    from langchain_openai import ChatOpenAI

### Setup variables

In [2]:
# Inputs
spreadsheet_url = pload(os.path.join(naas_data_product.OUTPUTS_PATH, "entities", entity_index), "abi_spreadsheet") or ""
sheet_posts = "DATASET_POSTS"
sheet_qa = "QA"
api_key = os.environ.get("NAAS_API_TOKEN") or naas.secret.get('NAAS_API_TOKEN')
os.environ['OPENAI_API_KEY'] = naas.secret.get("OPENAI_API_KEY")
url = "neo4j+s://d3eeda32.databases.neo4j.io:7687"
username = naas.secret.get("NEO4J_USERNAME")
password = naas.secret.get("NEO4J_PASSWORD")
slug = "@content-assistant/jérémy-ravenel-test"
plugin_id = "9c90b51d-7745-4746-8d0e-b4b152700d59"

# Outputs
sheet_result = "QA_RESULT"

## Model

### Get QA

In [3]:
df_qa = gsheet.connect(spreadsheet_url).get(sheet_name=sheet_qa)
if not isinstance(df_qa, pd.DataFrame):
    df_qa = pd.DataFrame()
print("- QA:", len(df_qa))
question_test = df_qa.loc[0, "QUESTION"]
print("-> Question (test):", question_test)
df_qa.head(1)

- QA: 50
-> Question (test): What was the last post title and url?


Unnamed: 0,DATABASE,ASSISTANT,SCOPE,CATEGORY,QUESTION,ANSWER
0,POSTS,content-assistant,Last post,Retrieval,What was the last post title and url?,The last post was about ___


### Create Chat completion

In [4]:
def create_chat_completion(
    api_key,
    chat_id,
    model_id,
    message,
    temperature,
    plugin_id
):
    url = f"https://api.naas.ai/chat/{chat_id}/completion"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    data = json.dumps(
        {
            "id": chat_id,
            "model_id": model_id,
            "plugin_id": plugin_id,
            "payload": json.dumps(
                {
                    "prompt": message,
                    "temperature": temperature,
                }
            )
        }
    )
    response = requests.post(url, headers=headers, data=data)
    return response.json()

def create_chat(
    api_key,
    chat_name,
):
    url = "https://api.naas.ai/chat"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    data = {
        "name": chat_name,
    }
    response = requests.post(url, headers=headers, json=data)
    return response.json()

chat_name = "Test Content Assistant"
result = create_chat(api_key, chat_name)
chat_id = result.get("chat").get("id")

# Test system prompt
prompt_result = ""
try:
    model_id = "c6f0d70f-faa4-492f-81b7-4b6aba79e227"
    message = f"@content-assistant/{slug} {question_test}"
    temperature = 0.5
    result = create_chat_completion(api_key, chat_id, model_id, message, temperature, plugin_id)
    if len(result.get("completion")) > 0:
        prompt_result = result.get("completion").get("messages")[1].get("message")
except Exception as e:
    prompt_result = e
print("Prompt result:", prompt_result)

Prompt result: Hey there! The last post by Jérémy Ravenel was titled:

"AI agentic workflows will drive massive AI progress" said Andrew NG in Sequoia Capital's AI Ascent event last week. So today, I wanted to document how we approach things at Naas to make this work for everyday business.

You can check it out at the following URL: [Jérémy's LinkedIn Post](https://www.linkedin.com/feed/update/urn:li:activity:7181006169149190144)

If you need more insights or a detailed analysis of Jérémy's content strategy, feel free to ask!


### Create Vector Index & Retrieval QA

In [5]:
df_posts = gsheet.connect(spreadsheet_url).get(sheet_name=sheet_posts)
df_posts.head(1)

vector_index = Neo4jVector.from_existing_graph(
    OpenAIEmbeddings(),
    url=url,
    username=username,
    password=password,
    index_name='content',
    node_label="Content",
    text_node_properties=list(df_posts.columns.str.lower()),
    embedding_node_property='embedding',
)

vector_qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=vector_index.as_retriever()
)

# Test vector search result
vector_result = ""
try:
    vector_qa_result = vector_qa.invoke(question_test)
    if len(vector_qa_result) > 0:
        vector_result = vector_qa_result.get("result")
except Exception as e:
    vector_result = e
print("Vector Result:", vector_result)

Vector Result: The last post title was "The best part of AI conferences in Paris is probably the beautiful building ceilings 😅" and the URL is https://www.linkedin.com/feed/update/urn:li:activity:7183130611279134721.


### GraphCypherQAChain

In [6]:
graph = Neo4jGraph(
    url=url, 
    username=username, 
    password=password
)

graph.refresh_schema()

cypher_chain = GraphCypherQAChain.from_llm(
    cypher_llm = ChatOpenAI(temperature=0, model_name='gpt-4'),
    qa_llm = ChatOpenAI(temperature=0), graph=graph, verbose=True,
)

# Test KG result
kg_result = ""
try:
    cypher_chain_result = cypher_chain.invoke(question_test)
    if len(cypher_chain_result) > 0:
        kg_result = cypher_chain_result.get("result")
except Exception as e:
    kg_result = e
print("KG Result:", kg_result)



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (c:Content) RETURN c.title, c.url ORDER BY c.published_date DESC LIMIT 1[0m
Full Context:
[32;1m[1;3m[{'c.title': 'How could Knowledge Graph and Vector Search work together? ', 'c.url': 'https://www.linkedin.com/feed/update/urn:li:activity:7189726422926524416'}][0m

[1m> Finished chain.[0m
KG Result: The last post title was "How could Knowledge Graph and Vector Search work together?" and the URL is https://www.linkedin.com/feed/update/urn:li:activity:7189726422926524416.


## Output

### Test Question

In [7]:
df_output = df_qa.copy()

for index, row in df_output.iterrows():
    question = row["QUESTION"]
    print("Question:", question)
    
    # Test system prompt
    prompt_result = ""
    try:
        model_id = "c6f0d70f-faa4-492f-81b7-4b6aba79e227"
        message = f"{slug} {question}"
        temperature = 0.5
        result = create_chat_completion(api_key, chat_id, model_id, message, temperature, plugin_id)
        if len(result.get("completion")) > 0:
            prompt_result = result.get("completion").get("messages")[1].get("message")
        time.sleep(3)
    except Exception as e:
        prompt_result = e
    print("Prompt Result:", prompt_result)
    print()
    
    # Test vector search result
    vector_result = ""
    try:
        vector_qa_result = vector_qa.invoke(question)
        if len(vector_qa_result) > 0:
            vector_result = vector_qa_result.get("result")
    except Exception as e:
        vector_result = e
    print("Vector Result:", vector_result)
    print()
    
    # Test KG result
    kg_result = ""
    try:
        cypher_chain_result = cypher_chain.invoke(question)
        if len(cypher_chain_result) > 0:
            kg_result = cypher_chain_result.get("result")
    except Exception as e:
        kg_result = e
    print("KG Result:", kg_result)  
    print()

    df_output.loc[index, "SYSTEM_PROMPT"] = prompt_result
    df_output.loc[index, "VECTOR_SEARCH"] = vector_result
    df_output.loc[index, "KNOWLEDGE_GRAPH"] = kg_result
    print()
    
df_output

Question: What was the last post title and url?
Prompt Result: Hello! The latest post by Jérémy Ravenel is titled:

"AI agentic workflows will drive massive AI progress" said Andrew NG in Sequoia Capital's AI Ascent event last week. So today, I wanted to document how we approach things at Naas to make this work for everyday business.

You can view the post at the following URL: [Jérémy's LinkedIn Post](https://www.linkedin.com/feed/update/urn:li:activity:7181006169149190144)

Ready for more insights or any other assistance with the content? Just let me know!

Vector Result: The last post title was: "The best part of AI conferences in Paris is probably the beautiful building ceilings 😅" and the URL is: [https://www.linkedin.com/feed/update/urn:li:activity:7183130611279134721](https://www.linkedin.com/feed/update/urn:li:activity:7183130611279134721)



[1m> Entering new GraphCypherQAChain chain...[0m
Generated Cypher:
[32;1m[1;3mMATCH (c:Content) RETURN c.title, c.url ORDER BY c.publ

Unnamed: 0,DATABASE,ASSISTANT,SCOPE,CATEGORY,QUESTION,ANSWER,SYSTEM_PROMPT,VECTOR_SEARCH,KNOWLEDGE_GRAPH
0,POSTS,content-assistant,Last post,Retrieval,What was the last post title and url?,The last post was about ___,Hello! The latest post by Jérémy Ravenel is ti...,"The last post title was: ""The best part of AI ...","The last post title was ""How could Knowledge G..."
1,POSTS,content-assistant,Last post,Retrieval,When was the last post published ?,The last post was published on ___,Jérémy Ravenel's last post was published on Ap...,"The last post was published on April 19, 2024,...","The last post was published on April 26, 2024 ..."
2,POSTS,content-assistant,Last post,Performance,How many views did the last post?,The last post received ___ views.,The last post by Jérémy Ravenel had 0 views. I...,"The last post by Jérémy Ravenel, published on ...","The last post titled ""How could Knowledge Grap..."
3,POSTS,content-assistant,Last post,Performance,How many likes did the last post?,The last post received ___ likes.,The last post by Jérémy Ravenel received 53 li...,The last post by Jérémy Ravenel received 135 l...,The last post received 133 likes.
4,POSTS,content-assistant,Last post,Performance,How many comments did the last post?,The last post received ___ comments.,Jérémy Ravenel's last post received 6 comments...,The last post by Jérémy Ravenel had 27 comments.,The last post received 27 comments.
5,POSTS,content-assistant,Last post,Performance,How many shares did the last post?,The last post received ___ shares.,The last post by Jérémy Ravenel had 5 shares. ...,"The last post by Jérémy Ravenel, titled ""Let’s...",The last post had 5 shares.
6,POSTS,content-assistant,Last post,Performance,How many engagements did the last post?,The last post received ___ engagements.,Jérémy Ravenel's last post received a total of...,The last post by Jérémy Ravenel received 112 e...,The last post had 165 engagements.
7,POSTS,content-assistant,Last post,Performance,What is the engagement score of the last post?,The engagement score of the last post is _____...,The engagement score of Jérémy Ravenel's last ...,The engagement score of the last post by Jérém...,The engagement score of the last post is 0.0145.
8,POSTS,content-assistant,Last post,Performance,"What is the contribution of likes, comments, a...","The contributions of likes, comments, and shar...","For Jérémy Ravenel's last post, here's how lik...","In the last post by Jérémy Ravenel, the engage...","Likes: 133, Comments: 27, Shares: 5, Total Eng..."
9,POSTS,content-assistant,Last post,Analysis,What were the concepts discussed in the last p...,The topics or concepts discussed in the last p...,"In Jérémy Ravenel's last post, the discussed c...",The concepts discussed in the last post by Jér...,Knowledge Graph and Vector Search were the con...


### Send data to spreadsheet

In [8]:
gsheet.connect(spreadsheet_url).send(data=df_output, sheet_name=sheet_result, append=False)

{'insertedRow': 50}