# Import Libraries

In [1]:
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI

# Set up the Environment Variable

In [2]:
load_dotenv()

api_key = os.getenv('OPENAI_API_KEY')

if api_key and api_key.startswith('sk-proj-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please check the key")


API key looks good so far


In [3]:
MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [16]:
question = """
Please explain what this code does, and why:
  with driver.session() as session:
        for index, row in data.iterrows():
            # Merge Person node with relationships to category and year
            session.run(
                
                MERGE (p:Person {name: $name})
                ON CREATE SET p.startYear = $startYear
                ,
                name=row['subject'],
                startYear=row['startYear']
            )
Respond in markdown
"""

In [18]:
openai = OpenAI()

def gpt_model(question):
    stream = openai.chat.completions.create(
        model=MODEL_GPT,
        messages = [
            {"role": "user", "content": question}
        ],
        stream=True
    )
    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        response = response.replace("```","").replace("markdown", "")
        update_display(Markdown(response), display_id=display_handle.display_id)

In [19]:
gpt_model(question)

Certainly! Let's break down the provided code snippet, which appears to be part of a Python script using the Neo4j graph database through the `neo4j` driver.

### Explanation of the Code

python
with driver.session() as session:
    for index, row in data.iterrows():
        # Merge Person node with relationships to category and year
        session.run(
            """
            MERGE (p:Person {name: $name})
            ON CREATE SET p.startYear = $startYear
            """,
            name=row['subject'],
            startYear=row['startYear']
        )


#### Key Elements of the Code

1. **Driver Session Initialization**:
   - `with driver.session() as session:`: This line initializes a session with the Neo4j database. The `with` statement ensures that the session is properly closed after exiting the block, handling resource management automatically.

2. **Iterate Over Data**:
   - `for index, row in data.iterrows():`: This loop iterates over each row of a DataFrame called `data`. The `iterrows()` method returns both the index and the data of each row.

3. **Merge Operation**:
   - `session.run(...)`: This command is used to execute a Cypher query against the Neo4j database.

4. **Cypher Query**:
   - Inside `session.run(...)`, the Cypher query performs the following actions:
     - `MERGE (p:Person {name: $name})`: This part of the query attempts to find a node with the label `Person` and the property `name` equal to the value of `$name`. If such a node does not exist, it will create one.
     - `ON CREATE SET p.startYear = $startYear`: If a new `Person` node is created (i.e., it did not exist before), this clause sets the `startYear` property of that node to the value of `$startYear`.

5. **Parameters**:
   - `name=row['subject']`: This is a parameter mapping where the `name` variable is set to the value of the 'subject' column of the current row.
   - `startYear=row['startYear']`: Similarly, this sets the `startYear` variable to the value of the 'startYear' column of the current row.

### Purpose of the Code

The purpose of this code is to ensure that a `Person` node corresponding to each entry (or row) in the DataFrame `data` is either created or found in the Neo4j database. If a `Person` node with the specified name already exists, no new node is created; instead, it simply uses the existing one. If the node is created in that iteration, the `startYear` property is assigned immediately.

This approach is useful in graph databases to avoid duplicate nodes and ensure that each person has relevant data associated with them. It can be applied in scenarios like importing data, managing relationships, or setting up a knowledge graph.

### Summary
In summary, this code efficiently handles the merging of `Person` nodes into a Neo4j graph database, ensuring nodes are created only if they do not exist while assigning related properties at the same time.

# Using llama

In [30]:
# prompts

system_prompt = "You are a helpful technical tutor who answers questions about python code, software engineering, data science and LLMs"
user_prompt = "Please give a detailed explanation to the following question: " + question


In [31]:
# messages

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]

In [32]:
payload = {
    "model": MODEL_LLAMA, 
    "messages": messages, 
    "stream": True
}

In [35]:
import ollama
display_handle = display(Markdown(""), display_id=True)
response = ollama.chat(model=MODEL_LLAMA, messages=messages)

reply = response['message']['content']
display(Markdown(reply))
# response = response.replace("```","").replace("markdown", "")
# update_display(Markdown(response), display_id=display_handle.display_id)



**Code Explanation**
=====================

This code snippet appears to be written in Cypher, a query language used for graph databases such as Neo4j. It is designed to merge data from a pandas DataFrame (`data`) into a Neo4j database using the Neo4j Python driver.

**Breakdown of the Code**
-------------------------

Here's a step-by-step explanation:

1. **Session Creation**: The code starts by creating a Neo4j session using `driver.session()`. This session is used to execute Cypher queries against the graph database.
2. **Iteration over DataFrame Rows**: The `data.iterrows()` method iterates over each row in the pandas DataFrame. Each row represents a single data point, which will be merged into the graph database.

**Cypher Query Explanation**
-----------------------------

 Inside the loop, a Cypher query is executed using `session.run()`. This query performs a **merge operation**, which creates a new node (Person) if it doesn't already exist in the database. If it does exist, Neo4j will update its properties instead.

The merge query has several components:

* `p:Person {name: $name}`: Creates a new Person node with the name from the DataFrame row.
* `ON CREATE SET p.startYear = $startYear`: If the node is created (i.e., it didn't exist in the database), sets its start year property to the value from the DataFrame row.

The query also passes two variables:

* `$name` and `$startYear`, which are placeholders for the actual values from the DataFrame rows. These values will be replaced with the corresponding values when the query is executed.

**Example Use Case**
--------------------

This code snippet could be used in a data integration pipeline, where it merges data from an external source (e.g., a CSV file or another database) into a Neo4j graph database. The resulting graph database can then be queried and analyzed using Cypher queries.

**Why this Code?**
-----------------

The use of graph databases like Neo4j allows for efficient storage and querying of complex relationships between data entities. This code snippet demonstrates how to merge data from an external source into a graph database, creating new nodes and relationships as needed. The choice of Cypher as the query language provides a powerful way to define the merge operation and integrate with the graph database.

In [28]:
response

