# Chatbot
In this notebook, we'll build a chatbot.  The chatbot interface will use gradio.  Underlying the chatbot is the Neo4j knowledge graph we built in previous labs.  The chatbot uses generative AI and langchain.  We take a natural language question, convert it to Neo4j Cypher using generative AI and run that against the database.  We take the response from the database and use generative AI again to convert that back to natural language before presenting it to the user.

## Base Example Without Grounding
Before grounding with the Neo4j, let's setup up a baseline that just uses an LLM to answer questions.

In [None]:
from langchain.llms import VertexAI

base_chain = VertexAI(model_name='text-bison', max_output_tokens=2048, temperature=0)

We can now ask a simple finance question.

In [None]:
base_response = base_chain("""What are the top 10 investments for Rempart?""")
print(f"Final answer: {base_response}")

While this answer looks reasonable, we have no real way to know how the LLM came it with it, or where it was sourced from.

Here is a more complicated example where we expect the LLM to understand some more specific terminology.

In [None]:
base_response = base_chain("""Which managers own FAANG stocks?""")
print(f"Final answer: {base_response}")

In [None]:
import gradio as gr
import typing_extensions
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key = "chat_history", return_messages = True)
agent_chain = base_chain

def chat_response(input_text,history):

    try:
        return agent_chain(input_text)
    except:
        # a bit of protection against exposed error messages
        # we could log these situations in the backend to revisit later in development
        return "I'm sorry, there was an error retrieving the information you requested."

interface = gr.ChatInterface(fn = chat_response,
                             title = "Ungrounded Chatbot",
                             description = "NOT powered by Neo4j",
                             theme = "soft",
                             chatbot = gr.Chatbot(height=500),
                             undo_btn = None,
                             clear_btn = "\U0001F5D1 Clear chat",
                             examples = ["What are the top 10 investments for Rempart?",
                                         "Which manager owns FAANG stocks?",
                                         "What are other top investments for fund managers investing in Exxon?",
                                         "What are Rempart's top investments by value for 2023?",
                                         "Who are the common investors between Tesla and Costco?",
                                         "Who are Tesla's top investors in last 3 months?"])

interface.launch(share=True)

## Fine Tuning for Cypher Generation
We encourage you to use Vertex AI Codey family models with a schema, few-shot examples, and precise prompt engineering for Cypher generation. However, if that still doesn't provide an appropriate level of quality, or you need your LLM to improve accuracy on a more specific task area, you can try fine-tuning.

Fine-tuning is the process of taking a foundational model and making precise changes to improve its performance for a specific, narrower task. It works by taking in training data containing many examples of your specific task and using it to update or add additional parameters in a new version of the model.

The total training time generally takes more than an hour. The tuned adapter model is going to stay within your tenant, and your training data will not be used to train the base model, which is frozen. Tuning runs on GCP's TPU infrastructure that is optimized to run ML workloads.

The training data should be structured as a supervised training set, where each row contains input text and desired, resulting, Cypher query. Vertex AI expects you to adhere to the below format in a `jsonl` file.

```
{"input_text": "MY_INPUT_PROMPT", "output_text": "CYPHER_QUERY"}
```
You can find more about fine-tuning in the [Vertex AI documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models)


## Conclusion
In this notebook, we went through the steps of connecting a LangChain agent to a Neo4j database and using it to generate Cypher queries in response to user requests via LLMs on Vertex AI, thus grounding the LLM with a knowledge graph.

While we used the `code-bison` model here, this approach can be generalized to other Vertex AI LLMs.  This process can also be augmented with additional steps around the generation chain to customize the experience for specific use cases.  

The critical takeaway is the importance of Neo4j Knowledge Graph as a grounding database to: 
* Anchor your chatbot to reality as it generates responses and 
* Enable your LLM to provide answers enriched with relevant enterprise data.