<h1 align="center"> Generative AI Hackathon</h1>
<table align="center">
    <td>
        <a href="https://colab.research.google.com/github/teamdatatonic/gen-ai-hackathon/blob/main/hackathon.ipynb">
            <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo">
            <span style="vertical-align: middle;">Run in Colab</span>
        </a>
    </td>
    <td>
        <a href="https://github.com/teamdatatonic/gen-ai-hackathon/blob/main/hackathon.ipynb">
            <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
            <span style="vertical-align: middle;">View on GitHub</span>
        </a>
    </td>
    <td>
        <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/teamdatatonic/gen-ai-hackathon/main/hackathon.ipynb">
            <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"> 
            <span style="vertical-align: middle;">Open in Vertex AI Workbench</span>
        </a>
    </td>
</table>
<hr>

**➡️ Your task:** Learn about Generative AI by building your own Knowledge Worker using Python and LangChain!

**❗ Note:** This workshop has been designed to be run in Google CoLab. Support for running the workshop locally or using VertexAI Workbench is provided, but we heavily recommend CoLab for the best experience.


# Introduction

This notebook walks you through the challenge of implementing a **Knowledge Worker** for your organisation using **Generative AI**!

**Why a knowledge worker?** Decentralized data across internal and external databases results in time wasted as workforce tries to find required information and transform into insights. A knowledge worker can consolidate this information, then answer queries in natural language providing summarisation and sources.

![Q&A Chain intro](https://github.com/teamdatatonic/gen-ai-hackathon/blob/bca8120f1408be1895309517a7a4d693035b940b/assets/qa_intro.png?raw=true)

➡️ **Your task:** Implement a knowledge worker to enable users in your company to perform Q&A, in natural language, upon a knowledge base.
In this way, you'll centralise company data for easy access in a user-friendly manner, boosting productivity.
As such, you'll create a knowledge worker fine-tuned to your data domain.
This app will only have access to specific knowledge such as public data about your company available on your company's website and unstructured documents (websites, PDF, Word, text ...).

While solving the tasks as instructed in this notebook, you'll familiarise yourself with common concepts and tools for Generative AI including:

- The Open-Source tool LangChain
- Large Language Models (LLMs)
- Vertex AI Search
- Prompts and Prompt Engineering
- Text Embeddings and Vector Databases


Ultimately, this notebook details how to get started with LangChain, and walks through setting up a knowledge worker on Google Cloud and Vertex AI.

## Implementing a knowledge worker

When creating a knowledge worker, you'll recall that Large Language Models (LLMs) can be tuned for a variety of tasks such as text summarization, answering questions, and generating new content (and many more!).
When it comes to tuning approaches, you'll have the choice between:

**A) Zero-shot learning:** Use LLMs directly without providing additional data or fine-tuning the model.

**B) Few-shot learning:** Provide a select number of input examples when using LLM to improve the quality of outputs.

**C) Model Fine-tuning:** Fine-tune certain (or additional) layers in the LLM by training the model on provided training data.

Instead of training LLMs using your own data (ie. fine-tuning), it is far easier and more effective to adapt the LLM to your use-case by prompt engineering only (ie. tuning).
Thus, methods A) and B) are more applicable for creating your first knowledge worker.

A knowledge worker can be approached in two stages:

1. Embedding knowledge from diverse sources.
    * Load our dataset.
    * Shard our documents (e.g.: by paragraph, per 1000 tokens, etc.)
    * Embed the documents in Vertex AI Search or a vector store.
2. Querying a LLM which is aware of your relevant knowledge to answer questions.
    * Locating relevant documents from Vertex AI Search or a vector store.
    * Asking the LLM our query, providing relevant knowledge as context to generate an answer.

![Q&A Chain Flow](https://github.com/teamdatatonic/gen-ai-hackathon/blob/bca8120f1408be1895309517a7a4d693035b940b/assets/qa_flow.jpeg?raw=true)
First, documents (websites, Word documents, databases, Powerpoints, PDFs, etc.) are loaded and split into chunks. Fragmenting is important for three reasons:

1. There are technical restrictions on how much data (tokens) can be fed into an LLM at once, meaning the context + system prompt + chat history + user prompt must fit within the token limit.
2. Most LLM APIs operate on a per-token pricing model, meaning it is cost-effective to limit the size / amount of data provided to the LLM per query.
3. Contextual information should be relevant to the user query, meaning it is optimal to provide only relevant snippets from documents, making the answer more relevant whilst saving costs as per (1) and (2).

Next, these document shards are embedded within a vector store. Embedding a document means to align it within a multi-dimension space, which can then be searched according to user queries to find relevant documents. Document relevancy scoring can be as simple as a K-neighbours search, since embedded documents with similarity (as percieved by the LLM embedding model) will be proximate within the search space.

![A typical ingestion chain](https://github.com/teamdatatonic/gen-ai-hackathon/blob/7f37d477b18ace5912d34b0574512559d7a457ed/assets/typical-ingestion-chain.png?raw=true)

Once the vector store is created, a user can query the knowledge base using natural language questions. Relevant documents related to the query are found in the vector store by embedding the user query and finding local documents. These snippets of text are provided to the LLM (alongside the user query, chat history, prompt engineering, etc.) which parses the information to generate an answer.

![A typical query chain](https://github.com/teamdatatonic/gen-ai-hackathon/blob/7f37d477b18ace5912d34b0574512559d7a457ed/assets/typical-query-chain.png?raw=true)

## Running this workshop
* Execute each code snippet sequentially. This lab is designed so certain steps (like defining functions or importing modules) are performed once, so each new code cell builds upon previous cells.
* If the notebook crashes or you want to restart the notebook later, make sure you execute all cells prior to where you left off.
* Executing cells out of order can lead to errors, so your first step when debugging should be to ensure all previous code cells have been run.
* This workshop can be completed independently, but Datatonic workshop leaders are available to discuss tasks, debug issues, or have a chat about generative AI!

## Prerequisites
### Install Python dependencies
We've developed a python module specifically for this workshop. Installing this one package also installs other dependencies, such as LangChain (a LLM framework) and GradIO (a web UI framework).

In [None]:
%pip install --quiet "git+https://github.com/teamdatatonic/gen-ai-hackathon.git@feat/add-vertex-ai-search#subdirectory=dt_gen_ai_hackathon_helper"

**❗ Restart the Python kernel:** Ensure that your environment can access the newly installed dependencies. Continue after the restart from the `Setup cloud project` step. 

In [None]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

**❗ Note:** If your kernel doesn't restart automatically, click the "Restart Runtime" button above your notebook.
If you dont see a restart button, go to the "Runtime" toolbar tab then "Restart Runtime". After restarting, continue executing the project from below this cell.

## Accessing the Vertex AI Endpoint

Currently, Vertex AI LLMs are accessible via Google Cloud projects. We will access the Vertex AI endpoint via a service account.

1. Upload the Google Application Credentials `.json` file sent to your email to the notebook filesystem.
2. Set the variable `GOOGLE_APPLICATION_CREDENTIALS` with the filepath (**❗ Note:** the `/content/` folder is where uploaded files are stored by default).

In [None]:
import os

# @title Set project credentials. { run: "auto", display-mode: "form" }
# @markdown Set the filepath to the `.json` credentials file.
GOOGLE_APPLICATION_CREDENTIALS = "secrets/credentials.json"  # @param {type:"string"}
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = GOOGLE_APPLICATION_CREDENTIALS

!gcloud auth activate-service-account --key-file={GOOGLE_APPLICATION_CREDENTIALS}

In [1]:
# @markdown Set the Google Cloud project ID.
PROJECT_ID = "dt-gen-ai-hackathon-dev"  # @param {type:"string"}

!gcloud config set project {PROJECT_ID}

Updated property [core/project].


### Configure notebook environment.

In [2]:
MODEL = "text-bison@001"

# Task 1: Implementing a knowledge worker on Vertex AI Search

Creating a custom knowledge worker is similar to your first step when learning a new programming language.
As such your first challenge is to create a “Hello World” program, however, adapted to LLMs which is way more exciting!

With a few lines of code, you'll:
- Load documents with information about your company
- Store documents in Vertex AI Search
- Use an LLM to answer queries about your company knowledge

**❗ All of these steps can be achieved in a few lines of Python.**

## 1.1 Introduction to LangChain

LangChain is a Python framework for developing applications using language models.
It abstracts the connection between applications and LLMs, allowing a loose coupling between code and specific providers like Google PaLM.

LangChain supports [numerous methods](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html) for loading documents.

We will be using the `DirectoryLoader` and `UnstructuredHTMLLoader` in order to load a pre-compiled archive of your website. This method is similar to the [`RecursiveUrlLoader`](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/recursive_url_loader). This document loader searches for subpages of a website and loads each pages content as a document. Additionally, if we only wanted to download a list of URLs without searching for subpages, we could use a [`UnstructuredURLLoader`](https://python.langchain.com/docs/modules/data_connection/document_loaders/integrations/url).

**❗ Note:** We're using pre-compiled archives to avoid hitting rate limits during this session, however this content can also be programatically gathered routinely to collect new blog posts / press releases.

**➡️ Your task:** Read the linked resources in the `Introduction to LangChain` step and study the following code cells as they provide reusable LangChain code for your knowledge worker.

## 1.2 Setting up Vertex AI Search

### Create a Vertex AI Search data store with documents
Create a Vertex AI Search data store to host unstructured data (pdf documents)

1 - On the Search & Conversation page, click `NEW DATA STORE`.


![Select NEW DATA STORE](https://github.com/teamdatatonic/gen-ai-hackathon/blob/ba86ffb203a0f6963958cf41244a2af7a3187684/assets/new_data_store.png?raw=true)

2 - Select a data source of type Cloud Storage. 

3 - Add the folder path `dt-gen-ai-hackathon-pdf-datasets/alphabet_investor_pdfs_2004_2021`and select the data kind `Unstructured documents`.

![Select kind of data](https://github.com/teamdatatonic/gen-ai-hackathon/blob/b0d89b56d4e6063d7a7de8c590ca6e886c9a0883/assets/select_kind_of_data.png?raw=true)

4 - Location should be `global (Global)` and the data store name `dt-gen-ai-hackathon-<TEAM NUMBER>`.e.g. `dt-gen-ai-hackathon-t1`.


### Create a Vertex AI Search application
Create a Vertex AI Search application to host your knowledge worker.

1 - On the Search & Conversation page, click `NEW APP`.

2 - Type: Select app type Search.

3 - Configuration: Maintain the default values with `gobal (Global)` and both features activated. Add the app name `app_<TEAM NUMBER>`.e.g. `app_t1` and click `CONTINUE`.

![App configuration](https://github.com/teamdatatonic/gen-ai-hackathon/blob/b0d89b56d4e6063d7a7de8c590ca6e886c9a0883/assets/select_kind_of_data.png?raw=true)

4 - Data: Select the data store created in the previous step and click `CREATE`.



## 1.3 LangChain retrieval Q&A chains
We will demonstrate the use of three types of LangChain retrieval Q&A chains:

- RetrievalQA
- RetrievalQAWithSourcesChain
- ConversationalRetrievalChain

First, we initialize a Vertex AI Language Model (LLM) and a LangChain 'retriever' to fetch documents from our Enterprise Search engine.

In the case of Q&A chains, our retriever is directly passed to the chain, enabling it to function automatically without requiring any additional configuration.

Behind the scenes, the search query is initially passed to the retriever. The retriever performs a search and returns relevant document snippets. These snippets are then used as context for the prompt executed by the LLM.

In [3]:
from langchain.llms import VertexAI
from langchain.retrievers import GoogleCloudEnterpriseSearchRetriever

# Get this value from the Vertex AI Search UI
data_store_id = "dt-gen-ai-hackathon-t1_1696592331863"

llm = VertexAI(model_name=MODEL, temperature=0.0)

retriever = GoogleCloudEnterpriseSearchRetriever(
    project_id=PROJECT_ID, search_engine_id=data_store_id
)

### RetrievalQA Chain
This is the simplest document Q&A chain offered by LangChain.

Several different chain types are available, as listed [here](https://docs.langchain.com/docs/components/chains/index_related_chains).

In these examples, we use the 'stuff' type, which simply inserts all the document snippets into the prompt. This approach has the advantage of requiring only a single LLM call, making it faster and more cost-efficient.

However, this method comes with a drawback: if we have a large number of search results, we run the risk of exceeding the token limit for our prompt or truncating useful information.

Other chain types, such as 'map_reduce' and 'refine,' employ an iterative process. These types make multiple LLM calls, taking individual document snippets one at a time and refining the answer iteratively.

In [4]:
from langchain.chains import RetrievalQA

search_query = "Give me the name of the CEO of DeepMind?"

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)
retrieval_qa.run(search_query)

'The CEO of DeepMind is Demis Hassabis.'

If we set `return_source_documents=True` as an optional parameter when constructing the chain, we can examine the document snippets returned by the retriever. This feature is particularly useful for debugging, as the relevance of these snippets to the answer may not always be immediately obvious.

In [6]:
from dt_gen_ai_hackathon_helper.formatter_helper import formatter_helper

retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)

results = retrieval_qa({"query": search_query})
formatter_helper.format_results(results)

*******************************************************************************
Answer: The CEO of DeepMind is Demis Hassabis.
Used 5 relevant documents.
*******************************************************************************
-------------------------------------------------------------------------------
Document 1
-------------------------------------------------------------------------------
Source of content: gs://dt-gen-ai-hackathon-pdf-datasets/alphabet_investor_pdfs_2004_2021/2015_google_annual_report.pdf
-------------------------------------------------------------------------------
to implement segment reporting for our Q4 results,
where Google financials will be provided separately
than those for the rest of Alphabet businesses as
a whole.
This new structure will allow us to keep tremendous
focus on the extraordinary opportunities we have
inside of Google. A key part of this is Sundar Pichai.
Sundar has been saying the things I would have
said (and sometimes better!) f

#### RetrievalQAWithSourcesChain
This variant delivers both the answer to the question and the source documents used for generating that answer, doing so in a simpler manner than using `return_source_documents=True`.

In [None]:
from langchain.chains import RetrievalQAWithSourcesChain

retrieval_qa_with_sources = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever
)

retrieval_qa_with_sources({"question": search_query}, return_only_outputs=True)

### ConversationalRetrievalChain
The ConversationalRetrievalChain remembers and uses previous questions to enable a chat-like discovery process. To utilize this chain, we need to provide a memory class that stores and passes the previous messages to the LLM as context. For this purpose, we use the ConversationBufferMemory class that comes with Langchain.

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
conversational_retrieval = ConversationalRetrievalChain.from_llm(
    llm=llm, retriever=retriever, memory=memory
)

search_query = "What were alphabet revenues in 2021?"

result = conversational_retrieval({"question": search_query})
print(result["answer"])

In [None]:
new_query = "What about costs and expenses?"
result = conversational_retrieval({"question": new_query})
print(result["answer"])

In [None]:
new_query = "Is this more than in 2020?"

result = conversational_retrieval({"question": new_query})
print(result["answer"])

## 1.4 Prompt engineering

![Q&A Chain](https://github.com/teamdatatonic/gen-ai-hackathon/blob/bca8120f1408be1895309517a7a4d693035b940b/assets/stuff-chain.jpeg?raw=true)

As outlined before, the creation of prompts is essential to adapt LLMs for your given use case.
**Prompt engineering** is a method of zero-shot fine-tuning for large language models.
By prompting a LLM with contextual information about its purpose, the model can simulate a variety of situations, such as a customer assistant chatbot, a document summariser, a translator, etc.

In this use case, we prompt our model to respond as a conversational Q&A chatbot.
Prompt engineering can be especially useful for introducing guard rails to an application - in this template we tell the model to not respond to queries it lacks the information to answer, as users will trust the application to provide factual replies, so rejecting a query is preferable to outputting false information.

You can use the prompt and code cells below for your knowledge worker.

**➡️ Your task:** Execute and study the code cell below. Pay attention to the prompt being defined.
What elements do you notice in the prompt?
How is the prompt used in the chain?

In [None]:
TASK_01_BASIC_TEMPLATE = """\
You are a helpful chatbot designed to perform Q&A on a set of documents.
Always respond to users with friendly and helpful messages.
Your goal is to answer user questions using relevant sources.

You were developed by Datatonic, and are powered by Google's PaLM-2 model.

In addition to your implicit model world knowledge, you have access to the following data sources:
- Company documentation.

If a user query is too vague, ask for more information.
If insufficient information exists to answer a query, respond with "I don't know".
NEVER make up information.
"""

TASK_01_A_TEMPLATE = """\
The answer should consist of:
Short answer: 1 word. Just a yes or no answer. Prefixed with Short answer:.
Details: 1 sentence. A short explanation of the answer. Prefixed with Details:.

Context: {context}

Question: {question}
Helpful Answer:
"""

Once we have connected our Vertex AI Search data store and prompt, we can define our LangChain.

**➡️ Your task:** Execute and study the following code cells - they provide reusable LangChain code for your knowledge worker.

In all of the previous examples we used the default prompt that comes with langchain.

We can inspect our chain object to discover the wording of the prompt template being used.

We may find that this is not suitable for our purposes, and we may wish to customise the prompt, for example to present our results in a different format, or to specify additional constraints.

In [None]:
qa = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True
)

print(qa.combine_documents_chain.llm_chain.prompt.template)

Let's modify the prompt to return an answer in a single word (useful for yes/no questions). We will constrain the LLM to say 'I don't know' if it cannot answer.

We create a new prompt_template and pass this in using the `template` argument.

In [None]:
from langchain.prompts import PromptTemplate

prompt_a = PromptTemplate(
    template=TASK_01_BASIC_TEMPLATE + "\n" + TASK_01_A_TEMPLATE
    , input_variables=["context", "question"]
)
qa_chain = RetrievalQA.from_llm(
    llm=llm, prompt=prompt_a, retriever=retriever, return_source_documents=True
)

In [None]:
print(qa_chain.combine_documents_chain.llm_chain.prompt.template)

In [None]:
search_query = "Were 2020 EMEA revenues higher than 2020 APAC revenues?"

results = qa_chain({"query": search_query})
formatter_helper.format_results(results)

## 1.5 Ingesting documents to Vertex AI Search

In [None]:
search_query = "Were 2022 EMEA revenues higher than 2022 APAC revenues?"

results = qa_chain({"query": search_query})
formatter_helper.format_results(results)

In [None]:
from dt_gen_ai_hackathon_helper.vertex_ai_search import vertex_ai_search
location = "global"
gcs_uri = "gs://dt-gen-ai-hackathon-pdf-datasets/alphabet_investor_pdfs_2022_2023/*.pdf"

vertex_ai_search.ingest_documents(project_id=PROJECT_ID, location=location, data_store_id=data_store_id, gcs_uri=gcs_uri, data_schema= "content")

In [None]:
search_query = "Were 2022 EMEA revenues higher than 2022 APAC revenues?"

results = qa_chain({"query": search_query})
formatter_helper.format_results(results)

## 1.6 Building the user interface

As building a UI is outside of the scope for this hackathon, a templated GUI using [Gradio](https://gradio.app/) is provided for your knowledge worker.

**➡️ Your task:** Execute the cells below to launch the user interface.

**❗ Note:** The following cell will run until manually stopped. Remember to halt it before moving onto the next task.

In [None]:
TASK_01_B_TEMPLATE = """\
Chat History:
{chat_history}
Question: {question}
"""
prompt_b = PromptTemplate.from_template(
    template=TASK_01_BASIC_TEMPLATE + "\n" + TASK_01_B_TEMPLATE
)

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.llms import VertexAI


def create_qa_vertex_ai_search_chain(condense_question_prompt, k=4, temperature=0.0):
    """Create a Q&A conversation chain using the VertexAI LLM.

    Arguments:
        vector_store (object): The vectorstore containing our knowledge.
        condense_question_prompt (PromptTemplate): The prompt template used to prompt engineer our LLM to respond in a certain tone, etc.
        k (int): the 'k' value indicates the number of sources to use per query. 'k' as in 'k-nearest-neighbours' to the query in the embedding space.
        temperature (float): the degree of randomness introduced into the LLM response.
    """
    retriever = GoogleCloudEnterpriseSearchRetriever(
        project_id=PROJECT_ID, search_engine_id=data_store_id
    )

    # The selected Google model uses embedded documents related to the query
    # It parses these documents in order to answer the user question.
    # We use the VertexAI LLM, however other models can be substituted here
    llm = VertexAI(model_name=MODEL, k=k, temperature=temperature)

    # A conversation retrieval chain keeps a history of Q&A / conversation
    # This allows for contextual questions such as "give an example of that (previous response)".
    # The chain is also set to return the source documents used in generating an output
    # This allows for explainability behind model output.
    conversational_retrieval = ConversationalRetrievalChain.from_llm(condense_question_prompt=condense_question_prompt,
                                                                     llm=llm, retriever=retriever,
                                                                     return_source_documents=True
                                                                     )
    return conversational_retrieval

In [None]:
import dt_gen_ai_hackathon_helper.view.view as demo_view

qa_chain = create_qa_vertex_ai_search_chain(prompt_b)
demo = demo_view.View(qa_chain=qa_chain)
demo.launch_interface()

# Task 0: Preview a knowledge worker

Before we look at the underlying code supporting the knowledge worker demo, let's look at a pre-built example using the Datatonic website as a knowledge source.

**➡️ Your task:** Download the pre-embedded vector store and explore the capabilities of a knowledge worker.

In [None]:
# @title Set the demo bucket name { run: "auto", display-mode: "form" }
# @markdown This variable can be left as default for this task.
DEMO_BUCKET = "dt-gen-ai-hackathon-demo"  # @param {type:"string"}

In [None]:
import dt_gen_ai_hackathon_helper.widgets.widgets as widgets

DEMO_DROPDOWN = widgets.gcp_bucket_dropdown(
    DEMO_BUCKET, "Select a demo from this dropdown: "
)
display(DEMO_DROPDOWN)

In [None]:
!gsutil cp gs: // {DEMO_BUCKET} / {DEMO_DROPDOWN.value}.tar.gz.& & tar -xzf {DEMO_DROPDOWN.value}.tar.gz

In [None]:
import dt_gen_ai_hackathon_helper.view.view as demo_view
import dt_gen_ai_hackathon_helper.embeddings.embeddings as demo_embeddings

demo_vector_store = demo_embeddings.load_embeddings(
    DEMO_DROPDOWN.value
)  # loads the vector DB from the local file system.

demo = demo_view.View(vector_store=demo_vector_store)
demo.launch_interface()

**❗ Note:** The following cell will run until manually stopped. Remember to halt it before moving onto the next task.

# Task 1: Implementing a knowledge worker
Here, you'll implement a knowledge worker for your company using a local vector store.

We will:
- Load documents with information about your company
- Create text embeddings from documents
- Storing embedding in a local database
- Use an LLM to answer queries about your company knowledge

## Collecting documents
First, we need to collect our data. To get started fast, we've already downloaded some sample website data upfront. Let's copy the website data from a public Cloud Storage bucket to your local file system

**❗ Note:** Although PaLM supports multiple languages, text embeddings currently work best with English documents.

**➡️ Your task:** Select a pre-compiled website from our list and download from the public bucket.

In [None]:
# @title Select a webarchive { run: "auto", display-mode: "form" }
# @markdown This variable can be left as default for this task.
BUCKET = "dt-gen-ai-hackathon-webarchive"  # @param {type:"string"}

In [None]:
LOCAL_FOLDER_DROPDOWN = widgets.gcp_bucket_dropdown(
    BUCKET,
    "Choose any of these web archives as the base knowledge of your worker: ",
)
display(LOCAL_FOLDER_DROPDOWN)

In [None]:
!gsutil cp gs: // {BUCKET} / {LOCAL_FOLDER_DROPDOWN.value}.tar.gz.& & tar -xzf {LOCAL_FOLDER_DROPDOWN.value}.tar.gz

🎉 Congratulations! 🎉 You've downloaded your website data.

If we browse the files we just downloaded, we can see the file structure contains folders and `HTML` files. This is because it is a local replica of the target website, meaning the file paths correlate with real webpages.

**➡️ Your task:** Run the following cells to *create* the text embeddings based on your downloaded data.

Now, lets embed these files so we can use them in our knowledge worker.

In the code cell below, we parse the directory to find `HTML` files and load their contents using `UnstructuredHTMLLoader`.

In [None]:
from langchain.document_loaders import DirectoryLoader, UnstructuredHTMLLoader


def load_documents(source_dir):
    # Load the documentation using a HTML parser
    loader = DirectoryLoader(
        source_dir,
        glob="**/*.html",
        loader_cls=UnstructuredHTMLLoader,
        show_progress=True,
    )
    documents = loader.load()

    print(f"Loaded: {len(documents)} documents from {source_dir}.")

    return documents

### Creating or loading embeddings

Creating embeddings each time we use our app is time-consuming and expensive.
By persisting the vector store database after embedding, we can load the saved embeddings for use in another session.

**➡️ Your task:** Study and execute the following code cells. Note that after the documents have been loaded, they are split into shards using the `RecursiveCharacterTextSplitter` function. 

In [None]:
# @title Set the name of your vectorstore { run: "auto", display-mode: "form" }
# @markdown This variable can be left as default for this task.
PERSIST_DIR = "chromadb"  # @param {type:"string"}

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import VertexAIEmbeddings


def create_embeddings(source_dir):
    documents = load_documents(source_dir=source_dir)

    # We use Google embeddings model, however other models can be substituted here
    embedding = VertexAIEmbeddings()

    # Individual documents will often exceed the token limit.
    # By splitting documents into chunks of 1000 token
    # These chunks fit into the token limit alongside the user prompt
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    texts = text_splitter.split_documents(documents)

    vector_store = Chroma.from_documents(
        documents=texts, embedding=embedding, persist_directory=PERSIST_DIR
    )

    # Persist the ChromaDB locally, so we can reload the script without expensively re-embedding the database
    vector_store.persist()

In [None]:
create_embeddings(
    source_dir=LOCAL_FOLDER_DROPDOWN.value
)  # creates the vector DB and saves it locally.
print(f"Created new vectorstore in dir {PERSIST_DIR}.")

**➡️ Your task:** Run the following cells to *load* the text embeddings.

In [None]:
def load_embeddings(persist_directory):
    # We use VertexAI embeddings model, however other models can be substituted here
    embeddings = VertexAIEmbeddings()

    # Creating embeddings with each re-run is highly inefficient and costly.
    # We instead aim to embed once, then load these embeddings from storage.
    vector_store = Chroma(
        embedding_function=embeddings,
        persist_directory=persist_directory,
    )

    return vector_store

In [None]:
vector_store = load_embeddings(
    PERSIST_DIR
)  # loads the vector DB from the local file system.
print(f"Loaded {PERSIST_DIR} as vectorstore.")

**🎉 Congratulations! 🎉** You've created text embeddings from your company data and stored them successfully in a local vector database.
Now, you'll shift your focus to implementing the actual LLM by creating a chain using LangChain.

## Creating the Conversational Q&A Chain

In this section, you'll create a chain which will be able to provide an answer given a question from a user.
To understand the purpose of chains, you can read about chains in the [LangChain documentation](https://docs.langchain.com/docs/).


## Building the user interface

As building a UI is outside of the scope for this hackathon, a templated GUI using [Gradio](https://gradio.app/) is provided for your knowledge worker.

**➡️ Your task:** Execute the cells below to launch the user interface.

**❗ Note:** The following cell will run until manually stopped. Remember to halt it before moving onto the next task.

In [None]:
qa_chain = create_qa_chain(
    vector_store=vector_store, condense_question_prompt=TASK_01_PROMPT
)

demo = demo_view.View(qa_chain=qa_chain)
demo.launch_interface()

**➡️ Your task:** Use the user interface above (which you can also open in a separate tab given the shareable link above) to query your knowledge base.

Try out a few questions from this example Q&A (using the Datatonic web archive):

> 👩‍💻: What is Datatonic?
> 
> 🦜: Datatonic is a data consultancy enabling companies to make better business decisions with the power of Modern Data Stack and MLOps.
> 
> 👩‍💻: Summarise the web article on Greentonic.
> 
> 🦜: Greentonic is Datatonic's sustainability initiative.
> 
> 👩‍💻: How is Datatonic being sustainable?
> 
> 🦜: Datatonic is committed to sustainability and has a number of initiatives in place to reduce its environmental impact. These include:
>    * Using renewable energy sources
>    * Reducing our carbon footprint
>    * Promoting sustainable practices in our supply chain
>    * Supporting environmental charities
>
> We believe that sustainability is essential for the future of our planet and we are committed to doing our part to make a difference.

Since we used a `ConversationalRetrievalChain`, we can also correct the model when it gives the wrong response and prompt it to fix it’s mistake, or ask for further detail on a previous response.

**🎉 Congratulations! 🎉** You've created your first chain using LangChain which you can query for general questions in a user interface.
Let's continue extending your knowledge worker in the next task.


# Task 2: Extending the knowledge base

As mentioned, it is possible to extend the knowledge base with additional documents.
This is useful for updating a knowledge base with new information without having to re-embed established knowledge from scratch.

If you wanted to build a knowledge worker with another document type, for instance [Microsoft Word](https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/microsoft_word.html) documents, you would update the `load_documents()` function according to the documentation for that document type loader.

**❗ Note:** if you're looking to deploy a knowledge worker with several knowledge bases, an [Embedding Router Chain](https://python.langchain.com/docs/modules/chains/foundational/router#embeddingrouterchain), which combines several knowledge workers with discrete knowledge bases into a single chain which selects the best worker for the query.

**➡️ Your task:** Extend the knowledge worker with new documents.
1. Load the new documents.
2. Add new documents to the existing vector store using the `.add_documents(documents=...)` method.

See the following example for loading Word documents:


In [None]:
from langchain.document_loaders import Docx2txtLoader


def load_docx_documents(filepath):
    if filepath:
        # Load the documentation using a Microsoft Word parser
        loader = Docx2txtLoader(filepath)
        documents = loader.load()

        return documents

In [None]:
# load the word document by filepath
word_documents = load_docx_documents(
    filepath=None
)  # ❗ TODO: update this function to your own file type + file path

# add this document(s) to the vector store
vector_store.add_documents(word_documents)

# "save" the new vector store back to the file system
vector_store.persist()

print("Finished adding new documents to the vectorstore.")

In [None]:
# ❗ TODO: replicate the code above to add more documents to the vector store..

**➡️ Your task:** Rerun the app and try asking questions using knowledge from your newly added documents.

**❗ Note:** The following cell will run until manually stopped. Remember to halt it before moving onto the next task.

In [None]:
qa_chain = create_qa_chain(
    vector_store=vector_store, condense_question_prompt=TASK_01_PROMPT
)

demo = demo_view.View(qa_chain=qa_chain)
demo.launch_interface()

**🎉 Congratulations! 🎉** You've extended your knowledge to creating text embedding from a variety of sources - whether it's public data from your company's website or unstructured documents!

# Task 3: Generating text over a vector index

We can utilise our embedded documents for more than just Q&A.
In tasks 1 and 2, we used the embedded documents as context for answering user queries, but in this task we will use it to generate original content using this knowledge base as a source of information and style.

The concept of this use case is to generate ideas for new blogs, utilising knowledge and style information contained in the existing company website data.
We can use Generative AI for creative ideation, too!
Let's demonstrates the possibilities for human-computer interaction (HCI) apps in this task.

In [None]:
# @title Set LLM temperature { run: "auto", display-mode: "form" }
# @markdown Temperature controls the degree of creativity / randomness introduced into the LLM.
temperature = 0.7  # @param {type:"slider", min:0, max:1, step:0.01}

In [None]:
TASK_03_TEMPLATE = """\
Using the provided context, write the outline of a company blog post.
Include a bullet-point list of the main talking points, and a brief summary of the overall blog.

Context: {context}
Topic: {topic}
"""

TASK_03_PROMPT = PromptTemplate.from_template(TASK_03_TEMPLATE)

In [None]:
from langchain.chains import LLMChain

model = VertexAI(temperature=temperature)

chain = LLMChain(llm=model, prompt=TASK_03_PROMPT)

In [None]:
def generate_blog_outline(topic: str, k: int):
    # search for 'k' nearest documents related to our topic.
    docs = vector_store.similarity_search(topic, k=k)

    # associate topic with the content of each document to generate inputs
    inputs = [{"context": doc.page_content, "topic": topic} for doc in docs]

    # generate blog outline
    output = chain.apply(inputs)

    return output

**➡️ Your task:** Create ideas for a new blog post.
Try adjusting the title of the post to generate new ideas!

In [None]:
# @title Set blog prompt / title { run: "auto", display-mode: "form" }
# @markdown Be descriptive, as the LLM will collect semantically similar sources as inspiration.
BLOG_TITLE = "How we're making our business more sustainable"  # @param {type:"string"}

In [None]:
from IPython.display import display, Markdown

# generate variations of blog posts on the topic provided, based on the 4 most relevant documents
output = generate_blog_outline(BLOG_TITLE, k=4)
markdown = ""

for i, blog in enumerate(output):
    markdown += f"# #{i} {BLOG_TITLE}\n{blog['text']}\n\n"

display(Markdown(markdown))

**➡️ Your task:** Update the `TASK_03_PROMPT`, `temperature` and `BLOG_TITLE` variables to create new types of content.

**🎉 Congratulations! 🎉** You've completed task 3 and generated ideas for future blog posts!
Continue with the next section to explore more possibilities and ideas using LangChain.

# Task 4: Extending the chain

So far you've created two types of chains:

### LLMChain

The `LLMChain` is a simple chain that adds some functionality around language models.
It is used widely throughout LangChain, including in other chains and agents.

An LLMChain consists of a **PromptTemplate** and a **language model** (either an LLM or chat model).
It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.

```python
chain = LLMChain(llm=model, prompt=PROMPT)
```

### ConversationalRetrievalChain

The `ConversationalRetrievalQA` chain builds on RetrievalQAChain to provide a chat history component.

It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.

To create one, you will need a retriever.
In the below example, we will create one from a vector store, which can be created from embeddings.

```python
retriever = vector_store.as_retriever(k=k)
model = VertexAI(temperature=temperature)
chain = ConversationalRetrievalChain.from_llm(
    llm=model,
    retriever=retriever,
    return_source_documents=True,
    condense_question_prompt=TASK_01_PROMPT,
)
```

### Explore more chains

**➡️ Your task:** Firm up your knowledge about the two chains used in this notebook [here](https://python.langchain.com/docs/modules/chains/foundational/llm_chain) and [here](https://python.langchain.com/docs/modules/chains/popular/chat_vector_db).
In which scenarios should you apply either of them?
What are their limitations?

*The LLMChain is useful when ...*

*It's limitations are ...*

*The ConversationalRetrievalChain is useful when ...*

*It's limitations are ...*

**➡️ Your task:** Read about more types of chains in the [official LangChain documentation](https://python.langchain.com/docs/modules/chains/additional/).
We recommend the **Sequential chain** and **Self-critique chain with constitutional AI**.
How can you extend your conversational knowledge worker which is currently based on the `ConversationalRetrievalChain`?
Summarise your idea either using pseudo code or actual code if you've time!
Overall we would like to you to consider:

**Idea + idea description:**

- *The idea is ...*
- *What it is ...*

**Problem it solves + impact:**

- *It would solve the following challenge ...*
- *The volume or value of the impact would be ...*

**Approach + Next steps:**

- *Next steps would be ...*

**❗ Note:** Do you have any other ideas (even outside of implementing a knowledge worker)?
Feel free to ideate about another use case which is relevant to your industry or company!

In [None]:
# ❗ TODO: create pseudo code

**🎉 Congratulations! 🎉** You've completed the last task of this hackathon! Now lets get a sneak peek of Gen App Builder and Google Enterprise Search.

## Bonus Track - Using Gen App Builder Within your Knowledge Worker
Google Cloud has released a tool called Enterprise Search within the Gen App Builder service. Using Enterprise Search, you can ingest and retrieve websites, internal structured and unstructured data in a search engine and then use a Knowledge Worker to retrieve information in natural language. This is analogous to an internal Google search engine for your documents.

Using a custom implementation of a Knowledge Worker, you can combine the power of Gen App Builder with the customisation ability of a Knowledge Worker and build applications simpler, faster and more robustly. 

Let's have a sneak peak at how to do this. 

**❗ Note** : This is not publicly available so you will need a whitelisted Google Cloud project. We have already created an Enterprise Search for you, so you can go ahead and try it out.

**➡️ Your task:** Obtain the credentials from one of the workshop leaders for the `dt-vertex-gen-ai-dev` project, in order to access a preview of Enterprise Search.

In [None]:
# @title Set project credentials. { run: "auto", display-mode: "form" }
# @markdown Set the filepath to the `.json` credentials file.

# @markdown **❗ Note** : This overwrites your previous credentials. If you want to return to previous tasks, reset the credentials filepath using the original `GOOGLE_APPLICATION_CREDENTIALS` assignment.
GOOGLE_APPLICATION_CREDENTIALS = "/content/es_credentials.json"  # @param {type:"string"}
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = GOOGLE_APPLICATION_CREDENTIALS

# @markdown These values can be kept as default for this task.
SEARCH_ENGINE_PROJECT_ID = "dt-vertex-gen-ai-dev"  # @param {type:"string"}
SEARCH_ENGINE_ID = "wpp-genai-day_1689017718091"  # @param {type:"string"}

**➡️ Your task:** Run the code cell below to update the Q&A Chain to use the new retriever.

**❗ Note how the only code snippet we need to update is the `retriever` variable. Enterprise Search integrates directly into our existing app with minimal code changes.**

In [None]:
import dt_gen_ai_hackathon_helper.enterprise_search.enterprise_search as enterprise_search


def create_qa_chain(condense_question_prompt, k=4, temperature=0.0):
    # Using Enterprise Search as a retriever
    retriever = enterprise_search.EnterpriseSearchRetriever(
        project_id=SEARCH_ENGINE_PROJECT_ID, search_engine_id=SEARCH_ENGINE_ID, k=4
    )

    # The selected Google model uses embedded documents related to the query
    # It parses these documents in order to answer the user question.
    # We use the Google LLM, however other models can be substituted here
    model = VertexAI(temperature=temperature)

    # A conversation retrieval chain keeps a history of Q&A / conversation
    # This allows for contextual questions such as "give an example of that (previous response)".
    # The chain is also set to return the source documents used in generating an output
    # This allows for explainability behind model output.
    chain = ConversationalRetrievalChain.from_llm(
        llm=model,
        retriever=retriever,
        return_source_documents=True,
        condense_question_prompt=condense_question_prompt,
    )

    return chain

**➡️ Your task:** Rerun the app using the new retriever.

**❗ Note:** The following cell will run until manually stopped. Remember to halt it before moving onto the next task.

In [None]:
qa_chain = create_qa_chain(condense_question_prompt=TASK_01_PROMPT)

demo = demo_view.View(qa_chain=qa_chain)
demo.launch_interface()

Try asking the same questions you used to query your knowledge worker. Does it answer the question in the same way? Does it give more or less detail? What are the immediate differences between solutions?

**🎉 Congratulations! 🎉** You've gotten started with Enterprise Search. 

# Conclusion

## What have we built?

In this session, we have built a knowledge worker use-case for accessing your complex information using Generative AI.
This concept can be extended into a fully-fledged tool that can unlock the value of your data for customers or internal use.

## Going further..

This workshop has introduced all the LangChain knowledge required to create a knowledge worker.
The next steps for moving this project from development to production are discussed below.

## Decoupling LangChain from Gradio

It is not necessary to run LangChain within a GradI/O app.
Decoupling LangChain into a separate API has several benefits:
1. We can deploy scalable servers / Docker containers
2. Simplified code - a frontend-backend loose coupling can lead to simpler code, which is ease to update and maintain.
3. If a more professional user interface is needed, such as a native React app.
Replacing GradI/O is a straightforward process - FastAPI can be called from javascript, etc., allowing you to move beyond Python frontend frameworks.

An example of this separation can be found on the GitHub repository, using FastAPI to create a simple LangChain API server and Poetry to manage separate server environments.

## Deploying on Google Cloud

Once we have decoupled our frontend and backend code, we can deploy the project onto Google Cloud.

This reference architecture diagrams mirror the flow diagrams we first introducted in the workshop introduction. Using Google Cloud, we can create production pipelines for creating / updating vector databases, and deploy a knowledge worker API (which can be connected to a web UI, Slack bot, etc.).

**Example architecture: Ingestion**
![A typical ingestion chain](https://github.com/teamdatatonic/gen-ai-hackathon/blob/7f37d477b18ace5912d34b0574512559d7a457ed/assets/knowledge-worker-gcp-ingestion-pipeline.png?raw=true)

By creating a pipeline for data ingestion, we can continue to extend the knowledge base of our knowledge worker as you produce new documents and documentation.

**Example architecture: Inference**
![A typical ingestion chain](https://github.com/teamdatatonic/gen-ai-hackathon/blob/7f37d477b18ace5912d34b0574512559d7a457ed/assets/knowledge-worker-gcp-inference-pipeline.png?raw=true)

By creating a pipeline for inference, we can leverage the power of Google Cloud to provide a highly reliable and scalable API that can power a variety of applications.

**🎉 Congratulations! 🎉** You've completed this notebook!
Now it's time to embark your Generative AI journey and ideate about use cases which can benefit your company in conjunction or in addition to your first knowledge worker.