# Building a Live RAG Pipeline over Google Drive Files

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/ingestion/ingestion_gdrive.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this guide we show you how to build a "live" RAG pipeline over Google Drive files.

This pipeline will index Google Drive files and dump them to a Redis vector store. Afterwards, every time you rerun the ingestion pipeline, the pipeline will propagate **incremental updates**, so that only changed documents are updated in the vector store. This means that we don't re-index all the documents!

We use the following [data source](https://drive.google.com/drive/folders/1RFhr3-KmOZCR5rtp4dlOMNl3LKe1kOA5?usp=sharing) - you will need to copy these files and upload them to your own Google Drive directory!

**NOTE**: You will also need to setup a service account and credentials.json. See our LlamaHub page for the Google Drive loader for more details: https://llamahub.ai/l/readers/llama-index-readers-google?from=readers



## Setup

We install required packages and launch the Redis Docker image.

In [None]:
%pip install llama_index

In [None]:
%pip install llama-index-storage-docstore-redis
%pip install llama-index-vector-stores-redis
%pip install llama-index-embeddings-huggingface
%pip install llama-index-readers-google

In [3]:
# if creating a new container
!docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
# # if starting an existing container
# !docker start -a redis-stack

docker: Error response from daemon: Conflict. The container name "/redis-stack" is already in use by container "0f8431747c7706d69300b99a86c59aa98e9a7d7aeae2be59787afa9e42c90814". You have to remove (or rename) that container to be able to reuse that name.
See 'docker run --help'.


In [248]:
import os

os.environ["OPENAI_API_KEY"] = 'sk-proj-cf2vGKeLXFbUw702UEB0T3BlbkFJRJAY4TNZCAuErTbkbSsl' 

## Define Ingestion Pipeline

Here we define the ingestion pipeline. Given a set of documents, we will run sentence splitting/embedding transformations, and then load them into a Redis docstore/vector store.

The vector store is for indexing the data + storing the embeddings, the docstore is for tracking duplicates.

In [73]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.ingestion import (
    DocstoreStrategy,
    IngestionPipeline,
    IngestionCache,
)
from llama_index.storage.kvstore.redis import RedisKVStore as RedisCache
from llama_index.storage.docstore.redis import RedisDocumentStore
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.redis import RedisVectorStore

from redisvl.schema import IndexSchema

In [74]:
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")



In [101]:
# e: Define redis schema
custom_schema = IndexSchema.from_dict(
    {
        "index": {"name": "gdrive", "prefix": "doc"},
        # customize fields that are indexed
        "fields": [
            # required fields for llamaindex
            {"type": "tag", "name": "id"},
            {"type": "tag", "name": "doc_id"},
            {"type": "text", "name": "text"},
            # custom vector field for bge-small-en-v1.5 embeddings
            {
                "type": "vector",
                "name": "vector",
                "attrs": {
                    "dims": 384,
                    "algorithm": "hnsw",
                    "distance_metric": "cosine",
                },
            },
        ],
    }
)

# e: define vector store given schema
vector_store = RedisVectorStore(
    schema=custom_schema,
    redis_url="redis://localhost:6379",
)

In [148]:
# Optional: clear vector store if exists
if vector_store.index_exists():
    vector_store.delete_index()
vector_store.create_index()

In [108]:
# Set up the ingestion cache layer
cache = IngestionCache(
    cache=RedisCache.from_host_and_port("localhost", 6379),
    collection="redis_cache",
)

In [109]:
docstore = RedisDocumentStore.from_host_and_port(
        "localhost", 6379, namespace="document_store"
)

In [105]:
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        embed_model,
    ],
    docstore=docstore,
    vector_store=vector_store,
    cache=cache,
    docstore_strategy=DocstoreStrategy.UPSERTS,
)

### Define our Vector Store Index

We define our index to wrap the underlying vector store.

In [80]:
from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(
    pipeline.vector_store, embed_model=embed_model
)

## Load Initial Data

Here we load data from our [Google Drive Loader](https://llamahub.ai/l/readers/llama-index-readers-google?from=readers) on LlamaHub.

The loaded docs are the header sections of our [Use Cases from our documentation](https://docs.llamaindex.ai/en/latest/use_cases/q_and_a/root.html).

In [89]:
from llama_index.readers.google import GoogleDriveReader

In [90]:
loader = GoogleDriveReader(service_account_key={
  "type": "service_account",
  "project_id": "llamaindex-425923",
  "private_key_id": "842e330bbcfd7c7525fce7f7d14ab161f056d717",
  "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQCqEkUMqdkWMrn6\nCkePQnTih2EOE/aD66crON+vFLZhC8nwTsbMoHWgu6QegRkn+9mm8/yp1wDuV8jF\nGztVF4pPA9pHLtoUliAR4qlO71mejElDcr5+9TUEu0BZV5maihQyMOi/ZAgOyx6N\nwUp1EqUTpt8EY+NyjAVU4EGolPJil0D0ltiu3wUKEYhCRPUifS2aO5iKEk5ipkaq\nkn87ar3Tt8A0URhrNYmcmzTN/lUYR4BfFvT1e+G+Ox4dT0diKrrJguRhqakAizo4\nGxpfPWS96yjAhlLjccDVqseefrin7LtRKl3Kf/K7PIplapct8ADoUgimr+KRhU/B\naqyxlxvXAgMBAAECggEAJuz3kOZvIX2Kw4gGyOtVBWQm+qnnClqtcF8cE16SP7QL\nDE17gupXAerwrWqu70/diTDe66CiPespsLOC6P7yURN4qkI29+9Ed9OBHOf7gyZo\nvl9M6pIFCIgzOj+qkIx6AuLe4q9qiLhEzf0npZQW5GCVOtQCAv25WqvZW1R8BcRp\nOXu/YnydQhtsheB7HgNjdCJepUqlzZR6Iy5zgQvvszPFHCSU9nDLShu8HR25Zssi\ns70iDdOoxC0br5fYN73yBhEw5TWYHJmLoSQ7NnXE6ZD453wM2syq2pJ3AEnaLXgc\nWajv0oFMeUzVP3u5llikEPZDcjcS0qL3YJmiS5J6wQKBgQDUcQdYy8lkrDAJ6sEP\nMpBB+jnzkfe9/K64ofiR8CY1b489cemYrXmMYvkUM3PT3S8m950iA0/tqKPY5Xz+\nKQhv5rGB6e0aZkc8fF26yLQ6OQ2kivF/aapZtH6FsdCCgPHCY0TwYxSW60HRUNXL\n1UXKlU40yUn0J7yPrChabdVzNQKBgQDM8T+c3LxWhiWCsFAilW3rXC++x5zDmCwf\n/lnqTIGM6VsF1TeOklriakF9+lsnLaMtmKF00vHQc7LaO3KUUtVk5Th8Kc1ChB5w\nyPxgomW/FB/3TJIoa0fIw46I3/flvuck3OA2u15E1JlmRHzh+bx3bF7R8fjBU/oB\nJgkQgNeIWwKBgBimFKgk3PUlVV2RM11EDKDbG6Y7JXeRveQGRwUHzCcfDouej+/7\nEMNNKIFDhwjp/PKUgFLV94wnqPsdhWcn86aoKahdL/mELHfEJkvpxZ+Lxx10UZjq\n9az62ENC4mKB+4sGYZVSqazpFvXmXygLUtGyiSRuxnnhsmspGnWDkOSRAoGBALmj\nfZ+1QtSNRsUOuEfbW2nqY0VIwvNZj/6bAw98pYzihBTQgjae07xry081zAO9DyZq\nLzYHLgtIAiYz2yIHtkxiZbvykX9C9d/r7tAJymg+7oWv0mTSaH+uxBmv5nkxZ2a4\nvFBXgJiQizNbrlNzSAhVHb7GlDFqw/buYR+V1aYpAoGBAJ70rAefLkJyjXreynmX\nF62/rghChlXTUnUJ3l517fwhFV21qc6MNZNcB9T90OE9TW2+9LH0G9X7NuBVdtHm\nki/Ab9Q4WpH+3j+4ijLUiIb+dLOzyBKxTaEQYTk5J0qt7y04HDt3Q6IuxbVK4iki\nqsVGWFDS1HVXb7H9DWNOTMyS\n-----END PRIVATE KEY-----\n",
  "client_email": "llama-index-example@llamaindex-425923.iam.gserviceaccount.com",
  "client_id": "115993612305546342970",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/llama-index-example%40llamaindex-425923.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
})# loader = GoogleDriveReader(service_account_key=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)
# 448896167847-agi7l86q9b5bj4evp7pjg3u8r682f0g8.apps.googleusercontent.com
# loader = GoogleDriveReader(448896167847-agi7l86q9b5bj4evp7pjg3u8r682f0g8.apps.googleusercontent.com)

In [None]:
!pip install docx2txt

In [143]:
loader = GoogleDriveReader(service_account_key={
  "type": "service_account",
  "project_id": "llamaindex-425923",
  "private_key_id": "842e330bbcfd7c7525fce7f7d14ab161f056d717",
  "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQCqEkUMqdkWMrn6\nCkePQnTih2EOE/aD66crON+vFLZhC8nwTsbMoHWgu6QegRkn+9mm8/yp1wDuV8jF\nGztVF4pPA9pHLtoUliAR4qlO71mejElDcr5+9TUEu0BZV5maihQyMOi/ZAgOyx6N\nwUp1EqUTpt8EY+NyjAVU4EGolPJil0D0ltiu3wUKEYhCRPUifS2aO5iKEk5ipkaq\nkn87ar3Tt8A0URhrNYmcmzTN/lUYR4BfFvT1e+G+Ox4dT0diKrrJguRhqakAizo4\nGxpfPWS96yjAhlLjccDVqseefrin7LtRKl3Kf/K7PIplapct8ADoUgimr+KRhU/B\naqyxlxvXAgMBAAECggEAJuz3kOZvIX2Kw4gGyOtVBWQm+qnnClqtcF8cE16SP7QL\nDE17gupXAerwrWqu70/diTDe66CiPespsLOC6P7yURN4qkI29+9Ed9OBHOf7gyZo\nvl9M6pIFCIgzOj+qkIx6AuLe4q9qiLhEzf0npZQW5GCVOtQCAv25WqvZW1R8BcRp\nOXu/YnydQhtsheB7HgNjdCJepUqlzZR6Iy5zgQvvszPFHCSU9nDLShu8HR25Zssi\ns70iDdOoxC0br5fYN73yBhEw5TWYHJmLoSQ7NnXE6ZD453wM2syq2pJ3AEnaLXgc\nWajv0oFMeUzVP3u5llikEPZDcjcS0qL3YJmiS5J6wQKBgQDUcQdYy8lkrDAJ6sEP\nMpBB+jnzkfe9/K64ofiR8CY1b489cemYrXmMYvkUM3PT3S8m950iA0/tqKPY5Xz+\nKQhv5rGB6e0aZkc8fF26yLQ6OQ2kivF/aapZtH6FsdCCgPHCY0TwYxSW60HRUNXL\n1UXKlU40yUn0J7yPrChabdVzNQKBgQDM8T+c3LxWhiWCsFAilW3rXC++x5zDmCwf\n/lnqTIGM6VsF1TeOklriakF9+lsnLaMtmKF00vHQc7LaO3KUUtVk5Th8Kc1ChB5w\nyPxgomW/FB/3TJIoa0fIw46I3/flvuck3OA2u15E1JlmRHzh+bx3bF7R8fjBU/oB\nJgkQgNeIWwKBgBimFKgk3PUlVV2RM11EDKDbG6Y7JXeRveQGRwUHzCcfDouej+/7\nEMNNKIFDhwjp/PKUgFLV94wnqPsdhWcn86aoKahdL/mELHfEJkvpxZ+Lxx10UZjq\n9az62ENC4mKB+4sGYZVSqazpFvXmXygLUtGyiSRuxnnhsmspGnWDkOSRAoGBALmj\nfZ+1QtSNRsUOuEfbW2nqY0VIwvNZj/6bAw98pYzihBTQgjae07xry081zAO9DyZq\nLzYHLgtIAiYz2yIHtkxiZbvykX9C9d/r7tAJymg+7oWv0mTSaH+uxBmv5nkxZ2a4\nvFBXgJiQizNbrlNzSAhVHb7GlDFqw/buYR+V1aYpAoGBAJ70rAefLkJyjXreynmX\nF62/rghChlXTUnUJ3l517fwhFV21qc6MNZNcB9T90OE9TW2+9LH0G9X7NuBVdtHm\nki/Ab9Q4WpH+3j+4ijLUiIb+dLOzyBKxTaEQYTk5J0qt7y04HDt3Q6IuxbVK4iki\nqsVGWFDS1HVXb7H9DWNOTMyS\n-----END PRIVATE KEY-----\n",
  "client_email": "llama-index-example@llamaindex-425923.iam.gserviceaccount.com",
  "client_id": "115993612305546342970",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/llama-index-example%40llamaindex-425923.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
})# loader = GoogleDriveReader(service_account_key=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)
# 448896167847-agi7l86q9b5bj4evp7pjg3u8r682f0g8.apps.googleusercontent.com
# loader = GoogleDriveReader(448896167847-agi7l86q9b5bj4evp7pjg3u8r682f0g8.apps.googleusercontent.com)

def load_data(folder_id: str):
    docs = loader.load_data(folder_id=folder_id)
    print(len(docs), "docs")
    # for doc in docs:
    #     doc.id_ = doc.metadata["file_name"]
    return docs

docs = load_data(folder_id="1UBvqhlESL6r8fLzjcAHuAxgH4vbPvT7K") # personal
# docs = load_data(folder_id="1RFhr3-KmOZCR5rtp4dlOMNl3LKe1kOA5") # orig
# print(docs)

5 docs


In [163]:
from llama_index.core import SimpleDirectoryReader
loader = SimpleDirectoryReader("data")

def load_data():
    docs = loader.load_data()
    print(len(docs), "docs")
    # for doc in docs:
    #     doc.id_ = doc.metadata["file_name"]
    return docs

# docs = load_data(folder_id="1UBvqhlESL6r8fLzjcAHuAxgH4vbPvT7K") # personal
docs = load_data()
# print(docs)

38 docs


In [164]:
for doc in docs:
    # print(doc.metadata)
    # print(doc.id_)
    # break
    doc.id_ = doc.metadata["file_name"]

In [165]:
nodes = pipeline.run(documents=docs)
print(f"Ingested {len(nodes)} Nodes")

Ingested 1 Nodes


Since this is our first time starting up the vector store, we see that we've transformed/ingested all the documents into it (by chunking, and then by embedding).

### Ask Questions over Initial Data

In [179]:
from llama_index.llms.openai import OpenAI
llm = OpenAI(temperature=0.1, model="gpt-4o", )
query_engine = index.as_query_engine(llm=llm)

In [184]:
response = query_engine.query("What is this documentation about?")
print(response)

This documentation appears to be about the Ursina game engine. It includes an HTML template for generating an API reference page, which lists various components and modules of the engine. These components are categorized into groups such as Basics, Core Modules, Graphics, Procedural Models, Animation, Math, Gameplay, Collision, Prefabs, UI, Editor, Scripts, Assets, and Shaders. Additionally, there are references to tutorials like an introduction and a platformer tutorial, which are likely aimed at helping users learn how to use the Ursina engine effectively.


In [199]:
## Attempt to use Pydantic Output response formats
# from typing import List
# from pydantic import BaseModel

# class Program(BaseModel):
#     """Data model for python program."""

#     name: str
#     python_code : str
#     explanation : str

# query_engine = index.as_query_engine(llm=llm, response_model="refine", output_cls=Program)

In [201]:
%pip install langchain

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
77047.78s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


Collecting langchain
  Obtaining dependency information for langchain from https://files.pythonhosted.org/packages/cb/e7/2556005908c42be3ab384e289729a21766c761a21186f2f433aa475e859f/langchain-0.2.3-py3-none-any.whl.metadata
  Downloading langchain-0.2.3-py3-none-any.whl.metadata (6.9 kB)
Collecting langchain-core<0.3.0,>=0.2.0 (from langchain)
  Obtaining dependency information for langchain-core<0.3.0,>=0.2.0 from https://files.pythonhosted.org/packages/18/d6/6eb8bf9b340b8827874a9c065d195af66e3287be0832a7c7143f30747c6e/langchain_core-0.2.5-py3-none-any.whl.metadata
  Downloading langchain_core-0.2.5-py3-none-any.whl.metadata (5.8 kB)
Collecting langchain-text-splitters<0.3.0,>=0.2.0 (from langchain)
  Obtaining dependency information for langchain-text-splitters<0.3.0,>=0.2.0 from https://files.pythonhosted.org/packages/a9/d9/31b1b5415be5201ec1ba34ab04f47a92c69174d7817d70b51693fb60e780/langchain_text_splitters-0.2.1-py3-none-any.whl.metadata
  Downloading langchain_text_splitters-0.2.

In [208]:
from llama_index.core.output_parsers import LangchainOutputParser
from llama_index.llms.openai import OpenAI
from langchain.output_parsers import StructuredOutputParser, ResponseSchema

# define output schema
response_schemas = [
    ResponseSchema(
        name="Python code",
        description="Write python code that correspond to the query",
    ),
    ResponseSchema(
        name="Explanation",
        description="Describe what you've done",
    ),
]

# define output parser
lc_output_parser = StructuredOutputParser.from_response_schemas(
    response_schemas
)
output_parser = LangchainOutputParser(lc_output_parser)
print("Output parseroutput_parser)

llm = OpenAI(temperature=0.1, model="gpt-4o", output_parser=output_parser)
# obtain a structured response
query_engine = index.as_query_engine(llm=llm)
# response = query_engine.query(
#     "What are a few things the author did growing up?",
# )
# print(str(response))

In [209]:
response = query_engine.query("Write a tic tac toe game")
print(response)

{'Python code': "from ursina import *\n\nif __name__ == '__main__':\n    app = Ursina()\n\ncamera.orthographic = True\ncamera.fov = 4\ncamera.position = (1, 1)\nText.default_resolution *= 2\n\nplayer = Entity(name='o', color=color.azure)\ncursor = Tooltip(player.name, color=player.color, origin=(0,0), scale=4, enabled=True)\ncursor.background.color = color.clear\nbg = Entity(parent=scene, model='quad', texture='shore', scale=(16,8), z=10, color=color.light_gray)\nmouse.visible = False\n\n# create a matrix to store the buttons in. makes it easier to check for victory\nboard = [[None for x in range(3)] for y in range(3)]\n\nfor y in range(3):\n    for x in range(3):\n        b = Button(parent=scene, position=(x,y))\n        board[x][y] = b\n\n        def on_click(b=b):\n            b.text = player.name\n            b.color = player.color\n            b.collision = False\n            check_for_victory()\n\n            if player.name == 'o':\n                player.name = 'x'\n            

In [243]:
print(type(response))
response

<class 'llama_index.core.base.response.schema.Response'>


Response(response='{\'Python code\': "from ursina import *\\n\\nif __name__ == \'__main__\':\\n    app = Ursina()\\n\\ncamera.orthographic = True\\ncamera.fov = 4\\ncamera.position = (1, 1)\\nText.default_resolution *= 2\\n\\nplayer = Entity(name=\'o\', color=color.azure)\\ncursor = Tooltip(player.name, color=player.color, origin=(0,0), scale=4, enabled=True)\\ncursor.background.color = color.clear\\nbg = Entity(parent=scene, model=\'quad\', texture=\'shore\', scale=(16,8), z=10, color=color.light_gray)\\nmouse.visible = False\\n\\n# create a matrix to store the buttons in. makes it easier to check for victory\\nboard = [[None for x in range(3)] for y in range(3)]\\n\\nfor y in range(3):\\n    for x in range(3):\\n        b = Button(parent=scene, position=(x,y))\\n        board[x][y] = b\\n\\n        def on_click(b=b):\\n            b.text = player.name\\n            b.color = player.color\\n            b.collision = False\\n            check_for_victory()\\n\\n            if player.na

In [245]:
import ast

# Assuming response_obj is your Response object
response_str = response.response

# Parse the string representation of the dictionary
response_dict = ast.literal_eval(response_str)

# Extract 'Python code' and 'Explanation'
python_code = response_dict.get('Python code')
explanation = response_dict.get('Explanation')

print("Python Code:\n", python_code)
print("\nExplanation:\n", explanation)

Python Code:
 from ursina import *

if __name__ == '__main__':
    app = Ursina()

camera.orthographic = True
camera.fov = 4
camera.position = (1, 1)
Text.default_resolution *= 2

player = Entity(name='o', color=color.azure)
cursor = Tooltip(player.name, color=player.color, origin=(0,0), scale=4, enabled=True)
cursor.background.color = color.clear
bg = Entity(parent=scene, model='quad', texture='shore', scale=(16,8), z=10, color=color.light_gray)
mouse.visible = False

# create a matrix to store the buttons in. makes it easier to check for victory
board = [[None for x in range(3)] for y in range(3)]

for y in range(3):
    for x in range(3):
        b = Button(parent=scene, position=(x,y))
        board[x][y] = b

        def on_click(b=b):
            b.text = player.name
            b.color = player.color
            b.collision = False
            check_for_victory()

            if player.name == 'o':
                player.name = 'x'
                player.color = color.orange
   

In [247]:
query_engine.get_prompts().keys()

dict_keys(['response_synthesizer:text_qa_template', 'response_synthesizer:refine_template'])

In [None]:
## product
# chat with your documentation
# iteratively edit code?


## todos
# turn the above local file upload / enter path into chatbot & pipe it into UI to show dad
# execute the output code
# documentation parsing into structured pydantic functions

In [192]:
response = query_engine.query("What are the examples provided here?")
print(response)

The examples provided are:
1. Tic Tac Toe
2. Inventory
3. Pong
4. Minecraft Clone
5. Rubik's Cube
6. Clicker Game
7. Platformer
8. FPS
9. Particle System
10. Column Graph


In [193]:
response = query_engine.query("What can I build with this engine?")
print(response)

With this engine, you can build various types of games and interactive applications. You can start with simple projects like creating a window and rendering a colored cube with basic controls. As you progress, you can create more complex scenes such as a solar system using spheres, or a landscape with moving planes and textured cubes. The engine also supports advanced features like texturing, texture animations, alpha blending, and mouse collisions, allowing you to develop more sophisticated and interactive environments.


In [194]:
response = query_engine.query("Write a tic tac toe game")
print(response)

To create a Tic Tac Toe game using the Ursina engine, you can use the following code:

```python
from ursina import *

if __name__ == '__main__':
    app = Ursina()

camera.orthographic = True
camera.fov = 4
camera.position = (1, 1)
Text.default_resolution *= 2

player = Entity(name='o', color=color.azure)
cursor = Tooltip(player.name, color=player.color, origin=(0,0), scale=4, enabled=True)
cursor.background.color = color.clear
bg = Entity(parent=scene, model='quad', texture='shore', scale=(16,8), z=10, color=color.light_gray)
mouse.visible = False

# create a matrix to store the buttons in. makes it easier to check for victory
board = [[None for x in range(3)] for y in range(3)]

for y in range(3):
    for x in range(3):
        b = Button(parent=scene, position=(x,y))
        board[x][y] = b

        def on_click(b=b):
            b.text = player.name
            b.color = player.color
            b.collision = False
            check_for_victory()

            if player.name == 'o'

## Modify and Reload the Data

Let's try modifying our ingested data!

We modify the "Q&A" doc to include an extra "structured analytics" block of text. See our [updated document](https://docs.google.com/document/d/1QQMKNAgyplv2IUOKNClEBymOFaASwmsZFoLmO_IeSTw/edit?usp=sharing) as a reference.

Now let's rerun the ingestion pipeline.

In [None]:
docs = load_data(folder_id="1RFhr3-KmOZCR5rtp4dlOMNl3LKe1kOA5")
nodes = pipeline.run(documents=docs)
print(f"Ingested {len(nodes)} Nodes")

Notice how only one node is ingested. This is beacuse only one document changed, while the other documents stayed the same. This means that we only need to re-transform and re-embed one document!

### Ask Questions over New Data

In [None]:
query_engine = index.as_query_engine()

In [None]:
response = query_engine.query("What are the sub-types of question answering?")

In [None]:
print(str(response))

The sub-types of question answering mentioned in the context are semantic search, summarization, and structured analytics.
