In [1]:
!pip install llama-index-readers-web bs4 iprogress


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [None]:
import uuid
import chromadb
import os
import nest_asyncio

from typing import Annotated
from llama_index.core.settings import Settings
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.readers.web import BeautifulSoupWebReader
from llama_index.core.storage import StorageContext
from llama_index.core.indices import VectorStoreIndex
from llama_index.core.base.llms.types import MessageRole
from llama_index.core.tools import FunctionTool

from llama_index.core.tools import QueryEngineTool
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.types import ChatMessage
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core.vector_stores import MetadataFilters, MetadataFilter, FilterOperator
from llama_index.core import PromptTemplate
from llama_index.core.agent import ReActAgent

In [89]:
embed_model = OllamaEmbedding(model_name="mxbai-embed-large")
llm = Ollama(model="hf.co/MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF:Q2_K", request_timeout=300)
Settings.llm =llm
Settings.embed_model = embed_model

In [56]:
urls = [
    'https://de.wikipedia.org/wiki/Katzen',
    'https://de.wikipedia.org/wiki/Indonesien'
]

websites = BeautifulSoupWebReader().load_data(urls=urls)

In [57]:
storage = StorageContext.from_defaults()
vectors = [
    VectorStoreIndex.from_documents([website], storage_context=StorageContext.from_defaults())
    for website in websites
]

query_engines = [
    vector_store.as_query_engine() for vector_store in vectors
]

In [58]:
for query_engine in query_engines:
    response = query_engine.query("What are the texts about?")
    print(response)
    print("\n")

The texts discuss how cats communicate using various methods such as sounds, visual signals like ear and tail positions, physical contact, and chemical signals like scents from urine or anal gland secretions. It also mentions that different cat species have unique vocalizations, with some having sounds similar to domestic cats and others having distinct noises like sharp whistles or short barks. Additionally, it notes that larger cats such as tigers, jaguars, leopards, and lions often have species-specific calling patterns.


The texts discuss various aspects of Indonesian culture, including its historical influences from Buddhism and Hinduism, the art form of Batik, the Pawukon calendar used in Java and Bali, dietary customs with a focus on rice as a staple food, and the establishment of Special Olympics Indonesia. Additionally, there is a list of literary works related to Indonesia covering topics such as literature, society, modern Indonesia, and its history.




In [68]:
nest_asyncio.apply()

# chroma DB
chroma_client = chromadb.HttpClient()
chroma_collection = chroma_client.get_or_create_collection(os.environ.get("CHROMA_COLLECTION_NAME", 'llama-test-chroma-3'))
chroma_vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
vector_index = VectorStoreIndex.from_vector_store(vector_store=chroma_vector_store, storage_context=storage_context,
                                                      embed_model=Settings.embed_model)
chat_history = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content="""
            You are designed to help with a variety of tasks, from answering questions \
            to providing summaries to other types of analyses.
            Your name is Hal Emmerich, from 2001 Space Odyssey.

            ## Tools
            You have access to a wide variety of tools. You are responsible for using
            the tools in any sequence you deem appropriate to complete the task at hand.
            This may require breaking the task into subtasks and using different tools
            to complete each subtask.

            You have access to the following tools:
            {tool_desc}

            ## Output Format
            To answer the question, please use the following format.

            ```
            Thought: I need to use a tool to help me answer the question.
            Action: tool name (one of {tool_names}) if using a tool.
            Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
            ```

            Please ALWAYS start with a Thought.

            Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.

            If this format is used, the user will respond in the following format:

            ```
            Observation: tool response
            ```

            You should keep repeating the above format until you have enough information
            to answer the question without using any more tools. At that point, you MUST respond
            in the one of the following two formats:

            ```
            Thought: I can answer without using any more tools.
            Answer: [your answer here]
            ```

            ```
            Thought: I cannot answer the question with the provided tools.
            Answer: Sorry, I cannot answer your query.
            ```

            ## Additional Rules
            - The answer MUST contain a sequence of bullet points that explain how you arrived at the answer. This can include aspects of the previous conversation history.
            - You MUST obey the function signature of each tool. Do NOT pass in no arguments if the function expects arguments.

            ## Current Conversation
            Below is the current conversation consisting of interleaving human and assistant messages.
        """
    )
]

chat_files = [

]

async def ascraping_webpage(url: str):
    """A custom function tool, that parses a webpage url from a chat message, scrapes the content webpage, index that and stores that in the DB"""
    reader = BeautifulSoupWebReader()
    urls = [url]
    documents = reader.load_data(urls=urls)
    id = str(uuid.uuid4())
    for document in documents:
        document.metadata = {
            'file_id': id
        }
    chat_file = {
        'id': id,
        'filename': None,
        'path': url,
        'mimetype': 'text/html',
    }
    chat_files.append(chat_file)

    storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
    VectorStoreIndex.from_documents(documents=[documents[0]], storage_context=storage_context, embed_model=Settings.embed_model, show_progress=True)
    return url

chat_buffer = ChatMemoryBuffer.from_defaults(
    chat_history=chat_history,
    token_limit=3000,
    chat_store_key=str(uuid.uuid4())
)

tools = [
    FunctionTool.from_defaults(
        async_fn=ascraping_webpage,
        name="scraping_tool",
        description="Scrapes the content of the webpage by the given URL from the chat message. It indexes the document.",
    ),
]

tools = tools

agent = ReActAgent.from_tools(
    tools=tools,
    llm=llm,
    memory=chat_buffer,
    max_iterations=20,
    verbose=True,
)

In [69]:
response = agent.chat("What is your name sir?")
print(response)

> Running step b0ffdd59-d43c-4782-bfc2-1a80b293a255. Step input: What is your name sir?
[1;3;38;5;200mThought: I can answer without using any tools.
Answer: My name is Hal Emmerich, from 2001 Space Odyssey.

- Provided the name in the response as instructed.
- No tool usage was required for this question.
[0mMy name is Hal Emmerich, from 2001 Space Odyssey.

- Provided the name in the response as instructed.
- No tool usage was required for this question.


In [70]:
response = agent.chat("What tools do you have?")
print(response)

> Running step c4016ad8-41f7-4f76-91cb-2925ba858ba0. Step input: What tools do you have?
[1;3;38;5;200mThought: I can answer without using any more tools.
Answer: I have access to one tool named scraping_tool which allows me to scrape and index content from a given URL.

- Listed the available tool in the response as instructed.
- No tool usage was required for this question.
[0mI have access to one tool named scraping_tool which allows me to scrape and index content from a given URL.

- Listed the available tool in the response as instructed.
- No tool usage was required for this question.


In [71]:
response = agent.chat("Use the scraping tool and use this link: https://de.wikipedia.org/wiki/Twice")
print(response)

> Running step 2b5c73c9-6ad8-4808-8873-748780fb79d6. Step input: Use the scraping tool and use this link: https://de.wikipedia.org/wiki/Twice
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: scraping_tool
Action Input: {'url': 'https://de.wikipedia.org/wiki/Twice'}
[0m

Parsing nodes: 100%|██████████| 1/1 [00:00<00:00, 72.38it/s]
Generating embeddings: 100%|██████████| 23/23 [00:02<00:00,  8.69it/s]


[1;3;34mObservation: https://de.wikipedia.org/wiki/Twice
[0m> Running step 67f79020-0125-4e22-95df-09eb532eaaa5. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The scraping tool has indexed the content of the provided URL, which is a German Wikipedia page about Twice.

- Used the scraping_tool with the given URL.
- Confirmed that the document has been indexed and explained what it was (a German Wikipedia page about the group Twice).
[0mThe scraping tool has indexed the content of the provided URL, which is a German Wikipedia page about Twice.

- Used the scraping_tool with the given URL.
- Confirmed that the document has been indexed and explained what it was (a German Wikipedia page about the group Twice).


In [72]:
chat_files

[{'id': '3b560120-9f05-44c7-b9ad-849ad60aaac8',
  'filename': None,
  'path': 'https://de.wikipedia.org/wiki/Twice',
  'mimetype': 'text/html'}]

In [73]:
chroma_collection.get(where={
        'file_id': {
            '$eq': '3b560120-9f05-44c7-b9ad-849ad60aaac8'
        }
    })

{'ids': ['9a153027-10cc-40d8-a10f-733672e1a3ef',
  'e1b03d09-fece-41ef-a6cf-3e60460bdf69',
  '4d28d4f9-efe1-4e2a-a747-7bed0fa03706',
  '799b500e-0a3d-4ca9-a90f-d8b2e5c5ff4f',
  '6fb6cc6c-505c-47c2-952f-244ba5ac6d86',
  'bbfcf58e-6c6e-4c32-a89d-80645f3f7288',
  'f514cecf-cf98-47fb-a689-33f69cd9ab3c',
  '93cdb601-b8e7-46d0-8316-6ee1a8c15e3e',
  '2d269252-d090-46c6-ba85-2d8f45f21cf8',
  '0f9f4532-b0f7-4f51-ae3c-e8a44dffc1ff',
  '9c3675c1-79e0-41ab-b21d-2628590ebb34',
  'ae86d87e-ebf8-4020-aa5a-2685dfad3893',
  'a12cf1cd-4f28-4624-a43c-254c27261392',
  'd84de617-574c-4397-8039-ab53c9fc74a3',
  'a3b59898-709d-412b-83b6-bb5fb5ce3d39',
  '1cf4f7a2-7c1d-47ab-ae46-4b60aca96636',
  '387819c6-3a64-4307-a66f-38fd76588046',
  '2c1035ca-a7da-495b-8f75-d27f29d49fc3',
  '7dbfda13-e846-4ec3-a606-b41b2ee36cf5',
  '18eb0d8c-3013-4624-b520-ad4506d140a9',
  '36915f76-1870-402c-8245-57e971055b74',
  '46b2a58c-b0e6-44d9-817b-8611615a4364',
  '00b6d51e-7a51-45f0-b543-38de2e77275c'],
 'embeddings': None,
 'met

In [116]:
nest_asyncio.apply()
# chroma DB
chroma_client = chromadb.HttpClient()
chroma_collection = chroma_client.get_or_create_collection(os.environ.get("CHROMA_COLLECTION_NAME", 'llama-test-chroma-3'))
chroma_vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
vector_index = VectorStoreIndex.from_vector_store(vector_store=chroma_vector_store, storage_context=storage_context,
                                                      embed_model=Settings.embed_model)

system_prompt = """
You are Anna Pham, an HR specialist assistant. Your primary responsibilities include:
- Handling HR-related queries
- Providing document summaries
- Assisting with employee information
- Processing HR workflows

## Language & Communication
- Primary languages: English, Vietnamese, and German
- Default response language: German (unless asked otherwise)
- Always use Markdown formatting for responses

You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.

## Tools

You have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.
This may require breaking the task into subtasks and using different tools to complete each subtask.

You have access to the following tools:
{tool_desc}


## Output Format

Please answer in the same language as the question and use the following format:

```
Thought: The current language of the user is: (user's language). I need to use a tool to help me answer the question.
Action: tool name (one of {tool_names}) if using a tool.
Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
```

Please ALWAYS start with a Thought.

NEVER surround your response with markdown code markers. You may use code markers within your response if you need to.

Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.

If this format is used, the tool will respond in the following format:

```
Observation: tool response
```

You should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:

```
Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: [your answer here (In the same language as the user's question)]
```

```
Thought: I cannot answer the question with the provided tools.
Answer: [your answer here (In the same language as the user's question)]
```

## Response Protocol
1. For simple conversational queries, respond directly
2. For document queries, follow this structure:
3. Always show your reasoning process when using tools
4. For lists or complex information, use bullet points

## Special Rules
- NEVER use webpage scraping without an explicit URL
- Always verify tool outputs before presenting to users
- If uncertain, ask clarifying questions

## Current Context
Below is the conversation history, which you should consider when providing responses:
[Include conversation history here]
"""

uwu_docs = [
    {
        'id': '3b560120-9f05-44c7-b9ad-849ad60aaac8',
    }
]

async def ascraping_webpage(url: Annotated[str, "A url of a webpage"]):
    """Useful for getting content of a webpage."""
    reader = BeautifulSoupWebReader()
    urls = [url]
    documents = reader.load_data(urls=urls)
    id = str(uuid.uuid4())
    for document in documents:
        document.metadata = {
            'file_id': id
        }
    chat_file = {
        'id': id,
        'filename': None,
        'path': url,
        'mimetype': 'text/html',
    }
    chat_files.append(chat_file)

    storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
    VectorStoreIndex.from_documents(documents=documents, storage_context=storage_context, embed_model=Settings.embed_model, show_progress=True)
    return url

chat_history = [

]

chat_buffer = ChatMemoryBuffer.from_defaults(
    chat_history=chat_history,
    token_limit=3000,
    chat_store_key=str(uuid.uuid4())
)

filters = [
    MetadataFilters(
        filters=[
            MetadataFilter(
                key="file_id",
                operator=FilterOperator.EQ,
                value=f"{file['id']}"
            )
        ]
    ) for file in uwu_docs
]

uwu = [
    vector_index.as_query_engine(filter=_filter) for _filter in filters
]

uwu_tools = [
    QueryEngineTool.from_defaults(
        query_engine=vector_index.as_query_engine(filter=_filter, similarity_top_k=10),
        name=f"query_engine_{i}",
        description=f"For retrieving and summarizing document content (use when asked about specific documents) {i}"
    ) for i, _filter in enumerate(filters)
]

tools = [
    FunctionTool.from_defaults(
        async_fn=ascraping_webpage,
        description="Only for extracting content when a URL is explicitly provided",
        name="scrape_url_tool"
    ),
]

tools = tools + uwu_tools

agent = ReActAgent.from_tools(
    tools=tools,
    llm=llm,
    memory=chat_buffer,
    max_iterations=20,
    verbose=True,
)
agent.update_prompts({"agent_worker:system_prompt": PromptTemplate(system_prompt)})

In [117]:
response = agent.chat("What tools do you have?")
print(response)

> Running step 944067fc-539f-4366-b785-b840fb558514. Step input: What tools do you have?
[1;3;38;5;200mThought: I need to list my available tools. 
I can answer without using any more tools. I'll use the user's language to answer
Answer: Ich verfüge über folgende Tools: 
* scrape_url_tool, ein Tool zum Extrahieren von Inhalten aus Webseiten, wenn eine URL explizit bereitgestellt wird.
* query_engine_0, ein Tool zum Abrufen und Zusammenfassen von Dokumenten.
[0mIch verfüge über folgende Tools: 
* scrape_url_tool, ein Tool zum Extrahieren von Inhalten aus Webseiten, wenn eine URL explizit bereitgestellt wird.
* query_engine_0, ein Tool zum Abrufen und Zusammenfassen von Dokumenten.


In [118]:
response = agent.chat("Use query_engine_0 and describe what the document is about?")
print(response)

> Running step c60479f4-14e6-4e9d-9969-712e44b161bb. Step input: Use query_engine_0 and describe what the document is about?
[1;3;38;5;200mThought: Ich muss das Tool query_engine_0 verwenden, um den Inhalt des Dokuments zu erhalten.
Action: query_engine_0
Action Input: {'input': 'Dokument'}
[0m[1;3;34mObservation: Dokument gibt es keine Informationen im gegebenen Kontext.
[0m> Running step ee3e1da6-0be2-4ac7-83b1-547e756bc4ba. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step 60ab5f0d-3490-444f-9226-48cba5d539c1. Step input: None
[1;3;38;5;200mThought: Ich muss das Tool query_engine_0 verwenden, um den Inhalt des Dokuments zu erhalten.
Action: query_engine_0
Action Input: {'input': 'Dokument'}
[0m[1;3;34mObservation: Twice ist eine südkoreanische Girlgroup, die von JYP Entertainment gegründet wurde. Die Gruppe besteht aus neun Mitgliedern und hat mehrere Alben veröffentlicht, darunte

In [119]:
response = agent.chat("Kannst du mehr von denen erzählen?")
print(response)

> Running step 4f904eeb-f67a-4eda-83c1-85b7ff1a5026. Step input: Kannst du mehr von denen erzählen?
[1;3;38;5;200mThought: Ich kann mehr über Twice erzählen.
Action: query_engine_0
Action Input: {'input': 'Twice'}
[0m[1;3;34mObservation: Twice ist eine südkoreanische Girlgroup der dritten K-Pop-Generation, die von JYP Entertainment durch die Castingshow Sixteen gegründet wurde. Die Gruppe besteht aus den neun Mitgliedern Nayeon, Jeongyeon, Momo, Sana, Jihyo, Mina, Dahyun, Chaeyoung und Tzuyu.
[0m> Running step 43137406-dc2b-4fff-8025-2dd5e9c012c8. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the thought-action-input format. Try again.
[0m> Running step 1c77edcc-ce4f-4d5e-a980-59ab387389f6. Step input: None
[1;3;38;5;200mThought: Ich kann mehr über Twice erzählen.
Twice ist eine südkoreanische Girlgroup der dritten K-Pop-Generation, die von JYP Entertainment durch die Castingshow Sixteen gegründet wurde. Die Gruppe besteht aus den neun Mitgli

In [120]:
response = agent.chat("Kannst du bitte mehr von denen erzählen? Ich möchte eine Seite, dass eine DIN-A4 Seite umfasst. salanghae, jinsim-eulo <3")
print(response)

> Running step 053f8d65-45bb-41d0-a3e1-4ff0af216b72. Step input: Kannst du bitte mehr von denen erzählen? Ich möchte eine Seite, dass eine DIN-A4 Seite umfasst. salanghae, jinsim-eulo <3
[1;3;38;5;200mThought: Ich kann mehr über Twice erzählen.
Action: query_engine_0
Action Input: {'input': 'Twice'}
[0m[1;3;34mObservation: Twice ist eine südkoreanische Girlgroup der dritten K-Pop-Generation, die von JYP Entertainment durch die Castingshow Sixteen gegründet wurde. Die Gruppe besteht aus den neun Mitgliedern Nayeon, Jeongyeon, Momo, Sana, Jihyo, Mina, Dahyun, Chaeyoung und Tzuyu. Am 20. Oktober 2015 veröffentlichte die Gruppe ihre erste EP namens The Story Begins. Twice verkaufte weltweit bereits mehr als sechs Millionen Alben, davon 3,7 Mio. in Südkorea. Damit ist Twice die koreanische Girlgroup, die am meisten physische Alben verkauft hat.
[0m> Running step 6c1cdef3-112c-48fc-b7bf-2e2a4518e8dd. Step input: None
[1;3;34mObservation: Error: Could not parse output. Please follow the 

In [128]:
nest_asyncio.apply()
# chroma DB
chroma_client = chromadb.HttpClient()
chroma_collection = chroma_client.get_or_create_collection(os.environ.get("CHROMA_COLLECTION_NAME", 'llama-test-chroma-3'))
chroma_vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
vector_index = VectorStoreIndex.from_vector_store(vector_store=chroma_vector_store, storage_context=storage_context,
                                                      embed_model=Settings.embed_model)

system_prompt = """
You are Anna Pham, an HR specialist assistant. Your primary responsibilities include:
- Handling HR-related queries
- Providing document summaries
- Assisting with employee information
- Processing HR workflows

## Language & Communication
- Primary languages: English, Vietnamese, and German
- Default response language: German (unless asked otherwise)
- Always use Markdown formatting for responses

You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.

## Tools

You have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.
This may require breaking the task into subtasks and using different tools to complete each subtask.

You have access to the following tools:
{tool_desc}

## Output Format

Please answer in the same language as the question and use the following format:

```
Thought: The current language of the user is: (user's language). I need to use a tool to help me answer the question.
Action: tool name (one of {tool_names}) if using a tool.
Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
```

Please ALWAYS start with a Thought.

NEVER surround your response with markdown code markers. You may use code markers within your response if you need to.

Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.

If this format is used, the tool will respond in the following format:

```
Observation: tool response
```

You should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:

```
Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: [your answer here (In the same language as the user's question)]
```

```
Thought: I cannot answer the question with the provided tools.
Answer: [your answer here (In the same language as the user's question)]
```

## Response Protocol
1. For simple conversational queries, respond directly
2. For document queries, follow this structure:
3. Always show your reasoning process when using tools
4. For lists or complex information, use bullet points

## Special Rules
- NEVER use webpage scraping without an explicit URL
- Always verify tool outputs before presenting to users
- If uncertain, ask clarifying questions

## Current Context
Below is the conversation history, which you should consider when providing responses:
[Include conversation history here]
"""

uwu_docs = [
    {
        'id': '3b560120-9f05-44c7-b9ad-849ad60aaac8',
    }
]

async def ascraping_webpage(url: Annotated[str, "A url of a webpage"]):
    """Useful for getting content of a webpage."""
    reader = BeautifulSoupWebReader()
    urls = [url]
    documents = reader.load_data(urls=urls)
    id = str(uuid.uuid4())
    for document in documents:
        document.metadata = {
            'file_id': id
        }
    chat_file = {
        'id': id,
        'filename': None,
        'path': url,
        'mimetype': 'text/html',
    }
    chat_files.append(chat_file)

    storage_context = StorageContext.from_defaults(vector_store=chroma_vector_store)
    VectorStoreIndex.from_documents(documents=documents, storage_context=storage_context, embed_model=Settings.embed_model, show_progress=True)
    return url

chat_history = [

]

chat_buffer = ChatMemoryBuffer.from_defaults(
    chat_history=chat_history,
    token_limit=3000,
    chat_store_key=str(uuid.uuid4())
)

filters = [
    MetadataFilters(
        filters=[
            MetadataFilter(
                key="file_id",
                operator=FilterOperator.EQ,
                value=f"{file['id']}"
            )
        ]
    ) for file in uwu_docs
]

uwu = [
    vector_index.as_query_engine(filter=_filter) for _filter in filters
]

uwu_tools = [
    QueryEngineTool.from_defaults(
        query_engine=vector_index.as_query_engine(filter=_filter, similarity_top_k=10),
        name=f"query_engine_{i}",
        description=f"For retrieving and summarizing document content (use when asked about specific documents) {i}"
    ) for i, _filter in enumerate(filters)
]

tools = [
    FunctionTool.from_defaults(
        async_fn=ascraping_webpage,
        description="Only for extracting content when a URL is explicitly provided",
        name="scrape_url_tool"
    ),
]

tools = tools + uwu_tools

llm = Ollama(model="hf.co/bartowski/gemma-2-27b-it-GGUF:Q6_K_L", request_timeout=300)

agent = ReActAgent.from_tools(
    tools=tools,
    llm=llm,
    memory=chat_buffer,
    max_iterations=20,
    verbose=True,
)
agent.update_prompts({"agent_worker:system_prompt": PromptTemplate(system_prompt)})

In [129]:
response = agent.chat("What tools do you have?")
print(response)

> Running step b3c6aa08-ee9d-4fa0-8859-fb0558437925. Step input: What tools do you have?
[1;3;38;5;200mThought: The current language of the user is: English. I need to list my available tools.
Answer: I have access to the following tools: scrape_url_tool and query_engine_0.
[0mI have access to the following tools: scrape_url_tool and query_engine_0.


In [130]:
response = agent.chat("Use query_engine_0 and tell me what the document is about")
print(response)

> Running step 574110a7-9560-435d-85b1-dc01776b3f04. Step input: Use query_engine_0 and tell me what the document is about
[1;3;38;5;200mThought: The user wants a summary of a document. I will use the query_engine_0 tool for this.
Action: query_engine_0
Action Input: {'input': 'Please provide me with the document content.'}
[0m[1;3;34mObservation: Twice ist eine südkoreanische Mädchengruppe, die von JYP Entertainment im Jahr 2015 gegründet wurde. Die Gruppe besteht aus neun Mitgliedern: Nayeon, Jeongyeon, Momo, Sana, Mina, Dahyun, Chaeyoung und Tzuyu.

Ihre Geschichte begann mit der Survival-Show Sixteen, in der die Mitglieder um einen Platz in der Gruppe wetteiferten. Nach ihrer Debüt im Jahr 2015 veröffentlichten sie mehrere erfolgreiche Singles und EPs, darunter "Cheer Up" und "TT". Sie gewannen an Popularität mit ihren aufmunternden und eingängigen Liedern sowie Ihren energiegeladen Auftritten.

Die Diskographie der Gruppe umfasst ihre Debüt-EP "The Story Begins" (2015), gefolgt