## Utils

In [None]:
def pprint(text):
    import textwrap
    wrapped_text = textwrap.fill(text, width=100) 
    print(wrapped_text)

## Creating a FunctionTool

Let's create a basic `FunctionTool` and call it.

In [None]:
from llama_index.core.tools import FunctionTool


def get_weather(location: str) -> str:
    """Useful for getting the weather for a given location."""
    print(f"Getting weather for {location}")
    return f"The weather in {location} is sunny"


tool = FunctionTool.from_defaults(
    get_weather,
    name="my_weather_tool",
    description="Useful for getting the weather for a given location.",
)
tool.call("New York")

Getting weather for New York


ToolOutput(blocks=[TextBlock(block_type='text', text='The weather in New York is sunny')], tool_name='my_weather_tool', raw_input={'args': ('New York',), 'kwargs': {}}, raw_output='The weather in New York is sunny', is_error=False)

## LLMs

In [36]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

emb_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

  from .autonotebook import tqdm as notebook_tqdm


In [37]:
from llama_index.llms.ollama import Ollama  

chat_model = Ollama(model="qwen2:7b")

## Creating a QueryEngineTool

Let's now re-use the `QueryEngine` we defined in the [previous unit on tools](/tools.ipynb) and convert it into a `QueryEngineTool`.

In [77]:
from llama_index.core import SimpleDirectoryReader
from llama_index.core.node_parser import SemanticSplitterNodeParser

reader = SimpleDirectoryReader(input_dir="~/pdev/yaub/frontend/simple-yaub/public/", recursive=True)
documents_all = reader.load_data()
documents = list()
for doc in documents_all:
    if "text" in doc.metadata["file_type"]:
        documents.append(doc) 
print("# Documents: s", len(documents))

# Splits the documents into sentences, pushes each sentence through the embed_model, 
# groups sentences into nodes based on how far the distance between their embeddings is,
# threshold distance is determined based on breakpoint_percentile_threshold    
splitter = SemanticSplitterNodeParser(embed_model=emb_model)
nodes = splitter.get_nodes_from_documents(documents=documents)

# Documents: 
 4


In [7]:
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

from llama_index.core import StorageContext, VectorStoreIndex

qdrant_client = QdrantClient(":memory:")      
vector_store = QdrantVectorStore(collection_name="blog", client=qdrant_client)

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex(nodes, storage_context=storage_context, embed_model=emb_model)
query_engine = index.as_query_engine(llm=chat_model)

  self._client.create_payload_index(


In [10]:
from llama_index.core.tools import QueryEngineTool

tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="Blogposts Accessor",
    description="RAG for posts",
)

tool.call(
    "What references in the end does the author mention for Kernel methods post?"
)

ToolOutput(blocks=[TextBlock(block_type='text', text='The author mentions Bishop\'s Section 6.4.5 from his book "Pattern Recognition and Machine Learning" (2006) as a reference for understanding different approximation methods used in Gaussian processes. Specifically, he refers to this section to describe the Laplace approximation method which is used to approximate $p(t^* \\mid \\vec{t})$.')], tool_name='Blogposts Accessor', raw_input={'input': 'What references in the end does the author mention for Kernel methods post?'}, raw_output=Response(response='The author mentions Bishop\'s Section 6.4.5 from his book "Pattern Recognition and Machine Learning" (2006) as a reference for understanding different approximation methods used in Gaussian processes. Specifically, he refers to this section to describe the Laplace approximation method which is used to approximate $p(t^* \\mid \\vec{t})$.', source_nodes=[NodeWithScore(node=TextNode(id_='8f3762d5-2064-4b97-a08d-00d1d6debb6a', embedding=No

## Creating Toolspecs

Let's create a `ToolSpec` from the `GmailToolSpec` from the LlamaHub and convert it to a list of tools.

In [11]:
from llama_index.tools.google import GmailToolSpec

tool_spec = GmailToolSpec()
tool_spec_list = tool_spec.to_tool_list()
[print(tool.metadata.name, tool.metadata.description) for tool in tool_spec_list]


load_data load_data() -> List[llama_index.core.schema.Document]
Load emails from the user's account.
search_messages search_messages(query: str, max_results: Optional[int] = None)

        Searches email messages given a query string and the maximum number
        of results requested by the user
           Returns: List of relevant message objects up to the maximum number of results.

        Args:
            query (str): The user's query
            max_results (Optional[int]): The maximum number of search results
            to return.

        
create_draft create_draft(to: Optional[List[str]] = None, subject: Optional[str] = None, message: Optional[str] = None) -> str

        Create and insert a draft email.
           Print the returned draft's message and id.
           Returns: Draft object, including draft id and message meta data.

        Args:
            to (Optional[str]): The email addresses to send the message to
            subject (Optional[str]): The subject for th

[None, None, None, None, None, None]

## Agent with Websearching

In [33]:
from langchain_community.tools import DuckDuckGoSearchRun, DuckDuckGoSearchResults

ddg_search = DuckDuckGoSearchRun()
ddg_search_res = DuckDuckGoSearchResults(output_format="list")

res_s = ddg_search.invoke("History of Prussia")
res_r = ddg_search_res.invoke("History of Prussia")

print(res_s,"\n")
for res in res_r:
    print(res)

  with DDGS() as ddgs:
  with DDGS() as ddgs:


Your History lists the pages you've visited on Chrome in the last 90 days. It doesn't store: If you’re signed in to Chrome and sync your history, then your History also shows pages you’ve visited … Von Ihnen besuchte Websites werden in Ihrem Browserverlauf gespeichert. Sie können in Chrome Ihren Browserverlauf einsehen oder löschen und ähnliche Suchanfragen finden. Sie … Delete your activity automatically You can automatically delete some of the activity in your Google Account. On your computer, go to your Google Account. At the left, click Data & privacy. … Under "History settings," click My Activity. To access your activity: Browse your activity, organized by day and time. To find specific activity, at the top, use the search bar and filters. Manage … Delete browsing data in Chrome You can delete your Chrome browsing history and other browsing data, like saved form entries, or just delete data from a specific date. 

{'snippet': 'May 22, 2025 · Beginning as a minor electorate in 1648

In [None]:
from duckduckgo_search import DDGS

search_ggg = DDGS()

res_dgg = search_ggg.text('History of Prussia')
for res in res_dgg:
    print(res)

{'title': 'Prussia - Wikipedia', 'href': 'https://en.wikipedia.org/wiki/Prussia', 'body': '3 days ago · Prussia formed the German Empire when it united the German states in 1871. It was de facto dissolved by an emergency decree transferring powers of the Prussian government to …'}
{'title': 'Prussia | History, Maps, Flag, & Definition | Britannica', 'href': 'https://www.britannica.com/place/Prussia', 'body': 'Jul 2, 2025 · Prussia, in European history, any of three historical areas of eastern and central Europe. It is most often associated with the kingdom ruled by the German Hohenzollern dynasty, …'}
{'title': 'The Rise of Prussia: How a Small State Became a Military …', 'href': 'https://ancientwarhistory.com/the-rise-of-prussia-how-a-small-state-became-a-military-powerhouse-1648-1815/', 'body': 'May 22, 2025 · Beginning as a minor electorate in 1648 with scattered territories stretching from the Rhine to the Baltic, Prussia would transform itself into a formidable military state by 1

  search_ggg = DDGS()


In [None]:
# https://www.geeksforgeeks.org/python/performing-google-search-using-python-code/ 
from googlesearch import search

res_g = search("History of Prussia, no wikipedia", stop=10)
for res in res_g:
    print(res)

https://en.wikipedia.org/wiki/Prussia
https://en.wikipedia.org/wiki/Abolition_of_Prussia
https://en.wikipedia.org/wiki/Flag_of_Prussia
https://en.wikipedia.org/wiki/Prussia_(disambiguation)
https://en.wikipedia.org/wiki/Duchy_of_Prussia
https://en.wikipedia.org/wiki/Kingdom_of_Prussia
https://en.wikipedia.org/wiki/Prussia_(region)
https://en.wikipedia.org/wiki/Royal_Prussia
https://en.wikipedia.org/wiki/Category:History_of_Prussia
https://simple.wikipedia.org/wiki/Prussia


  anchors = soup.findAll('a')


In [None]:
res_g = search("Instagram", stop=10)
for res in res_g:
    print(res)

  anchors = soup.findAll('a')


https://www.instagram.com/?hl=en
https://www.instagram.com/
https://en.m.wikipedia.org/wiki/File:Instagram_logo_2016.svg
https://en.wikipedia.org/wiki/Instagram
https://apps.apple.com/us/app/instagram/id389801252
https://www.facebook.com/instagram/
https://nl.wikipedia.org/wiki/Instagram
https://www.sesarju.eu/sites/default/files/webform/sids_2025_poster/_sid_/23akk.pdf
https://apps.apple.com/nl/app/instagram/id389801252
http://www.medianest.be/wat-instagram


  anchors = soup.findAll('a')


In [None]:
import warnings

from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
from llama_index.core.tools import FunctionTool
from llama_index.core.workflow import Context
import llama_index.core

llama_index.core.set_global_handler("simple")

with  warnings.catch_warnings():
    warnings.simplefilter("ignore")
    def get_relevant_webpages(query: str) -> list:
        """
        Gets the relevant webpages urls for a query.

        Args:
            query (str): what to search online on www.

        Returns:
            list: list of url links (websites) related to the query 
        """
        search_ggg = DDGS()
        results = search_ggg.text(query)
        return results

    web_search_tool = FunctionTool.from_defaults(
        fn=get_relevant_webpages,
        name="get_relevant_webpages",
        description="Useful for getting a list of relevant webpages (url links) for a particular query. " \
        "Together with the web links the list also include a short information about the answer for the query."
    )

    search_agent = ReActAgent(
        name = "Web Searcher", 
        description = "Search the web give links to the relevant pages it found",
        system_prompt = "",
        tools=[web_search_tool],
        verbose=True,
        llm=chat_model,
    )


    ctx = Context(search_agent)
    answer = await search_agent.run("History of Prussia, search online with the web search tool", ctx=ctx)
    print(answer)
    pprint(answer.response.blocks[0].text)

Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_ag

In [None]:
from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
from llama_index.core.tools import FunctionTool
from llama_index.core.workflow import Context

import mlflow

mlflow.set_experiment(experiment_name="Test")
mlflow.set_tracking_uri('http://localhost:5000')
mlflow.llama_index.autolog()


def get_relevant_webpages(query: str) -> list:
    """
    Gets the relevant webpages urls for a query.

    Args:
        query (str): what to search online on www.

    Returns:
        list: list of url links (websites) related to the query 
    """
    search_ggg = DDGS()
    results = search_ggg.text(query)
    return results

web_search_tool = FunctionTool.from_defaults(
    fn=get_relevant_webpages,
    name="get_relevant_webpages",
    description="Useful for getting a list of relevant webpages (url links) for a particular query. " \
    "Together with the web links the list also include a short information about the answer for the query."
)

search_agent = ReActAgent(
    name = "Web Searcher", 
    description = "Search the web give links to the relevant pages it found",
    system_prompt = "",
    tools=[web_search_tool],
    verbose=True,
    llm=chat_model,
)


ctx = Context(search_agent)
answer = await search_agent.run("History of Prussia, search online with the web search tool", ctx=ctx)
print(answer)
pprint(answer.response.blocks[0].text)

/Users/maksim.rostov/pdev/freestyling/agents/hf-course/.conda/lib/python3.12/inspect.py:592: PydanticDeprecatedSince20: The `__fields__` attribute is deprecated, use `model_fields` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  value = getter(object, key)
/Users/maksim.rostov/pdev/freestyling/agents/hf-course/.conda/lib/python3.12/inspect.py:592: PydanticDeprecatedSince20: The `__fields_set__` attribute is deprecated, use `model_fields_set` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  value = getter(object, key)
/Users/maksim.rostov/pdev/freestyling/agents/hf-course/.conda/lib/python3.12/inspect.py:592: PydanticDeprecatedSince211: Accessing the 'model_computed_fields' attribute on the instance is deprecated. Instead, you should access this attribute from the model class. Deprecated in Pydantic V2.11

Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step


/Users/maksim.rostov/pdev/freestyling/agents/hf-course/.conda/lib/python3.12/site-packages/mlflow/llama_index/tracer.py:290: PydanticDeprecatedSince20: The `json` method is deprecated; use `model_dump_json` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
  if params_str := metadata.json(exclude_unset=True):
1. Set the MLFLOW_TRACKING_URI environment variable to the desired tracking URI. `export MLFLOW_TRACKING_URI=http://localhost:5000`
2. Set the tracking URI programmatically by calling `mlflow.set_tracking_uri`. `mlflow.set_tracking_uri('http://localhost:5000')`.


Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
Prussia was a Germanic kingdom and state that existed from the 1200s to the 1900s. Its importance peaked when it united the German states to form the German Empire in 1871 before being de facto dissolved by an emergency decree transferring powers of the Prussian government to the German Chancery. The territory included East Prussia, Brandenburg, and Saxony (including much of present-day Poland). Before its dissolution, there were provinces such as East Prussia and the margraves of Brandenburg who became highly dependent on estates representing counts, lords, knights, and towns.

Prussia's coat of arms depicted a black eagle on a white background. The main flag featured this same symbol. 

Key events include the granting of Burzenland in Transylvania to the Teutonic Knights by King Andrew II of Hungary in 1211 as a fiefdom for the German military order.

In the

In [None]:
llama_index.core.global_handler


<llama_index.core.callbacks.simple_llm_handler.SimpleLLMHandler at 0x3d50849b0>

  pid, fd = os.forkpty()
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


/Users/maksim.rostov/pdev/freestyling/agents/hf-course/.conda/bin/python
