# Agentic AI with a decision process to select a 'researcher' or 'journalist' as agents using crewAI 

# Instructions:

- Go to the **Assessments** tab in iCollege and click on **Assignments** to find the **Homework 2** assignment and further instructions to the assignment.

- Make a copy of this homework notebook and rename the copy using your GSU username as suffix:

    **AgenticAI_RAG_Homework2_\<YOUR GSU USERNAME\>.ipynb**

- Use the **MSA8700 Python 3.12** kernel when you execute your noterbnook cells! This kernel has all the necessary Python packages already installed. You do not have to install any packages. Do **NOT** use any of the other kernels. 

- Work with your renamed copy to create your solution and all the necessary .txt files that will be generated and submitted with your notebook. 

- Complete all cells in the notebook i.e. replace missing fields (abbreviated with **'...'** in the code cells below) with your code cells, run the notebook and create the 6 necessary .txt files. 

- Run the tests to confirm that your .txt files were generated. Finally, submit your .txt files and your executed .ipynb notebook by running the two submission steps at the very end of the notebook.

# Important Note:

- When running the 'researcher' branch you might notice that the **`pdf_search_tool` might fail**, i.e. a response cannot be generated fron the PDF file and you get an error message.

- When that occurs **delete the database folder db/** (e.g. with **rm -rf db** on the command line in a terminal window; you might have to restart or shutdown the kernel before you can do that) that has been generated and re-run the cells. The database folder db/ will be re-created and `pdf_search_tool` should return a result with showing an error.

In [1]:
# Do not change the content of this cell. Execute this cell first, and everytime after you restarted the kernel.
%reload_ext autoreload
%autoreload 2

In [3]:
import os
from pprint import pprint
from langchain_openai import ChatOpenAI
from crewai import Agent, Task, Crew, LLM
from crewai.tools import tool
from crewai_tools import SerperDevTool, ScrapeWebsiteTool
from crewai_tools import PDFSearchTool

/opt/anaconda3/envs/conda-msa8700/lib/python3.12/site-packages/pydantic/_internal/_config.py:295: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  warn(
/opt/anaconda3/envs/conda-msa8700/lib/python3.12/site-packages/crewai_tools/tools/scrapegraph_scrape_tool/scrapegraph_scrape_tool.py:34: PydanticDeprecatedSince20: Pydantic V1 style `@validator` validators are deprecated. You should migrate to Pydantic V2 style `@field_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.10/migration/
  @validator("website_url")
/opt/anaconda3/envs/conda-msa8700/lib/python3.12/site-packages/crewai_tools/tools/selenium_scraping_tool/selenium_scraping_tool.py:26: PydanticDeprecatedSince20: Pydantic V

### Load the API keys

In [4]:
# Load your API keys here and create your environment variables.

# IMPORTANT: Do not directly copy your API keys into your notebooks and submit them with your solution!

OPENAI_API_KEY =''
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

GEMINI_API_KEY = ''
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY

SERPER_API_KEY = ''
os.environ["SERPER_API_KEY"] = SERPER_API_KEY

In [5]:
GROQ_API_KEY = ''
os.environ["GROQ_API_KEY"] = GROQ_API_KEY

### Load the PDF document

Upload NIPS 2017 paper and store locally in **data/** folder.

Source: 

**Advances in Neural Information Processing Systems 30 (NIPS 2017)**. Edited by: I. Guyon and U. Von Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett. ISBN: 9781510860964. https://proceedings.neurips.cc/paper_files/paper/2017

**Attention is All you Need** Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

In [6]:
import requests

pdf_url = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'
response = requests.get(pdf_url)

with open('data/NIPS-2017-attention-is-all-you-need-Paper.pdf', 'wb') as file:
    file.write(response.content)

### Initialize LLMs

In [7]:
# Upgrade the langchain-openai package first if needed:
#!pip install -U langchain-openai

from langchain_openai import ChatOpenAI

crew_llm = ChatOpenAI(
    api_key=OPENAI_API_KEY,
    model="gpt-3.5-turbo",  # or "gpt-4-1106-preview" if you have access
    max_tokens=1024,
    temperature=0.7
)

In [8]:
# Alternative:
# Pick a Groq chat model with API key, max_tokens and temparature parameter values:



### Define Tools

In [9]:
from crewai_tools import PDFSearchTool

pdf_search_tool = PDFSearchTool(
    pdf='data/NIPS-2017-attention-is-all-you-need-Paper.pdf',
    config=dict(
        llm=dict(
            provider="openai",
            config=dict(
                model="gpt-4",
                temperature=0.7,
                top_p=1,
                stream=False
            )
        ),
        embedder=dict(
            provider="openai",
            config=dict(
                model="text-embedding-3-small"
                # Do not include task_type or title here
            )
        )
    )
)


In [10]:
# custom-defined tool using the @tool decrorator:
@tool("Router Function")
def router_tool(question: str) -> str:
  """Router Function"""
  if 'self-attention' in question:
    return 'researcher'
  else:
    return 'journalist'

### Create decision router

In [11]:
# Create a decision router function based on an input 'query':

# The `decide_hiring()` function shall make a binary decision and return a string. 
# This function shall either return the value 'researcher' or the value 'journalist' 
# depending on whether a certain query can be answered with the content in a PDF file 
# or has to be answered via a web search with web scraping task.

# An agent named 'router' shall execute a task using the tool `pdf_search_tool` to determine whether
# the query can be answered with the PDF file content or has to be answered via a web search with web scraping activity.

# Enter definitions for the variables 'role', 'goal' and 'backstory' of the Agent 'router'.
# Enter definitions for the variables 'description', 'expected_output', 'agent', and 'tools' of the Task 'router_task'.
# Enter definitions for the variables 'agents' and 'tasks' in the Crew class 'crew'. 
# Finally, define the 'inputs' dictionary in order to execute `result = crew.kickoff(inputs={...})` to generate an output string 'result.raw'

# Note: You might have to experiment with different definitions an experiments to get a well-performing agentic workflow.

from crewai import Agent, Task, Crew

def decide_hiring(query):
    """Router function to determine if we can answer from the existing PDF file"""

    router = Agent(
        role='router',
        goal='Decide if a query can be answered from the PDF document',
        backstory=(
            "You are a smart router that determines whether a given user question "
            "can be answered using the PDF content or requires external web search."
        ),
        verbose=True,
        allow_delegation=False,
        llm=crew_llm,  # assuming crew_llm is already initialized
    )

    router_task = Task(
        description=f"Given the question: '{query}', determine if it can be answered using only the provided PDF document.",
        expected_output="Return 'yes' if the PDF content contains enough information to answer the question. Return 'no' otherwise.",
        agent=router,
        tools=[pdf_search_tool],  # assuming pdf_search_tool is already initialized
    )

    crew = Crew(
        agents=[router],
        tasks=[router_task],
        verbose=1,
        memory=False,
    )

    result = crew.kickoff(inputs={"query": query})
    return result.raw


### Create branch workflows with agents, tasks and crews

#### Create the 'researcher' branch in the process: 

In [12]:
# For the 'researcher' branch, we want to define a function `hire_researcher(query)` that is executed 
# if the 'router' agent determines that the 'query' can be answered by the content avaialble in the PDF file. 

# Define an Agent 'researcher', Task 'researcher_task' and a Crew that will use the tool `pdf_search_tool` to
# provide a response to the query from the PDF tile.

# Similar to the 'router' workflow above:
# Enter definitions for the variables 'role', 'goal' and 'backstory' of the Agent 'researcher'. 
# For the 'role' you can specify 'Expert Data Analyst'.

# Enter definitions for the variables 'description', 'expected_output', 'agent', and 'tools' of the Task 'researcher_task'.
# In the 'description' argument refer to the '{topic}' variable as placeholder for the input query.  

# Enter definitions for the variables 'agents' and 'tasks' in the Crew class 'crew'. 

# By executing `result = crew.kickoff(inputs={"topic": query})` the response from searching the PDF file 
# is generated for the input 'query' by the agentic workflow. 

# Adding the PDF Searching Agent
def hire_researcher(query):
    # Researcher agent
    researcher = Agent(
        role="Expert Data Analyst",
        goal="Answer user questions using information found in a PDF document.",
        backstory=(
            "You are an expert data analyst skilled in navigating and understanding academic research papers. "
            "You help users by finding accurate, relevant information in the provided PDF document to answer their queries clearly."
        ),
        allow_delegation=False,
        verbose=True,
        llm=crew_llm  # assumed to be initialized as ChatOpenAI or compatible LLM
    )

    # Task for the researcher
    researcher_task = Task(
        description="Use the PDF document to answer the following question: '{topic}'",
        expected_output="A concise and accurate answer to the query based solely on the PDF content.",
        tools=[pdf_search_tool],  # assumed to be set up with the PDF path
        agent=researcher
    )

    # Create the Crew
    crew = Crew(
        agents=[researcher],
        tasks=[researcher_task],
        memory=False,
        verbose=1
    )

    print("The researcher was hired...")
    result = crew.kickoff(inputs={"topic": query})
    return result.raw


#### Create the 'journalist' branch in the process: 

In [13]:
# For the 'journalist' branch, alternatively, a function `hire_journalist(query)` is executed 
# if the 'router' agent determines that the 'query' cannot be answered by the content available in the PDF file 
# but needs to be answered via a web searching and web scraping activity. 

# See the Agent definitions of 'web_search_agent' and 'web_scraper_agent'.

# Enter definitions for the variables 'description', 'expected_output', 'agent', and 'tools' 
# of the 'search_task' and 'scraping_task'.

# In the 'description' argument refer to the '{topic}' variable as placeholder for the input query.  

# Enter definitions for the variables 'agents' and 'tasks' in the Crew class 'crew'. 

# By executing `result = crew.kickoff(inputs={"topic": query})` the response 
# from searching and scraping the web.

# Note: Focus on the content scraped from the most relevant searched website with respect to the inout 'query' 
# in order to limit the number of iterations in the web searching and web scraping process. 

# Adding the Web Searching and Scraping Agent
def run_web_search_scraping(query):
    """Setup the web scraping agent and related components"""
    search_tool = SerperDevTool()  # Tool for performing web searches
    scrape_website = ScrapeWebsiteTool()  # Tool for extracting data from websites
    
    # Define the web search agent
    web_search_agent = Agent(
        role="Expert Web Search Agent",
        goal="Identify and retrieve relevant web data for user queries",
        backstory="An expert in identifying valuable web sources for the user's needs",
        allow_delegation=False,
        verbose=True,
        llm=crew_llm
    )
    
    # Define the web scraping agent
    web_scraper_agent = Agent(
        role="Expert Web Scraper Agent",
        goal="Extract and analyze content from specific web page identified by the search agent",
        backstory="A highly skilled web scraper, capable of analyzing and summarizing website content accurately",
        allow_delegation=False,
        verbose=True,
        llm=crew_llm
    )
    
    # Define the web search task
    search_task = Task(
        description="Search the web and identify the most relevant source to answer the following query: '{topic}'",
        expected_output="The URL of the most relevant and trustworthy web page",
        tools=[search_tool],
        agent=web_search_agent,
    )
    
    # Define the web scraping task
    scraping_task = Task(
        description="Extract and summarize key insights from the webpage related to: '{topic}'",
        expected_output="A well-structured answer based on the content of the selected webpage",
        tools=[scrape_website],
        agent=web_scraper_agent,
    )
    
    # Define the crew to manage agents and tasks
    crew = Crew(
        agents=[web_search_agent, web_scraper_agent],
        tasks=[search_task, scraping_task],
        verbose=1,
        memory=False,
    )
    
    print("The journalist was hired ...")
    result = crew.kickoff(inputs={"topic": query})
    return result.raw
def generate_journalist_answer(context, query):
    """Generate final answer using LLM for RAG"""
    messages = [
        (
            "system",
            "You are an investigative journalist skilled in distilling information from web research into clear and accurate answers.",
        ),
        ("system", f"Context: {context}"),
        ("human", query),
    ]
    response = crew_llm.invoke(messages)
    return response.content


In [14]:
def hire_journalist(query): 
    # Retrieve context from web scraping
    context = run_web_search_scraping(query)
    # Generate final answer via RAG
    answer = generate_journalist_answer(context, query)
    return answer

#### Define process

In [15]:
# Create the necessary process flow with the functions `decide_hiring`, `hire_researcher`, `hire_journalist` 
# defined above to create the agentic workflow as described above.

# Replace the dots below (...) with your code insertions:

def process_query(query):
    """Main function to process user query"""
    print(f"Processing query: {query}")

    # Step 1: Decide which team will be hired based on the question (input query)
    decision = decide_hiring(query).lower()

    # Step 2: Generate final answer based on whether 'researcher' or 'journalist' branch was selected
    if "yes" in decision:
        # Researcher branch (answer can be found in the PDF)
        context = hire_researcher(query)
        answer = generate_journalist_answer(context, query)
    else:
        # Journalist branch (answer needs web search + scraping)
        context = run_web_search_scraping(query)
        answer = generate_journalist_answer(context, query)

    return answer


#### Execute process with query

In [16]:
# Create three different appropriate questions (input query) to test the execution of the 'researcher' branch
# where the 'researcher' agent extracts an answer for the question from the uploaded PDF file:

# Example: "How did the self-attention mechanism evolve in large language models?"

# Choose a different question as input!

In [36]:
# Pick your first 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
# Pick your first 'question'
question = "What is the transformer architecture introduced in the 'Attention is All You Need' paper?"

# Execute the query
result = process_query(question)

print("\nFinal Answer:")
pprint(result)


Processing query: What is the transformer architecture introduced in the 'Attention is All You Need' paper?


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: 'What is the transformer architecture introduced in the 'Attention is All You Need' paper?', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"transformer architecture in 'Attention is All You Need' paper\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arXiv pre



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
yes[00m




The researcher was hired...


[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Task:[00m [92mUse the PDF document to answer the following question: 'What is the transformer architecture introduced in the 'Attention is All You Need' paper?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"transformer architecture\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 



[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Final Answer:[00m [92m
To understand the transformer architecture introduced in the 'Attention is All You Need' paper, we need to add "positional encodings" to the input embeddings to provide information about the relative or absolute position of the tokens in the sequence.[00m





Final Answer:
("The transformer architecture introduced in the 'Attention is All You Need' "
 'paper is a neural network model that relies entirely on self-attention '
 'mechanisms to compute representations of input and output sequences. This '
 'architecture is designed to address the limitations of traditional '
 'sequence-to-sequence models like RNNs and LSTMs by eliminating recurrent '
 'connections and instead using self-attention mechanisms to capture '
 'dependencies between input and output tokens.\n'
 '\n'
 'The transformer model consists of an encoder and a decoder, both composed of '
 'multiple layers of self-attention mechanisms and feed-forward neural '
 'networks. The self-attention mechanism allows each token in the sequence to '
 'attend to all other tokens in the sequence, capturing long-range '
 'dependencies efficiently.\n'
 '\n'
 'The key components of the transformer architecture include multi-head '
 'self-attention mechanisms, feed-forward neural networks, laye

In [37]:
with open("researcher_1.txt", "w") as text_file:
    text_file.write(result)

In [38]:
!test/run_test.sh 1

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_01.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


In [22]:
# Pick your second 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
question = "What is the difference between supervised and unsupervised learning?"
result = process_query(question)

print("\nFinal Answer:")
pprint(result)


Processing query: What is the difference between supervised and unsupervised learning?


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: 'What is the difference between supervised and unsupervised learning?', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"supervised and unsupervised learning difference\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016. [32] Jie Zhou, Ying Cao, Xuguang Wang, Peng Li, and Wei Xu. Deep recurrent models with fast-forward connections for neural machine translation. CoRR, abs/1606.04199, 2016. 11

makes it more difﬁcult to learn dependencies between distant positions [ 11]. In the Transformer this is reduced to a constant number of operations, albeit at the cost of reduced effective resolution due to averaging attention-weighted positions, an effect we counteract with Multi-Head Attention as described in section 3.2. Self-attention, sometimes called intra-attention is an attention mechanism relating d



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
no[00m




The journalist was hired ...


[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Task:[00m [92mSearch the web and identify the most relevant source to answer the following query: 'What is the difference between supervised and unsupervised learning?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Using tool:[00m [92mSearch the internet with Serper[00m
[95m## Tool Input:[00m [92m
"{\"search_query\": \"Difference between supervised and unsupervised learning\"}"[00m
[95m## Tool Output:[00m [92m
{'searchParameters': {'q': 'Difference between supervised and unsupervised learning', 'type': 'search', 'num': 10, 'engine': 'google'}, 'organic': [{'title': "Supervised vs. Unsupervised Learning: What's the Difference? - IBM", 'link': 'https://www.ibm.com/think/topics/supervised-vs-unsupervised-learning', 'snippet': 'To put it simply, supervised learning uses labeled input and output data, while an unsupervised learning algorithm does not.', 'position': 1, 'sitelinks': [{'title': "In this article, we'll explore the...", 'link': 'https://www.ibm.com/think/topics/supervised-vs-unsupervised-learning#In+this+article%2C+we%E2%80%99ll+explore+the+basics+of+two+data+science+approaches%3A+supervised+and+unsupervised.+



[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Final Answer:[00m [92m
https://www.ibm.com/think/topics/supervised-vs-unsupervised-learning[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Task:[00m [92mExtract and summarize key insights from the webpage related to: 'What is the difference between supervised and unsupervised learning?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://www.ibm.com/think/topics/supervised-vs-unsupervised-learning\"}"[00m
[95m## Tool Output:[00m [92m

Supervised vs. Unsupervised Learning: What’s the Difference? | IBM
Supervised versus unsupervised learning: What's the difference?
Tags
Artificial Intelligence
12 March 2021
5 min read
Link copied
In this article, we’ll explore the basics of two data science approaches: supervised and unsupervised. Find out which approach is right for your situation.
The world is getting “smarter” every day, and to keep up with consumer expectations, companies are increasingly using machine learning algorithms to make things easier. You can see them in use in end-user devices (through face recognition for unlocking smartphones) or for detecting credit card fraud (like triggering alerts for unusual purchases).
Within  artific



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Final Answer:[00m [92m
In this article from IBM, the main difference between supervised and unsupervised learning is explained. Supervised learning uses labeled data sets to train algorithms for classification and regression tasks, while unsupervised learning analyzes and clusters unlabeled data to discover patterns. Supervised learning is ideal for predicting outcomes with labeled data, while unsupervised learning is great for getting insights from large data volumes. The article also discusses the goals, applications, complexity, and drawbacks of both approaches, highlighting the importance of evaluating data structure, defining goals, and reviewing algorithm options when choosing between supervised and unsupervised learning. Additionally, the article introduces semi-supervised learning as a hybrid approach that combines labeled and unlabeled data for improved accuracy in scenarios where relevant features are c


Final Answer:
('Supervised learning and unsupervised learning are two main categories of '
 'machine learning approaches:\n'
 '\n'
 '1. Supervised Learning:\n'
 '- In supervised learning, the algorithm is trained on a labeled dataset, '
 'where the input data is paired with the corresponding correct output.\n'
 '- The goal of supervised learning is to learn a mapping function from input '
 'variables to output labels, enabling the algorithm to make predictions or '
 'classify new data.\n'
 '- It is typically used for tasks like classification and regression, where '
 'the algorithm learns to predict outcomes based on labeled data.\n'
 '- Supervised learning requires labeled data for training, and the algorithm '
 'aims to minimize the error between predicted and actual outputs during '
 'training.\n'
 '- Examples of supervised learning algorithms include linear regression, '
 'logistic regression, support vector machines, and neural networks.\n'
 '\n'
 '2. Unsupervised Learning:\n'
 '

In [23]:
with open("researcher_2.txt", "w") as text_file:
    text_file.write(result)

In [24]:
!test/run_test.sh 2

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_02.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


In [25]:
# Pick your third 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
question = " What are the key components of the Transformer architecture? "
result = process_query(question)

print("\nFinal Answer:")
pprint(result)

Processing query:  What are the key components of the Transformer architecture? 


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: ' What are the key components of the Transformer architecture? ', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"key components of the Transformer architecture\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arXiv preprint arXiv:150



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
yes[00m




The researcher was hired...


[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Task:[00m [92mUse the PDF document to answer the following question: ' What are the key components of the Transformer architecture? '[00m




[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"key components of the Transformer architecture\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arXiv prepr



[1m[95m# Agent:[00m [1m[92mExpert Data Analyst[00m
[95m## Final Answer:[00m [92m
To address the position of tokens in a sequence, the Transformer architecture includes "positional encodings" added to the input embeddings.[00m





Final Answer:
('The key components of the Transformer architecture are:\n'
 '1. **Attention Mechanism**: Allows the model to weigh the importance of '
 'different parts of the input sequence when generating an output.\n'
 '2. **Encoder**: Composed of a stack of identical layers, each with two '
 'sub-layers - a multi-head self-attention mechanism and a feed-forward neural '
 'network.\n'
 '3. **Decoder**: Also composed of a stack of identical layers, with three '
 'sub-layers - a multi-head self-attention mechanism, a multi-head attention '
 "mechanism over the encoder's output, and a feed-forward neural network.\n"
 '4. **Multi-Head Attention**: Allows the model to focus on different parts of '
 'the sequence simultaneously by projecting the input into multiple subspaces '
 'and computing attention in each subspace.\n'
 '5. **Position-wise Feed-Forward Networks**: Used in both the encoder and '
 'decoder layers to apply a simple feed-forward neural network to each '
 'position separa

In [26]:
with open("researcher_3.txt", "w") as text_file:
    text_file.write(result)

In [27]:
!test/run_test.sh 3

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_03.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


In [28]:
# Pick three different appropriate questions (input query) to test the execution of the 'journalist' branch
# where the 'journalist' agent extracts an answer for the question 
# via a web search and scraping of the most relavant web page found:

# Example: "How does LORA work in large language model fine-tuning?"

# Choose a different question as input!

In [30]:
# Pick your first 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
question = "How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?"
result = process_query(question)

print("\nFinal Answer:")
pprint(result)

Processing query: How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: 'How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"LoRA efficiency in fine-tuning large language models compared to full fine-tuning\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
of the sequence, we must inject some information about the relative or absolute position of the tokens in the sequence. To this end, we add "positional encodings" to the input embeddings at the 5

a few cases [22], however, such attention mechanisms are used in conjunction with a recurrent network. In this work we propose the Transformer, a model architecture eschewing recurrence and instead relying entirely on an attention mechanism to draw global dependencies between input and output. The Transformer allows for signiﬁcantly more parallelization and can reach a new state of the art in translation quality after being trained for as little as twelve hours on eight P100 GPUs. 2 Background The goal of r



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
no[00m




The journalist was hired ...


[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Task:[00m [92mSearch the web and identify the most relevant source to answer the following query: 'How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Using tool:[00m [92mSearch the internet with Serper[00m
[95m## Tool Input:[00m [92m
"{\"search_query\": \"How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?\"}"[00m
[95m## Tool Output:[00m [92m
{'searchParameters': {'q': 'How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?', 'type': 'search', 'num': 10, 'engine': 'google'}, 'organic': [{'title': 'Customizing LLMs: When to Choose LoRA or Full Fine-Tuning', 'link': 'https://gradientflow.com/lora-or-full-fine-tuning/', 'snippet': "LoRA maintains the base model's performance on tasks outside the target domain better than full fine-tuning, demonstrating stronger ...", 'position': 1}, {'title': 'Is LoRA Fine-Tuning Sometimes Less Effective Than Full ... - Reddit', 'link': 'https://www.reddit.com/r/LocalLLaMA/comments/1eg0cap/is_lora_finetuning_sometimes_less_effectiv



[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Final Answer:[00m [92m
The URL of the most relevant and trustworthy web page discussing how LoRA improves efficiency in fine-tuning large language models compared to full fine-tuning is https://gradientflow.com/lora-or-full-fine-tuning/[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Task:[00m [92mExtract and summarize key insights from the webpage related to: 'How does LoRA improve efficiency in fine-tuning large language models compared to full fine-tuning?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://gradientflow.com/lora-or-full-fine-tuning/\"}"[00m
[95m## Tool Output:[00m [92m

Customizing LLMs: When to Choose LoRA or Full Fine-Tuning - Gradient Flow
Skip to content
Gradient Flow
Unlocking Data and AI
Menu +
×
expanded
collapsed
Newsletter
Podcast
Point of View
Reports
Tech Index
Bluesky
Linkedin
YouTube
Reddit
RSS
Customizing LLMs: When to Choose LoRA or Full Fine-Tuning
Posted by Ben Lorica May 20, 2024 May 20, 2024 Posted in Uncategorized
The growing prevalence of large language models (LLMs) has spurred a demand for customization to suit specific tasks and domains. As I’ve noted in previous work , tailoring LLMs to unique needs can significantly enhance performance and cost-efficiency, particularly when striving for higher accuracy in specific applications.
Fine-tuning has some advantages over p



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Final Answer:[00m [92m
Customizing LLMs: When to Choose LoRA or Full Fine-Tuning - Gradient Flow
Skip to content
Gradient Flow
Unlocking Data and AI
Menu +
×
expanded
collapsed
Newsletter
Podcast
Point of View
Reports
Tech Index
Bluesky
Linkedin
YouTube
Reddit
RSS
Customizing LLMs: When to Choose LoRA or Full Fine-Tuning
Posted by Ben Lorica May 20, 2024 May 20, 2024 Posted in Uncategorized
The growing prevalence of large language models (LLMs) has spurred a demand for customization to suit specific tasks and domains. As I’ve noted in previous work, tailoring LLMs to unique needs can significantly enhance performance and cost-efficiency, particularly when striving for higher accuracy in specific applications.
Fine-tuning has some advantages over prompt engineering, or training from scratch. Click HERE to enlarge.
Fine-tuning LLMs allows developers to adapt pre-trained models to their unique requirements, ultimate


Final Answer:
('LoRA improves efficiency in fine-tuning large language models compared to '
 'full fine-tuning by being a parameter-efficient method that trains only '
 'low-rank perturbations to selected weight matrices. This means that LoRA '
 'focuses on adjusting specific parts of the pre-trained model rather than '
 'optimizing all model parameters, which can lead to faster training times and '
 'reduced computational resources required. Additionally, LoRA acts as a '
 "regularizer, helping to prevent overfitting and maintain the model's "
 'generalizability by preserving performance on tasks outside the target '
 'domain. This regularization property of LoRA can lead to improved efficiency '
 'in scenarios where the model needs to perform well on multiple related tasks '
 'without requiring extensive retraining of the entire model.')


In [31]:
with open("journalist_1.txt", "w") as text_file:
    text_file.write(result)

In [32]:
!test/run_test.sh 4

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_04.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


In [39]:
# Pick your second 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
question = "How are multimodal transformers evolving to handle more complex inputs in 2024?"
result = process_query(question)

print("\nFinal Answer:")
pprint(result)

Processing query: How are multimodal transformers evolving to handle more complex inputs in 2024?


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: 'How are multimodal transformers evolving to handle more complex inputs in 2024?', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"multimodal transformers evolving complex inputs 2024\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arXiv preprint arX



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
no[00m




The journalist was hired ...


[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Task:[00m [92mSearch the web and identify the most relevant source to answer the following query: 'How are multimodal transformers evolving to handle more complex inputs in 2024?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Using tool:[00m [92mSearch the internet with Serper[00m
[95m## Tool Input:[00m [92m
"{\"search_query\": \"How are multimodal transformers evolving to handle more complex inputs in 2024?\"}"[00m
[95m## Tool Output:[00m [92m
{'searchParameters': {'q': 'How are multimodal transformers evolving to handle more complex inputs in 2024?', 'type': 'search', 'num': 10, 'engine': 'google'}, 'organic': [{'title': 'Generative AI Meets Multimodal Learning Systems in 2024', 'link': 'https://www.rapidinnovation.io/post/generative-ai-meets-multimodal-learning-systems-in-2024', 'snippet': "Another notable advancement is the use of deep learning techniques to refine the accuracy of multimodal systems. Google's Multimodal Transformer, for example, ...", 'position': 1}, {'title': 'The Era of Generative AI: How Large Multimodal Models are ...', 'link': 'https://medium.com/@santismm/the-era-of-generative-ai-how-large-multimodal-



[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Final Answer:[00m [92m
https://www.rapidinnovation.io/post/generative-ai-meets-multimodal-learning-systems-in-2024[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Task:[00m [92mExtract and summarize key insights from the webpage related to: 'How are multimodal transformers evolving to handle more complex inputs in 2024?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://www.rapidinnovation.io/post/generative-ai-meets-multimodal-learning-systems-in-2024\"}"[00m
[95m## Tool Output:[00m [92m
Generative AI Meets Multimodal Learning: Revolutionizing Industries in 2024
Blockchain Our Blockchain Services Incubating a culture of innovation & creativity
Blockchain
Smart contract
Enterprise Blockchain
NFT & Web3 Solutions
Layer 2 Solutions Solutions Ethereum Development Build decentralized applications on the most versatile blockchain. Solana Development Experience lightning-fast and low-cost blockchain solutions. Binance Smart chain Development Unlock the power of BNB Chain for scalable DApps. Polkadot Development Seamlessly connect blockchains with Polkadot's interoperability. Polygon Development Scale your DApps with Ethereumâ€™s leading Layer-2 solution. Avalance Development C



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Final Answer:[00m [92m
Generative AI Meets Multimodal Learning: Revolutionizing Industries in 2024
Blockchain Our Blockchain Services Incubating a culture of innovation & creativity
Blockchain
Smart contract
Enterprise Blockchain
NFT & Web3 Solutions
Layer 2 Solutions Solutions Ethereum Development Build decentralized applications on the most versatile blockchain. Solana Development Experience lightning-fast and low-cost blockchain solutions. Binance Smart chain Development Unlock the power of BNB Chain for scalable DApps. Polkadot Development Seamlessly connect blockchains with Polkadot's interoperability. Polygon Development Scale your DApps with Ethereumâ€™s leading Layer-2 solution. Avalance Development Create high-performance DApps on a multi-chain ecosystem. Cardano Development Empower secure and sustainable blockchain innovation. Hyperledger Development Drive enterprise-grade solutions with private blockch


Final Answer:
('In 2024, multimodal transformers are evolving to handle more complex inputs '
 'by integrating various modalities of data, such as text, images, videos, and '
 'audio, into a single model. These models are designed to understand and '
 'process information from different sources simultaneously, allowing for a '
 'more comprehensive analysis of the input data.\n'
 '\n'
 'Some key advancements in multimodal transformers in 2024 include:\n'
 '\n'
 '1. **Improved Fusion Mechanisms:** Multimodal transformers are incorporating '
 'advanced fusion mechanisms to effectively combine information from diverse '
 'modalities. This helps in capturing complex relationships and dependencies '
 'between different types of data.\n'
 '\n'
 '2. **Attention Mechanisms:** Enhanced attention mechanisms are being '
 'developed to enable the model to focus on relevant aspects of each modality '
 "while considering their interactions. This improves the model's ability to "
 'extract meaningful

In [40]:
with open("journalist_2.txt", "w") as text_file:
    text_file.write(result)

In [41]:
!test/run_test.sh 5

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_05.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


In [42]:
# Pick your third 'question' (replace the ... and insert your string) 
# and execute the `process_query(question)`:
question = "How do sparse attention mechanisms improve transformer efficiency at scale?"
result = process_query(question)

print("\nFinal Answer:")
pprint(result)


Processing query: How do sparse attention mechanisms improve transformer efficiency at scale?


[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Task:[00m [92mGiven the question: 'How do sparse attention mechanisms improve transformer efficiency at scale?', determine if it can be answered using only the provided PDF document.[00m




[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Using tool:[00m [92mSearch a PDF's content[00m
[95m## Tool Input:[00m [92m
"{\"query\": \"sparse attention mechanisms improve transformer efficiency at scale\"}"[00m
[95m## Tool Output:[00m [92m
Relevant Content:
[21] Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attention- based neural machine translation. arXiv preprint arXiv:1508.04025, 2015. [22] Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model. In Empirical Methods in Natural Language Processing, 2016. [23] Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. [24] Oﬁr Press and Lior Wolf. Using the output embedding to improve language models. arXiv preprint arXiv:1608.05859, 2016. [25] Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural machine translation of rare words with subword units. arX



[1m[95m# Agent:[00m [1m[92mrouter[00m
[95m## Final Answer:[00m [92m
no[00m




The journalist was hired ...


[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Task:[00m [92mSearch the web and identify the most relevant source to answer the following query: 'How do sparse attention mechanisms improve transformer efficiency at scale?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Using tool:[00m [92mSearch the internet with Serper[00m
[95m## Tool Input:[00m [92m
"{\"search_query\": \"How do sparse attention mechanisms improve transformer efficiency at scale?\"}"[00m
[95m## Tool Output:[00m [92m
{'searchParameters': {'q': 'How do sparse attention mechanisms improve transformer efficiency at scale?', 'type': 'search', 'num': 10, 'engine': 'google'}, 'organic': [{'title': 'Demystifying Sparse Attention: A Comprehensive Guide from Scratch', 'link': 'https://medium.com/@vishal09vns/sparse-attention-dad17691478c', 'snippet': 'The attention mechanism allows the model to dynamically weigh and focus on different parts of the input sequence based on the context of the ...', 'position': 1}, {'title': 'Sparse Transformers: Efficient Scaling for Large Language Models', 'link': 'https://dev.to/nareshnishad/day-29-sparse-transformers-efficient-scaling-for-large-language-models-59j5', 'snippet': '



[1m[95m# Agent:[00m [1m[92mExpert Web Search Agent[00m
[95m## Final Answer:[00m [92m
https://arxiv.org/html/2406.16747v1[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Task:[00m [92mExtract and summarize key insights from the webpage related to: 'How do sparse attention mechanisms improve transformer efficiency at scale?'[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m

Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
1 Introduction
2 Related Work
Long-range Transformers
Sparse attention
3 SparseK Attention
3.1 Background
Self-Attention
SparseMax operator
3.2 Learnable Key-Value Pair Selection
Key-value pair selection
Irreversibile selection
Learnable selection
3.3 The Differentiable SparseK Operator
Definition
Solution
3.4 Extensions
Training with fixed-size truncation-free cache
Combine with other efficient attention mechanism
Straight-through estimator
3.5 Techniques for Faster and More Stable Training
Score normalization
Hard selection for keys and soft selection for values
Initialization to mimic attention score
4 Experiments
4.1 Language Modeling from Scratch



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.




You ONLY have access to the following tools, and should NEVER make up tools that are not listed here:

Tool Name: Read website content
Tool Arguments: {'website_url': {'description': 'Mandatory website url to read the file', 'type': 'str'}}
Tool Description: A tool that can be used to read a website content.

IMPORTANT: Use the following format in your response:

```
Thought: you should always think about what to do
Action: the action to take, only one name of [Read website content], just the name, exactly as it's written.
Action Input: the input to the action, just a simple JSON object, enclosed in curly braces, using " to wrap keys and



[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Using tool:[00m [92mRead website content[00m
[95m## Tool Input:[00m [92m
"{\"website_url\": \"https://arxiv.org/html/2406.16747v1\"}"[00m
[95m## Tool Output:[00m [92m
I tried reusing the same input, I must stop using this action input. I'll try something else instead.

[00m
[93m Maximum iterations reached. Requesting final answer.[00m




[1m[95m# Agent:[00m [1m[92mExpert Web Scraper Agent[00m
[95m## Final Answer:[00m [92m
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
1 Introduction
2 Related Work
Long-range Transformers
Sparse attention
3 SparseK Attention
3.1 Background
Self-Attention
SparseMax operator
3.2 Learnable Key-Value Pair Selection
Key-value pair selection
Irreversibile selection
Learnable selection
3.3 The Differentiable SparseK Operator
Definition
Solution
3.4 Extensions
Training with fixed-size truncation-free cache
Combine with other efficient attention mechanism
Straight-through estimator
3.5 Techniques for Faster and More Stable Training
Score normalization
Hard selection for keys and soft selection for values
Initialization to mimic attention score
4 Experiments
4.1 Language Modeling from Scratch
4.2 Fine-tuning Existing Models
4.3 Retrieval-based Evaluation and Length Extrapolation
4.4 Downstream Task
5 Conclusion
A Derivation of SparseK
A.1 The


Final Answer:
('Sparse attention mechanisms improve transformer efficiency at scale by '
 'reducing the computational complexity and memory requirements associated '
 'with self-attention mechanisms, especially when dealing with long sequences. '
 'Here are some key ways in which sparse attention mechanisms achieve this:\n'
 '\n'
 '1. **Reduced Computational Complexity**: Traditional self-attention '
 'mechanisms in transformers have a quadratic computational complexity, '
 'meaning the time taken for computation increases quadratically with the '
 'sequence length. Sparse attention mechanisms, on the other hand, enable more '
 'efficient computation by focusing only on a subset of key-value pairs '
 'relevant to each query, resulting in a linear time complexity. This '
 'reduction in computational complexity makes it more feasible to handle long '
 'sequences efficiently.\n'
 '\n'
 '2. **Memory Efficiency**: Sparse attention mechanisms help in reducing the '
 'memory footprint requir

In [44]:
with open("journalist_3.txt", "w") as text_file:
    text_file.write(result)

In [45]:
!test/run_test.sh 6

  pid, fd = os.forkpty()


platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_06.py [32m.[0m[32m                                                   [100%][0m

Testing completed.


# Run all tests:

In [46]:
!test/run_test.sh

platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_01.py [32m.[0m[32m                                                   [100%][0m

platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_02.py [32m.[0m[32m                                                   [100%][0m

platform linux -- Python 3.12.9, pytest-8.3.4, pluggy-1.5.0
rootdir: /home/hkamble2/BuildingGenerativeAIBusinessSolutions/Homework2
plugins: anyio-4.6.2
collected 1 item                                                               [0m

test/test_hw02_03.py [32m.[0m[32m                                                   [

# Submission Instructions:

- Submit your **6 responses** (three reponses each for the two branches listed above) as text files described above.

- Submit your **executed (!) Jupyter notebook**. See instructions below.

- **Important:** Your notebook solution needs to be executable with the Python packages listed and return correct answers with the selected queries.

# Homework submission on new IFI cluster:
- This homework is due by **2025-03-31, 2:30 PM (EDT) (Section 006)** or **2025-03-31, 6:00 PM (EDT) (Section 003)**, respectively.
- Make sure that all your programs and output files are in the exact folder as specified in the instructions.
- All file names on this system are case sensitive. Verify if you copy your work from a local computer to your home directory on the new cluster.
- **Execute the cells below to submit your assignment (6 text files and executed notebook):**

In [48]:
# Execute this cell. DO NOT EDIT THIS CONTENT!
from msa8700 import Homework2
Homework2(__session__).submit().list()

Created Homework 2.

  ____        _               _ _   
 / ___| _   _| |__  _ __ ___ (_) |_ 
 \___ \| | | | '_ \| '_ ` _ \| | __|
  ___) | |_| | |_) | | | | | | | |_ 
 |____/ \__,_|_.__/|_| |_| |_|_|\__|
                                    

Submit on behalf of: hkamble2
Submitted file: AgenticAI_RAG_Homework2_hkamble2.ipynb (1).ipynb
Submitted file: researcher_1.txt
Submitted file: researcher_2.txt
Submitted file: researcher_3.txt
Submitted file: journalist_1.txt
Submitted file: journalist_2.txt
Submitted file: journalist_3.txt
Your submission has been completed. You can re-submit your updated assignment until the deadline.

  _     _     _   _             
 | |   (_)___| |_(_)_ __   __ _ 
 | |   | / __| __| | '_ \ / _` |
 | |___| \__ \ |_| | | | | (_| |
 |_____|_|___/\__|_|_| |_|\__, |
                          |___/    

total 1220
-rw-r--r-- 1 hkamble2 hkamble2 1221797 Apr  1 00:44 AgenticAI_RAG_Homework2_hkamble2.ipynb (1).ipynb
-rw-r--r-- 1 hkamble2 hkamble2     793 Apr  1 00:3

<msa8700.Homework2 at 0x7fa0ba5aa2d0>