## 🔥 Please check this blog post for more details: blog link [TBC]

## Install Packages 

Install below ptyhon packages in requirements.txt: 
- **llama-index**: an open-source framework that helps build applications using LLMs. 
- **llama-index-llms-bedrock-converse**: Amazon Bedrock Converse integration with LlamaIndex.
- **llama-index-embeddings-bedrock**: Amazon Bedrock embedding models integration with LlamaIndex. 
- **llama-index-retrievers-bedrock**: Amazon Bedrock Knowledge Bases integration with LlamaIndex. 
- **llama-index-tools-arxiv**: A prebuilt tool to query arxiv.org
- **llama-index-tools-duckduckgo**: A prebuilt tool integrating DuckDuckGo search capabilities.
- **llama-index-postprocessor-bedrock-rerank**: A LlamaIndex plugin that uses Amazon Bedrock's Rerank API to reorder retrieved documents by relevance. 
- **llama-index-vector-stores-opensearch**: A LlamaIndex integration that uses Amazon OpenSearch as a vector store for embedding storage and similarity search.
- **feedparser**: A Python library for parsing for downloading and parsing syndicated feeds including RSS, Atom & RDF Feeds
- **opensearch-py**:A Python library for connecting to and interacting with OpenSearch/Elasticsearch clusters
- **requests-aws4auth**: A Python library that handles AWS Signature Version 4 authentication for making signed HTTP requests to AWS services

In [None]:
%pip install -r requirements.txt -q

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
# Initialise and configure the BedrockConverse LLM with the Mistral Large 2 model and set it as the default in Settings

from llama_index.llms.bedrock_converse import BedrockConverse
from llama_index.core.agent import FunctionCallingAgent
from llama_index.core.tools import FunctionTool

from llama_index.core import Settings

llm = BedrockConverse(model="mistral.mistral-large-2407-v1:0", max_tokens = 2048)
Settings.llm = BedrockConverse(model="mistral.mistral-large-2407-v1:0", max_tokens = 2048)


## API tools integration 

We implement two functions to interact with GitHub and TechCrunch APIs. To ensure clear communication between the agent and the LLM model, we follow Python function best practices including:
- Type hints for parameter and return value validation
- Detailed docstrings explaining function purpose, parameters, and expected returns
- Clear function descriptions

For arXiv and DuckDuckGo integration, we leverage LlamaIndex's pre-built tools instead of creating custom functions. You can explore other available pre-built tools in the [LlamaIndex documentation](https://docs.llamaindex.ai/en/stable/api_reference/tools/) to avoid duplicating existing solutions. 

In [None]:
# Define a function to search GitHub repositories by topic, sorting by stars or update date, and return top results

import requests

def github_search(topic: str, num_results: int = 3, sort_by: str = "stars") -> list:
    """
    Retrieve a specified number of GitHub repositories based on a given topic, 
    ranked by the specified criteria.

    This function uses the GitHub API to search for repositories related to a 
    specific topic or keyword. The results can be sorted by the number of stars 
    (popularity) or the most recent update, with the most relevant repositories 
    appearing first according to the chosen sorting method.

    Parameters:
    -----------
    topic : str
        The topic or keyword to search for in GitHub repositories.
        The topic cannot contain blank spaces.
    num_results : int, optional
        The number of repository results to retrieve. Defaults to 3.
    sort_by : str, optional
        The criterion for sorting the results. Options include:
        - 'stars': Sort by the number of stars (popularity).
        - 'updated': Sort by the date of the last update (most recent first).
        Defaults to 'stars'.

    Returns:
    --------
    list
        A list of dictionaries, where each dictionary contains information 
        about a repository. Each dictionary includes:
        - 'html_url': The URL of the repository.
        - 'description': A brief description of the repository.
        - 'stargazers_count': The number of stars (popularity) the repository has.
    """
    

    url = f"https://api.github.com/search/repositories?q=topic:{topic}&sort={sort_by}&order=desc"

    response = requests.get(url).json()
    
    code_repos = [
        {
            'html_url': item['html_url'],
            'description': item['description'],
            'stargazers_count': item['stargazers_count'],
            # 'topics': item['topics']
        }
        for item in response['items'][:num_results]
    ]
    
    return code_repos

github_tool = FunctionTool.from_defaults(fn=github_search)

In [None]:
# Define a function to search for TechCrunch news articles by topic and return details for a specified number of results

import feedparser
    
def news_search(topic: str, num_results: int = 3) -> list:
    """
    Retrieve a specified number of news articles from TechCrunch based on a given topic.

    This function queries the TechCrunch RSS feed to search for news articles related to the 
    provided topic and returns a list of the most relevant articles. Each article includes 
    details such as the title, link, publication date, and a summary or description.

    Parameters:
    -----------
    topic : str
        The keyword or subject to search for in the TechCrunch news feed.
        The topic cannot contain blank spaces.
        If multiple words are needed, connect them with "+" (e.g., artificial+intelligence).
    num_results : int, optional
        The number of articles to retrieve from the search results. Defaults to 3.

    Returns:
    --------
    list
        A list of dictionaries, where each dictionary contains information about a retrieved 
        news article. Each dictionary includes:
        - 'title': The title of the article.
        - 'link': The URL to the article.
        - 'published': The publication date of the article.
        - 'summary': A brief summary or description of the article, if available.
    """
    

    url = f"https://techcrunch.com/tag/{topic}/feed/"
    feed = feedparser.parse(url)
    
    news = []
    
    # Loop through the top num_results articles
    for entry in feed.entries[:num_results]:
        # Create a dictionary for each article
        article = {
            'title': entry.title,
            'link': entry.link,
            'published': entry.published,
            'summary': entry.summary if hasattr(entry, 'summary') else entry.description if hasattr(entry, 'description') else None
        }

        # Add the article dictionary to the list
        news.append(article)
    
    return news

news_tool = FunctionTool.from_defaults(fn=news_search)

In [None]:
# Import and configure the ArxivToolSpec and DuckDuckGoSearchToolSpec from LlamaIndex prebuilt tools

from llama_index.tools.arxiv import ArxivToolSpec
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec

arxiv_tool = ArxivToolSpec()
search_tool = DuckDuckGoSearchToolSpec()

api_tools = arxiv_tool.to_tool_list() + search_tool.to_tool_list(spec_functions=["duckduckgo_full_search"])

# Consolidate all tools into one list. 
api_tools.extend([news_tool, github_tool])

In [None]:
# Set up an agent with access to GitHub, arXiv, and TechCrunch APIs, using a system prompt to guide interactions.

from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

import time
current_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))

system_prompt = f"""
You are a technology expert with access to the GitHub API, arXiv API, DuckDuckGo API and TechCrunch API. 
You can search for the latest code repositories, papers, and news related to technology.
Always try to use the tools available to you. 
If you don’t know the answer, do not make up any information; simply say: Sorry, I don’t know.

Current time is: {current_time}
"""

agent_worker = FunctionCallingAgentWorker.from_tools(
    api_tools, 
    llm=llm, 
    verbose=True, # Set verbose=True to display the full trace of steps. 
    system_prompt = system_prompt,
    allow_parallel_tool_calls = True # this line to allow multiple tool invocations
)
agent = AgentRunner(agent_worker)

In [None]:
response = agent.chat("Can you give me top 2 papers about GenAI, and recent news about bedrock")
print(str(response))

In [None]:
# Simple chatbot UI. Enter "exit" to quit. 
while True:
    text_input = input("User: ")
    if text_input == "exit":
        break
    response = agent.chat(text_input)

    print("-" * 120)
    print(" " * 120)
    print(f"Agent: {response}")
    print("=" * 120)
    print(" New question: ")

In [None]:
# agent.memory.get() # retrieve conversation history
# agent.memory.reset() # clear the chat memory

In [None]:
# test questions: 
# 1. any news about GenAI
# 2. can you give me top5 github code repo related to bedrock
# 3. can you show me the top 3 paper that releted to GenAI

## Documents RAG Integration with Amazon OpenSearch Serverless

Below, we download two PDF files of decision guide documents from the AWS website, which provide recommendations for selecting GenAI and ML services in different scenarios, and outline what should be evaluated and considered in the decision-making process. You can provide and replace these with your internal business documents in this step.

In the section below, we use LlamaIndex to process documents into chunks and convert them into embedding vectors using the Amazon Bedrock Embedding model. We then use Amazon OpenSearch Serverless as a vector store to persist the vectors. 


In [None]:
# On Ubuntu/Debian systems
!sudo apt-get update
!sudo apt-get install ca-certificates -y

# download test documents from below links
!wget -O docs/genai_on_aws.pdf "https://docs.aws.amazon.com/pdfs/decision-guides/latest/generative-ai-on-aws-how-to-choose/generative-ai-on-aws-how-to-choose.pdf?did=wp_card&trk=wp_card"
!wget -O docs/ml_on_aws.pdf "https://docs.aws.amazon.com/pdfs/decision-guides/latest/machine-learning-on-aws-how-to-choose/machine-learning-on-aws-how-to-choose.pdf?did=wp_card&trk=wp_card#guide"

In [None]:
# use Llamaindex to load documents 

from llama_index.core import SimpleDirectoryReader
loader = SimpleDirectoryReader('docs/')
documents = loader.load_data()

In [None]:
# Initialise and configure the embedding model with Amazon Titan Text Embeddings V2, and set it as the default in Settings

from llama_index.embeddings.bedrock import BedrockEmbedding
embed_model = BedrockEmbedding(model_name="amazon.titan-embed-text-v2:0")
Settings.embed_model = BedrockEmbedding(model_name="amazon.titan-embed-text-v2:0")

In [None]:
# Create Amazon OpenSearch Serverless collection 
from utils import *
import sagemaker 
import random

region_name = "us-west-2"
suffix = random.randrange(1, 500)
collection_name = "mistral-test-"+str(suffix)
notebook_execution_role = sagemaker.get_execution_role()

endpoint = create_collection(collection_name, notebook_execution_role)
print("Amazon OpenSearch Collection endpoint is: ", endpoint)

import time
# Wait for 1 minute (60 seconds) for collection creation complete
time.sleep(60)

In [None]:
## create an index in the collection
index_name = "pdf-docs"
create_index(index_name, endpoint, emb_dim=1024)

In [None]:
## integrate Amazon OpenSearch Serverless collection and index to llamaindex 

import boto3
from llama_index.vector_stores.opensearch import  OpensearchVectorStore,   OpensearchVectorClient
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth

dim = 1024 # Amazon Titan Embedding V2 model dimension 

service = 'aoss'
credentials = boto3.Session().get_credentials()
awsauth = AWSV4SignerAuth(credentials, region_name, service)

client = OpensearchVectorClient(
    endpoint, 
    index_name, 
    dim, 
    embedding_field="vector", 
    text_field="chunk",
    http_auth=awsauth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
)

In [None]:
# initialise vector store and save document chunks to the vector store 
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.core.node_parser import SentenceSplitter

vector_store = OpensearchVectorStore(client)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex.from_documents(
    documents, 
    storage_context=storage_context,
    transformations=[SentenceSplitter(chunk_size=1024, chunk_overlap=20)]
)

# LlamaIndex provides various text splitters, more information: https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/

In [None]:
import time
# Wait for 5 minutes (300 seconds) for document ingesion to complete
time.sleep(300)

In [None]:
# Use reranking to improve the quality and relevance of the context retrieval in RAG.

from llama_index.postprocessor.bedrock_rerank import AWSBedrockRerank
reranker = AWSBedrockRerank(
    top_n=3,
    model_id="amazon.rerank-v1:0",#  another rerank model option is: cohere.rerank-v3-5:0
    region_name="us-west-2",
)
query_engine = index.as_query_engine(
    similarity_top_k=10,
    node_postprocessors=[reranker],
)

# test the vector store with a simple question 
response = query_engine.query(
    "In which situation should I use Amazon Bedrock over Amazon SageMaker?",
)
print(response)

In [None]:
# create QueryEngineTool based on the OpenSearch vector store 

from llama_index.core.tools import QueryEngineTool, ToolMetadata
oss_tool = QueryEngineTool(
        query_engine=query_engine,
        metadata=ToolMetadata(
            name="oss_guide_tool",
            description="""
            These decision guides help users select appropriate AWS machine learning and generative AI services based on specific needs. 
            They cover pre-built solutions, customizable platforms, and infrastructure options for ML workflows, 
            while outlining how generative AI can automate processes, personalize content, augment data, reduce costs, 
            and enable faster experimentation in various business contexts.""",
        ),
    )

In [None]:
all_tools = api_tools +[oss_tool]

agent_worker = FunctionCallingAgentWorker.from_tools(
    all_tools, 
    llm=llm, 
    verbose=True, # Set verbose=True to display the full trace of steps. 
    system_prompt = system_prompt,
    # allow_parallel_tool_calls = True  # Uncomment this line to allow multiple tool invocations
)
agent = AgentRunner(agent_worker)

In [None]:
# Simple chatbot UI. Enter "exit" to quit. 

while True:
    text_input = input("User: ")
    if text_input == "exit":
        break
    response = agent.chat(text_input)

    print("-" * 120)
    print(" " * 120)
    print(f"Agent: {response}")
    print("=" * 120)
    print(" New question: ")
    # what services bedrock platform is offering
    #  what are the LLM models available from bedrock


## Documents RAG Integration with Bedrock Knowledge Bases Service

In the section below, we use Amazon Bedrock Knowledge Bases to build the RAG framework. You can create a Bedrock Knowledge Base from the [AWS console](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html) or follow this [notebook example](https://github.com/aws-samples/amazon-bedrock-workshop/blob/main/02_KnowledgeBases_and_RAG/0_create_ingest_documents_test_kb.ipynb) to create it programmatically. 

First, create a new S3 bucket for the Bedrock Knowledge Bases. Then, upload the previously downloaded files to this S3 bucket. You can select different embedding models and chunking strategies that work better for your data.

Once the Knowledge Base is created, remember to sync the data. Data synchronisation may take a few minutes. 

In [None]:
# After you create the knowledge base, provide Bedrock Knowledge Base ID 
knowledge_base_id = "[KNOWLEDGE_BASE_ID]" # 

In [None]:
# Configure a knowledge base retriever using AmazonKnowledgeBasesRetriever

from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.retrievers.bedrock import AmazonKnowledgeBasesRetriever

# maximum number of relevant text chunks that will be retrieved
# If you need quick, focused answers: lower numbers (1-3)
# If you need detailed, comprehensive answers: higher numbers (5-10)
top_k = 10

# search mode options: HYBRID, SEMANTIC
# HYBRID search combines the strengths of semantic search and keyword search 
# Balances semantic understanding with exact matching
# https://docs.llamaindex.ai/en/stable/examples/retrievers/bedrock_retriever/
search_mode = "HYBRID"


#### Note 🔥: KB service role should have permission to bedrock:Rerank and invoke the rerank model
- Create below IAM inline policy and attach it to Amazon Knowledge Base service role from AWS console 

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": "arn:aws:bedrock:us-west-2::foundation-model/amazon.rerank-v1:0"
        },
        {
            "Effect": "Allow",
            "Action": "bedrock:Rerank",
            "Resource": "*"
        }
    ]
}
```

In [None]:
kb_retriever = AmazonKnowledgeBasesRetriever(
    knowledge_base_id=knowledge_base_id,
    retrieval_config={
        "vectorSearchConfiguration": {
            "numberOfResults": top_k,
            "overrideSearchType": search_mode,
            'rerankingConfiguration': {
                'bedrockRerankingConfiguration': {
                    'modelConfiguration': {
                        'modelArn': 'arn:aws:bedrock:us-west-2::foundation-model/amazon.rerank-v1:0'
                    },
                    'numberOfRerankedResults': 3
                },
                'type': 'BEDROCK_RERANKING_MODEL'
            }
        },
        
    }
)
kb_engine = RetrieverQueryEngine(retriever=kb_retriever)

In [None]:
# test the knowledge base with a simple question 
response = kb_engine.query(
    "In which situation should I use Amazon Bedrock over Amazon SageMaker?",
)
print(response)

In [None]:
# Create a query tool for Bedrock Knowledge Base

kb_tool = QueryEngineTool(
        query_engine=kb_engine,
        metadata=ToolMetadata(
            name="kb_tool",
            description="""
            These decision guides help users select appropriate AWS machine learning and generative AI services based on specific needs. 
            They cover pre-built solutions, customizable platforms, and infrastructure options for ML workflows, 
            while outlining how generative AI can automate processes, personalize content, augment data, reduce costs, 
            and enable faster experimentation in various business contexts.""",
        ),
    )

In [None]:
import time
current_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))

system_prompt = f"""
You are a technology expert with access to the GitHub API, arXiv API, DuckDuckGo API and  TechCrunch API. 
You can search for the latest code repositories, research papers, news related to technology, and search the Internet using DuckDuckGo API.
You have access to the Amazon Bedrock user guide, which provides information about services offered by Bedrock, 
such as Agents, Knowledge Bases, Guardrails, Model Evaluation, and Model Fine-Tuning. 
It also provides third-party foundation models and Amazon LLMs via the Bedrock platform 
Always utilise the tools at your disposal.
If you don’t know the answer, do not make up any information; simply say: Sorry, I don’t know.

Current time is: {current_time}
"""

# Update the agent to include all API tools and the Knowledge Base tool.

all_tools = api_tools +[kb_tool]

agent_worker = FunctionCallingAgentWorker.from_tools(
    all_tools, 
    llm=llm, 
    verbose=True, # Set verbose=True to display the full trace of steps. 
    system_prompt = system_prompt,
    # allow_parallel_tool_calls = True  # Uncomment this line to allow multiple tool invocations
)
agent = AgentRunner(agent_worker)

In [None]:
# Simple chatbot UI. Enter "exit" to quit. 

while True:
    text_input = input("User: ")
    if text_input == "exit":
        break
    response = agent.chat(text_input)

    print("-" * 120)
    print(" " * 120)
    print(f"Agent: {response}")
    print("=" * 120)
    print(" New question: ")
    # what services bedrock platform is offering
    #  what are the LLM models available from bedrock
    # "I don't have many ML experts, but I want to build a GenAI application. Which AWS service should I choose?"


In [None]:
# agent.memory.reset() # clear the chat memory
# agent.memory.get() # retrieve conversation history

In [None]:
# # Test question: 
# 1. I don't have many ML experts, but I want to build a GenAI application. Which AWS service should I choose?
# 2. whats the benefits of using bedrock service
# 3. can you give me top 5 git repos related to bedrock 

## Clean Up
1. Navigate to the Amazon S3 console and delete the S3 bucket and data created for this solution.
2. In the Amazon OpenSearch Service console, go to Serverless, select Collection, and delete the collection that was created for storing the embedding vectors. 
3. From the Amazon Bedrock Knowledge Bases console, select the knowledge base created in option 2 and delete it. 
4. In the Amazon SageMaker console, select your domain and user profile. Launch SageMaker Studio to stop or delete the JupyterLab instance.


## Conclusion

This notebook shows how we can combine LLMs (Mistral Large 2), internet searching tools, and knowledge bases to build an intelligent research helper. We can see how this solution works well for finding and understanding technical information, and it can easily be made more powerful by adding more data sources and features.