In [None]:
%pip install llama-index
%pip install transformers
%pip install llama-index-readers-web
%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-program-openai
%pip install llama-index-agent-openai

## Setup

### Data

In [None]:
from llama_index.readers.web import BeautifulSoupWebReader

url = "https://www.theverge.com/2023/9/29/23895675/ai-bot-social-network-openai-meta-chatbots"

documents = BeautifulSoupWebReader().load_data([url])

### LLM

In [None]:
import os
import openai

os.environ['OPENAI_API_KEY'] = "sk-..."
openai.api_key = os.environ['OPENAI_API_KEY']

In [None]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4", temperature=0)

In [None]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = "local:BAAI/bge-small-en-v1.5"

Downloading (…)lve/main/config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/134M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/394 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

[nltk_data] Downloading package punkt to /tmp/llama_index...
[nltk_data]   Unzipping tokenizers/punkt.zip.


### Index Setup

In [None]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(documents)

In [None]:
from llama_index.core.indices import SummaryIndex

summary_index = SummaryIndex.from_documents(documents)

### Helpful Imports / Logging

In [None]:
from llama_index.core.response.notebook_utils import display_response

In [None]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

## Basic Query Engine

### Compact (default)

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

**`Final Response:`** OpenAI and Meta have different approaches to AI tools. OpenAI tends to present its products as productivity tools, focusing on utilities for getting things done. For instance, they have updated their ChatGPT tool to include voice interaction and image questioning capabilities. On the other hand, Meta is more focused on the entertainment aspect of AI. They have developed 28 personality-driven chatbots for their messaging apps, with celebrities lending their voices to the effort. These chatbots are designed to provide a more engaging and personalized experience for users.

### Refine

In [None]:
query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

**`Final Response:`** OpenAI and Meta have distinct strategies when it comes to AI tools. OpenAI typically markets its products as productivity enhancers, with a concentration on functionalities that aid in task completion. An example of this is ChatGPT, a comprehensive language model that can engage with users vocally and respond to inquiries about uploaded visuals. Conversely, Meta prioritizes the entertainment factor. They are also creating extensive language models, but their approach includes the introduction of 28 character-based chatbots designed for use in their messaging applications.

### Tree Summarize

In [None]:
query_engine = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

**`Final Response:`** OpenAI and Meta have different approaches to AI tools. OpenAI tends to present its products as productivity tools, focusing on utilities for getting things done. For instance, they have developed ChatGPT, a large language model that can interact with users via voice and answer questions about uploaded images. This tool is designed to be more powerful, patient, empathetic, and available, potentially serving as a synthetic companion for users.

On the other hand, Meta is more focused on the entertainment aspect of AI. They have developed 28 personality-driven chatbots for their messaging apps, with celebrities lending their voices to the effort. These chatbots are designed to interact with users in a more personalized and engaging way. Meta also plans to integrate these AI characters across all major surfaces of its products, creating a partially synthetic social network.

## Router Query Engine

In [None]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts."
    )
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document."
    )
)

### Single Selector

In [None]:
from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    select_multi=False
)

response = query_engine.query("What was mentioned about Meta?")

display_response(response)

**`Final Response:`** Meta is building large language models (LLMs) and has unveiled 28 personality-driven chatbots for use in its messaging apps. These chatbots feature the voices of celebrities such as Charli D’Amelio, Dwyane Wade, Kendall Jenner, MrBeast, Snoop Dogg, Tom Brady, and Paris Hilton. The company is also planning to place its AI characters on every major surface of its products, including Facebook pages and Instagram accounts. This is part of a shift towards a partially synthetic social network.

### Multi Selector

In [None]:
from llama_index.core.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    select_multi=True,
)

response = query_engine.query("What was mentioned about Meta? Summarize with any other companies mentioned in the entire document.")

display_response(response)

**`Final Response:`** Meta, previously known as Facebook, is developing large language models (LLMs) and has unveiled 28 personality-driven chatbots for use in its messaging apps. These chatbots feature the voices of various celebrities, including Charli D’Amelio, Dwyane Wade, Kendall Jenner, MrBeast, Snoop Dogg, Tom Brady, and Paris Hilton. Meta's approach is seen as part of the entertainment business, contrasting with OpenAI, which presents its products as productivity tools. OpenAI has announced updates for ChatGPT, a tool that now allows interaction via voice and image uploads. The voice feature is currently being rolled out to ChatGPT Plus subscribers. Both companies' developments are seen as steps towards a synthetic social network.

## SubQuestion Query Engine

In [None]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts."
    )
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document."
    )
)

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
from llama_index.core.query_engine import SubQuestionQueryEngine

query_engine = SubQuestionQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    verbose=True,
)

response = query_engine.query("What was mentioned about Meta? How Does it differ from how OpenAI is talked about?")

display_response(response)

Generated 4 sub questions.
[1;3;38;2;237;90;200m[vector_search] Q: What specific facts were mentioned about Meta?
[0m[1;3;38;2;90;149;237m[vector_search] Q: What specific facts were mentioned about OpenAI?
[0m[1;3;38;2;11;159;203m[summary] Q: Can you summarize the entire document about Meta?
[0m[1;3;38;2;155;135;227m[summary] Q: Can you summarize the entire document about OpenAI?
[0m[1;3;38;2;90;149;237m[vector_search] A: OpenAI announced updates for ChatGPT, which include a feature that allows interaction with its large language model via voice and another that lets users upload images and ask questions about them. This has made ChatGPT more powerful as a mobile app, allowing users to chat with it while on the move or ask it questions about images they've taken. The addition of a voice to ChatGPT has given it a hint of personality, with its five native voices being described as livelier and more dynamic than those of Alexa or the Google assistant. The voice feature is current

**`Final Response:`** Meta is working on large language models and has disclosed its applications for generative AI and voices. It has launched 28 chatbots with distinct personalities for its messaging apps, featuring the voices of various celebrities. Meta plans to integrate these AI characters across its product range, including Facebook pages and Instagram accounts. It is also introducing AI-generated imagery in the form of new stickers for its messaging apps.

On the other hand, OpenAI has announced updates for its large language model, ChatGPT. These updates include a voice interaction feature and an image upload feature for asking questions about the images. The voice feature, which is currently available to ChatGPT Plus subscribers, gives ChatGPT a hint of personality, with five native voices described as more lively and dynamic than typical AI assistants. OpenAI's updates have made ChatGPT more versatile, especially as a mobile app. The company envisions its products as productivity tools with potential for emotional connection and support.

## SQL Query Engine

Here, we download and use a sample SQLite database with 11 tables, with various info about music, playlists, and customers. We will limit to a select few tables for this test.

In [None]:
!curl https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip -O /content/chinook.zip
!unzip /content/chinook.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  298k  100  298k    0     0  2150k      0 --:--:-- --:--:-- --:--:-- 2162k
curl: (3) URL using bad/illegal format or missing URL
Archive:  /content/chinook.zip
  inflating: chinook.db              


In [None]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, select, column

engine = create_engine("sqlite:////content/chinook.db")

In [None]:
from llama_index.core import SQLDatabase

sql_database = SQLDatabase(engine)

In [None]:
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["albums", "tracks", "artists"]
)

In [None]:
response = query_engine.query("What are some albums?")

display_response(response)

**`Final Response:`** Here are some albums: 
1. For Those About To Rock We Salute You
2. Balls to the Wall
3. Restless and Wild
4. Let There Be Rock
5. Big Ones
6. Jagged Little Pill
7. Facelift
8. Warner 25 Anos
9. Plays Metallica By Four Cellos
10. Audioslave

In [None]:
response = query_engine.query("What are some artists? Limit it to 5.")

display_response(response)

**`Final Response:`** Some artists include AC/DC, Accept, Aerosmith, Alanis Morissette, and Alice In Chains.

This last query should be a more complex join

In [None]:
response = query_engine.query("What are some tracks from the artist AC/DC? Limit it to 3")

display_response(response)

**`Final Response:`** Some tracks from the artist AC/DC are "For Those About To Rock (We Salute You)", "Put The Finger On You", and "Let's Get It Up".

In [None]:
print(response.metadata['sql_query'])

SELECT tracks.Name FROM tracks 
JOIN albums ON tracks.AlbumId = albums.AlbumId 
JOIN artists ON albums.ArtistId = artists.ArtistId 
WHERE artists.Name = 'AC/DC' 
LIMIT 3;


## Programs

Depending the LLM, you will have to test with either `OpenAIPydanticProgram` or `LLMTextCompletionProgram`

In [None]:
from typing import List
from pydantic import BaseModel

from llama_index.core.program import LLMTextCompletionProgram
from llama_index.program.openai import OpenAIPydanticProgram

class Song(BaseModel):
    """Data model for a song."""

    title: str
    length_seconds: int


class Album(BaseModel):
    """Data model for an album."""

    name: str
    artist: str
    songs: List[Song]

In [None]:
prompt_template_str = """\
Generate an example album, with an artist and a list of songs. \
Using the movie {movie_name} as inspiration.\
"""
# program = LLMTextCompletionProgram.from_defaults
program = OpenAIPydanticProgram.from_defaults(
    output_cls=Album,
    prompt_template_str=prompt_template_str,
    llm=llm,
    verbose=True,
)

In [None]:
output = program(movie_name="The Shining")

Function call: Album with args: {
  "name": "Echoes of The Overlook",
  "artist": "Jack's Torment",
  "songs": [
    {
      "title": "Redrum Reverie",
      "length_seconds": 300
    },
    {
      "title": "Maze in the Snow",
      "length_seconds": 240
    },
    {
      "title": "Room 237",
      "length_seconds": 180
    },
    {
      "title": "All Work and No Play",
      "length_seconds": 360
    },
    {
      "title": "The Twins' Lullaby",
      "length_seconds": 210
    },
    {
      "title": "Blood Elevator Blues",
      "length_seconds": 420
    },
    {
      "title": "Wendy's Escape",
      "length_seconds": 330
    },
    {
      "title": "Frozen in Fear",
      "length_seconds": 270
    }
  ]
}


## Data Agent

Similar to programs, OpenAI LLMs will use `OpenAIAgent`, while other LLMs will use `ReActAgent`.

In [None]:
from llama_index.core.agent import ReActAgent
from llama_index.agent.openai import OpenAIAgent

# agent = ReActAgent.from_tools(
agent = OpenAIAgent.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)

In [None]:
response = agent.chat("Hello!")
print(response)

Hi there! How can I assist you today?


In [None]:
response = agent.chat("What was mentioned about Meta? How Does it differ from how OpenAI is talked about?")
print(response)

=== Calling Function ===
Calling function: vector_search with args: {
  "input": "Meta"
}
Got output: Meta is a company that is in the entertainment business and is also building large language models (LLMs). They have unveiled 28 personality-driven chatbots to be used in their messaging apps. Celebrities like Charli D’Amelio, Dwyane Wade, Kendall Jenner, MrBeast, Snoop Dogg, Tom Brady, and Paris Hilton have lent their voices to this effort. Each of these characters comes with a unique description. Meta plans to place its AI characters on every major surface of its products, including Facebook pages and Instagram accounts.
=== Calling Function ===
Calling function: vector_search with args: {
  "input": "OpenAI"
}
Got output: OpenAI is an organization that develops artificial intelligence technologies. They recently announced updates for ChatGPT, a large language model. The updates include a feature that allows users to interact with the model via voice and another that lets users uploa