<a href="https://colab.research.google.com/github/frank-morales2020/MLxDL/blob/main/Rag_Fusion_Langchain_Llamaindex_PostgreSQL_CLAUDE3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG Fusion Query Pipeline

This notebook shows how to implement RAG Fusion using the LlamaIndex Query Pipeline syntax.

In [1]:
!nvidia-smi

Sun Jun  2 18:11:18 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

Required Dependencies

In [None]:
#added by Frank Morales(FM) 11/01/2024
%pip install openai  --root-user-action=ignore -q
!pip install llama_index phoenix pyvis network -q
!pip install llama_hub -q
%pip install colab-env --upgrade --quiet --root-user-action=ignore
!pip install accelerate -q
#!pip install typing_extensions

!pip install langchain --quiet
!pip install accelerate --quiet
!pip install transformers --quiet
!pip install bitsandbytes --quiet

### llama index components
!pip install llama-index-llms-langchain -q
%pip install llama-index-llms-fireworks -q

## Setup / Load Data

We load in the pg_essay.txt data.

In [None]:
import colab_env
import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")
!wget "https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1" -O pg_essay.txt

llama_index

In [4]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["/content/pg_essay.txt"])
docs = reader.load_data()

# POSTGRESQL

POSTGRESQL

https://www.atlantic.net/dedicated-server-hosting/how-to-install-and-configure-postgres-14-on-ubuntu/

In [None]:
#ADDED By FM 01/06/2024
!apt-get update -y
!apt-get install postgresql-14 -y

!service postgresql restart
!sudo apt install postgresql-server-dev-all

#apt-get -y install postgresql

In [None]:
print()
# PostGRES SQL Settings
%cd /content/
!sudo -u postgres psql -c "ALTER USER postgres PASSWORD 'postgres'"

print('START: PG embedding COMPILATION')
%cd /content/
!git clone https://github.com/neondatabase/pg_embedding.git
%cd /content/pg_embedding
!make
!make install # may need sudo
print('END: PG embedding COMPILATION')
print()

#!sudo -u postgres psql -c "DROP EXTENSION embedding"
!sudo -u postgres psql -c "CREATE EXTENSION embedding"
#!sudo -u postgres psql -c "DROP TABLE documents"
!sudo -u postgres psql -c "CREATE TABLE documents(id integer PRIMARY KEY, embedding real[])"

# Langchain

In [None]:
!pip install -U langchain-community -q

In [None]:
#ADDED By FM 11/01/2024

from typing import List, Tuple
from langchain.docstore.document import Document
from langchain.document_loaders import TextLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PGEmbedding

loader = TextLoader("/content/pg_essay.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs0 = text_splitter.split_documents(documents)

collection_name0 = "pg_essay"
print(f'# of Document Pages {len(documents)}')
print(f'# of Document Chunks: {len(docs0)}')

# Llama Index

## Setup Llama Pack



In [9]:
#!pip install llama_index
import llama_index
print('LLAMA INDEX VERSION: %s'%llama_index.core.__version__)
#llama_index.core.
from llama_index.core.query_pipeline import QueryPipeline
import llama_index.core.query_pipeline as query_pipeline
#llama_index.core

LLAMA INDEX VERSION: 0.10.42


In [10]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

In [11]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.core import PromptTemplate

# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
llm = OpenAI(model="gpt-3.5-turbo")

p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

In [12]:
output = p.run(movie_name="The Departed")

[1;3;38;2;155;135;227m> Running module 90801dcb-d1cb-4309-a57d-e5c1999f21a7 with input: 
movie_name: The Departed

[0m[1;3;38;2;155;135;227m> Running module af6788a8-b2cd-4c18-b430-28d89486a248 with input: 
messages: Please generate related movies to The Departed

[0m

In [13]:
print(str(output))

assistant: 1. Infernal Affairs (2002) - The original Hong Kong film that inspired The Departed
2. The Town (2010) - A crime thriller directed by Ben Affleck
3. Mystic River (2003) - A crime drama directed by Clint Eastwood
4. Goodfellas (1990) - A classic crime film directed by Martin Scorsese
5. The Irishman (2019) - Another crime drama directed by Martin Scorsese, starring Robert De Niro and Al Pacino
6. The Godfather (1972) - A classic crime film directed by Francis Ford Coppola
7. Heat (1995) - A crime thriller directed by Michael Mann, starring Al Pacino and Robert De Niro
8. The Departed (2006) - A crime thriller directed by Martin Scorsese, starring Leonardo DiCaprio, Matt Damon, and Jack Nicholson.


In [14]:
# Option 1: Use `download_llama_pack`
# from llama_index.llama_pack import download_llama_pack

# RAGFusionPipelinePack = download_llama_pack(
#     "RAGFusionPipelinePack",
#     "./rag_fusion_pipeline_pack",
#     # leave the below line commented out if using the notebook on main
#     # llama_hub_url="https://raw.githubusercontent.com/run-llama/llama-hub/jerry/add_query_pipeline_pack/llama_hub"
# )

# Option 2: Import from llama_hub package
#RAGFusionPipelinePack                                           RAGFusionPipelinePack
#from llama_hub.llama_packs.query.rag_fusion_pipeline.base import RAGFusionPipelinePack
#from llama_index.llms import OpenAI

# EMBEDDING with OPENAI and Langchain

In [None]:
# 20x faster than pgvector: introducing pg_embedding extension for vector search in Postgres and LangChain
# https://neon.tech/blog/pg-embedding-extension-for-vector-search

#ADDED By FM 11/01/2024

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PGEmbedding

# https://supabase.com/blog/fewer-dimensions-are-better-pgvector
embeddings = OpenAIEmbeddings(model='text-embedding-ada-002')

collection_name='Paul Graham Essay'
connection_string = os.getenv("DATABASE_URL")

db = PGEmbedding.from_documents(
    embedding=embeddings,
    documents=docs0,
    collection_name=collection_name,
    connection_string=connection_string,
)

#db.create_hnsw_index(dims = 1536, m = 8, ef_construction = 16, ef_search = 16)

In [16]:
#ADDED By FM 11/01/2024
query='What did the author do growing up?'
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)

print()
print(query)
print()

for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)


What did the author do growing up?

--------------------------------------------------------------------------------
Score:  0.59925514
What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.

The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright f

# RAG FUSION PIPELINE

In [17]:
"""RAG Fusion Pipeline."""

from typing import Any, Dict, List, Optional

from llama_index.core import Document, ServiceContext, VectorStoreIndex
from llama_index.core.llama_pack.base import BaseLlamaPack
from llama_index.core.llms.llm import LLM
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.query_pipeline.components.argpacks import ArgPackComponent
from llama_index.core.query_pipeline.components.function import FnComponent
from llama_index.core.query_pipeline.components.input import InputComponent
from llama_index.core.query_pipeline.query import QueryPipeline
from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core.schema import NodeWithScore
from llama_index.llms.openai import OpenAI

DEFAULT_CHUNK_SIZES = [128, 256, 512, 1024]


def reciprocal_rank_fusion(
    results: List[List[NodeWithScore]],
) -> List[NodeWithScore]:
    """Apply reciprocal rank fusion.

    The original paper uses k=60 for best results:
    https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf
    """
    k = 60.0  # `k` is a parameter used to control the impact of outlier rankings.
    fused_scores = {}
    text_to_node = {}
    rank=0

#for rank, node_with_score in enumerate(
#            sorted(nodes_with_scores, key=lambda x: x.score or 0.0, reverse=True)
#        ):

# The above lines commented generated this error AttributeError: 'tuple' object has no attribute 'score'


     # compute reciprocal rank scores by Frank Morales 09/05/2024
    for node_with_score in results:
        rank+=1
        if not isinstance(node_with_score, NodeWithScore):
            raise TypeError("node_with_score must be a NodeWithScore object.")
        text = node_with_score.node.get_content()
        text_to_node[text] = node_with_score
        if text not in fused_scores:
          fused_scores[text] = 0.0
        fused_scores[text] += 1.0 / (rank + k)

    # sort results
    reranked_results = dict(
        sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    )

    # adjust node scores
    reranked_nodes: List[NodeWithScore] = []
    for text, score in reranked_results.items():
        reranked_nodes.append(text_to_node[text])
        reranked_nodes[-1].score = score

    return reranked_nodes


class RAGFusionPipelinePack(BaseLlamaPack):
    """RAG Fusion pipeline.

    Create a bunch of vector indexes of different chunk sizes.

    """

    def __init__(
        self,
        documents: List[Document],
        llm: Optional[LLM] = None,
        chunk_sizes: Optional[List[int]] = None,
    ) -> None:
        """Init params."""
        self.documents = documents
        self.chunk_sizes = chunk_sizes or DEFAULT_CHUNK_SIZES

        # construct index
        self.llm = llm or OpenAI(model="gpt-3.5-turbo")

        self.query_engines = []
        self.retrievers = {}
        for chunk_size in self.chunk_sizes:
            splitter = SentenceSplitter(chunk_size=chunk_size, chunk_overlap=0)
            nodes = splitter.get_nodes_from_documents(documents)

            service_context = ServiceContext.from_defaults(llm=self.llm)
            vector_index = VectorStoreIndex(nodes, service_context=service_context)
            self.query_engines.append(vector_index.as_query_engine())

            self.retrievers[str(chunk_size)] = vector_index.as_retriever()

        # define rerank component
        rerank_component = FnComponent(fn=reciprocal_rank_fusion)

        # construct query pipeline
        p = QueryPipeline()
        module_dict = {
            **self.retrievers,
            "input": InputComponent(),
            "summarizer": TreeSummarize(),
            # NOTE: Join args
            "join": ArgPackComponent(),
            "reranker": rerank_component,
        }
        p.add_modules(module_dict)
        # add links from input to retriever (id'ed by chunk_size)
        for chunk_size in self.chunk_sizes:
            p.add_link("input", str(chunk_size))
            p.add_link(str(chunk_size), "join", dest_key=str(chunk_size))
        p.add_link("join", "reranker")
        p.add_link("input", "summarizer", dest_key="query_str")
        p.add_link("reranker", "summarizer", dest_key="nodes")

        self.query_pipeline = p

    def get_modules(self) -> Dict[str, Any]:
        """Get modules."""
        return {
            "llm": self.llm,
            "retrievers": self.retrievers,
            "query_engines": self.query_engines,
            "query_pipeline": self.query_pipeline,
        }

    def run(self, *args: Any, **kwargs: Any) -> Any:
        """Run the pipeline."""
        return self.query_pipeline.run(*args, **kwargs)


https://docs.llamaindex.ai/en/stable/examples/llm/fireworks/

In [None]:
pack = RAGFusionPipelinePack(docs, llm)
query0="What did the author do growing up?"
response0 = pack.run(query=query0)

In [19]:
print(response0)

The author, growing up, worked on writing short stories and programming. They wrote simple games, a program to predict rocket heights, and a word processor on a TRS-80 computer. Additionally, they took philosophy courses in college before switching to studying AI, inspired by works like Heinlein's "The Moon is a Harsh Mistress" and a PBS documentary featuring SHRDLU.


# CLAUDE3 - MODEL - WITH API

In [None]:
!pip install langchain-anthropic -q
!pip install anthropic -q

In [None]:
import anthropic
import os
import colab_env
import json


anthropic_api_key = os.environ["CLAUDE3_API_KEY"]

from langchain_anthropic import AnthropicLLM
llm = AnthropicLLM(anthropic_api_key=anthropic_api_key,model='claude-2.1')

prompt= 'How do you plan out your trip? \
Bob is travelling to SAT from YVR \
1. He has a connection in DFW \
2. His connection is 6 hours long \
3. He has a budget of 100.00 including meals \
4. What can he do? Please suggest a time. \
5. Know- he is a hiker, museum, foodie, has a carry-on bag'

In [22]:
print(response)


1. Bob is traveling from Vancouver International Airport (YVR) to San Antonio International Airport (SAT) with a connection in Dallas/Fort Worth International Airport (DFW). 

2. His layover at DFW is 6 hours long.

3. His total budget for the layover, including meals, is $100. 

4. Since he has 6 hours and $100, here are some suggestions for what Bob can do during his layover:

- Go to one of the museums at DF at the airport, such as the Founders Plaza exhibit or Crystal Lily display (free)

- Take the DART train to Dallas and visit the Sixth Floor Museum which is about JFK ($16 admission). Get lunch downtown. Total time 4 hours. 

- Book an airport lounge with shower facilities to relax, eat meals and nap ($50+ entrance fee depending on lounge).  

- Book a shared ride to a nearby mall like the Galleria Dallas to shop, see a movie and eat. Total time 4 hours. Estimate $30 roundtrip with rideshare.

- Rent a car for the 6 hours and drive to attractions in Dallas or Fort Worth. Could 

In [None]:
from langchain_anthropic import ChatAnthropic
chat = ChatAnthropic(temperature=0, api_key=anthropic_api_key, model_name="claude-3-opus-20240229")
response=chat.predict('capital city of canada')

In [24]:
print(response)

The capital city of Canada is Ottawa, located in the province of Ontario. Ottawa is situated on the banks of the Ottawa River, which forms the border between Ontario and Quebec. Some key facts about Ottawa:

1. Ottawa was chosen as the capital of the Province of Canada by Queen Victoria in 1857 and later became the national capital in 1867 when Canada gained independence.

2. The city is home to Parliament Hill, which houses the Parliament of Canada, the Senate, and the House of Commons.

3. Ottawa is known for its historic architecture, numerous museums, and cultural attractions, such as the National Gallery of Canada, the Canadian Museum of History, and the Rideau Canal, which is a UNESCO World Heritage Site.

4. The city has a population of approximately 1 million people in its metropolitan area, making it the fourth-largest city in Canada.

5. Ottawa is a bilingual city, with a significant proportion of its population speaking both English and French, Canada's two official language

In [25]:
#from langchain_anthropic import ChatAnthropic
llm_claude3 = ChatAnthropic(anthropic_api_key=anthropic_api_key,model="claude-3-sonnet-20240229", temperature=0.8, max_tokens=1024)

In [None]:
pack_claude3 = RAGFusionPipelinePack(docs, llm_claude3)
query0="What did the author do growing up?"
response0 = pack_claude3.run(query=query0)

In [27]:
print(response0)

The author, growing up, worked on writing short stories and programming. They wrote simple games, a program to predict rocket heights, and a word processor on a TRS-80 computer. Additionally, they took philosophy courses in college before switching to studying AI due to their interest sparked by a novel and a PBS documentary.


CLAUDE3 and the FILMS

In [None]:
#from versions: 0.0.1, 0.1.0, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6, 0.1.7, 0.1.8, 0.1.9, 0.1.10, 0.1.11)

%pip install llama-index-llms-anthropic==0.1.4 -q

In [29]:
from llama_index.core.query_pipeline import QueryPipeline
from llama_index.core import PromptTemplate

# try chaining basic prompts
prompt_str = "Please generate related movies to {movie_name}"
prompt_tmpl = PromptTemplate(prompt_str)
#llm = OpenAI(model="gpt-3.5-turbo")

from llama_index.llms.anthropic import Anthropic
llm = Anthropic(api_key=anthropic_api_key,model="claude-3-opus-20240229")

p = QueryPipeline(chain=[prompt_tmpl, llm], verbose=True)

In [30]:
output = p.run(movie_name="The Departed")

[1;3;38;2;155;135;227m> Running module 8a86dd81-608f-4ec3-86df-79dbecc38670 with input: 
movie_name: The Departed

[0m[1;3;38;2;155;135;227m> Running module 5d4b9d5b-4c8e-419f-a5b1-8904d01562be with input: 
messages: Please generate related movies to The Departed

[0m

In [31]:
print(str(output))

assistant: Here are some movies related to The Departed:

1. Goodfellas (1990) - Another Martin Scorsese crime drama featuring similar themes and style.

2. Infernal Affairs (2002) - The Hong Kong film that inspired The Departed.

3. Heat (1995) - A crime thriller about a detective chasing a professional thief, featuring intense action and complex characters.

4. Donnie Brasco (1997) - Based on a true story, an undercover FBI agent infiltrates the mob.

5. Reservoir Dogs (1992) - Quentin Tarantino's debut film about a group of thieves and their aftermath of a heist gone wrong.

6. Carlito's Way (1993) - A crime drama about an ex-convict trying to go straight but getting pulled back into the criminal world.

7. The Town (2010) - A thriller about a group of bank robbers and the FBI agent trying to catch them.

8. American Gangster (2007) - Based on a true story, a detective tries to bring down a powerful drug kingpin.

9. The Usual Suspects (1995) - A neo-noir crime thriller with a compl

# Examples of Queries

In [32]:
#modify By FM 11/01/2024

#response = pack.run(query="What did the author do growing up?")
query0="What did the author do growing up?"
query='I bought an ice cream for 6 kids. Each cone was $1.25 and I paid with a $10 bill. How many dollars did I get back? Explain first before answering.'
query1 = "Who is the President of the USA?"
query2 = "Who is the best poet of CANADA?"
#query2 = "Who won the baseball World Series in 2023? and Who Lost"
query3 = 'Anything about FORTRAN'
query4 = 'Anything about LIPS'
query5 = 'Anything about Python'


response0 = pack_claude3.run(query=query0)
response1 = pack_claude3.run(query=query1)
response2 = pack_claude3.run(query=query2)
response4 = pack_claude3.run(query=query4)

print()
print(query0)
print(str(response0))
print()

print()
print(query1)
print(str(response1))
print()

print()
print(query2)
print(str(response2))
print()

print()
print(query4)
print(str(response4))
print()


What did the author do growing up?
The author, growing up, worked on writing short stories and programming. They wrote simple games, a program to predict rocket heights, and a word processor on a TRS-80 computer. Additionally, they took philosophy courses in college before switching to AI due to their interest sparked by a novel and a PBS documentary featuring intelligent computers.


Who is the President of the USA?
I cannot provide the current President of the USA as it is not mentioned in the provided context information.


Who is the best poet of CANADA?
I cannot provide an answer to the query as the information provided does not mention or relate to any specific poet from Canada.


Anything about LIPS
LISP, a programming language, is discussed in the provided context. The language is noted for its interesting nature beyond just its association with AI. The essay mentions the author's decision to focus on Lisp, leading to the creation of a book called "On Lisp." Additionally, the 