Tutorial em: https://wow.groq.com/retrieval-augmented-generation-with-groq-api/

Semantic Search
Garanta que esteja com python 3.10
In this walkthrough we will see how to use Pinecone for semantic search. To begin we must install the required prerequisite libraries:

In [2]:
!pip install -qU \
  pinecone-client==3.1.0 \
  pinecone-datasets==0.7.0 \
  sentence-transformers==2.2.2

Data Download
In this notebook we will skip the data preparation steps as they can be very time consuming and jump straight into it with the prebuilt dataset from Pinecone Datasets. If you'd rather see how it's all done, please refer to this notebook.

Let's go ahead and download the dataset.

In [8]:
from pinecone_datasets import load_dataset

dataset = load_dataset('quora_all-MiniLM-L6-bm25')
# we drop metadata as will use blob column
dataset.documents.drop(['metadata'], axis=1, inplace=True)
dataset.documents.rename(columns={'blob': 'metadata'}, inplace=True)
# we will use 80K rows of the dataset between rows 240K -> 320K
dataset.documents.drop(dataset.documents.index[320_000:], inplace=True)
dataset.documents.drop(dataset.documents.index[:240_000], inplace=True)
dataset.head()

Unnamed: 0,id,values,sparse_values,metadata
240000,515997,"[-0.00531694, 0.06937869, -0.0092854, 0.003286...","{'indices': [845, 1657, 13677, 20780, 27058, 2...","{'text': ' Why is a ""law of sciences"" importan..."
240001,515998,"[-0.09243751, 0.065432355, -0.06946959, 0.0669...","{'indices': [2110, 6324, 9754, 13677, 15207, 2...",{'text': ' Is it possible to format a BitLocke...
240002,515999,"[-0.021924071, 0.032280188, -0.020190848, 0.07...","{'indices': [2110, 4949, 23579, 23758, 27058, ...",{'text': ' Can formatting a hard drive stress ...
240003,516000,"[-0.120020054, 0.024080949, 0.10693012, -0.018...","{'indices': [22014, 24734, 24773, 25791, 25991...",{'text': ' Are the new Samsung Galaxy J7 and J...
240004,516001,"[-0.095293395, -0.048446465, -0.017618902, -0....","{'indices': [307, 2110, 5785, 12969, 12971, 13...",{'text': ' I just watched an add for Indonesia...


In [10]:
print(len(dataset))

80000


In [7]:
import os

use_serverless = os.environ.get("USE_SERVERLESS", "True").lower() == "true"

True

Creating an Index
Now the data is ready, we can set up our index to store it.

We begin by initializing our connection to Pinecone. To do this we need a free API key.

In [3]:
from pinecone import Pinecone
from pinecone import ServerlessSpec, PodSpec

# configure client
pc = Pinecone(api_key='82bd065a-4e41-4a1c-89c2-820308be7e24')
# use serverless
spec = ServerlessSpec(cloud='aws', region='us-west-2')
index_name = 'semantic-search-fast'

In [5]:
import time

existing_indexes = [
    index_info["name"] for index_info in pc.list_indexes()
]

# check if index already exists (it shouldn't if this is first time)
if index_name not in existing_indexes:
    # if does not exist, create index
    pc.create_index(
        index_name,
        dimension=384,  # dimensionality of minilm
        metric='dotproduct',
        spec=spec
    )
    # wait for index to be initialized
    while not pc.describe_index(index_name).status['ready']:
        time.sleep(1)

# connect to index
index = pc.Index(index_name)
time.sleep(1)
# view index stats
index.describe_index_stats()

{'dimension': 384,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

In [11]:
from tqdm.auto import tqdm

for batch in tqdm(dataset.iter_documents(batch_size=500), total=160):
    index.upsert(batch)

  0%|          | 0/160 [00:00<?, ?it/s]

In [12]:
from sentence_transformers import SentenceTransformer
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2', device=device)
model

.gitattributes:   0%|          | 0.00/1.23k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

train_script.py:   0%|          | 0.00/13.2k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Normalize()
)

In [18]:
query = "which city has the highest population in the world?"

# create the query vector
xq = model.encode(query).tolist()

# now query
xc = index.query(vector=xq, top_k=5, include_metadata=True)
xc

{'matches': [{'id': '69331',
              'metadata': {'text': " What's the world's largest city?"},
              'score': 0.785910666,
              'values': []},
             {'id': '69332',
              'metadata': {'text': ' What is the biggest city?'},
              'score': 0.727316499,
              'values': []},
             {'id': '84749',
              'metadata': {'text': " What are the world's most advanced "
                                   'cities?'},
              'score': 0.710067093,
              'values': []},
             {'id': '109231',
              'metadata': {'text': ' Where is the most beautiful city in the '
                                   'world?'},
              'score': 0.696097672,
              'values': []},
             {'id': '109230',
              'metadata': {'text': ' What is the greatest, most beautiful city '
                                   'in the world?'},
              'score': 0.65822351,
              'values': []}],
 'namespa

In [19]:
for result in xc['matches']:
    print(f"{round(result['score'], 2)}: {result['metadata']['text']}")

0.79:  What's the world's largest city?
0.73:  What is the biggest city?
0.71:  What are the world's most advanced cities?
0.7:  Where is the most beautiful city in the world?
0.66:  What is the greatest, most beautiful city in the world?


In [1]:
from pinecone import Pinecone
pc = Pinecone(api_key='82bd065a-4e41-4a1c-89c2-820308be7e24')
pinecone_index = pc.Index('semantic-search-fast')

In [2]:
from transformers import AutoModel

embedding_model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-en', trust_remote_code=True)
user_query = "user query"
query_embeddings = embedding_model.encode(user_query).tolist()
query_embeddings

[-0.19169047474861145,
 -0.33060839772224426,
 0.48784947395324707,
 -0.19269895553588867,
 0.10216296464204788,
 0.10024604946374893,
 0.5414406061172485,
 -0.17543010413646698,
 0.8870824575424194,
 1.3601000308990479,
 -0.7892287969589233,
 -0.2601028084754944,
 -0.5770358443260193,
 0.18329253792762756,
 -0.3339479863643646,
 0.20853963494300842,
 0.47661566734313965,
 -0.0634189024567604,
 0.06421688944101334,
 -0.47315430641174316,
 -0.35998308658599854,
 -0.8422293663024902,
 -1.033406376838684,
 0.04525456205010414,
 0.29470688104629517,
 1.126645565032959,
 0.5214318037033081,
 -0.2046876698732376,
 0.2245684564113617,
 0.8224840760231018,
 0.0012009590864181519,
 -0.06141866743564606,
 -0.3988493084907532,
 0.8727911710739136,
 -0.3059637248516083,
 -0.7053397297859192,
 -0.008409630507230759,
 -0.15613269805908203,
 0.35889989137649536,
 0.9658057689666748,
 -0.17554570734500885,
 0.08128426969051361,
 -0.36518222093582153,
 0.678146243095398,
 -0.3760606348514557,
 -0.15936

In [10]:
from sentence_transformers import SentenceTransformer
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2', device=device)

query = "which city has the highest population in the world?"
# create the query vector
xq = model.encode(query).tolist()
# now query
result = pinecone_index.query(vector=xq, top_k=5, include_values=False, include_metadata=True)
result

{'matches': [{'id': '69331',
              'metadata': {'text': " What's the world's largest city?"},
              'score': 0.785910666,
              'values': []},
             {'id': '69332',
              'metadata': {'text': ' What is the biggest city?'},
              'score': 0.727316499,
              'values': []},
             {'id': '84749',
              'metadata': {'text': " What are the world's most advanced "
                                   'cities?'},
              'score': 0.710067093,
              'values': []},
             {'id': '109231',
              'metadata': {'text': ' Where is the most beautiful city in the '
                                   'world?'},
              'score': 0.696097672,
              'values': []},
             {'id': '109230',
              'metadata': {'text': ' What is the greatest, most beautiful city '
                                   'in the world?'},
              'score': 0.65822351,
              'values': []}],
 'namespa

In [16]:
matched_info = ' '.join(item['metadata']['text'] for item in result['matches'])
#sources = [item['metadata']['source'] for item in result['matches']]
context = f"Information: {matched_info}"
sys_prompt = f"""
Instructions:
- Be helpful and answer questions concisely. If you don't know the answer, say 'I don't know'
- Utilize the context provided for accurate and specific information.
- Incorporate your preexisting knowledge to enhance the depth and relevance of your response.
- Cite your sources
Context: {context}
"""
sys_prompt


"\nInstructions:\n- Be helpful and answer questions concisely. If you don't know the answer, say 'I don't know'\n- Utilize the context provided for accurate and specific information.\n- Incorporate your preexisting knowledge to enhance the depth and relevance of your response.\n- Cite your sources\nContext: Information:  What's the world's largest city?  What is the biggest city?  What are the world's most advanced cities?  Where is the most beautiful city in the world?  What is the greatest, most beautiful city in the world?\n"

In [19]:
from groq import Groq

client = Groq(api_key='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')

completion = client.chat.completions.create(
    model="llama2-70b-4096",
    messages=[
        {
            "role": "user",
            "content": "which city has the highest population in the world? {sys_prompt}".format(sys_prompt=sys_prompt)
        }
    ],
    temperature=1,
    max_tokens=1024,
    top_p=1,
    stream=True,
    stop=None,
)

for chunk in completion:
    print(chunk.choices[0].delta.content or "", end="")


The city with the highest population in the world is Tokyo, Japan. As of 2023, the estimated population of Tokyo is over 13.8 million people. However, it's important to note that the metropolitan area of Tokyo has a population of over 38 million people, making it the largest metropolitan area in the world.

Source:

* "Tokyo Population 2023" (World Population Review)
* "Tokyo Metropolitan Area Population" (World Urbanization Prospects, United Nations)

It's worth noting that the definition of "city" can vary depending on the source and methodology used. Some sources may define a city based on its administrative boundaries, while others may use a wider definition that includes the surrounding metropolitan area. Therefore, the answer to the question of what the world's largest city is can depend on the specific definition being used.

In [32]:
from sentence_transformers import SentenceTransformer
import torch
from groq import Groq

client = Groq(api_key='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')
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2', device=device)


query = "Who is the richest person in the world?"
# create the query vector
xq = model.encode(query).tolist()
# now query
result = pinecone_index.query(vector=xq, top_k=5, include_values=False, include_metadata=True)

print(query)
print("||||||||||||||||||||||||||||||||||||")

for r in result['matches']:
    print(f"{round(r['score'], 2)}: {r['metadata']['text']}")

matched_info = ' '.join(item['metadata']['text'] for item in result['matches'])
#sources = [item['metadata']['source'] for item in result['matches']]
context = f"Information: {matched_info}"
sys_prompt = f"""
Instructions:
- Be helpful and answer questions concisely. If you don't know the answer, say 'I don't know'
- Utilize the context provided for accurate and specific information.
- Incorporate your preexisting knowledge to enhance the depth and relevance of your response.
- Cite your sources
Context: {context}
"""

print(f"""
      COM RAG ---------------------------------------------
        """)

completion = client.chat.completions.create(
    model="mixtral-8x7b-32768",
    messages=[
        {
            "role": "user",
            "content": "{user_query} {sys_prompt}".format(user_query=query, sys_prompt=sys_prompt)
        }
    ],
    temperature=1,
    max_tokens=1024,
    top_p=1,
    stream=True,
    stop=None,
)

for chunk in completion:
    print(chunk.choices[0].delta.content or "", end="")


print(f"""
      
      SEM RAG ---------------------------------------------
        """)

completion = client.chat.completions.create(
    model="mixtral-8x7b-32768",
    messages=[
        {
            "role": "user",
            "content": "{user_query} {sys_prompt}".format(user_query=query, sys_prompt='')
        }
    ],
    temperature=1,
    max_tokens=1024,
    top_p=1,
    stream=True,
    stop=None,
)

for chunk in completion:
    print(chunk.choices[0].delta.content or "", end="")

Who is the richest person in the world?
||||||||||||||||||||||||||||||||||||
1.0:  Who is the richest person in the world?
0.98:  Who is currently the richest person in the world?
0.95:  Who are the richest people in the world?
0.83:  Who is the richest person in asia?
0.8:  Who is the richest disabled person in the world?

      COM RAG ---------------------------------------------
        
As of my knowledge up to October 2021, the richest person in the world is Elon Musk, CEO of SpaceX and Tesla. He surpassed the previous titleholder, Jeff Bezos, CEO of Amazon. However, these rankings can change frequently. I'd recommend checking a reliable source for the most recent information.

To answer your other questions:

- The richest person in Asia is Mukesh Ambani, Chairman of Reliance Industries.
- I don't have real-time information on the richest disabled person in the world. It's important to note that a person's disability does not define their financial status or success. If you're i