# RAG Simples com Gemini e Llama Index 

Experimento simples para construção de uma RAG com um artigo científico utilizando Llama Index e Gemini.


**Referência do Artigo**

Larsson, D.G.J., Flach, CF. Antibiotic resistance in the environment. Nat Rev Microbiol 20, 257–269 (2022). https://doi.org/10.1038/s41579-021-00649-x

Disponível no link: https://rdcu.be/dUn5n


Link para documentação dos parâmetros da Gemini: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/adjust-parameter-values?hl=pt-br


In [None]:
# Bibliotecas utilizadas

!pip install google-generativeai
!pip install llama-index
!pip install llama-index-llms-gemini
!pip install llama-index-embeddings-huggingface
!pip install python-dotenv

In [1]:
import os
import google.generativeai as genai
from dotenv import load_dotenv

load_dotenv()  

genai.configure(api_key=os.getenv('GOOGLE_API_KEY'))

In [2]:
for m in genai.list_models():
    if "generateContent" in m.supported_generation_methods:
        print(m.name)

models/gemini-1.0-pro-latest
models/gemini-1.0-pro
models/gemini-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-vision-latest
models/gemini-pro-vision
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-001
models/gemini-1.5-pro
models/gemini-1.5-pro-exp-0801
models/gemini-1.5-pro-exp-0827
models/gemini-1.5-flash-latest
models/gemini-1.5-flash-001
models/gemini-1.5-flash-001-tuning
models/gemini-1.5-flash
models/gemini-1.5-flash-exp-0827
models/gemini-1.5-flash-8b-exp-0827


In [3]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["pubmed_document.pdf"]
).load_data()

In [4]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]), "\n")
print(documents[0])

<class 'list'> 

13 

<class 'llama_index.core.schema.Document'> 

Doc ID: 0badc5c2-8289-42ad-a29c-41507f703e27
Text: 0123456789();: Many bacterial species evolved the ability to
tolerate  antibiotics long before humans started to mass-produce  them
to prevent and treat infectious diseases1,2. Isolated  caves2,
permafrost cores1, and other environments and  specimens that have
been preserved from anthropo-genic bacterial contamination 3,4 can
provide insights  ...


## Basic RAG Pipeline

In [5]:
from llama_index.core import VectorStoreIndex
from llama_index.llms.gemini import Gemini
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.node_parser import TokenTextSplitter


llm  = Gemini(model="models/gemini-1.5-pro", temperature=0.3, top_p=1, top_k=32)
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

text_splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=20)
splitter = TokenTextSplitter(chunk_size=1024, chunk_overlap=20)


# global settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.text_splitter = text_splitter

index = VectorStoreIndex.from_documents(documents, show_progress=True)

query_engine = index.as_query_engine()

Parsing nodes:   0%|          | 0/13 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/31 [00:00<?, ?it/s]

In [6]:
response = query_engine.query(
    "What are the factors that contribute to the development and spread of antibiotic resistance??"
)
print(str(response))

The extensive use of antibiotics in human and veterinary medicine has exerted significant selection pressure on bacterial populations, leading to the emergence and dissemination of antibiotic resistance genes (ARGs). The natural microbial world, with its vast genetic diversity, serves as a reservoir of ARGs. These genes can be transferred between different bacterial species through mobile genetic elements, such as plasmids and integrons. The presence of antibiotics, while not essential for all steps, can accelerate the process by favoring the survival and proliferation of resistant bacteria. 

Factors like high metabolic activity and close contact between bacteria, as seen in biofilms, can further enhance the transfer of ARGs. The ability of bacteria and genes to move between different environments, including those of humans, animals, and the environment, highlights the interconnectedness of resistance development and transmission. 



In [7]:
response_2 = query_engine.query(
    "Explain the problem of pollution."
)
print(str(response_2))

Pollution from antibiotic manufacturing is a major problem, especially in developing countries, where it can lead to the emergence and spread of antibiotic-resistant bacteria. While some pharmaceutical companies have pledged to reduce emissions, there is a lack of transparency and enforcement, making it difficult to assess progress. 

Addressing this issue requires a multifaceted approach. Policymakers need to implement stricter regulations and incentives to encourage sustainable practices within the pharmaceutical industry.  Treating industrial wastewater, along with municipal and animal waste, is crucial to remove antibiotics and other contaminants.  Prioritizing these actions is essential to mitigate the global threat of antibiotic resistance. 



In [8]:
response_3 = query_engine.query(
    "How the environment contributes to the problem of antibiotic resistance??"
)
print(str(response_3))

The environment contains a vast and diverse microbiome, which serves as a reservoir of resistance genes that could potentially be transferred to pathogens. While the origin of most antibiotic resistance genes (ARGs) is unknown, the external environment offers an unmatched gene pool compared to humans or domestic animals. This diversity makes it likely that resistance factors already exist in the environment for any new antibiotic developed. 

