
# Avance 4

El objetivo principal del proyecto es implementar técnicas de RAG (Retrieval-Augmented Generation) y Graph RAG para abordar las limitaciones actuales de los Modelos de Lenguaje de Gran Escala (LLMs), que pierden relevancia en sus respuestas al enfrentarse a información no incluida en su entrenamiento. Como parte del proyecto, se desarrollará un chatbot que permitirá a los usuarios acceder a información pertinente extraída de una base de conocimientos.

En esta fase del proyecto, se han establecido los primeros componentes del flujo de datos. Estos incluyen **la carga del documento que contiene la base de información**, **la integración de un modelo de embedding preentrenado**, **la creación de una base de conocimientos vectorizada., implementación de modelos base por medio de RAG**

## Modelo de encoder y busqueda por similaridad

In [None]:
!pip install -r requirements.txt

In [2]:
#!pip install langchain

In [3]:
import time

In [4]:
import torch
import math

#check for GPU MacOs
print(torch.backends.mps.is_available())
print(torch.backends.mps.is_built())

False
False


In [5]:
print(torch.cuda.is_available())

True


In [6]:
encoder_model_name = 'Alibaba-NLP/gte-large-en-v1.5'

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings

encoder = HuggingFaceEmbeddings(
    model_name = encoder_model_name,
    #model_kwargs = {'device': "cpu"}
    model_kwargs = {"trust_remote_code":True}
    #model_kwargs = {'device': "mps"} #mac
)


In [8]:
embeddings = encoder.embed_query("How are you?")

In [None]:
print(len(embeddings))

## Carga de documento y separación de texto

In [10]:
from langchain.document_loaders import PyPDFLoader

loaders = [
    PyPDFLoader("goog023-alphabet-2023-annual-report-web-1.pdf"),
]
pages = []
for loader in loaders:
    pages.extend(loader.load())

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from transformers import AutoTokenizer

text_splitter = RecursiveCharacterTextSplitter.from_huggingface_tokenizer(
    tokenizer=AutoTokenizer.from_pretrained(encoder_model_name),
     chunk_size=256,
     chunk_overlap=32,
     strip_whitespace=True,
)

docs = text_splitter.split_documents(pages)

In [None]:
print(len(pages))
print(len(docs))

## Base de datos de conocimiento vectorizada

In [13]:
from langchain.vectorstores import FAISS
from langchain_community.vectorstores.utils import DistanceStrategy

faiss_db = FAISS.from_documents(
    docs, encoder, distance_strategy=DistanceStrategy.COSINE
)


In [None]:
faiss_db

In [15]:
def similarity_search(question: str, faiss_db, k):
  retrieved_docs = faiss_db.similarity_search(question, k=k)
  context = "".join(doc.page_content + "\n" for doc in retrieved_docs)
  return context

## LLMs

In [16]:
torch.cuda.empty_cache()

In [17]:
import os
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from dotenv import load_dotenv


In [None]:
#ACCESS_TOKEN = os.getenv("ACCESS_TOKEN") # reads .env file with ACCESS_TOKEN=<your hugging face access token>

ACCESS_TOKEN = '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'


model_id_3 = "meta-llama/Meta-Llama-3-8B-Instruct"

quantization_config = BitsAndBytesConfig(load_in_4bit=True,
                                         bnb_4bit_compute_dtype=torch.bfloat16)

tokenizer = AutoTokenizer.from_pretrained(model_id_3,token=ACCESS_TOKEN)
model = AutoModelForCausalLM.from_pretrained( model_id_3,
                                              quantization_config=quantization_config,
                                              token=ACCESS_TOKEN
                                              )



In [19]:
tokenizers=[]
tokenizers.append(tokenizer)

models=[]
models.append(model.eval())

model_names = ['llama_3_4_bits']

device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [20]:
def generate(question: str, context: str, tokenizer, model):

  start_time = time.time()  # Start the timer

  if context == None or context == "":
      prompt = f"""Give a detailed answer to the following question. Question: {question}"""
  else:
      prompt = f"""Using the information contained in the context, give a detailed answer to the question.
          Context: {context}.
          Question: {question}"""
  chat = [{"role": "user", "content": prompt}]
  formatted_prompt = tokenizer.apply_chat_template(
      chat,
      tokenize=False,
      add_generation_prompt=True,
  )
  inputs = tokenizer.encode(
      formatted_prompt, add_special_tokens=False, return_tensors="pt"
  ).to(device)
  with torch.no_grad():
      outputs = model.generate(
          input_ids=inputs,
          max_new_tokens=250,
          do_sample=False,
      )
  response = tokenizer.decode(outputs[0], skip_special_tokens=False)
  response = response[len(formatted_prompt) :]  # remove input prompt from reponse
  response = response.replace("<eos>", "")  # remove eos token

  end_time = time.time()  # End the timer
  print(f"Running time: {end_time - start_time:.2f} seconds")  # Print the running time

  return response

## Comparación de modelos

In [21]:
question = "What are the customers five key capabilities from Google Cloud?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 23.65 seconds
Google Cloud provides a wide range of capabilities to its customers, but here are five key capabilities that are particularly noteworthy:

**1. Cloud Infrastructure**: Google Cloud offers a robust and scalable cloud infrastructure that enables customers to run their applications and workloads in a secure, reliable, and highly available environment. This includes:

* Compute services: Google Compute Engine, Google Kubernetes Engine (GKE), and Cloud Functions allow customers to run their applications, deploy containers, and execute serverless functions.
* Storage services: Google Cloud Storage, Cloud Datastore, and Cloud Firestore provide scalable and durable storage options for customers' data.
* Networking services: Google Cloud provides a range of networking services, including Virtual Private Cloud (VPC), Cloud Load Balancing, and Cloud CDN, to help customers build and manage their networks.

**2. Artificial Intelligence and Machine Learning**: Google Clou

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 31.62 seconds
According to the provided context, the five key capabilities from Google Cloud are:

1. **AI-optimized Infrastructure**: provides open, reliable, and scalable compute, networking, and storage to enable customers to run workloads anywhere — on our Cloud, at the edge, or in their data centers.
2. **Cybersecurity**: helps customers detect, protect, and respond to a broad range of cybersecurity threats, with AI integrated to further strengthen security outcomes, prioritize which threats to investigate, and identify attack paths, as well as accelerate resolution of cybersecurity threats.
3. **Databases and Analytics**: provides a variety of different types of databases — relational, key-value, in-memory — to store and manage data for different types of applications. Our Data Cloud also unifies data lakes, data warehouses, data governance, and advanced machine learning into a single platform that can analyze data across any cloud.
4. **Collaboration Tools**: Google Workspace and Duet AI in Google Workspace provide easy-to-use, secure communication and collaboration tools, including apps like Gmail, Docs, Drive, Calendar, Meet, and more. These tools enable secure and seamless collaboration across devices and platforms.
5. **AI**: provides pre-packaged AI agents that assist developers to write

In [22]:
torch.cuda.empty_cache()

In [23]:
question = "What occurred in the dispute between Epic Games and Google in December 2023?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 19.07 seconds
I apologize, but since December 2023 has not yet occurred, there was no dispute between Epic Games and Google in that month. However, I can provide information on the ongoing dispute between Epic Games and Google, which has been ongoing since 2020.

The dispute between Epic Games and Google revolves around the revenue split and policies of the Google Play Store, which is the primary platform for Android apps. Epic Games, the developer of the popular game Fortnite, has been critical of the 30% revenue cut that Google takes from app sales and in-app purchases.

In August 2020, Epic Games launched a lawsuit against Google, alleging that the company's policies and practices violate antitrust laws and unfairly favor Google's own apps. The lawsuit claimed that Google's 30% revenue cut is excessive and stifles innovation, while also giving Google an unfair advantage over other app developers.

Since then, the dispute has continued, with Epic Games and Google engagi

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 18.59 seconds
According to the provided context, in December 2023, a California jury delivered a verdict in Epic Games v. Google, finding that Google violated antitrust laws related to Google Play's billing practices. The presiding judge will determine remedies in 2024, and the range of potential remedies vary widely. Google plans to appeal the verdict.<|eot_id|>llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 18.59 seconds


In [24]:
torch.cuda.empty_cache()

In [25]:
question = "How do Google's financial results for 2023 compare with those of 2022?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 19.19 seconds
As the largest and most profitable technology company in the world, Google's financial performance is closely watched by investors, analysts, and the general public. Here's a detailed analysis of Google's financial results for 2023 compared to those of 2022:

**Revenue**

Google's revenue for 2023 was $162.4 billion, a 10% increase from $147.9 billion in 2022. This growth is primarily driven by the company's continued dominance in the digital advertising market, as well as the increasing adoption of its cloud computing services.

**Operating Income**

Google's operating income for 2023 was $44.4 billion, a 12% increase from $39.5 billion in 2022. This growth is a result of the company's ability to maintain its operating margins, despite the increasing costs associated with expanding its cloud infrastructure and hiring more employees.

**Net Income**

Google's net income for 2023 was $34.4 billion, a 10% increase from $31.3 billion in 2022. This growth is pri

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 29.88 seconds
According to the provided context, Google's financial results for 2023 compared to those of 2022 are as follows:

* Google Services operating income increased by $13.2 billion, primarily driven by an increase in revenues, partially offset by an increase in content acquisition costs and compensation expenses, including an increase in SBC expense. Additionally, operating income benefited from a reduction in costs driven by the change in the estimated useful lives of our servers and certain network equipment.
* General and administrative expenses increased by $701 million, primarily driven by an increase in compensation expenses of $416 million, which was largely the result of $264 million in employee severance charges associated with the reduction in our workforce, in addition to a combination of other factors, none of which were individually significant.
* Google Cloud operating income increased by $3.6 billion, primarily driven by an increase in revenues, partially offset by an increase in compensation expenses largely driven by headcount growth. Additionally, operating income benefited from a reduction in costs driven by the change in the estimated useful lives of our servers and certain network equipment.
* Other Bets operating loss decreased by $541 million, primarily due to growth in revenues as well as a reduction in valuation-based compensation liabilities related to Other

In [26]:
torch.cuda.empty_cache()

In [27]:
question = "what is the number google's of employees in 2023?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 19.05 seconds
A question that requires some digging!

As of 2023, Google's parent company, Alphabet Inc., has not publicly disclosed the exact number of employees. However, I can provide some information on the company's workforce and growth trends.

In 2022, Alphabet reported a total of 174,752 employees worldwide in its annual 10-K filing with the Securities and Exchange Commission (SEC). This number includes employees from both Google and its other subsidiaries, such as Waymo, Verily, and Nest.

It's worth noting that this number has been steadily increasing over the years. In 2012, Google had around 54,000 employees, and by 2019, that number had more than tripled to around 163,000 employees.

While we don't have an exact number for 2023, we can make an educated estimate based on the company's historical growth rate. Assuming a similar growth rate, we can estimate that Alphabet's employee count has likely increased by around 5-7% in 2023, which would put the number of 

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 10.71 seconds
According to the provided context, Alphabet had 182,502 employees as of December 31, 2023.<|eot_id|>

In [28]:
torch.cuda.empty_cache()

In [29]:
question = "what is the percentage of total revenues that comes from online advertising?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 19.06 seconds
The percentage of total revenues that comes from online advertising varies depending on the industry, company, and geographic region. However, I can provide some general insights and statistics to give you an idea of the scope.

**Global Online Advertising Revenue**

According to a report by GroupM, a leading global media investment group, the global online advertising revenue was estimated to be around $135 billion in 2020. This represents a significant growth from $70 billion in 2015, with a compound annual growth rate (CAGR) of around 10%.

**Breakdown by Industry**

Here's a breakdown of online advertising revenue by industry:

1. **Digital Media**: Online advertising revenue from digital media platforms like Google, Facebook, and YouTube accounts for around 40-50% of the total online advertising revenue. This includes search ads, display ads, video ads, and sponsored content.
2. **E-commerce**: Online advertising revenue from e-commerce platforms like A

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 14.03 seconds
According to the provided context, in 2023, Google generated more than 75% of its total revenues from online advertising.<|eot_id|>

In [30]:
torch.cuda.empty_cache()

In [33]:
question = "what is the percentage of total revenues that comes from international revenues?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Sin contexto:--')
  print(generate(question=question, context='', tokenizer=tokenizers[i], model=llm_model ))
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Sin contexto:--


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


Running time: 19.11 seconds
To provide a detailed answer, I'll need more information about the company or industry in question. However, I can provide a general overview of how to calculate the percentage of total revenues that comes from international revenues.

**Methodology:**

To calculate the percentage of total revenues that comes from international revenues, you'll need the following information:

1. Total revenues: The total amount of money earned by the company or industry from all sources, including domestic and international sales.
2. International revenues: The amount of money earned by the company or industry from international sales, including exports, foreign sales, and other international revenue streams.

**Calculation:**

Once you have the total revenues and international revenues, you can calculate the percentage of total revenues that comes from international revenues using the following formula:

Percentage of international revenues = (International revenues ÷ Tota

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 12.91 seconds
According to the provided context, international revenues accounted for approximately 53% of our consolidated revenues in 2023.<|eot_id|>

In [34]:
torch.cuda.empty_cache()

In [35]:
question = "what was the most importat source of revenue?"


context = similarity_search(question, faiss_db, k=10)


for i, llm_model in enumerate(models):

  print(model_names[i],'-'*100)
  print('*'*100)
  print(model_names[i],'--Contexto (gte-large-en-v1.5):--')
  print(generate(question=question, context=context, tokenizer=tokenizers[i], model=llm_model ))

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


llama_3_4_bits ----------------------------------------------------------------------------------------------------
****************************************************************************************************
llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 25.60 seconds
Based on the provided context, the most important source of revenue for Alphabet (Google) was Google Services, which includes:

1. Google Advertising: This includes revenues generated from Google Search & other properties, Google Network properties, and YouTube properties.
2. Google Subscriptions, Platforms, and Devices: This includes revenues from consumer subscriptions, such as YouTube TV, YouTube Music, and Premium, as well as Google One.

According to the provided financial information, Google Services accounted for a significant portion of Alphabet's revenues, with a 9% year-over-year increase in 2023. Within Google Services, Google Advertising was the largest contributor, with a 13% increase

llama_3_4_bits --Contexto (gte-large-en-v1.5):--
Running time: 25.60 seconds
Based on the provided context, the most important source of revenue for Alphabet (Google) was Google Services, which includes:

1. Google Advertising: This includes revenues generated from Google Search & other properties, Google Network properties, and YouTube properties.
2. Google Subscriptions, Platforms, and Devices: This includes revenues from consumer subscriptions, such as YouTube TV, YouTube Music, and Premium, as well as Google One.

According to the provided financial information, Google Services accounted for a significant portion of Alphabet's revenues, with a 9% year-over-year increase in 2023. Within Google Services, Google Advertising was the largest contributor, with a 13% increase in operating income and a 27% increase in diluted EPS.

The other sources of revenue, such as Google Cloud and Other Bets, also contributed to Alphabet's overall revenue growth, but Google Services was the primary driver of the company's financial performance.<|eot_id|>