# 0. Instalación de librerías

In [None]:
# !pip install streamlit
# !pip install pyngrok==4.1.1
# #https://dashboard.ngrok.com/signup
# !pip install --upgrade typing_extensions
# !pip install openai
# !pip install pypdf
# !pip install langchain

# 1. Crear Vector Store

## 1.1. Carga de documento pdf

In [1]:
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("llm_doc.pdf")
documents = loader.load()

## 1.2. Generación de 'chunks'

In [16]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size    = 1000,
                                               chunk_overlap = 100)
doc_splits = text_splitter.split_documents(documents)

In [17]:
doc_splits[2].page_content

'1.\nIntroduction to LLMOps\nGenerative AI models have gained wide popularity in recent times with the adoption of  \ntransformer-based neural network architectures. Generative model’s ability to generate new data \nenables them to go beyond traditional prediction and classification use cases. These models \nare now used across domains and use cases like chatbots, question answering, fraud detection, \nprotein folding and many more.\nGenerative AI models for natural language use cases are powered by Large Language Models (LLMs). \nLLMs are transformer-based Deep Learning architectures that harness vast amounts of textual \ndata to develop language and domain understanding. The models are built with an emphasis on \ngenerating human-like responses and reasoning. Their ability to understand human languages allows \nthem to serve as powerful tools for information retrieval, natural language processing, language \ntranslation and even creative writing.'

In [18]:
doc_splits[3].page_content

'translation and even creative writing.\nUsing large language models in production environments poses a certain unique set of challenges \nsuch as organizing LLMs into agents for sub-tasks, developing robust instructions for each  \nLLM agent, evaluating the correctness of generated response and efficiencies with fine-tuning.  \nHence, effective usage in a production environment requires appropriate infrastructure and practices \nfocusing on experimentation, deployment, management and monitoring of large language models. \nLarge Language Model Operations (LLMOps) is a framework of tools and best practices  \nto manage the lifecycle of LLM-powered applications, from development to deployment  \nand maintenance.\nThe aim is to enable AI capabilities with LLMs by developing better prompts, longer context, \nfaster inference and customized techniques that enable rapid experimentation and innovation \nwith LLMs. Together, these allow data scientists and engineers to collaborate effectively 

In [19]:
doc_splits[3].metadata

{'source': 'llm_doc.pdf', 'page': 2}

## 1.3. Crear Vector Store como dataframe

In [20]:
import pandas as pd
data = [{'Chunks': doc.page_content, 'Metadata': doc.metadata} for doc in doc_splits]
df_vector_store = pd.DataFrame(data)
df_vector_store.head()

Unnamed: 0,Chunks,Metadata
0,Building Pipelines and Environments for \nLar...,"{'source': 'llm_doc.pdf', 'page': 0}"
1,Contents\nIntroduction to LLMOps 1\nWhy LLMOps...,"{'source': 'llm_doc.pdf', 'page': 1}"
2,1.\nIntroduction to LLMOps\nGenerative AI mode...,"{'source': 'llm_doc.pdf', 'page': 2}"
3,translation and even creative writing.\nUsing ...,"{'source': 'llm_doc.pdf', 'page': 2}"
4,"with LLMs. Together, these allow data scientis...","{'source': 'llm_doc.pdf', 'page': 2}"


In [22]:
from openai import OpenAI
import numpy as np
from google.colab import userdata
client = OpenAI(api_key=userdata.get('openai_key'))

def text_embedding(text=[]):
    embeddings = client.embeddings.create(model="text-embedding-ada-002",
                                          input=text,
                                          encoding_format="float")
    return embeddings.data[0].embedding

df_vector_store["Embedding"] = df_vector_store["Chunks"].apply(lambda x: text_embedding([x]))
df_vector_store["Embedding"] = df_vector_store["Embedding"].apply(np.array)

df_vector_store.to_pickle('df_vector_store.pkl')
df_vector_store.head()

Unnamed: 0,Chunks,Metadata,Embedding
0,Building Pipelines and Environments for \nLar...,"{'source': 'llm_doc.pdf', 'page': 0}","[-0.0004000346, -0.012532205, 0.017549334, -0...."
1,Contents\nIntroduction to LLMOps 1\nWhy LLMOps...,"{'source': 'llm_doc.pdf', 'page': 1}","[0.0031110481, -0.0044141663, -0.0021332903, -..."
2,1.\nIntroduction to LLMOps\nGenerative AI mode...,"{'source': 'llm_doc.pdf', 'page': 2}","[-0.020828877, -0.008388099, -0.013470776, -0...."
3,translation and even creative writing.\nUsing ...,"{'source': 'llm_doc.pdf', 'page': 2}","[-0.006481511, -0.005626922, -0.0033462602, -0..."
4,"with LLMs. Together, these allow data scientis...","{'source': 'llm_doc.pdf', 'page': 2}","[0.0069767754, -0.0102603715, 0.0035248334, -0..."


# 2.  Formulación de pregunta

In [23]:
query = '¿Cómo se selecciona un modelo llm?'

# 3. Búsqueda semántica

In [24]:
def get_dot_product(row):
    return np.dot(row, query_vector)

def cosine_similarity(row):
    denominator1 = np.linalg.norm(row)
    denominator2 = np.linalg.norm(query_vector.ravel())
    dot_prod = np.dot(row, query_vector)
    return dot_prod/(denominator1*denominator2)

def get_context_from_query(query, vector_store, n_chunks = 5):
    global query_vector
    query_vector = np.array(text_embedding(query))
    top_matched = (
        vector_store["Embedding"]
        .apply(cosine_similarity)
        .sort_values(ascending=False)[:n_chunks]
        .index)
    top_matched_df = vector_store[vector_store.index.isin(top_matched)][["Chunks"]]
    return list(top_matched_df['Chunks'])

Context_List = get_context_from_query(
    query        = query,
    vector_store = df_vector_store,
    n_chunks     = 5)

for chunk in Context_List:
  print("#########################")
  print(chunk)

#########################
Contents
Introduction to LLMOps 1
Why LLMOps? 1
Typical stages in an LLMOps Workflow 2
 3.1 Data Collection, Preparation, Labelling 2
 3.2 Selection of Foundation Models 3
 3.3 Using Large Language Models - Prompting and Fine-tuning 4
 3.4 Evaluation of Prompts and Models & Version Control 5
 3.5 Deployment and Monitoring 6
 3.6 Security, Privacy, Governance and Ethical Considerations 8
Setting Up LLMOps Pipelines 9
Conclusions 11
References 11
#########################
translation and even creative writing.
Using large language models in production environments poses a certain unique set of challenges 
such as organizing LLMs into agents for sub-tasks, developing robust instructions for each  
LLM agent, evaluating the correctness of generated response and efficiencies with fine-tuning.  
Hence, effective usage in a production environment requires appropriate infrastructure and practices 
focusing on experimentation, deployment, management and monitoring of l

# 5. Construir prompt

In [25]:
custom_prompt = """
Eres una Inteligencia Artificial super avanzada que trabaja asistente personal.
Utilice los RESULTADOS DE BÚSQUEDA SEMANTICA para responder las preguntas del usuario.
Solo debes utilizar la informacion de la BUSQUEDA SEMANTICA si es que hace sentido y tiene relacion con la pregunta del usuario.
Si la respuesta no se encuentra dentro del contexto de la búsqueda semántica, no inventes una respuesta, y responde amablemente que no tienes información para responder.

RESULTADOS DE BÚSQUEDA SEMANTICA:
{source}

Lee cuidadosamente las instrucciones, respira profundo y escribe una respuesta para el usuario!
""".format(source = str(Context_List))

# 6. Obtener respuesta

In [26]:
from google.colab import userdata
from openai import OpenAI
client = OpenAI(api_key=userdata.get('openai_key'))

completion = client.chat.completions.create(
  model="gpt-4",
  temperature = 0.0,
  messages=[
    {"role": "system", "content": custom_prompt},
    {"role": "user", "content": query}
  ]
)

print(completion.choices[0].message.content)

La selección de un modelo LLM (Large Language Model) implica varios factores importantes. Primero, debido a que entrenar LLMs desde cero es costoso y consume mucho tiempo, los investigadores y profesionales a menudo optan por usar LLMs preentrenados como modelos fundamentales. Estos modelos pueden ser ajustados o adaptados a aplicaciones específicas con menos datos y recursos. Algunos modelos populares incluyen BERT, GPT-3, T5 y XLNet.

Al elegir un modelo, uno debe considerar su rendimiento en tareas comparativas, la disponibilidad de modelos pre-entrenados específicos del dominio (por ejemplo, modelos centrados en la industria financiera o en la medicina), y los términos de licencia del modelo. Entre los factores a considerar también está si el modelo está disponible para ejecutarse en hardware menos costoso y si hay servicios de terceros que ofrecen despliegue y soporte técnico por una tarifa.

Es crucial seleccionar el modelo que mejor se adapte a las necesidades de tu proyecto, ev

# 7. Ejecutar Streamlit App

In [2]:
from google.colab import userdata
ngrok_token = userdata.get('ngrok_token')
!ngrok authtoken $ngrok_token

Authtoken saved to configuration file: /root/.ngrok2/ngrok.yml


In [90]:
!nohup streamlit run app.py &
from pyngrok import ngrok
public_url = ngrok.connect(port='8501')
public_url

nohup: appending output to 'nohup.out'


'http://cd84-34-145-94-217.ngrok-free.app'

In [89]:
# Terminate all active ngrok tunnels
ngrok.kill()