# Primera interaccion con un LLM desde VSCode



In [22]:
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from langchain_ollama.llms import OllamaLLM

# Llamada de prueba a nuestro modelo local

In [23]:

model = OllamaLLM(model="qwen2.5-coder:0.5b")

model.invoke("What is LangChain?")

"LangChain is an open-source framework for building conversational AI applications in Python and other programming languages. It includes several modules that facilitate the creation of natural language processing (NLP) systems. Here's a brief overview of what LangChain can do:\n\n1. **Text Processing**: \n   - **Tokenization**: Split text into individual words or tokens.\n   - **Cleaning**: Remove punctuation, stop words, and convert to lowercase.\n   - **Stemming/Lemmatization**: Reduce word frequencies to their root forms.\n\n2. **Machine Learning Models**:\n   - **Natural Language Generation (NLP)**: Use a pre-trained NLP model like GPT-3 or BERT for text generation.\n   - **Question Answering**: Train a model to answer questions based on conversational history or context.\n\n3. **Embeddings**:\n   - **Word Embeddings**: Represent words as numerical vectors using techniques like Word2Vec, GloVe, or FastText.\n   - **Sentence Embeddings**: Represent sentences as numeric vectors usin

## Comenzando nuestro primer proyecto

In [24]:
# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers) # obtener el sitio web
        soup = BeautifulSoup(response.content, 'html.parser') # formatear como HTML
        self.title = soup.title.string if soup.title else "No title found" # configurar 'title' en caso de que haya
        for irrelevant in soup.find_all(['script', 'style', 'img', 'input']): #eliminar etiquetas irrelevantes para el proyecto
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

# Probando el codigo

In [25]:

tacodevtripa = Website("https://github.com/tacodevtripa")
print(tacodevtripa.title)
print(tacodevtripa.text)

tacodevtripa (tacodevtripa) ¬∑ GitHub
Skip to content
Navigation Menu
Toggle navigation
Sign in
Product
GitHub Copilot
Write better code with AI
Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
White papers, Ebooks, Webinars
Customer Stories
Partners
Executive Insights
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Enterprise
E

## Tipos de Prompts


**system prompt** descripcion de la tarea especifica, asi como la manera en que esta debe ser realizada

**user prompt** -- Input por parte del usuario

In [26]:
"""En esta definicion cada detalle importa, el simple hecho de agregar una instruccion puedes cambiar por completo
el formato en el que el LLM genera su respuesta, incluyendo traduccion a otro idioma o formato especifico (JSON, YML, etc)"""

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

### Funcion para generar el formato deseado del input de usuario, incluyendo las variables para el contenido dinamico

In [27]:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [28]:
print(user_prompt_for(tacodevtripa))

You are looking at a website titled tacodevtripa (tacodevtripa) ¬∑ GitHub
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Skip to content
Navigation Menu
Toggle navigation
Sign in
Product
GitHub Copilot
Write better code with AI
Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
White papers, Ebooks, Web

# Mensajes

OpenAI establecio un formato para la manera en que la informacion es procesada por los LLMs, el siguiente ejemplo
es la forma mas basica, donde la estructura es una lista `[]` de diccionarios `{key:value}`, o elementos tipo llave:valor

Asi es posible para el LLMs diferencias que tipo de prompt esta procesando

In [42]:
messages = [
    {"role": "system", "content": "You are an assistant that answers in Spanish"},
    {"role": "user", "content": "How can I define a class in Python?"}
]

In [40]:
# Llamada a olama, el resultado es devuelto una vez terminado de procesar por completo
model.invoke(messages)

'La respuesta a tu pregunta es 4.'

In [43]:
# El resultado es devuelto 'a trozos', por lo que se puede iterar para mostrarlo en pantalla conforme es recibido
for chunk in model.stream(messages):
    print(chunk, end=" ") #usamos el parametro de `end=" "` para sobreescribir que por defecto `print()` agregue un salto de linea

En  Python ,  una  clase  es  una  plant illa  de  c√≥digo  que  def ini endo  un  nuevo  tipo  de  objeto .  Aqu √≠  tienes  algunos  pas os  b√°s icos  para  defin ir  una  clase :

 1 .  ** Def in ir  la  Nombre  de  la  Cl ase **:  El  nombre  de  tu  clase  debe  comenz ar  con  letra  o  may √∫ sc ula  y  seguir  por  letras ,  n√∫meros  o  gu iones  baj os  ( p uede  tener  com as ).

 2 .  ** Def in ir  Los  A trib utos  ( O  Prop iedades )** :  Los  atrib utos  son  las  caracter√≠sticas  que  tu  clase  tiene .  Se  def inen  en  Python  utilizando  la  palabra  ` class `  seg uida  del  nombre  de  la  clase .

     ``` python 
     class  MyClass :
         pass 
     `` `

 3 .  ** Def in ir  M√©t odos  ( O  Func iones )** :  Los  m√©t odos  son  las  acciones  que  una  clase  puede  realizar .  Se  def inen  en  Python  utilizando  el  oper ador  ` def `.

     ``` python 
     class  MyClass :
         def  __ init __( self ,  value ):
             self .value  =  value

## Funcion para crear los mensajes con el formato especificado anteriormente

In [44]:
def messages_for_LLM(system_prompt, website):
    return [
        {"role": "system", "content": system_prompt}, #configuracion del system prompt
        {"role": "user", "content": user_prompt_for(website)} #configuracion del input de usuario con los datos de la pagina web
    ]

In [45]:
messages_for_LLM(system_prompt, tacodevtripa)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': "You are looking at a website titled tacodevtripa (tacodevtripa) ¬∑ GitHub\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nSkip to content\nNavigation Menu\nToggle navigation\nSign in\nProduct\nGitHub Copilot\nWrite better code with AI\nSecurity\nFind and fix vulnerabilities\nActions\nAutomate any workflow\nCodespaces\nInstant dev environments\nIssues\nPlan and track work\nCode Review\nManage code changes\nDiscussions\nCollaborate outside of code\nCode Search\nFind more, search less\nExplore\nAll features\nDocumentation\nGitHub Skills\nBlog\nSolutions\nBy company size\nEnterprises\nSmall and medium teams\nStartups\nNonprofits\nBy use case\n

# Funcion para unir todo

## Al proveer la URL del sitio web deseado:
- Se creara la instancia de dicho sitio web
- Por medio de el constructor de dicho metodo se almacena la informacion deseada
- Y se devuelve la respuesta del LLM, usando los metodos anteriores para obtener el resultado deseado

In [46]:
def summarize(url):
    website = Website(url)
    return model.invoke(messages_for_LLM(system_prompt, website))

In [48]:
# Probando funcion
summarize("https://github.com/tacodevtripa")

"# TacodevTripa\n\nWelcome to the official GitHub of 'taco dev tripa'. Here, you'll find all the resources from the YouTube videos.\n\n# Resources\n\n- **Cheat Sheets**: Include my cheat sheets on HTML & CSS, React apps.\n- **Boilerplates**: Provide boilerplates for API with Node.js, Spring Boot, and more.\n- **Reusable Components**: Offer reusable components for frontend projects, backend projects, databases, and AWS solutions.\n- **YouTube Channels**: Watch tutorials from my YouTube Channel.\n- **GitHub Sponsors**: Learn more about GitHub sponsors and support.\n- **Feedback**: Use saved searches to filter your results more quickly.\n- **Code Review**: Manage code changes.\n- **Discussions**: Participate in discussions.\n\n# Contact\n\nLet's Connect:\n\n- **YouTube**: Subscribe to the YouTube Channel\n- **Discord**: Join the Discord Server\n- **Support**: If you enjoy the content and find it helpful, star the repositories you love.\n\n# Footer\n\n¬© 2025 GitHub, Inc.\nFooter navigatio

In [19]:
# Metodo para modificar el formato de la respuesta con la libreria de `Markdown`
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [None]:
#Probando funcion
display_summary("https://github.com/tacodevtripa")

# Probemos mas sitios web

Este proyecto solo funcionara con sitios web muy basicos, puesto que algunos sitios web que usen frameworks como React, al renderizar los elementos conforme interactuas con la pagina, la informacion puede ser no muy precisa

E incluso algunos sitios web cuentan con alguna proteccion basica gracias a sus proveedores de DNS, como CloudFrount, y su acceso por medio de estas librerias puede no funcionar

In [None]:
display_summary("https://www.thetimes.com/")

In [None]:
display_summary("https://python.langchain.com/docs/introduction/")

## Ejemplo Adicional

- El `System Prompt` fue definido para crear un ayudante en la creacion de los asuntos para los correos electronicos de acuerdo a su contenido
- Por `User Prompt` pasaremos el contenido completo de un correo electronico inventado, la forma en que esta informacion llegue ya la dejo a su imaginacion
- Notese como se especifica en el System Prompt que solo provea el 'asunto' para dicho email, para poder utilizar dicho valor y guardarlo en una variable, esto puede o no funcionar dependiendo de la 'fuerza' de cada LLM, pero por lo general siempre funciona, cosa que incluso puede ser cambiada para pedir varias opciones en una sola respuesta

In [None]:
system_prompt = """Analyze the email content and suggest a clear, concise, and effective subject line. Your task is to:

1. Read and understand the entire email content.
2. Identify the main topic or purpose of the email.
3. Determine the tone and language used in the email (e.g., formal, informal, promotional).
4. Consider the recipient's context and potential emotional resonance.

Create a subject line that accurately reflects the content of the email and is likely to capture the attention of
the recipient.

Provide nothing but the subject line, so the value can be stored in a variable, and provide 3 options"""

user_prompt = """
Dear Tacodevtripa,

We're excited to announce the launch of our new product, "EcoCycle" - a revolutionary recycling system designed to
help households and businesses reduce their waste output.

As one of our valued partners, we're offering you an exclusive 10% discount on your first purchase. Simply use the
code ECOCYCLE10 at checkout to redeem your discount.

Our team has worked tirelessly to develop EcoCycle, and we're confident it will make a significant impact in
reducing waste and promoting sustainability.

To learn more about EcoCycle and how it can benefit your organization, please click on the link below:


Best regards,
EcoCycle Team
"""

def messages_for_email_agent(system_prompt, content):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": content}
    ]
messages = messages_for_email_agent(system_prompt, user_prompt)


model.invoke(messages)