# Primera interaccion con un LLM desde VSCode



In [9]:
import os
import requests
from bs4 import BeautifulSoup
from IPython.display import Markdown, display
from langchain_ollama.llms import OllamaLLM

# Llamada de prueba a nuestro modelo local

In [10]:

model = OllamaLLM(model="llama3.2")

model.invoke("What is LangChain?")

"LangChain is an open-source framework for building blockchain-agnostic data pipelines, specifically designed for the Polkadot and Substrate ecosystems. It aims to simplify the process of integrating data storage, querying, and processing with decentralized applications (dApps).\n\nLangChain provides a set of libraries, tools, and frameworks that enable developers to:\n\n1. Store and manage data on-chain using blockchain-based storage solutions like Pinata and IPFS.\n2. Query data from the blockchain using Substrate's query system or external APIs.\n3. Process and transform data using Rust and other programming languages.\n4. Build robust and scalable data pipelines for dApps.\n\nThe core components of LangChain include:\n\n1. `langchain-store`: A storage library that allows you to store, retrieve, and update data on the blockchain.\n2. `langchain-query`: A querying library that enables you to fetch data from the blockchain using Substrate's query system or external APIs.\n3. `langchai

## Comenzando nuestro primer proyecto

In [11]:
# Some websites need you to use proper headers when fetching them:
headers = {
 "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36"
}

class Website:

    def __init__(self, url):
        self.url = url
        response = requests.get(url, headers=headers) # obtener el sitio web
        soup = BeautifulSoup(response.content, 'html.parser') # formatear como HTML
        self.title = soup.title.string if soup.title else "No title found" # configurar 'title' en caso de que haya
        for irrelevant in soup.find_all(['script', 'style', 'img', 'input']): #eliminar etiquetas irrelevantes para el proyecto
            irrelevant.decompose()
        self.text = soup.body.get_text(separator="\n", strip=True)

# Probando el codigo

In [12]:

tacodevtripa = Website("https://github.com/tacodevtripa")
print(tacodevtripa.title)
print(tacodevtripa.text)

tacodevtripa (tacodevtripa) · GitHub
Skip to content
Navigation Menu
Toggle navigation
Sign in
Product
GitHub Copilot
Write better code with AI
Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
White papers, Ebooks, Webinars
Customer Stories
Partners
Executive Insights
Open Source
GitHub Sponsors
Fund open source developers
The ReadME Project
GitHub community articles
Repositories
Topics
Trending
Collections
Enterprise
En

## Tipos de Prompts


**system prompt** descripcion de la tarea especifica, asi como la manera en que esta debe ser realizada

**user prompt** -- Input por parte del usuario

In [13]:
"""En esta definicion cada detalle importa, el simple hecho de agregar una instruccion puedes cambiar por completo
el formato en el que el LLM genera su respuesta, incluyendo traduccion a otro idioma o formato especifico (JSON, YML, etc)"""

system_prompt = "You are an assistant that analyzes the contents of a website \
and provides a short summary, ignoring text that might be navigation related. \
Respond in markdown."

### Funcion para generar el formato deseado del input de usuario, incluyendo las variables para el contenido dinamico

In [16]:

def user_prompt_for(website):
    user_prompt = f"You are looking at a website titled {website.title}"
    user_prompt += "\nThe contents of this website is as follows; \
please provide a short summary of this website in markdown. \
If it includes news or announcements, then summarize these too.\n\n"
    user_prompt += website.text
    return user_prompt

In [17]:
print(user_prompt_for(tacodevtripa))

You are looking at a website titled tacodevtripa (tacodevtripa) · GitHub
The contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.

Skip to content
Navigation Menu
Toggle navigation
Sign in
Product
GitHub Copilot
Write better code with AI
Security
Find and fix vulnerabilities
Actions
Automate any workflow
Codespaces
Instant dev environments
Issues
Plan and track work
Code Review
Manage code changes
Discussions
Collaborate outside of code
Code Search
Find more, search less
Explore
All features
Documentation
GitHub Skills
Blog
Solutions
By company size
Enterprises
Small and medium teams
Startups
Nonprofits
By use case
DevSecOps
DevOps
CI/CD
View all use cases
By industry
Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources
Topics
AI
DevOps
Security
Software Development
View all
Explore
Learning Pathways
White papers, Ebooks, Webi

# Mensajes

OpenAI establecio un formato para la manera en que la informacion es procesada por los LLMs, el siguiente ejemplo
es la forma mas basica, donde la estructura es una lista `[]` de diccionarios `{key:value}`, o elementos tipo llave:valor

Asi es posible para el LLMs diferencias que tipo de prompt esta procesando

In [20]:
messages = [
    {"role": "system", "content": "You are a Spanish assistant"},
    {"role": "user", "content": "What is 2 + 2?"}
]

In [None]:
# Llamada a olama, el resultado es devuelto una vez terminado de procesar por completo
model.invoke(messages)

'Hola! *smile*\n\nLa respuesta es... 4. ¡Es fácil, ¿no?!" (The answer is... 4. Easy, isn\'t it?)'

In [25]:
# El resultado es devuelto 'a trozos', por lo que se puede iterar para mostrarlo en pantalla conforme es recibido
for chunk in model.stream(messages):
    print(chunk, end=" ") #usamos el parametro de `end=" "` para sobreescribir que por defecto `print()` agregue un salto de linea

¡ Hola !  ( Hello !)  In  Spanish ,  we  would  say  " ¿ Cu ánt o  es   2  más   2 ?"  which  means  " How  much  is   2  plus   2 ?"

 And  the  answer  is ...  ¡ 4 !  ( Four !)

 ¿ Es  correct o ?  ( Is  that  right ?)  

## Funcion para crear los mensajes con el formato especificado anteriormente

In [26]:
def messages_for_LLM(system_prompt, website):
    return [
        {"role": "system", "content": system_prompt}, #configuracion del system prompt
        {"role": "user", "content": user_prompt_for(website)} #configuracion del input de usuario con los datos de la pagina web
    ]

In [28]:
messages_for_LLM(system_prompt, tacodevtripa)

[{'role': 'system',
  'content': 'You are an assistant that analyzes the contents of a website and provides a short summary, ignoring text that might be navigation related. Respond in markdown.'},
 {'role': 'user',
  'content': "You are looking at a website titled tacodevtripa (tacodevtripa) · GitHub\nThe contents of this website is as follows; please provide a short summary of this website in markdown. If it includes news or announcements, then summarize these too.\n\nSkip to content\nNavigation Menu\nToggle navigation\nSign in\nProduct\nGitHub Copilot\nWrite better code with AI\nSecurity\nFind and fix vulnerabilities\nActions\nAutomate any workflow\nCodespaces\nInstant dev environments\nIssues\nPlan and track work\nCode Review\nManage code changes\nDiscussions\nCollaborate outside of code\nCode Search\nFind more, search less\nExplore\nAll features\nDocumentation\nGitHub Skills\nBlog\nSolutions\nBy company size\nEnterprises\nSmall and medium teams\nStartups\nNonprofits\nBy use case\nD

# Funcion para unir todo

## Al proveer la URL del sitio web deseado:
- Se creara la instancia de dicho sitio web
- Por medio de el constructor de dicho metodo se almacena la informacion deseada
- Y se devuelve la respuesta del LLM, usando los metodos anteriores para obtener el resultado deseado

In [29]:
def summarize(url):
    website = Website(url)
    return model.invoke(messages_for_LLM(system_prompt, website))

In [30]:
# Probando funcion
summarize("https://github.com/tacodevtripa")

"# Summary of tacodevtripa's Website\n\nThe website `tacodevtripa` appears to be the official GitHub profile of a full-stack development creator. The page is divided into sections:\n\n## Overview\n- Introduction: A brief welcome message introducing the creator and their purpose.\n- About Me: Information about the creator, including their passion for teaching web development.\n\n## Features\n\n### Resources\n- Code and projects tutorials with examples.\n- Full-stack development tools and resources.\n\n## Popular Repositories\n- Links to some of the popular repositories created by `tacodevtripa`.\n\n## Contact\n- Invitation to subscribe to the YouTube Channel or connect through Discord.\n- Call for feedback, support, and help in spreading the content.\n\nThe website focuses on providing tutorials and resources for web development, with a focus on full-stack skills."

In [31]:
# Metodo para modificar el formato de la respuesta con la libreria de `Markdown`
def display_summary(url):
    summary = summarize(url)
    display(Markdown(summary))

In [32]:
#Probando funcion
display_summary("https://github.com/tacodevtripa")

### Summary of tacodevtripa's Website

#### About the Creator
The website is run by a developer who aims to help aspiring developers become full-stack professionals through tutorials, code examples, and resources.

#### Content Overview
- **Frontend Projects**: Tutorials on HTML & CSS basics, React apps, and more.
- **Backend Projects**: Tutorials on APIs with Node.js, Spring Boot, and other technologies.
- **Full-stack Applications**: Complete solutions combining frontend, backend, databases, and AWS solutions.
- **Resources**: Cheat sheets, boilerplates, and reusable components.

#### Features of the Website
The website includes a YouTube channel, Discord community for connection and collaboration, and offers support through stars (similar to likes) and subscribing to tutorials.

#### Social Media Links
- YouTube: [https://www.youtube.com/@tacodevtripa](https://www.youtube.com/@tacodevtripa)
- Discord: Link not available in the provided information

#### News/Announcements
There are no formal announcements or news on the website itself. However, the creator invites users to share their content with their network and subscribe to their YouTube channel for more tutorials.

#### Contact Information
The creator is open to feedback through GitHub's issue system. There isn't a clear contact email provided for direct communication outside of these platforms.

### Note
Given the navigation menu was skipped in the analysis, key features and services like GitHub Copilot, Security, Codespaces, etc., were not included in this summary as their content or specific details about them weren't covered during the initial pass.

# Probemos mas sitios web

Este proyecto solo funcionara con sitios web muy basicos, puesto que algunos sitios web que usen frameworks como React, al renderizar los elementos conforme interactuas con la pagina, la informacion puede ser no muy precisa

E incluso algunos sitios web cuentan con alguna proteccion basica gracias a sus proveedores de DNS, como CloudFrount, y su acceso por medio de estas librerias puede no funcionar

In [33]:
display_summary("https://www.thetimes.com/")

It seems you have provided a large portion of the content from The Times newspaper. I can help with specific questions or tasks related to this content, but please let me know what you would like me to assist with.

Would you like me to:

1. Answer a specific question about the news or sports section?
2. Summarize an article or provide more context on a particular topic?
3. Help with something else (e.g., finding information on a specific person, place, or event)?
4. Assist with generating ideas or discussing topics related to the content?

Please let me know how I can assist you!

In [34]:
display_summary("https://python.langchain.com/docs/introduction/")

This appears to be the official documentation for the LangChain framework, a Python library for building intelligent agents and language models.

The documentation is organized into several sections:

1. **Getting Started**: This section provides a brief introduction to LangChain and its ecosystem, as well as guides on how to get started with the library.
2. **Architecture**: This section explains the architecture of LangChain and its various components, including chat models, vector stores, and agents.
3. **Guides**: This section provides tutorials, how-to guides, and conceptual overviews of various topics related to LangChain.
4. **Integrations**: This section lists the various integrations available for LangChain, including chat models, vector stores, and other components from different providers.
5. **API Reference**: This section provides detailed documentation on all classes and methods in the LangChain Python packages.
6. **Ecosystem**: This section highlights additional resources and tools that integrate with LangChain, such as LangSmith and LangGraph.
7. **Versions**: This section explains the versioning policies for LangChain and provides information on migrating legacy code to new versions.
8. **Security**: This section provides guidelines for developing safely with LangChain.
9. **Contributing**: This section provides a developer's guide for contributing to LangChain.

The documentation also mentions the following resources:

* Twitter: @LangChain
* GitHub: langchain
* Organization: LangChain, Inc.
* Python and JS/TS versions

Overall, this documentation appears to be designed to help developers build intelligent agents and language models using LangChain.

## Ejemplo Adicional

- El `System Prompt` fue definido para crear un ayudante en la creacion de los asuntos para los correos electronicos de acuerdo a su contenido
- Por `User Prompt` pasaremos el contenido completo de un correo electronico inventado, la forma en que esta informacion llegue ya la dejo a su imaginacion
- Notese como se especifica en el System Prompt que solo provea el 'asunto' para dicho email, para poder utilizar dicho valor y guardarlo en una variable, esto puede o no funcionar dependiendo de la 'fuerza' de cada LLM, pero por lo general siempre funciona, cosa que incluso puede ser cambiada para pedir varias opciones en una sola respuesta

In [37]:
system_prompt = """Analyze the email content and suggest a clear, concise, and effective subject line. Your task is to:

1. Read and understand the entire email content.
2. Identify the main topic or purpose of the email.
3. Determine the tone and language used in the email (e.g., formal, informal, promotional).
4. Consider the recipient's context and potential emotional resonance.

Create a subject line that accurately reflects the content of the email and is likely to capture the attention of
the recipient.

Provide nothing but the subject line, so the value can be stored in a variable, and provide 3 options"""

user_prompt = """
Dear Tacodevtripa,

We're excited to announce the launch of our new product, "EcoCycle" - a revolutionary recycling system designed to
help households and businesses reduce their waste output.

As one of our valued partners, we're offering you an exclusive 10% discount on your first purchase. Simply use the
code ECOCYCLE10 at checkout to redeem your discount.

Our team has worked tirelessly to develop EcoCycle, and we're confident it will make a significant impact in
reducing waste and promoting sustainability.

To learn more about EcoCycle and how it can benefit your organization, please click on the link below:


Best regards,
EcoCycle Team
"""

def messages_for_email_agent(system_prompt, content):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": content}
    ]
messages = messages_for_email_agent(system_prompt, user_prompt)


model.invoke(messages)

'"Exclusive Offer: Revolutionize Your Waste Management with EcoCycle"\n\n(or)\n\n"Ditch Waste, Not Your Budget: 10% Off EcoCycle Today"\n\n(or)\n\n"Sustainability Meets Convenience: Get Exclusive Access to EcoCycle"'