# Prompting y Langchain básico

Importación y instalación de los paquetes requeridos

In [1]:
import os
import json
from pathlib import Path
from dotenv import load_dotenv
import pprint
from typing import List

from IPython.display import display, Markdown

In [2]:
# Load environment variables from .env file
load_dotenv()

True

In [None]:
# %pip install langchain langchain-openai langchain-groq langchain-community langchain-chroma

In [3]:
from langchain_openai import ChatOpenAI
from langchain_groq import ChatGroq
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain.memory import ChatMessageHistory
from langchain_core.runnables import RunnablePassthrough
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
from langchain_openai import OpenAIEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_chroma import Chroma
from pydantic import BaseModel, Field


## Chat simple basado en LLM

Para Groq, crear una cuenta y un API KEY at https://groq.com/ 

Lee el nombre del modelo desde la variable de entorno, crea una instancia de ChatOpenAI con temperatura baja (menos aleatoriedad), envía un prompt directo al LLM y muestra la respuesta.

In [36]:
llm_model = os.environ["OPENAI_MODEL"]
llm_model = "moonshotai/kimi-k2-instruct"
print(llm_model)

llm = ChatOpenAI(model=llm_model, temperature=0.1)
llm = ChatGroq(model=llm_model, temperature=0.1)

response = llm.invoke("Tell me a joke about data scientists")
print(response.text()) 

moonshotai/kimi-k2-instruct
Why don’t data scientists ever get invited to parties?

Because they always bring their own *standard deviations* and kill the *mean* mood!


Los chat models se componen de roles y mensajes intercalados. 

Crea un template con un SystemMessage fijo y un MessagesPlaceholder llamado "messages", y luego encadena ese template con el LLM. Al invocar la cadena, se pasa un diccionario con la clave "messages" y una lista de objetos de mensaje (HumanMessage, AIMessage, ...). El resultado ai_msg es un objeto de mensaje que se muestra como respuesta.

El orden y tipo de mensajes importa (contexto, historial, ejemplos). La variable definida en el placeholder debe coincidir luego.

In [37]:
prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content="You are a helpful assistant. Answer all questions to the best of your ability."
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | llm # Chaining the prompt and the llm call

ai_msg = chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)
print(ai_msg.text())

I said: “J’adore la programmation,” which is French for “I love programming.”


Este es un ejemplo de una chain *zero-shot*, porque el LLM contesta en base a su pre-entrenamiento.

Se crea un chat con un mensaje de sistema fijo (SYSTEM_PROMPT) y un template que espera una variable "question" en el rol "human". ChatPromptTemplate.from_messages() acepta distintas representaciones de mensajes y devuelve un template para componer con un LLM.

La qa_chain, al invocarse, llenará las variables del prompt, ejecutará el modelo y devolverá la salida combinada. Al final, se renderizala salida como Markdown.

In [38]:
SYSTEM_PROMPT = """You are an experienced software architect that assists a novice developer 
to design a system. """

qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", SYSTEM_PROMPT),
            ("human", "{question}"),
        ]
)

qa_chain = qa_prompt | llm

query = "What are the pros and cons of the Proxy design pattern?"
result = qa_chain.invoke(input=query)
# print(result.content)

display(Markdown(result.content)) # Result in Markdown format


Proxy Design Pattern – Quick-Reference Card  
(what it is: a surrogate / placeholder that controls access to another object)

-------------------------------------------------
PROS – why you reach for it
-------------------------------------------------
1. Transparent protection  
   Client code talks to the Proxy through the exact same interface it would use for the real subject → no changes in calling code.

2. Lazy (virtual) instantiation  
   Heavy or network objects are created only when the first real method is invoked → faster start-up and lower memory footprint.

3. Security gatekeeper  
   You can centralise permission checks in the Proxy instead of scattering them through business code.

4. Remote communication stub (RMI, gRPC, SOAP, …)  
   The Proxy serialises the call and hides all the plumbing (marshalling, retries, network failures).

5. Thread-safe or cached access  
   Synchronisation, pooling, result caching, request coalescing, circuit-breaker, etc. can be added without touching the real object.

6. Instrumentation / AOP hook  
   Add logging, metrics, transaction boundaries, profiling, throttling, … in one place.

7. Future-proofing & hot swapping  
   Replace the real subject with a different implementation (mock, decorator, cloud service) at run-time.

-------------------------------------------------
CONS – where it can bite you
-------------------------------------------------
1. One more abstraction layer  
   Extra classes/interfaces → steeper learning curve for newcomers, more code to navigate.

2. Performance tax  
   Every call goes through an indirection; for very fine-grained objects (e.g., graph nodes, game entities) the overhead can dominate.

3. Lifecycle complexity  
   You now have two objects to manage: proxy and real subject.  
   – Who creates whom?  
   – Who owns the expensive resources?  
   – When can the real subject be disposed?  
   Reference-counting or weak references may be needed.

4. Concurrency pitfalls  
   If the proxy is shared between threads, its own state (e.g., cached reference to the real subject) must be thread-safe, duplicating synchronisation effort.

5. Debugging pain  
   Stack traces are deeper; breakpoints in the proxy can hide the real culprit; logging may show “proxy” instead of the actual class name unless you toString() carefully.

6. Interface coupling  
   If the real subject’s interface evolves you must update the proxy (and all specialised proxies: virtual, remote, cache, security, …).  
   Dynamic proxies (java.lang.reflect.Proxy, CGLib, ByteBuddy) mitigate this but add byte-code magic that not every team is comfortable with.

7. Over-engineering risk  
   A simple “if” in the service layer is sometimes enough; wrapping every repository call in a proxy can turn into a maze of tiny classes.

-------------------------------------------------
Rule-of-thumb
-------------------------------------------------
Use Proxy when the cost of creating/accessing the real object is high, or when you need to add a systemic, cross-cutting behaviour (security, caching, remoting) that must stay orthogonal to business logic.  
Avoid it when the object is cheap, calls are extremely frequent, or the added indirection obscures more than it helps.

## Gestión de Memoria (como parte de un chat)

La memoria es una lista de (pares de) mensajes previos entre el humano y la IA, que proporciona *contexto* para la siguiente interacción.

La cadena define un prompt de reescritura para convertir una pregunta dependiente del historial en una pregunta independiente: se busca crear un template con un SystemMessage y un MessagesPlaceholder "chat_history", y luego componerla con el LLM, sumando además un StrOutputParser para la salida.

MessagesPlaceholder espera explícitamente una lista de mensajes en la entrada.

ChatMessageHistory actúa como buffer. Pasar chat_history.messages al invoke() proporciona la estructura que MessagesPlaceholder espera.


In [39]:
CONTEXTUALIZED_PROMPT = """Given a chat history and the latest developer's question
    which might reference context in the chat history, formulate a standalone question
    that can be understood without the chat history. Do NOT answer the question,
    just reformulate it if needed and otherwise return it as is."""

contextualized_qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", CONTEXTUALIZED_PROMPT),
            MessagesPlaceholder(variable_name="chat_history"),
            ("human", "{question}"),
        ]
    )

# This is a possible chain to keep (and compress) past interactions
# It's a form of rewriting
contextualized_qa_chain = contextualized_qa_prompt | llm | StrOutputParser()


# A buffer to store messages from user and assistant
chat_history = ChatMessageHistory()
chat_history.add_user_message(query)
chat_history.add_ai_message(result)


query = "Can I combine the pattern with other patterns?"
ai_msg = contextualized_qa_chain.invoke(
    {
        'question': query, 
        'chat_history': chat_history.messages
    }
)
print(ai_msg)

Yes—combine the Proxy pattern with other patterns whenever you need to layer responsibilities that don’t belong in the real subject. Typical pairings:

- Decorator – Proxy adds control/access; Decorator adds behaviour. You can wrap a Proxy around a Decorator (or vice-versa) to get both caching and, say, compression.  
- Adapter – Use an Adapter inside a Remote Proxy to translate the real object’s interface into the one clients expect over the wire.  
- Flyweight – A Virtual Proxy can postpone creation of the heavy flyweight object until it’s actually needed.  
- Singleton – A Proxy can be the single entry point that lazily creates and guards the only instance.  
- Observer – A Protection Proxy can filter or veto subscription requests; a Remote Proxy can re-broadcast events across JVMs.  
- Factory / Abstract Factory – Instantiate the correct Proxy subclass (CachingProxy, SecurityProxy, …) without clients knowing.  
- Command – A Remote Proxy can turn each method call into a Command obj

Y ahora se puede usar la pregunta contextualizada y que el LLM responda la pregunta.

In [40]:
def contextualized_question(input: dict):
        if input.get("chat_history"):
            return contextualized_qa_chain
        else:
            return input["question"]

# A new QA chain that reuses the previous chain
qa_chain_with_memory = (
         RunnablePassthrough.assign(
            context=contextualized_question | qa_prompt | llm
        )
    )

result = qa_chain_with_memory.invoke(
    {
        'question': query,  
        'chat_history': chat_history.messages
    }
)

display(Markdown(result['context'].content))

Exactly.  
Think of the Proxy as the “on-ramp” to the object; once traffic is flowing through that on-ramp you can stack as many extra lanes (concerns) as you need without touching the original object.  
Below are the most common combinations you’ll run into, the problem each one solves, and a minimal code sketch so you can see the seams.

-------------------------------------------------
1. Proxy + Caching (Flyweight or simple Map)
-------------------------------------------------
Problem: The real service is CPU- or I/O-heavy and the same arguments are requested over and over.

Key idea: Let the proxy hold a lightweight cache; only delegate to the real object on a miss.

Java example (interface already exists):

public interface DataService { String fetch(String key); }

public class CachingProxy implements DataService {
    private final DataService real;
    private final Map<String,String> cache = new ConcurrentHashMap<>();

    public CachingProxy(DataService real) { this.real = real; }

    @Override
    public String fetch(String key) {
        return cache.computeIfAbsent(key, real::fetch);
    }
}

Usage:
DataService data = new CachingProxy(new RemoteDataService());

-------------------------------------------------
2. Proxy + Security (Protection Proxy)
-------------------------------------------------
Problem: You must check roles/permissions before every call.

Key idea: Intercept the call, ask an AccessManager, throw if denied, forward if allowed.

public class SecureProxy implements DataService {
    private final DataService real;
    private final AccessManager access;

    public SecureProxy(DataService real, AccessManager access) {
        this.real = real; this.access = access;
    }

    @Override
    public String fetch(String key) {
        if (!access.canRead(key))
            throw new ForbiddenException();
        return real.fetch(key);
    }
}

-------------------------------------------------
3. Proxy + Remote Communication (Remote Proxy / Adapter)
-------------------------------------------------
Problem: Real object lives on another process/machine.

Key idea: Proxy hides serialization, network stubs, retries, circuit-breaker, etc.

Spring example (RMI or HTTP):

@FeignClient(name = "data-service")
public interface RemoteDataService extends DataService {}

Spring generates the proxy at runtime; you just autowire DataService and never notice the network.

-------------------------------------------------
4. Proxy + Lazy Loading (Virtual Proxy)
-------------------------------------------------
Problem: Object graph is expensive to create/load.

Key idea: Proxy holds a reference but doesn’t materialize the real object until the first method call.

public class VirtualProxy implements HeavyObject {
    private HeavyObject real;   // null until needed
    @Override
    public void heavyOp() {
        if (real == null) {
            real = new HeavyObjectImpl(); // costly
        }
        real.heavyOp();
    }
}

-------------------------------------------------
5. Proxy + Metrics / Logging (Decorator flavour)
-------------------------------------------------
Problem: Ops wants latency counters, logs, tracing.

Key idea: Proxy wraps and times, then delegates.

public class MetricsProxy implements DataService {
    private final DataService real;
    private final MeterRegistry registry;

    @Override
    public String fetch(String key) {
        return registry.timer("ds.fetch", "key", key)
                       .recordCallable(() -> real.fetch(key));
    }
}

-------------------------------------------------
6. Chaining them together
-------------------------------------------------
Because every proxy implements the same interface you can stack:

DataService service =
    new MetricsProxy(
        new SecureProxy(
            new CachingProxy(
                new RemoteDataService())));

The call order becomes:
Metrics → Security → Cache → Network.

-------------------------------------------------
7. When to stop
-------------------------------------------------
- Keep the interface stable; every new concern is a new wrapper.  
- If the stack becomes hard to reason about, switch to a single interceptor pipeline (e.g., Spring AOP, Jakarta interceptors, .NET DispatchProxy) so ordering is explicit.  
- Document the chain in README or a diagram—future you will thank you.

-------------------------------------------------
Rule of thumb
-------------------------------------------------
“Add a new proxy when the concern is orthogonal to business logic and you can express it as ‘before/after’ the real call.”  
Everything else probably belongs inside the service itself.

In [41]:
# How the whole result looks like
pprint.pprint(result)

{'chat_history': [HumanMessage(content='What are the pros and cons of the Proxy design pattern?', additional_kwargs={}, response_metadata={}),
                  AIMessage(content='Proxy Design Pattern – Quick-Reference Card  \n(what it is: a surrogate / placeholder that controls access to another object)\n\n-------------------------------------------------\nPROS – why you reach for it\n-------------------------------------------------\n1. Transparent protection  \n   Client code talks to the Proxy through the exact same interface it would use for the real subject → no changes in calling code.\n\n2. Lazy (virtual) instantiation  \n   Heavy or network objects are created only when the first real method is invoked → faster start-up and lower memory footprint.\n\n3. Security gatekeeper  \n   You can centralise permission checks in the Proxy instead of scattering them through business code.\n\n4. Remote communication stub (RMI, gRPC, SOAP, …)  \n   The Proxy serialises the call and hides al

## Few-shots

Crea un template que espera una variable llamada "input" y contiene un prompt que pide el antónimo y una explicación. Luego se arma un chain que compone el prompt con los valores de entrada y lo envía al model. 

In [42]:
# Template inicial
zero_shot_prompt = PromptTemplate(
    input_variables=['input'], validate_template=True,
    template="""Return the antonym of the input given along with an explanation.
    Input: {input}
    Output:
    Explanation:
    """
)

# Zero-shot chain (por ahora)
zero_shot_chain = zero_shot_prompt | llm

query = 'I am very sad but still have hope'
result = zero_shot_chain.invoke(input=query)
print(result.content)

Antonym:  
“I am overjoyed and utterly without hope.”

Explanation:  
The original sentence pairs deep sadness (“very sad”) with the retention of hope. Its opposite must reverse both emotional valence and outlook. “Overjoyed” is the direct affective opposite of “very sad,” while “utterly without hope” negates the residual optimism. The conjunction “and” replaces “but” because the two reversed states now align in the same negative-positive direction, forming a coherent antithetical state.


Ahora se supone que hay una base de ejemplos de antonimos.

In [43]:
# Ejemplos de tareas para crear antonimos
example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

Se configura un flujo few‑shot que selecciona dinámicamente ejemplos semánticamente similares y renderiza un prompt listo para enviar al LLM. Se usan los embeddings (de OpenAI, u otros) para convertir ejemplos a vectores y luego Chroma como based de almacenamiento. El parametro k=1 indica que se devolverá 1 ejemplo cercano por consulta.

Al ejecutar el FewShotsPromptTemplate, el selector busca el ejemplo semánticamente más cercano, y lo inserta para generar el prompt final, que debiera usarse con el LLM.

In [44]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

example_selector = SemanticSimilarityExampleSelector.from_examples(
    # The list of examples available to select from.
    examples,
    # The embedding class used to produce embeddings which are used to measure semantic similarity.
    embeddings, #OpenAIEmbeddings(),
    Chroma, # The database to store the examples with their embeddings
    # The number of examples to produce.
    k=2,
)

similar_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Return the antonym of the given input along with an explanation. \n\nExample(s):",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

print(similar_prompt.format(adjective="cloudy"))


Return the antonym of the given input along with an explanation. 

Example(s):

Input: sunny
Output: gloomy

Input: sunny
Output: gloomy

Input: cloudy
Output:


Y solo resta usar el prompt en una chain

In [45]:
# Few-shot chain
few_shot_chain = similar_prompt | llm

query = 'rainy' # 'I am very sad but still have hope'
print(similar_prompt.format(adjective=query))
print()

result = few_shot_chain.invoke(input=query)
print(result.text())

Return the antonym of the given input along with an explanation. 

Example(s):

Input: sunny
Output: gloomy

Input: sunny
Output: gloomy

Input: rainy
Output:

Output: sunny  
Explanation: “Rainy” describes weather characterized by rain, while “sunny” describes weather with clear skies and sunshine—essentially the opposite condition.


Alternativamente, se le puede pedir al LLM que genere un objeto JSON como respuesta (salida estructurada) via Pydantic.

In [46]:
# Descripción de la estructura de salida usando Pydantic
class FormattedAntonym(BaseModel):
    antonym: str = Field(description="An antonym for the input word or phrase.", default="")
    explanation: str = Field(description="A short explanation of how the antonym was generated")
    additional_clarifications: List[str] = Field(description="A list of questions that the LLM needs to clarify in order to ...", default=[])


# Wrapper para la salida estructurada
llm_with_structure = llm.with_structured_output(FormattedAntonym)

few_shot_chain1 = similar_prompt | llm_with_structure
result = few_shot_chain1.invoke(input=query)
result # The Pydantic (object) format

FormattedAntonym(antonym='sunny', explanation='The antonym of "rainy" is "sunny" because "rainy" describes weather characterized by rain and overcast skies, whereas "sunny" describes clear skies and sunshine, representing opposite weather conditions.', additional_clarifications=[])

In [47]:
result.antonym

'sunny'

In [48]:
# print(result.model_dump_json(indent=2))
pprint.pprint(result.model_dump()) # This is a dict

{'additional_clarifications': [],
 'antonym': 'sunny',
 'explanation': 'The antonym of "rainy" is "sunny" because "rainy" describes '
                'weather characterized by rain and overcast skies, whereas '
                '"sunny" describes clear skies and sunshine, representing '
                'opposite weather conditions.'}


---