# Model IO
### LangChain Expression Language (LCEL)

---

Alejandro Ricciardi (Omegapy)  
created date: 01/22/2024   
[GitHub](https://github.com/Omegapy)  

Credit: [LangChain](https://python.langchain.com/docs/expression_language/)

<br>

--- 

 
Projects Description:  
**LangChain** is a framework for developing applications powered by language models.  
**In this project:**  
I explore the concept Model i/o, the core element of any language model application is...the model.   
LangChain gives you the building blocks to interface with any language model.
<br>
<img src="pic/model_io.jpg" width="1500">
<br>

[Conceptual Guide](https://python.langchain.com/docs/modules/model_io/concepts)  
A conceptual explanation of messages, prompts, LLMs vs ChatModels, and output parsers. You should read this before getting started.  
[Quickstart](https://python.langchain.com/docs/modules/model_io/quick_start)  
Covers the basics of getting started working with different types of models. You should walk through [this section](https://python.langchain.com/docs/modules/model_io/output_parsers/) if you want to get an overview of the functionality.  
[Prompts](https://python.langchain.com/docs/modules/model_io/prompts/)  
This section deep dives into the different types of prompt templates and how to use them.  
[LLMs](https://python.langchain.com/docs/modules/model_io/llms/)  
This section covers functionality related to the LLM class. This is a type of model that takes a text string as input and returns a text string.  
[ChatModels](https://python.langchain.com/docs/modules/model_io/chat/)  
This section covers functionality related to the ChatModel class. This is a type of model that takes a list of messages as input and returns a message.  
[Output Parsers](https://python.langchain.com/docs/modules/model_io/output_parsers/)  
Output parsers are responsible for transforming the output of LLMs and ChatModels into more structured data. This section covers the different types of output parsers.  

<p></p>
<b style="font-size:15;">
⚠️ This project requires an OpenAI key.
</b>


##### Project Map  
- [API Keys](#api-keys)  
- [Quickstart](#quickstart)
    - [Models](#models)
    - [Prompt Templates](#prompt-templates)
    - [Output parsers](#output-parsers)
    - [Composing with LCEL](#composing-with-lcel)
- [Concepts](#concepts)
     - [Models (concept)](#models-concept)
     - [Messages (concept)](#messages-concept)
     - [Prompts (concept)](#prompts-concept)
     - [Output Parsers (concept)](#output-parsers-concept)

<br>

---


#### API Keys

In [2]:
import os
from dotenv import load_dotenv,find_dotenv
load_dotenv(find_dotenv())
OPENAI_API_KEY = os.environ.get("OPEN_AI_KEY")

[Project Map](#project-map)

---

---
## Quickstart

The quick start will cover the basics of working with language models. It will introduce the two different types of models - LLMs and ChatModels. It will then cover how to use PromptTemplates to format the inputs to these models, and how to use Output Parsers to work with the outputs. For a deeper conceptual guide into these topics - please see this [documentation](https://python.langchain.com/docs/modules/model_io/concepts).

<br>

---

### Models
For this getting started guide, I will provide the options of using OpenAI.

##### OpenAI

Both ```llm``` and ```chat_model``` are objects that represent configuration for a particular model. You can initialize them with parameters like ```temperature``` and others, and pass them around. 
**The main difference between them is their input and output schemas.**
- The ```LLM``` objects take ```string``` as ```input``` and ```output string```. 
- The ```ChatModel``` objects take a ```list of messages``` as ```input``` and ```output a message```.

In [6]:
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAI

llm = OpenAI()
chat_model = ChatOpenAI()

In [10]:
from langchain.schema import HumanMessage

text = "If my wife says that I am a lucky individual, am I a lucky individual?"
messages = [HumanMessage(content=text)]

In [11]:
llm.invoke(text)

'?\n\nIt is subjective and ultimately up to personal interpretation. Some may believe that being called "lucky" by one\'s spouse is a sign of good fortune, while others may not put much weight on the opinion of one person. Ultimately, the label of "lucky" is based on a person\'s own experiences and perspective.'

In [12]:
chat_model.invoke(messages)

AIMessage(content='Yes, if your wife says that you are a lucky individual, it means that she believes you have good fortune or that positive things continuously happen to you.')

[Project Map](#project-map)

---

### Prompt Templates

Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.

In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it would be great if the user only had to provide the description of a company/product without worrying about giving the model instructions.

PromptTemplates help with exactly this! They bundle up all the logic for going from user input into a fully formatted prompt. This can start off very simple

In [13]:
from langchain.prompts import PromptTemplate

prompt = PromptTemplate.from_template("What is a good name for a company that makes {product}?")
prompt.format(product="colorful socks")

'What is a good name for a company that makes colorful socks?'

However, the advantages of using these over raw string formatting are several. You can "partial" out variables - e.g. you can format only some of the variables at a time. You can compose them together, easily combining different templates into a single prompt. For explanations of these functionalities, see the [section on prompts](https://python.langchain.com/docs/modules/model_io/prompts) for more detail.

```PromptTemplates``` can also be used to produce a list of messages. In this case, the prompt not only contains information about the content, but also each message (its role, its position in the list, etc.). Here, what happens most often is a ChatPromptTemplate is a list of ```ChatMessageTemplates```. Each ```ChatMessageTemplate``` contains instructions for how to format that ```ChatMessage``` - its role, and then also its content. Let's take a look at this below:

In [14]:
from langchain.prompts.chat import ChatPromptTemplate

template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", human_template),
])

chat_prompt.format_messages(input_language="English", output_language="French", text="I love programming.")

[SystemMessage(content='You are a helpful assistant that translates English to French.'),
 HumanMessage(content='I love programming.')]

[Project Map](#project-map)

---

## Output parsers

```OutputParsers``` convert the raw output of a language model into a format that can be used downstream. There are a few main types of ```OutputParser```s, including:

Convert text from ```LLM``` into structured information (e.g. JSON)
Convert a ```ChatMessage``` into just a string
Convert the extra information returned from a call besides the message (like OpenAI function invocation) into a string.
For full information on this, see the [section on output parsers](https://python.langchain.com/docs/modules/model_io/output_parsers).

In this getting started guide, we use a simple one that parses a list of comma separated values.

In [15]:
from langchain.output_parsers import CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()
output_parser.parse("hi, bye")

['hi', 'bye']

[Project Map](#project-map)

---

### Composing with LCEL
We can now combine all these into one chain. This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to a language model, and then pass the output through an (optional) output parser. This is a convenient way to bundle up a modular piece of logic. Let's see it in action!

In [16]:
template = "Generate a list of 5 {text}.\n\n{format_instructions}"

chat_prompt = ChatPromptTemplate.from_template(template)
chat_prompt = chat_prompt.partial(format_instructions=output_parser.get_format_instructions())
chain = chat_prompt | chat_model | output_parser
chain.invoke({"text": "colors"})
# >> ['red', 'blue', 'green', 'yellow', 'orange']

['red', 'blue', 'green', 'yellow', 'purple']

[Project Map](#project-map)

---

---
## Concept

[Concept](https://python.langchain.com/docs/modules/model_io/concepts)
The core element of any language model application is...the model. LangChain gives you the building
blocks to interface with any language model. Everything in this section is about making it easier to work
with models. This largely involves a clear interface for what a model is, helper utils for constructing
inputs to models, and helper utils for working with the outputs of models

<br>

---

### Models (concept)

**LLM**s 
LLMs in LangChain refer to pure text completion models. The APIs they wrap take a string
prompt as input and output a string completion. OpenAI's GPT-3 is implemented as an LLM.

**Chat Models**
Chat models are often backed by LLMs but tuned specifically for having conversations. Crucially,
their provider APIs use a different interface than pure text completion models. Instead of a
single string, they take a list of chat messages as input and they return an AI message as output.
See the section below for more details on what exactly a message consists of. GPT-4 and
Anthropic's Claude-2 are both implemented as chat models.

Considerations
These two API types have pretty different input and output schemas. This means that best way
to interact with them may be quite different. Although LangChain makes it possible to treat
them interchangeably, that doesn't mean you should. In particular, the prompting strategies for
LLMs vs ChatModels may be quite different. This means that you will want to make sure the
prompt you are using is designed for the model type you are working with.

Additionally, not all models are the same. Different models have different prompting strategies
that work best for them. For example, Anthropic's models work best with XML while OpenAI's
work best with JSON. This means that the prompt you use for one model may not transfer to
other ones. LangChain provides a lot of default prompts, however these are not guaranteed to
work well with the model are you using. Historically speaking, most prompts work well with
OpenAI but are not heavily tested on other models. This is something we are working to address,
but it is something you should keep in mind.

[Project Map](#project-map)

---

### Messages (concept)

ChatModels take a list of messages as input and return a message. There are a few different types of
messages. All messages have a role and a content property.
The role describes WHO is saying the message. LangChain has different message classes for different
roles.
The content property describes the content of the message.

This can be a few different things:
- A string (most models are this way)
- A List of dictionaries (this is used for multi-modal input, where the dictionary contains information about that input type and that input location)

In addition, messages have an additional_kwargs property. This is where additional information about
messages can be passed. This is largely used for input parameters that are provider specific and not
general. The best known example of this is function_call from OpenAI.

**HumanMessage**
This represents a message from the user. Generally consists only of content.

**AIMessage**
This represents a message from the model. This may have additional_kwargs in it - for
example functional_call if using OpenAI Function calling.

**SystemMessage**
This represents a system message. Only some models support this. This tells the model how to
behave. This generally only consists of content.

**FunctionMessage**
This represents the result of a function call. In addition to role and content, this message has
a name parameter which conveys the name of the function that was called to produce this
result.

**ToolMessage**
This represents the result of a tool call. This is distinct from a FunctionMessage in order to match
OpenAI's function and tool message types. In addition to role and content, this message has
a tool_call_id parameter which conveys the id of the call to the tool that was called to produce
this result.

[Project Map](#project-map)

---

### Prompts (concept)
The inputs to language models are often called prompts. Oftentimes, the user input from your app is not
the direct input to the model. Rather, their input is transformed in some way to produce the string or list
of messages that does go into the model. The objects that take user input and transform it into the final
string or messages are known as "Prompt Templates". LangChain provides several abstractions to make
working with prompts easier.

**PromptValue**
ChatModels and LLMs take different input types. PromptValue is a class designed to be
interoperable between the two. It exposes a method to be cast to a string (to work with LLMs)
and another to be cast to a list of messages (to work with ChatModels).

**PromptTemplate**
This is an example of a prompt template. This consists of a template string. This string is then
formatted with user inputs to produce a final string.

**MessagePromptTemplate**
This is an example of a prompt template. This consists of a template message - meaning a
specific role and a PromptTemplate. This PromptTemplate is then formatted with user inputs to


**HumanMessagePromptTemplate**
This is MessagePromptTemplate that produces a HumanMessage.

**AIMessagePromptTemplate**
This is MessagePromptTemplate that produces an AIMessage.

**SystemMessagePromptTemplate**
This is MessagePromptTemplate that produces a SystemMessage.

**MessagesPlaceholder**
Oftentimes inputs to prompts can be a list of messages. This is when you would use a
MessagesPlaceholder. These objects are parameterized by a variable_name argument. The input
with the same value as this variable_name value should be a list of messages.

**ChatPromptTemplate**
This is an example of a prompt template. This consists of a list of MessagePromptTemplates or
MessagePlaceholders. These are then formatted with user inputs to produce a final list of
messages

[Project Map](#project-map)

---

### Output Parsers (concept)

The output of models are either strings or a message. Oftentimes, the string or messages contains
information formatted in a specific format to be used downstream (e.g. a comma separated list, or JSON
blob). Output parsers are responsible for taking in the output of a model and transforming it into a more
usable form. These generally work on the content of the output message, but occasionally work on
values in the additional_kwargs field.

**StrOutputParser**
This is a simple output parser that just converts the output of a language model (LLM or
ChatModel) into a string. If the model is an LLM (and therefore outputs a string) it just passes
that string through. If the output is a ChatModel (and therefore outputs a message) it passes
through the .content attribute of the message.

**OpenAI Functions Parsers**
There are a few parsers dedicated to working with OpenAI function calling. They take the output
of the function_call and arguments parameters (which are inside additional_kwargs) and work
with those, largely ignoring content.

**Agent Output Parsers**
[Agents](https://python.langchain.com/docs/modules/agents) are systems that use language models to determine what steps to take. The output of a
language model therefore needs to be parsed into some schema that can represent what actions

[Project Map](#project-map)

---