### Build a Simple LLM Application with LCEL

In this quickstart we'll show you how to build a simple LLM application with LangChain. This application will translate text from English into another language. This is a relatively simle LLM application - it's just a single LLM call plus some prompting. Still, this is a great way to get started with langchain - a lot of features can be built with just some prompting and an LLM call!

After seeing this video, you'll have a high level overview of: 

- Using language models
- Using PromptTemplates and OuputParsers
- Using Langchain Expression Language (LCEL) to chain components together
- Debugging and tracing your application using LangSmith
- Deploying your application with LangServe

In [2]:
### Open AI API key and Open Source models -- Llama3, Gemma2, mistram-Groq

import os
from dotenv import load_dotenv
load_dotenv()

import openai
openai.api_key=os.getenv("OPENAI_API_KEY")

groq_api_key=os.getenv("GROK_API_KEY")
groq_api_key

Grok is the AI infrastructure company that delivers fast AI inference. It has deployed many models in its own cloud and over there they have used this amazing inferencing known as LPU AI inferencing engine. 

An LPU inferencing Engine, with LPU standing for Language Processing unit is a hardware and software platform that delivers exceptional compute speed, quality, and energy efficiency. This new type of end - to - end processing unit system provides the fastest inference for computationally intensive applications with sequencial compnents, such as AI language applications like LLM's.

The LPU is designed the two LLM bottlenecks, compute density and memory bandwidth. An LPU has greater compute capacity than a GPU and CPU in regards to LLMs. This reduces the amount of time per word calculated, allowing sequences of text to be generated much faster. Additionally, eliminating external memory bottlenecks enables the LPU Inference Engine to deliver orders of magnitude better performance on LLMs compared to GPUs. 

In [3]:
!pip install langchain_groq



This is what langchain is basically doing. Langchain wants integration with every platform out there, every LLM models out there. They will say, hey you all bigger companies, you keep on competing, you bring up new new models, but we will create a wrapper which will help us to integrate any kind of model that comes into the picture.

In [4]:
from langchain_openai import ChatOpenAI
from langchain_groq import ChatGroq
model = ChatGroq(model="gemma2-9b-it",groq_api_key=groq_api_key)
model



ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x000001C14040B5E0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001C140409150>, model_name='gemma2-9b-it', model_kwargs={})

Building an LLM application basically means let's say if I'm giving any query to my model, my model should be able to give some response. But you should understand some very key libraries that are specifically used in langchain which we also say it as Runnables.

In [5]:
!pip install langchain_core



In [6]:
from langchain_core.messages import HumanMessage,SystemMessage
# Whenever we chat or whenever we  provide any messages to our LLM model, we should specify our LLM thst which message is basically provided by the human being, and which message is a kind of instruction to the LLM model.

messages = [
    SystemMessage(content= "Translate the following from English to French"),
    HumanMessage(content="Hello How are you?")
]
result = model.invoke(messages)


In [7]:
result

AIMessage(content='Here are a few ways to say "Hello, how are you?" in French:\n\n**Formal:**\n\n* **Bonjour, comment allez-vous ?** (This is the most formal option and is used with people you don\'t know well, people in authority, or in professional settings.)\n\n**Informal:**\n\n* **Salut, comment vas-tu ?** (This is used with friends and family.)\n\n* **Coucou, ça va ?** (This is a very informal and friendly way to say hello.) \n\n\nLet me know if you\'d like to learn more French greetings!\n', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 125, 'prompt_tokens': 21, 'total_tokens': 146, 'completion_time': 0.227272727, 'prompt_time': 0.001354339, 'queue_time': 0.274117741, 'total_time': 0.228627066}, 'model_name': 'gemma2-9b-it', 'system_fingerprint': 'fp_10c08bf97d', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None}, id='run--af41963f-1151-4560-a75f-0eb4f9152f81-0', usage_metadata={'input_tokens': 21, 'output_tokens': 125, 'total_

The above result is nothing but a AI message, but I want to only retrieve this particular output. So for that in langchain_core we have something called output_parsers. The Output parser will be responsible to basically display the message that is coming outside as a response from the LLM model. We can create custom string output parser if we want.

In [8]:
from langchain_core.output_parsers import StrOutputParser
parser = StrOutputParser()
parser.invoke(result)

'Here are a few ways to say "Hello, how are you?" in French:\n\n**Formal:**\n\n* **Bonjour, comment allez-vous ?** (This is the most formal option and is used with people you don\'t know well, people in authority, or in professional settings.)\n\n**Informal:**\n\n* **Salut, comment vas-tu ?** (This is used with friends and family.)\n\n* **Coucou, ça va ?** (This is a very informal and friendly way to say hello.) \n\n\nLet me know if you\'d like to learn more French greetings!\n'

In [9]:
### Using LCEL we can chain the components
chain = model|parser
chain.invoke(messages)

"Hello - Bonjour \n\nHow are you? - Comment allez-vous ? (formal) or Ça va ? (informal) \n\n\nLet me know if you'd like to practice more translations! \n"

Right now we are passing everytime list of messages. Instead of this we can use one more efficient technique. And it is called as prompt templates.

In [10]:
# Prompt Templates
# So instead of just giving list of message, I will try to take a combination of user input and some application logic. Where I am able to give the instruction. I am able to give the userinput over ther. 
from langchain_core.prompts import ChatPromptTemplate 

# First we should define the generic template
generic_template="Translate the following into {language}:"

# After generic template we should create prompt
prompt=ChatPromptTemplate.from_messages(
    [("system",generic_template),("user","{text}")]
    # this text we should give in run time
)

In [11]:
result=prompt.invoke({"language":"French","text":"Hello"})

In [12]:
result.to_messages()

[SystemMessage(content='Translate the following into French:', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Hello', additional_kwargs={}, response_metadata={})]

In [14]:
chain=prompt|model|parser
chain.invoke({"language":"French","text":"Hello"})

'Bonjour \n'

# Langserve helps developers deploy LangChain runnables and chains as REST API.

# This Library is integrated with FAST API and uses pydantic for data validation.

# In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChain.js