# [Groq](https://groq.com/)

- [Groq Api Key](https://console.groq.com/keys)
- [Groq models](https://console.groq.com/docs/models)
- [ChatGroq](https://api.python.langchain.com/en/latest/chat_models/langchain_groq.chat_models.ChatGroq.html)

### What is Groq?

Groq is a technology company that specializes in designing and developing high-performance hardware accelerators for artificial intelligence (AI) and machine learning (ML) workloads. The company's flagship product is the GroqChip, which is designed to provide unparalleled performance, scalability, and efficiency for AI and ML tasks, particularly for large language models (LLMs) and generative AI (GenAI) applications.

### What is the LPU™ Inference Engine?

The LPU™ (Tensor Streaming Processor) Inference Engine is a core component of Groq's hardware architecture. It is designed to execute AI and ML inference tasks with high efficiency and speed. Unlike traditional GPU architectures, which rely on multiple cores to perform computations in parallel, the LPU™ Inference Engine uses a novel approach to achieve higher performance.

### Why is it so much faster than GPUs for LLMs and GenAI?

The speed advantages of the LPU™ Inference Engine over traditional GPUs for LLMs and GenAI applications can be attributed to several key factors:

1. **Dataflow Architecture**: Groq's architecture is designed to optimize data flow through the processor, reducing latency and maximizing throughput. This is particularly beneficial for the large and complex computations required by LLMs and GenAI.
2. **Predictable Execution**: The GroqChip provides deterministic execution, which means that operations happen in a predictable manner, eliminating the overhead associated with managing multiple parallel threads.
3. **High Utilization**: Groq's architecture ensures that the processing elements are highly utilized, reducing idle time and improving overall efficiency.
4. **Optimized for Inference**: While GPUs are general-purpose and can handle both training and inference, the GroqChip is specifically optimized for inference tasks, allowing it to achieve higher performance in this area.

In [1]:
!pip install python-dotenv



In [2]:
!pip install langchain-groq

Collecting langchain-groq
  Downloading langchain_groq-0.1.9-py3-none-any.whl.metadata (2.9 kB)
Collecting groq<1,>=0.4.1 (from langchain-groq)
  Downloading groq-0.9.0-py3-none-any.whl.metadata (13 kB)
Collecting langchain-core<0.3.0,>=0.2.26 (from langchain-groq)
  Downloading langchain_core-0.2.29-py3-none-any.whl.metadata (6.2 kB)
Collecting langsmith<0.2.0,>=0.1.75 (from langchain-core<0.3.0,>=0.2.26->langchain-groq)
  Downloading langsmith-0.1.98-py3-none-any.whl.metadata (13 kB)
Collecting packaging<25,>=23.2 (from langchain-core<0.3.0,>=0.2.26->langchain-groq)
  Downloading packaging-24.1-py3-none-any.whl.metadata (3.2 kB)
Collecting orjson<4.0.0,>=3.9.14 (from langsmith<0.2.0,>=0.1.75->langchain-core<0.3.0,>=0.2.26->langchain-groq)
  Downloading orjson-3.10.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
Downloading langchain_gro

In [3]:
!pip install langchain_core



In [4]:
import os
from dotenv import load_dotenv
load_dotenv()
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("GROQ_API_KEY")

In [5]:
from langchain_groq import ChatGroq

In [6]:
llm = ChatGroq(model = "gemma2-9b-it", groq_api_key = secret_value_0)

In [7]:
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x7d99f0268760>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x7d99f02693f0>, model_name='gemma2-9b-it', groq_api_key=SecretStr('**********'))

In [8]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content = "Translate the Following from English to Hindi"),
    HumanMessage(content = "Hello, How are you?")
]

llm.invoke(messages).content

'नमस्ते, आप कैसे हैं? (Namaste, aap kaise hain?) \n\n\nThis translates to "Hello, how are you?" in Hindi. \n'

# LCEL

LangChain Expression Language (LCEL) is a domain-specific language designed to simplify and streamline the creation and manipulation of chains in the LangChain framework. LangChain is a library that facilitates the construction of chains of components, such as language models, tools, and APIs, to create sophisticated applications involving natural language processing and understanding.

LCEL allows users to define chains in a more readable and concise manner, making it easier to specify the sequence of operations and transformations applied to the data. This can include tasks like querying language models, processing text, integrating with external APIs, and more.

### Key Features of LCEL

1. **Readable Syntax**: LCEL provides a syntax that is easy to read and write, reducing the complexity of creating chains.
2. **Modularity**: It promotes the modular design of components, allowing users to reuse and combine different parts of their chains.
3. **Integration**: LCEL seamlessly integrates with LangChain's existing components and tools, making it easier to leverage the full power of the framework.
4. **Flexibility**: Users can define complex logic and transformations using LCEL, enabling the creation of sophisticated workflows and applications.

### Example Usage

Here's a simple example to illustrate how LCEL might be used:

```lcel
# Define a chain that queries a language model and processes the response

chain {
  input_text = "What is the capital of France?"
  
  # Step 1: Query the language model
  response = query_language_model(model="llama3", input=input_text)
  
  # Step 2: Extract the answer from the response
  answer = extract_answer(response)
  
  # Step 3: Output the answer
  output(answer)
}
```

In this example, LCEL is used to define a chain with three steps: querying a language model, extracting an answer from the response, and outputting the answer. The syntax is designed to be straightforward, making it easier to understand and maintain the chain.

### Benefits

- **Ease of Use**: LCEL reduces the learning curve for new users of the LangChain framework.
- **Efficiency**: By providing a clear and concise way to define chains, LCEL helps users save time and effort.
- **Maintainability**: The modular and readable nature of LCEL makes it easier to maintain and update chains over time.

LCEL is a powerful tool within the LangChain ecosystem, enhancing the usability and flexibility of the framework for creating advanced natural language processing applications.

# Building Chatbot With Message History Using Langchain

In [9]:
import os
from langchain_groq import ChatGroq
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

llm = ChatGroq(model = "llama-3.1-8b-instant", groq_api_key = secret_value_0)

In [10]:
!pip install langchain_community

Collecting langchain_community
  Downloading langchain_community-0.2.11-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain<0.3.0,>=0.2.12 (from langchain_community)
  Downloading langchain-0.2.12-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-text-splitters<0.3.0,>=0.2.0 (from langchain<0.3.0,>=0.2.12->langchain_community)
  Downloading langchain_text_splitters-0.2.2-py3-none-any.whl.metadata (2.1 kB)
Downloading langchain_community-0.2.11-py3-none-any.whl (2.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m31.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading langchain-0.2.12-py3-none-any.whl (990 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m990.6/990.6 kB[0m [31m32.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langchain_text_splitters-0.2.2-py3-none-any.whl (25 kB)
Installing collected packages: langchain-text-splitters, langchain, langchain_community
Successfully installed langchain-0.2.

1. **`langchain_community.chat_message_histories.ChatMessageHistory`**:
   - **Purpose**: This class manages the history of chat messages. 
   - **Functionality**: It allows for storing, retrieving, and managing a sequence of chat messages in a conversational AI or chatbot application. This history can be used to maintain context in ongoing conversations, enabling the chatbot to provide more relevant and coherent responses based on past interactions.

2. **`langchain_core.chat_history.BaseChatMessageHistory`**:
   - **Purpose**: This is a base or abstract class for chat message history.
   - **Functionality**: It provides a foundational structure for implementing chat message history. Being a base class, it defines the basic interface and functionalities that any specific chat message history class should implement. This could include methods for adding messages, retrieving past messages, and possibly persisting the chat history to a storage medium.

3. **`langchain_core.runnables.history.RunnableWithMessageHistory`**:
   - **Purpose**: This class likely represents a runnable task or operation that includes or interacts with message history.
   - **Functionality**: It combines the concept of a runnable task (something that can be executed) with message history. This means it can perform operations that involve or are influenced by the sequence of messages in the chat history. This could be useful for tasks that need to take into account previous conversations, such as generating responses, summarizing chats, or performing analytics on the conversation data.

In [11]:
# Importing library
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

# Dictionary to store chat histories for different sessions
store = {}

# Function to get the chat history for a specific session
def get_session_history(session_id: str) -> BaseChatMessageHistory:
    # If the session_id is not already in the store, create a new ChatMessageHistory
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    # Return the chat history for the given session_id
    return store[session_id]

with_message_history = RunnableWithMessageHistory(llm, get_session_history)

In [12]:
config = {"configurable":{"session_id":"chat1"}}

In [13]:
response = with_message_history.invoke(
    [HumanMessage(content = "Hello my name is Aman and i am a Student")],
    config = config
)

In [14]:
response.content

'Hello Aman! Nice to meet you! What brings you here? Are you looking for help with your studies, or just chatting to pass the time?'

It remebers my name because i used same config

In [15]:
with_message_history.invoke(
    [HumanMessage(content = "What is my name?")],
    config = config
)

AIMessage(content='Your name is Aman. You told me that earlier!', response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 67, 'total_tokens': 80, 'completion_time': 0.017333333, 'prompt_time': 0.021290091, 'queue_time': None, 'total_time': 0.038623424}, 'model_name': 'llama-3.1-8b-instant', 'system_fingerprint': 'fp_9cb648b966', 'finish_reason': 'stop', 'logprobs': None}, id='run-951a74a2-f392-4a4b-8074-88f4e7b2dac3-0', usage_metadata={'input_tokens': 67, 'output_tokens': 13, 'total_tokens': 80})

After changing changing the config(session) llm don't know my name

In [16]:
# Changing the config
config = {"configurable":{"session_id":"chat2"}}

response = with_message_history.invoke(
    [HumanMessage(content = "What is my name?")],
    config = config
)
response.content

"I'm happy to chat with you, but I don't actually know your name. This is the beginning of our conversation, and I don't have any prior information about you. Would you like to introduce yourself?"

# Prompt Templates

Prompt Templates help to turn raw user information into a format that the LLM can work with. In this case the raw user input in just a messag, which we are passing to the LLM. Let's now make that a bit more complicated.
First let's add in a System Message with some custom instrutions (but still taking message as input.) Next we'll add in more input besides just the messages. 

We use `ChatPromptTemplate` for several important reasons:

1. **Standardizing Inputs**: It helps convert raw user inputs into a format that the language model can work with effectively. This ensures consistency in how information is presented to the model.

2. **Adding Context and Instructions**: It allows us to include system messages or custom instructions along with user messages. This helps guide the language model on how to respond appropriately.

3. **Flexibility and Reusability**: `ChatPromptTemplate` makes it easier to create and reuse prompt structures across different scenarios, saving time and ensuring that best practices are followed.

4. **Simplifying Complex Interactions**: When dealing with complex interactions or multiple variables (like different types of messages or additional parameters), `ChatPromptTemplate` helps manage these elements in a clear and organized way.

5. **Error Handling and Debugging**: By using a structured template, it's easier to identify and fix errors in how prompts are constructed and processed, leading to more reliable and accurate interactions with the language model.

In summary, `ChatPromptTemplate` helps in creating clear, structured, and effective prompts that improve the quality of interactions with the language model.

We use `MessagesPlaceholder` for a few key reasons:

1. **Organizing User Input**: It helps turn user messages into a structured format that the language model can understand better.
  
2. **Adding Instructions**: It allows us to add extra instructions or context to the messages, like telling the model it's a helpful assistant.

3. **Handling Different Messages**: It helps separate and manage different types of messages (like system messages and user messages).

4. **Managing Multiple Inputs**: It allows us to handle additional information (like specifying a language) along with the user messages.

Overall, `MessagesPlaceholder` makes sure the messages are clear and correctly formatted for the language model to process.

In [17]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a Helpful Assistant. Answer the Questions to the Best of your Ability"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain = prompt | llm

In [18]:
prompt

ChatPromptTemplate(input_variables=['messages'], input_types={'messages': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a Helpful Assistant. Answer the Questions to the Best of your Ability')), MessagesPlaceholder(variable_name='messages')])

In [19]:
response = chain.invoke({"messages":[HumanMessage(content = "Hi My name is Ramesh")]})
response.content

"Hello Ramesh! It's nice to meet you. How can I assist you today?"

In [20]:
with_message_history = RunnableWithMessageHistory(chain, get_session_history)

In [21]:
config = {"configurable": {"session_id": "chat3"}}
response = with_message_history.invoke([HumanMessage(content = "Hi my name is Aadam")],
                                      config=config
                                      )
response.content

"Nice to meet you, Aadam! It's great to have you here. Is there something I can help you with today?"

In [22]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a Helpful Assistant. Answer the Questions to the Best of your Ability in {language}"),
        MessagesPlaceholder(variable_name="messages")
    ]
)

chain = prompt | llm

In [23]:
response = chain.invoke({"messages": [HumanMessage(content = "Hello my name is Esha")], "language":"Hindi"})
response.content

'नमस्ते! मैं आपका मददगार सहायक हूँ। आपका नाम ईशा है, ना? क्या मैं आपकी किसी भी चीज़ में मदद कर सकता हूँ?'

Let's now wrap this more complex chain in a Message History class. This time because there are multiple keys in the input we need to specify the correct key to use to save the chat history

In [24]:
with_message_history = RunnableWithMessageHistory(chain, 
                                                  get_session_history,
                                                 input_messages_key = "messages"
                                                 )

In [25]:
config = {"configurable": {"session_id": "chat4"}}

response = with_message_history.invoke({"messages":[HumanMessage(content = "Hello, My name is Raftaar")],
                                       "language":"Marathi"},
                                      config=config
                                      )
response.content

'नमस्कार राफ्तार! तुमचे मी मदतीला असणार्\u200dया सहाय्यक म्हणून येऊन होतो. काय गोष्टी असू शकतात ज्यामध्ये मी तुम्हाला मदत करू शकतो?'

In [26]:
response = with_message_history.invoke({"messages":[HumanMessage(content = "What is my name?")],
                                       "language":"Marathi"},
                                      config=config
                                      )
response.content

'तुझे नाव राफ्तार आहे!'

# Managing the Conversation History

One important concept to understand when building chatbot is how to manage conversation history.If left unmanaged, the list of messages will grow unbounded and potentially overflow the context window of the LLM.Therefore, it is important to add a step that limits the size of the messages that you are passing in.

All models have finite context windows, meaning there's a limit to how many tokens they can take as input. If you have very long messages or a chain/agent that accumulates a long message is history, you'll need to manage the length of the messages you're passing in to the model.

The `trim_messages` util provides some basic strategies for trimming a list of messages to be of a certain token length.

[How to trim messages](https://python.langchain.com/v0.2/docs/how_to/trim_messages/#:~:text=If%20you%20have%20very%20long,of%20a%20certain%20token%20length.)

In [36]:
from langchain_core.messages import SystemMessage, trim_messages

# Initialize the trimmer with specific parameters
trimmer = trim_messages(
    max_tokens=70,               # The maximum number of tokens allowed after trimming.
    strategy="last",            # The strategy used to trim messages; "last" means it will remove messages from the end until the token limit is met.
    token_counter=llm,          # The object or function used to count tokens; here, 'llm' likely refers to a language model that counts tokens.
    include_system=True,        # Whether to include system messages (like instructions) in the trimming process.
    allow_partial=True,         # Whether partial messages can be included when trimming.
    start_on="human",           # Specifies to start trimming after a "human" message is encountered.
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]
trimmer.invoke(messages)

[SystemMessage(content="you're a good assistant"),
 HumanMessage(content="hi! I'm bob"),
 AIMessage(content='hi!'),
 HumanMessage(content='I like vanilla ice cream'),
 AIMessage(content='nice'),
 HumanMessage(content='whats 2 + 2'),
 AIMessage(content='4'),
 HumanMessage(content='thanks'),
 AIMessage(content='no problem!'),
 HumanMessage(content='having fun?'),
 AIMessage(content='yes!')]

`itemgetter` is a utility function that creates a callable object (essentially a function) that can retrieve specific items from an iterable, such as lists, tuples, or dictionaries. It's often used in sorting operations, particularly when you need to sort a list of tuples or dictionaries by one or more keys.

`RunnablePassthrough` in LangChain is like a "bypass" in a pipeline. It lets data flow through without changing it. Think of it as a step in a process that doesn't do anything but just passes the data along to the next step. This is useful if you want to keep the process going without altering the data at certain points.

In [44]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

chain=(
    RunnablePassthrough.assign(messages=itemgetter("messages")|trimmer)
    | prompt
    | llm
    
)

response=chain.invoke(
    {
    "messages":messages + [HumanMessage(content="What ice cream do i like")],
    "language":"English"
    }
)
response.content

'You like vanilla ice cream!'

In [42]:
response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask")],
        "language": "English",
    }
)
response.content

'You asked me "whats 2 + 2"'

In [45]:
## Lets wrap this in the Message History

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)
config={"configurable":{"session_id":"chat5"}}

In [46]:
response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="you are a?")],
        "language": "English",
    },
    config=config,
)

response.content

"I'm a Helpful Assistant!"