# Exploring LLMs and ChatModels for LLM Input / Output with LangChain

## Install OpenAI, HuggingFace and LangChain dependencies

In [3]:
!pip install -qq langchain==0.3.11
!pip install -qq langchain-openai==0.2.12
!pip install -qq langchain-community==0.3.11
!pip install -qq huggingface_hub==0.30.2
!pip install -qq langchain-core==0.3.63 

In [5]:
# Don't run if you want to use only chatgpt
# This is for accessing open LLMs from huggingface
!pip install -qq transformers==4.46.3

In [10]:
!pip install -qq langchain_google_genai

## Enter API Tokens

In [6]:
import os 
from dotenv import load_dotenv

load_dotenv()

True

# Model I/O

In LangChain, the central part of any application is the language model. This module provides crucial tools for working effectively with any language model, ensuring it integrates smoothly and communicates well.

### Key Components of Model I/O

**LLMs and Chat Models (used interchangeably):**
- **LLMs:**
  - **Definition:** Pure text completion models.
  - **Input/Output:** Receives a text string and returns a text string.
- **Chat Models:**
  - **Definition:** Based on a language model but with different input and output types.
  - **Input/Output:** Takes a list of chat messages as input and produces a chat message as output.


## Chat Models and LLMs

Large Language Models (LLMs) are a core component of LangChain. LangChain does not implement or build its own LLMs. It provides a standard API for interacting with almost every LLM out there.

There are lots of LLM providers (OpenAI, Hugging Face, etc) - the LLM class is designed to provide a standard interface for all of them.

## Accessing Commercial LLMs like ChatGPT



### Accessing ChatGPT as an LLM

Here we will show how to access a basic ChatGPT Instruct LLM. However the ChatModel interface which we will see later, is better because the LLM API doesn't support the chat models like `gpt-3.5-turbo`and only support the `instruct`models which can respond to instructions but can't have a conversation with you.

In [None]:
from langchain_openai import OpenAI, ChatOpenAI

chatgpt = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0)

In [8]:
prompt = """Explain what is Generative AI in 3 bullet points"""
response = chatgpt.invoke(prompt)
print(response)



1. Generative AI is a subset of artificial intelligence that focuses on creating new and original content, rather than just analyzing and processing existing data.

2. It uses algorithms and machine learning techniques to generate new ideas, designs, or solutions based on a set of input data or parameters.

3. Generative AI has a wide range of applications, including creating art, music, and text, as well as assisting in product design and optimization. It has the potential to revolutionize industries by automating creative tasks and providing innovative solutions.


### Accessing ChatGPT as an Chat Model LLM

Here we will show how to access the more advanced ChatGPT Turbo Chat-based LLM. The ChatModel interface is better because this supports the chat models like `gpt-3.5-turbo`which can respond to instructions as well as have a conversation with you. We will look at the conversation aspect slightly later in the notebook.

In [9]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
prompt = """Explain what is Generative AI in 3 bullet points"""
response = chatgpt.invoke(prompt)
print(response)

content='- Generative AI is a type of artificial intelligence that is capable of creating new content, such as images, text, or music, based on patterns and data it has been trained on.\n- It uses algorithms and neural networks to generate new content that is similar to the input data it has been trained on, but with variations and creativity.\n- Generative AI has applications in various fields, including art, design, music composition, and even creating realistic deepfake videos.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 94, 'prompt_tokens': 19, 'total_tokens': 113, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run--1a28f2ae-e2a0-498f-ae11-f653d71dcc1b-0' usage_metadat

## Accessing Gemini as an LLM

In [11]:
from langchain_google_genai import ChatGoogleGenerativeAI

gemini = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0)

  from .autonotebook import tqdm as notebook_tqdm


In [13]:
prompt = """Explain what is Generative AI in 3 bullet points"""
response = gemini.invoke(prompt)
print(response.content)

* **Creates new content:** Generative AI uses algorithms to produce various forms of content, including text, images, audio, and video, rather than just analyzing or classifying existing data.

* **Learns from existing data:**  It's trained on massive datasets to learn patterns and structures, allowing it to generate outputs that resemble the training data but are novel and unique.

* **Uses various techniques:**  Different generative AI models employ techniques like transformers, generative adversarial networks (GANs), and variational autoencoders (VAEs) to achieve content generation.



## Accessing Local LLMs with HuggingFacePipeline API

Hugging Face models can be run locally through the `HuggingFacePipeline` class. However remember you need a good GPU to get fast inference

The Hugging Face Model Hub hosts over 500k models, 90K+ open LLMs

These can be called from LangChain either through this local pipeline wrapper or by calling their hosted inference endpoints through the `HuggingFaceEndpoint` API we saw earlier.

To use, you should have the `transformers` python package installed, as well as `pytorch`.

Advantages include the model being completely local, high privacy and security. Disadvantages are basically the necessity of a good compute infrastructure, preferably with a GPU

In [30]:
from langchain_huggingface import HuggingFacePipeline

In [34]:
# gemma_params = {
#                   "do_sample": False, # greedy decoding - temperature = 0
#                   "return_full_text": False, # don't return input prompt
#                   "max_new_tokens": 1000, # max tokens answer can go upto
#                 }

# local_llm = HuggingFacePipeline.from_model_id(
#     model_id="microsoft/Phi-3.5-mini-instruct",
#     task="text-generation",
#     pipeline_kwargs=gemma_params,
#     # device=0 # when running on Colab selects the GPU, you can change this if you run it on your own instance if needed
# )

In [36]:
# local_llm

In [None]:
# # Gemma2B when used locally expects input prompt to be formatted in a specific way
# # check more details here: https://huggingface.co/google/gemma-1.1-2b-it#chat-template
# gemma_prompt = """<bos><start_of_turn>user\n""" + prompt + """\n<end_of_turn>
# <start_of_turn>model
# """
# print(gemma_prompt)

In [35]:
# response = local_llm.invoke(gemma_prompt)
# print(response)

### Accessing Open LLMs in HuggingFace as a Chat Model LLM

Here we will show how to access open LLMs from HuggingFace like Google Gemma 2B and make them have a conversation with you. We will look at the conversation aspect slightly later in the notebook.

In [None]:
from langchain_huggingface import HuggingFaceEndpoint

repo_id = "microsoft/Phi-3.5-mini-instruct"

phi3_params = {
                  "wait_for_model": True, # waits if model is not available in Hugginface serve
                  "do_sample": False, # greedy decoding - temperature = 0
                  "return_full_text": False, # don't return input prompt
                  "max_new_tokens": 1000, # max tokens answer can go upto
                }

llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    # max_length=128,
    temperature=0.5,
    huggingfacehub_api_token="",
   **phi3_params
)

In [37]:
from langchain_huggingface import ChatHuggingFace

chat_gemma = ChatHuggingFace(llm=llm,
                             model_id='google/gemma-1.1-2b-it')

In [38]:
print(response.content)

* **Creates new content:** Generative AI uses algorithms to produce various forms of content, including text, images, audio, and video, rather than just analyzing or classifying existing data.

* **Learns from existing data:**  It's trained on massive datasets to learn patterns and structures, allowing it to generate outputs that resemble the training data but are novel and unique.

* **Uses various techniques:**  Different generative AI models employ techniques like transformers, generative adversarial networks (GANs), and variational autoencoders (VAEs) to achieve content generation.



## Message Types for ChatModels and Conversational Prompting

Conversational prompting is basically you, the user, having a full conversation with the LLM. The conversation history is typically represented as a list of messages.

ChatModels process a list of messages, receiving them as input and responding with a message. Messages are characterized by a few distinct types and properties:

- **Role:** Indicates who is speaking in the message. LangChain offers different message classes for various roles.
- **Content:** The substance of the message, which can vary:
  - A string (commonly handled by most models)
  - A list of dictionaries (for multi-modal inputs, where each dictionary details the type and location of the input)

Additionally, messages have an `additional_kwargs` property, used for passing extra information specific to the message provider, not typically general. A well-known example is `function_call` from OpenAI.

### Specific Message Types

- **HumanMessage:** A user-generated message, usually containing only content.
- **AIMessage:** A message from the model, potentially including `additional_kwargs`, like `tool_calls` for invoking OpenAI tools.
- **SystemMessage:** A message from the system instructing model behavior, typically containing only content. Not all models support this type.


## Conversational Prompting with ChatGPT

Here we use the `ChatModel` API in `ChatOpenAI` to have a full conversation with ChatGPT while maintaining a full flow of the historical conversations

In [39]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model_name="gpt-4o-mini", temperature=0)

In [47]:
from langchain_core.messages import HumanMessage, AIMessage,SystemMessage

prompt = """Can you explain what is Generative AI in 3 bullet points?"""
sys_prompt = """Act as a helpful assistant and give meaningful examples in your responses."""

message = [
    {
        "role": "system",
        "content": sys_prompt
    },
    {
        "role": "user",
        "content": prompt
    }
]

response = chatgpt.invoke(message)

In [48]:
message

[{'role': 'system',
  'content': 'Act as a helpful assistant and give meaningful examples in your responses.'},
 {'role': 'user',
  'content': 'Can you explain what is Generative AI in 3 bullet points?'}]

In [44]:
print(response.content)

1. Generative AI refers to a type of artificial intelligence that is capable of creating new content, such as images, text, or music, based on patterns it has learned from existing data.
2. One popular example of generative AI is GPT-3 (Generative Pre-trained Transformer 3), a language model developed by OpenAI that can generate human-like text based on the input it receives.
3. Another example is StyleGAN, a generative adversarial network (GAN) that can generate highly realistic images of faces, animals, and other objects.


In [50]:
response

AIMessage(content="1. Generative AI is a type of artificial intelligence that is capable of creating new content, such as images, text, or music, based on patterns it has learned from existing data.\n2. It uses techniques like neural networks and deep learning to generate realistic and original outputs that mimic human creativity.\n3. Examples of generative AI include text generators like GPT-3, image generators like StyleGAN, and music generators like OpenAI's MuseNet.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 92, 'prompt_tokens': 38, 'total_tokens': 130, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--c8960707-af3b-4144-a55f-cd84b1946123-0', usage_metadata={'inp

In [45]:
# Another way to define msg

message = [
    SystemMessage(content = sys_prompt),
    HumanMessage(content = prompt)

]
response = chatgpt.invoke(message)
print(response.content)

1. Generative AI refers to a type of artificial intelligence that is capable of creating new content, such as images, text, or music, based on patterns it has learned from existing data.
2. One popular example of generative AI is GPT-3 (Generative Pre-trained Transformer 3), a language model developed by OpenAI that can generate human-like text based on the input it receives.
3. Another example is StyleGAN, a generative adversarial network (GAN) that can generate highly realistic images of faces, animals, and other objects.


In [52]:
message

[{'role': 'system',
  'content': 'Act as a helpful assistant and give meaningful examples in your responses.'},
 {'role': 'user',
  'content': 'Can you explain what is Generative AI in 3 bullet points?'}]

In [53]:
# add the past conversation history into messages
message.append(response)
# add the new prompt to the conversation history list
prompt = """What did we discuss so far?"""
message.append(HumanMessage(content=prompt))
message

[{'role': 'system',
  'content': 'Act as a helpful assistant and give meaningful examples in your responses.'},
 {'role': 'user',
  'content': 'Can you explain what is Generative AI in 3 bullet points?'},
 AIMessage(content="1. Generative AI is a type of artificial intelligence that is capable of creating new content, such as images, text, or music, based on patterns it has learned from existing data.\n2. It uses techniques like neural networks and deep learning to generate realistic and original outputs that mimic human creativity.\n3. Examples of generative AI include text generators like GPT-3, image generators like StyleGAN, and music generators like OpenAI's MuseNet.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 92, 'prompt_tokens': 38, 'total_tokens': 130, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens

In [55]:
# sent the conversation history along with the new prompt to chatgpt
response = chatgpt.invoke(message)
response.content

"So far, we have discussed the concept of Generative AI in three main points:\n1. Generative AI's ability to create new content based on learned patterns.\n2. The use of neural networks and deep learning in generative AI.\n3. Examples of generative AI applications such as text generators, image generators, and music generators."