## LangChain Tutorial - Part 1: LLM Basic Prompt

#### Setting Up OpenAI API

Before starting this tutorial, please follow the steps below to set your environment:

1. **Access the OpenAI Platform**:
    - Visit [OpenAI Platform](https://platform.openai.com/).
    - Go to [Settings > Billing](https://platform.openai.com/settings/organization/billing/overview) and add credits to your account (if necessary).
    - Go to [Dashboard > API keys](https://platform.openai.com/api-keys) and generate a new API key.

2. **Create an API Key**:    
    - Create a new project (or use the Default project) and generate an API key.
    - Add the key to you `.env` file using the following format:

      ```
      OPENAI_API_KEY=<your-secret-key>
      ```

3. **Load the `.env. File**:
    - Make sure to properly load the `.env` file in your environment so the key is accessible to your application.  

#### How to Load the .env File

**Important:**

- Jupyter Notebooks and many Python extensions automatically load `.env` files.
- The `python-dotenv` package reads key-value pairs from a `.env` file and can set them as environment variables.
  - See: [python-dotenv GitHub repository](https://github.com/theskumar/python-dotenv)

**Step 1: Install the Package `python-dotenv`**

- From a terminal:
  
  ```
  pip install python-dotenv
  ```

- Or from a Jupyter code cell:
  
  ```
  !pip install python-dotenv
  ```

**Step 2: Load the `.env` File**

- Using Python code:

  ```python
  from dotenv import load_dotenv
  load_dotenv()
  ```

- Or using IPython extension from a Jupyter code cell:

  ```python
  %load_ext dotenv
  %dotenv
  ```


In [None]:
!pip install python-dotenv

In [2]:
from dotenv import load_dotenv

# override=False (default)
# Whether to override the system environment variables with the variables from the '.env' file.
_ = load_dotenv()

In [4]:
%load_ext dotenv
%dotenv

In [None]:
import os
from dotenv import dotenv_values

# Parse a .env file and return its content as a dict.
# It does not set environment variables.
dotenv_content = dotenv_values()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']
TAVILY_API_KEY = dotenv_content['TAVILY_API_KEY']

print(OPENAI_API_KEY)
print(GOOGLE_API_KEY)
print(TAVILY_API_KEY)

In [None]:
# Install the packages langchain and langchain_openai
# Use the following syntax, with --upgrade, to upgrade the version of any package, if a new version is available
!pip install --upgrade langchain langchain_openai

In [5]:
# Import the class ChatOpenAI from the package langchain_openai
from langchain_openai import ChatOpenAI

In [6]:
# Create an instance of ChatOpenAI

# ChatOpenAI is able to get the OPENAI_API_KEY from the system environment
llm = ChatOpenAI()

# If OPENAI_API_KEY is not defined, the api_key must be informed:
# llm = ChatOpenAI(api_key=OPENAI_API_KEY)

In [None]:
# First call to the llm
llm.invoke("Tell me a joke.")

# Similar expected response:
# AIMessage(content="Why don't scientists trust atoms?\n\nBecause they make up everything!", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 12, 'total_tokens': 25, 'completion_tokens_details': {'audio_tokens': 0, 'reasoning_tokens': 0, 'text_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-cd71b40d-ba77-4e15-bd35-7bb72a0aeb1a-0', usage_metadata={'input_tokens': 12, 'output_tokens': 13, 'total_tokens': 25, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [10]:
# Creating a professional prompt

# Import the class ChatPromptTemplate from langchain_core.prompts
from langchain_core.prompts import ChatPromptTemplate

In [11]:
# Create an instance of ChatPromptTemplate using the from_messages class method
# Without any input placeholder
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an excellent joke teller."),
        ("user", "Tell me a joke about politics.")
    ]
)

In [12]:
# Create a simple chain that:
# - takes the prompt and passes it to the chat model to be processed by the llm
# - and then returns the response
chain = prompt | llm

In [13]:
# Invoke the chain
# Without any mapped input, passing an empty dictionaty {}
ai_msg = chain.invoke({})

In [None]:
# Type of ai_msg
type(ai_msg)

# langchain_core.messages.ai.AIMessage

In [None]:
# type(ai_msg) == AIMessage
ai_msg

# Similar expected response:
# AIMessage(content='Why did the politician go to the doctor? Because they heard it was a good way to get a second opinion!', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 27, 'total_tokens': 50, 'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-7bd6db23-301c-4a34-a108-477acc21acb4-0', usage_metadata={'input_tokens': 27, 'output_tokens': 23, 'total_tokens': 50, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

In [None]:
# Type of ai_msg.content
type(ai_msg.content)

# str

In [None]:
# type(ai_msg.content) == str
ai_msg.content

# Similar expected response:
# Why did the politician go to the doctor? Because they heard it was a good way to get a second opinion!

In [19]:
# A prompt with user input (placeholder)
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an excellent joke teller."),
        ("user", "Tell me a joke about {user_input}.")
    ]
)

In [20]:
# Create a simple chain
chain = prompt | llm

In [21]:
# Invoke the chain
# Passing a dictionaty to supply the value for the placeholder 'user_input'
ai_msg = chain.invoke({'user_input': 'dictatorships'})

In [None]:
ai_msg

# Similar expected response:
# AIMessage(content='Why did the dictator go to the beach? \n\nTo catch some waves of oppression!', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 22, 'total_tokens': 39, 'completion_tokens_details': {'audio_tokens': None, 'reasoning_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-2b6e4759-6dda-40c5-8c6f-4180d87b8cf9-0', usage_metadata={'input_tokens': 22, 'output_tokens': 17, 'total_tokens': 39, 'input_token_details': {'cache_read': 0}, 'output_token_details': {'reasoning': 0}})

In [None]:
# Other examples...

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a not helpful assistant who makes fun of anything."),
        ("user", "I would like to understand {user_input}. Can you tell me more about it?")
    ]
)

chain = prompt | llm

ai_msg = chain.invoke({'user_input': 'Earth Planet'})

print(ai_msg.content)

# Similar expected response:
# Oh, you want to understand Earth, the big blue marble in space? Well, it's a pretty average planet, nothing too special. Just a little ball of rock and water floating around the sun. But hey, at least it's got some nice beaches, right?

In [None]:
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a not helpful assistant who tells the opposite of the truth."),
        ("user", "I would like to understand {user_input}. Can you tell me more about it?")
    ]
)

chain = prompt | llm

ai_msg = chain.invoke({'user_input': 'Earth Planet'})

print(ai_msg.content)

# Similar expected response:
# Sorry, I can't help you with that. Earth is a mysterious planet that no one really knows much about.

In [None]:
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ("user", "I would like to understand {user_input}. Can you tell me more about it?")
    ]
)

chain = prompt | llm

ai_msg = chain.invoke({'user_input': 'Earth Planet'})

print(ai_msg.content)

# Similar expected response:
# Of course! Earth is the third planet from the Sun in our solar system and is the only known planet to support life. It is often referred to as the "Blue Planet" because of its abundance of water, which covers about 71% of its surface. Earth has a diverse range of environments, including oceans, mountains, forests, deserts, and more.
# The planet has a diameter of about 12,742 kilometers (7,918 miles) and is the fifth-largest planet in our solar system. Earth has a relatively thin atmosphere composed mainly of nitrogen (78%) and oxygen (21%), which is essential for supporting life as we know it.
# Earth orbits the Sun at an average distance of about 150 million kilometers (93 million miles) and takes approximately 365.25 days to complete one orbit, resulting in a year. It also rotates on its axis, causing day and night cycles, with a full rotation taking about 24 hours, resulting in a day.
# Earth is home to a wide variety of life forms, ranging from microscopic bacteria to complex organisms like plants, animals, and humans. The planet's diverse ecosystems provide habitats for countless species and play a crucial role in maintaining the balance of life on Earth.
# Overall, Earth is a unique and fascinating planet that continues to be studied and explored by scientists to better understand its complexities and the interconnected systems that support life.

In [24]:
# When using OpenAI, the LLM response is an object AIMessage
# - Use the syntax `ai_msg.content` to obtain the the content string

# When using Google Gemini, the response is always a string
# - The `ai_msg` itself is already a string

# This can be an issue when interoperating with different APIs through LangChain
# We can use LangChain Output Parsers to solve this issue

# An output parser will help standardize the output across different LLMs, making it easier to handle responses regardless of the model used

# Let's use a simple output parser
# Import the class StrOutputParser from the package langchain_core.output_parsers
from langchain_core.output_parsers import StrOutputParser

In [25]:
# Create an instance of StrOutputParser
output_parser = StrOutputParser()

In [26]:
# Update the latest chain by recreating it

# Create a simple chain that:
# - takes the prompt and passes it to the chat model to be processed by the llm
# - and then takes the response and passes it to be processed by the output parser
# - and then returns the response content
chain = prompt | llm | output_parser

# alternatively:
# chain = prompt | llm | StrOutputParser()

In [27]:
# Invoke the chain
ai_msg = chain.invoke({"user_input": "Earth Planet"})

In [None]:
# Type of ai_msg
type(ai_msg)

# str

In [None]:
# type(msg) == str
ai_msg

# Similar expected response:
# 'Certainly! Earth is the third planet from the Sun in our solar system and is the only known planet to support life. It has a diverse range of ecosystems, including oceans, forests, deserts, and more. Earth has a unique atmosphere that consists mainly of nitrogen and oxygen, which is essential for supporting life as we know it.\n\nThe planet has a solid outer layer called the crust, a semi-liquid layer beneath the crust called the mantle, and a solid inner core at its center. Earth is constantly changing due to processes like plate tectonics, erosion, and volcanic activity.\n\nEarth is also known as the "Blue Planet" because of its abundance of water, which covers about 71% of its surface. The planet has a rich biodiversity, with millions of species of plants, animals, and microorganisms inhabiting its various environments.\n\nOverall, Earth is a fascinating and dynamic planet that provides a home for a wide variety of life forms.'

In [None]:
# The output parser has already extracted the content from the AIMessage object
# The ai_msg contains a string now
# The code below will fail
ai_msg.content

# AttributeError: 'str' object has no attribute 'content'