# Getting Started with LangChain and Open AI

Outline:
- Setup LangChain, LangSmith and LangServe.
- Use the basic and common component of LangChain, prompt template, models and output parser.
- Build a simple application with LangChain.
- Trace the application with LangSmith.
- Serve the application with LangServe.

In [3]:
import os 
from dotenv import load_dotenv
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

# For langchain to work, you need to have the following environment variables set:
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

In [4]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
llm

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x10c259cc0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x10c25bd90>, root_client=<openai.OpenAI object at 0x103c6ed10>, root_async_client=<openai.AsyncOpenAI object at 0x10c259d20>, model_name='gpt-4o', model_kwargs={}, openai_api_key=SecretStr('**********'))

In [5]:
# input and output parser
result = llm.invoke("Who is the president of Nigeria?")

In [6]:
print(result)

content='As of the latest available information, Bola Ahmed Tinubu is the President of Nigeria. He assumed office on May 29, 2023.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 14, 'total_tokens': 44, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_4691090a87', 'finish_reason': 'stop', 'logprobs': None} id='run-268e0d8a-b92f-4628-a030-4173eedb36de-0' usage_metadata={'input_tokens': 14, 'output_tokens': 30, 'total_tokens': 44, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}


In [7]:
result2 = llm.invoke("What is mars")

In [8]:
result2

AIMessage(content='Mars is the fourth planet from the Sun in our solar system and is often referred to as the "Red Planet" due to its reddish appearance, which is caused by iron oxide, or rust, on its surface. It is a terrestrial planet with a thin atmosphere composed mostly of carbon dioxide, and it has surface features both similar to those found on Earth and the Moon, including valleys, deserts, and polar ice caps.\n\nMars has two small moons, Phobos and Deimos, which are thought to be captured asteroids. The planet has been a significant focus of space exploration, with numerous missions conducted by various space agencies to study its geology, climate, potential for past or present life, and its suitability for future human colonization. Notably, Mars hosts the tallest volcano and the deepest, longest canyon in the solar system, Olympus Mons and Valles Marineris, respectively.\n\nThe exploration of Mars is driven by its potential to have supported microbial life in the past, and i

In [9]:
# Chat prompt template
from langchain_core.prompts import ChatPromptTemplate

prompt =  ChatPromptTemplate.from_messages(
    [
        ("system", "Act like an expert AI Engineer, Provide me answers based on the questions"),
        ("user", "{input}")
    ]
)
prompt

ChatPromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='Act like an expert AI Engineer, Provide me answers based on the questions'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])

In [10]:
## Chain prompt with LLM
chain = prompt|llm

response = chain.invoke({"input": "Tell me about langsmith and langserve."})
print(response)

content='As of my last training cutoff in October 2023, I don\'t have any specific information about products or tools named "Langsmith" and "Langserve" within the realm of AI and technology. It\'s possible that these could be newer tools developed after my last update, or they could be niche solutions not widely covered in the information I\'ve been trained on. \n\nIf these terms refer to recent developments, I\'d recommend checking the latest resources, such as product websites, recent AI conference papers, or industry news articles for the most up-to-date information. Alternatively, they could be hypothetical or conceptual terms that might not be officially recognized or documented widely. If you provide more context, I could potentially be more helpful.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 142, 'prompt_tokens': 34, 'total_tokens': 176, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning

In [11]:
chain = prompt|llm

response = chain.invoke({"input": "Tell me about Generative Adversarial Network."})
print(response)

content="Generative Adversarial Networks (GANs) are a class of machine learning models introduced by Ian Goodfellow and his colleagues in 2014. They are primarily used for generating synthetic data that resemble real data distributions and are composed of two neural networks: the generator and the discriminator, which are trained simultaneously through an adversarial process.\n\n1. **Generator**: The generator's role is to produce data that mimic real data. It starts with random noise as input and generates data through a series of transformations. The objective of the generator is to improve over time such that the data it generates is indistinguishable from real data.\n\n2. **Discriminator**: The discriminator acts as a judge that evaluates the authenticity of the data. It receives input either from the real data distribution or from the generator and outputs a probability indicating how real or fake the input data is. The discriminator’s goal is to correctly distinguish between real

In [12]:
type(response)

langchain_core.messages.ai.AIMessage

In [13]:
# Print out just the sting answer only

from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()
chain = prompt|llm|output_parser

response2 = chain.invoke({"input": "Tell me about Variation auto encodersk."})
print(response2)

Variational Autoencoders (VAEs) are a type of generative model and an extension of traditional autoencoders used for unsupervised learning. Introduced by Kingma and Welling in 2013, VAEs are designed to learn a latent variable model for the data, where the latent variables follow a known distribution, typically a Gaussian distribution. They are particularly powerful for generating new data samples that resemble the original dataset.

Here's a brief overview of key concepts related to VAEs:

1. **Components**: Like a standard autoencoder, a VAE consists of two main parts: an encoder and a decoder.
   - **Encoder**: Maps the input data to a latent space, representing it as a distribution. Instead of mapping inputs to a fixed vector, VAEs map inputs to a distribution, typically parameterized as a mean and a variance that define a Gaussian.
   - **Decoder**: Reconstructs the input data from latent variables sampled from the distribution generated by the encoder.

2. **Latent Space Represen