### Getting started with Langchain and Open AI

In this quick start we'll see how to:

 -  Get setup with LangChain, LangSmith and LangServe
 - Use the most basic and common components of LangChain: prompt templates, models and output parsers.
 - Build simple application with LangChain
 - Trace your application with LangSmith
 - Serve your application with LangServe

In [None]:
## Importing required libraries
import os
from dotenv import load_dotenv
load_dotenv()

## Loading required variables from dot environment
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

In [5]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model='gpt-5.2')
print(llm)

profile={'max_input_tokens': 272000, 'max_output_tokens': 128000, 'image_inputs': True, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True, 'image_url_inputs': True, 'pdf_inputs': True, 'pdf_tool_message': True, 'image_tool_message': True, 'tool_choice': True} client=<openai.resources.chat.completions.completions.Completions object at 0x135a94160> async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x135a96a10> root_client=<openai.OpenAI object at 0x132cfb0a0> root_async_client=<openai.AsyncOpenAI object at 0x135a96a70> model_name='gpt-5.2' model_kwargs={} openai_api_key=SecretStr('**********') stream_usage=True


In [6]:
## Lets provide some input and get some response from LLM

result = llm.invoke("What is generative AI, explain in max lines")

In [9]:
print(result)

content='Generative AI is a type of artificial intelligence that learns patterns from existing data and then creates new content—like text, images, music, audio, video, or code—that resembles what it was trained on.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 45, 'prompt_tokens': 16, 'total_tokens': 61, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-5.2-2025-12-11', 'system_fingerprint': None, 'id': 'chatcmpl-D2gzOMljpLKLXYLyTIwSRyTaQJOVf', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--019c0076-46ea-7271-b9e2-31d363a73619-0' tool_calls=[] invalid_tool_calls=[] usage_metadata={'input_tokens': 16, 'output_tokens': 45, 'total_tokens': 61, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_tok

In [10]:
## Prompt Template: Provide rules or pre instructions
## Chat prompt template
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "you are an AI Engineer. Provide me answers based on the question"),
        ("user", "{input}")
    ]
)
prompt

ChatPromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='you are an AI Engineer. Provide me answers based on the question'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])

In [12]:
chain = prompt|llm
response = chain.invoke({"input":"can you tell be about Langsmith in few crisp sententences"})

In [15]:
response

AIMessage(content='LangSmith is a platform from LangChain for **building, testing, and monitoring LLM applications**. It helps you **trace** every step of an LLM run (prompts, tool calls, outputs, latency, and errors), so debugging is faster and more reliable. It also supports **evaluation** (offline/online), **datasets**, and **prompt/version management** to measure quality and prevent regressions as you iterate.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 90, 'prompt_tokens': 35, 'total_tokens': 125, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-5.2-2025-12-11', 'system_fingerprint': None, 'id': 'chatcmpl-D2hBd3jVmBJu0Iig4OoSH8ZKCZNHv', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019c0081-db6

In [16]:
type(response)

langchain_core.messages.ai.AIMessage

In [18]:
## Output Parser: is a way of displaying LLm outputs based on user requireements
## Sting Output Parser (StrOutParser)
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()
chain = prompt|llm|output_parser
response = chain.invoke({"input":"Explain RAG steps in crisp and clear"})
print(response)

RAG (Retrieval-Augmented Generation) combines **search** + **LLM generation** so answers are grounded in your data. Core steps:

1. **Ingest data**
   - Collect source docs (PDFs, web pages, tickets, DB rows, etc.).
   - Clean/normalize text and metadata.

2. **Chunking**
   - Split documents into small passages (e.g., 200–800 tokens) with optional overlap.
   - Keep metadata (source, section, timestamp, ACLs).

3. **Embedding**
   - Convert each chunk into a vector using an embedding model.
   - Store `{chunk_text, embedding, metadata}`.

4. **Indexing (Vector store)**
   - Save embeddings in a vector database (FAISS, Pinecone, Weaviate, etc.).
   - Optionally add keyword/BM25 index for hybrid retrieval.

5. **Query understanding**
   - Take user question; optionally rewrite/expand it (query rewriting), detect intent, apply filters (tenant, time, permissions).

6. **Retrieval**
   - Embed the query and fetch top‑k most relevant chunks (vector similarity and/or hybrid).
   - Apply meta