### Getting started with Langchain and OpenAI
* Setup with LangChain, LangSmith, LangServe
* Use the basic and common components of LangChain: prompt,templates,models and output parsers
* Build a simple application with LangChain
* Trace your application with LangSmith
* Serve your application with LangServe (make it API and production ready)
* Good to have a separate virtual environment hence setup using conda create
* 1).Create your own virtual environment for Langchain : conda create -p venv python==3.12 -y 
* 2).Activate conda enviornment "conda activate /mnt/c/projects/agentic-ai/langchain/venv"
* 3).Create a separate (or new) requirements.txt file for langchain and install with "pip install -r requirements.txt" after virtual enviroment (venv) created in langchain folder.
##### langchain-community (Python package)
* This is the official community-maintained Python package for LangChain integrations.
* It contains third-party connectors (LLM providers, tools, vector stores, document loaders, etc.) that are not part of the core LangChain logic.
* You can install with "pip install langchain-community"
##### langchain_community (the Python namespace)
* This is the actual Python import namespace provided by the langchain-community package once installed.
* So even though the package you install is called langchain-community, the modules inside are accessed as langchain_community.*
* This dual naming (hyphen in PyPI (Python Package Index), underscore in import) is a common pattern in Python packaging.



In [None]:
# Frist export all environment variables we have 
import os 
from dotenv import load_dotenv
load_dotenv()                   # to load all the environment variables 

os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")

# for langsmith tracking purpose
os.environ['LANGCHAIN_API_KEY']=os.getenv("LANGCHAIN_API_KEY")
# os.environ['LANGSMITH_API_KEY']=os.getenv("LANGSMITH_API_KEY")

# Tracing need to enabled
os.environ['LANGSMITH_TRACING_V2']="true"

os.environ['LANGCHAIN_PROJECT']=os.getenv("LANGCHAIN_PROJECT")
project_name=os.getenv("LANGCHAIN_PROJECT")
#print(project_name)


In [32]:
from langchain_openai import ChatOpenAI 
# ChatOpenAI is a LangChain wrapper around OpenAI’s chat models (like GPT‑3.5 or GPT‑4).
llm=ChatOpenAI()  
print(llm)

profile={'max_input_tokens': 16385, 'max_output_tokens': 4096, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': False, 'structured_output': False, 'image_url_inputs': False, 'pdf_inputs': False, 'pdf_tool_message': False, 'image_tool_message': False, 'tool_choice': True} client=<openai.resources.chat.completions.completions.Completions object at 0x00000233EA659AF0> async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x00000233EA522060> root_client=<openai.OpenAI object at 0x00000233EA521A00> root_async_client=<openai.AsyncOpenAI object at 0x00000233EA521220> model_kwargs={} openai_api_key=SecretStr('**********') stream_usage=True


In [47]:
# You can also pass other basic llm parameters if needed
# gpt-4 is most efficient openAI model so far 
llm = ChatOpenAI(
    #model_name="gpt-4",
    # temperature=0.7,
    # max_tokens=500
)


In [48]:
# Input and get response from LLM 
# You can check the monitoring with langsmith after this call 
# LangSmith is a unified DevOps platform for developing, collaborating, testing, deploying, and monitoring LLM applications
result=llm.invoke("What is Generative AI?")

In [49]:
print(result)

content='Generative AI is a type of artificial intelligence technology that can generate new content, such as images, texts, or sounds, based on patterns and data it has been trained on. This technology utilizes machine learning algorithms to learn from existing data and then create new content that is similar to the original data. Generative AI has a wide range of applications, including creating art, generating realistic human faces, composing music, and more.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 85, 'prompt_tokens': 13, 'total_tokens': 98, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CowI76cEWFKe6TfOTVO6RzKEQxmlP', 'service_tier': 'default', 'finish_reason': 'stop

In [50]:
results=llm.invoke("What is Data Science?")
print(results)

content='Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It combines expertise from various fields such as statistics, computer science, information theory, and domain knowledge to analyze complex data sets and solve real-world problems. Data science involves techniques such as data mining, machine learning, data visualization, and predictive modeling to extract and analyze data, gaining valuable insights that can drive decision-making and innovation in various industries.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 99, 'prompt_tokens': 12, 'total_tokens': 111, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gp

In [38]:
### LangSmith is not tracing my above API call ... Need to check again 



In [51]:
# Define chatprompt template 

from langchain_core.prompts import ChatPromptTemplate

prompt=ChatPromptTemplate.from_messages(
    [
        ("system","You are an AI engineer, provide me answers based on questions"),
        ("user","{input}")
    ]
)

prompt

ChatPromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='You are an AI engineer, provide me answers based on questions'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], input_types={}, partial_variables={}, template='{input}'), additional_kwargs={})])

In [None]:
### Create chain object 
# (pipe (|) creates a chain object that automatically performt he following
# takes the input dict --> pass the value into prompt template --> send the resulting text to LLM --> returns the model output.

chain=prompt|llm
response=chain.invoke({"input":"What is DataScience?"})
print(response)


content='Data science is a multidisciplinary field that uses scientific methods, algorithms, processes, and systems to extract knowledge and insights from structured and unstructured data. It combines elements from statistics, computer science, machine learning, and domain expertise to analyze data and solve complex problems. Data science involves various stages such as data collection, data cleaning, data analysis, modeling, and interpretation to derive actionable insights and make informed decisions.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 83, 'prompt_tokens': 28, 'total_tokens': 111, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CowIgHdUDV3Bg6YTqXxi2QydZkpBz', 'servic

In [58]:
type(response)

langchain_core.messages.ai.AIMessage

In [None]:
### output parser (stroutput - String output parser) - parse the output without content
### StrOutputParser is an output parser provided by LangChain.
### Its purpose: take the raw text output from the LLM and return it as a string.
### LLMs return raw text (sometimes with extra spaces, line breaks, or formatting).Parsers standardize this output for downstream processing.

from langchain_core.output_parsers import StrOutputParser
output_parser=StrOutputParser()
chain=prompt|llm|output_parser

response=chain.invoke({"input":"What is DataScience?"})
print(response)

Data science is a multidisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. It combines knowledge from various fields such as statistics, machine learning, computer science, and domain expertise to analyze and interpret complex data to make data-driven decisions and predictions.
