#### Getting started With Langchain And Open AI

In this quickstart we'll see how to:

- Set up LangChain, LangSmith, and LangServe
- Utilize the core components of LangChain, including prompt templates, models, and output parsers
- Create a simple application using LangChain
- Monitor your application with LangSmith
- Serve your application using LangServe

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ['OPENAI_API_KEY']=os.getenv("OPENAI_API_KEY")
## Langsmith Tracking
os.environ["LANGCHAIN_API_KEY"]=os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_PROJECT"]=os.getenv("LANGCHAIN_PROJECT")

In [None]:
from langchain_openai import ChatOpenAI
llm=ChatOpenAI(model="gpt-4o-mini")
print(llm)

In [None]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model='gpt-4o-mini')
print(llm)

In [None]:
result=llm.invoke("What is Agentic AI?")

In [None]:
result

In [None]:
from langchain_core.prompts import ChatPromptTemplate

In [None]:
prompt=ChatPromptTemplate.from_messages(
    [
        ("system","You are an expert Data Scientist and Gen AI Engineer. Provide me answers based on the asked question "),
        ("user","{query}")

    ]
)
prompt

In [None]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an expert Data Scientist and Gen AI Engineer. Provide me answers based on the asked question "),
        ("user", "{query}")
    ]
)

prompt

### Chains
Chains are easily reusable components linked together.

Chains encode a sequence of calls to components like models, document retrievers, other Chains, etc., and provide a simple interface to this sequence.

The Chain interface makes it easy to create apps that are:

Stateful: add Memory to any Chain to give it state,

Observable: pass Callbacks to a Chain to execute additional functionality, like logging, outside the main sequence of component calls,

Composable: combine Chains with other components, including other Chains.

In [None]:
## chain 
chain=prompt|llm

response=chain.invoke({"query":"Can you tell me something about Genertaive ai vs agentic ai"})
print(response)


In [None]:
chain = prompt|llm

response = chain.invoke({"query":"Can you tell me someting about Generative ai vs agentic ai"})
print(response)

In [None]:
response.content

## Stroutput Parser
The StrOutputParser is a fundamental component in the Langchain framework, designed to streamline the output from language models (LLMs) and ChatModels into a usable string format. This parser is particularly useful when dealing with outputs that may vary in structure, such as strings or messages. It ensures that the output is consistent and easy to handle in subsequent processing steps.

In [None]:
from langchain_core.output_parsers import StrOutputParser
output_parser=StrOutputParser()
chain=prompt|llm|output_parser

response=chain.invoke({"query":"Can you tell me about Langsmith?"})
print(response)

In [None]:
from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()
chain = prompt|llm|output_parser

response = chain.invoke({"query":"Can you tell me about Langsmith?"})

In [None]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
output_parser=JsonOutputParser()
prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": output_parser.get_format_instructions()},
)

In [None]:
from langchain_core.output_parsers import JsonOutputParser
output_parser=JsonOutputParser()
chain=prompt|llm|output_parser

response=chain.invoke({"query":"Can you tell me about Langsmith?"})
print(response)

RAG

In [None]:
## Data Ingestion--From the website we need to scrape the data
from langchain_community.document_loaders import WebBaseLoader

In [None]:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse
import time

def get_all_links(url, base_domain, visited=None, depth=2):
    """
    Recursively fetch all unique sub-links from a webpage and its sub-pages.

    :param url: The main webpage URL to fetch links from
    :param base_domain: The base domain to ensure internal links only
    :param visited: A set to track visited URLs (to avoid duplicates)
    :param depth: Depth of recursion (limits how deep to go)
    :return: A set of all discovered unique URLs
    """
    if visited is None:
        visited = set()

    if depth == 0 or url in visited:
        return visited  # Stop recursion if max depth is reached or URL is already visited

    try:
        response = requests.get(url, timeout=5)
        if response.status_code != 200:
            return visited  # Skip if page fetch fails
    except requests.RequestException:
        return visited  # Skip if request fails

    visited.add(url)  # Mark current URL as visited
    soup = BeautifulSoup(response.text, "html.parser")

    for a_tag in soup.find_all("a", href=True):
        link = urljoin(url, a_tag["href"])  # Convert relative URL to absolute
        parsed_link = urlparse(link)

        # Ensure the link belongs to the same domain and is not already visited
        if parsed_link.netloc == base_domain and link not in visited:
            visited.add(link)
            time.sleep(0.5)  # Short delay to prevent request overload
            get_all_links(link, base_domain, visited, depth - 1)  # Recursive call

    return visited

# Example Usage
main_page = "https://python.langchain.com/"
base_domain = urlparse(main_page).netloc  # Extract base domain

all_unique_links = get_all_links(main_page, base_domain, depth=2)  # Set depth limit

print(len(all_unique_links))

# Print all collected unique links
for link in sorted(all_unique_links):
    print(link)


In [None]:
# loader=WebBaseLoader(["https://python.langchain.com/","https://python.langchain.com/docs/introduction/"])
loader=WebBaseLoader("https://python.langchain.com/")
loader

In [None]:
documents=loader.load()
documents

In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter=RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200)
documents=text_splitter.split_documents(documents)
documents


In [None]:
len(documents)

In [None]:
from langchain_openai import OpenAIEmbeddings
embeddings=OpenAIEmbeddings()

In [None]:
from langchain_community.vectorstores import FAISS
vectorstoredb=FAISS.from_documents(documents,embeddings)

In [None]:
vectorstoredb

In [None]:
query="Langchain is a framework"
result=vectorstoredb.similarity_search(query)
result[0].page_content