# Environment Setup
This section sets up environment variables for API endpoints and keys.

In [27]:
from dotenv import load_dotenv
load_dotenv()

import os

os.environ['LLM_ENDPOINT'] = os.getenv('LLM_ENDPOINT')

os.environ['LANGCHAIN_ENDPOINT'] = os.getenv('LANGCHAIN_ENDPOINT')
os.environ['LANGCHAIN_API_KEY'] = os.getenv('LANGCHAIN_API_KEY')



# Load and Parse Web Page
This section loads the web page and parses its content using BeautifulSoup and LangChain's WebBaseLoader.

In [5]:
import bs4
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()

# Split Documents into Chunks
This section splits the loaded documents into manageable text chunks for embedding and retrieval.

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)

splits = text_splitter.split_documents(docs)

# Create Vector Store and Generate Embeddings
This section creates a Chroma vector store and generates embeddings for the text chunks using Ollama.

In [25]:
from langchain_community.vectorstores import Chroma
from langchain_ollama.embeddings import OllamaEmbeddings

vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=OllamaEmbeddings(model="qwen3:14b")
)

# Cleanup Vector Store
This section deletes the Chroma vector store collection to free up resources.

In [24]:
vectorstore.delete_collection()

# Retrieve and Print Document
This section retrieves a document relevant to the query and prints its content.

In [26]:
retriever = vectorstore.as_retriever()

docs = retriever.invoke("What is Task Decomposition?")

print(docs[0].page_content)

Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.
Self-Reflection#


# Retrieve Prompt from LangChain Hub
This section pulls a prompt from the LangChain hub and prints it.

In [17]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

print(prompt)

Failed to get info from https://api.smith.lanchain.com: LangSmithConnectionError('Connection error caused failure to GET /info in LangSmith API. Please confirm your LANGCHAIN_ENDPOINT. SSLError(MaxRetryError("HTTPSConnectionPool(host=\'api.smith.lanchain.com\', port=443): Max retries exceeded with url: /info (Caused by SSLError(SSLCertVerificationError(1, \'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:997)\')))"))\nContent-Length: None\nAPI Key: lsv2_********************************************3a')


LangSmithConnectionError: Connection error caused failure to GET /commits/rlm/rag-prompt/latest in LangSmith API. Please confirm your LANGCHAIN_ENDPOINT. SSLError(MaxRetryError("HTTPSConnectionPool(host='api.smith.lanchain.com', port=443): Max retries exceeded with url: /commits/rlm/rag-prompt/latest (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:997)')))"))
Content-Length: None
API Key: lsv2_********************************************3a