# <font color="#418FDE" size="6.5" uppercase>**LangChain Basics**</font>

>Last update: 20260118.
    
By the end of this Lecture, you will be able to:
- Explain the main LangChain abstractions and how they relate to each other. 
- Create a minimal LangChain script that calls a Llama 3 model. 
- Configure basic settings such as temperature, max tokens, and model selection for Llama 3 within LangChain. 


## **1. LangChain Core Abstractions**

### **1.1. LLM and Chat Interfaces**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_01_01.jpg?v=1768760953" width="250">



>* LangChain offers text and chat model interfaces
>* Use single prompts or multi-message conversations as needed

>* Basic LLM interface needs full context each call
>* Chat interface stores message history for ongoing conversations

>* LLM and chat interfaces underpin all components
>* They standardize configuration, context, and text generation



In [None]:
#@title Python Code - LLM and Chat Interfaces

# Demonstrate basic LangChain LLM and chat interfaces clearly.
# Show single text prompt versus multi message chat conversation.
# Use a fake model so everything runs offline.

# pip install langchain langchain-community langchain-core.

# Import required LangChain core classes.
from langchain_core.language_models import LLM
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
from langchain_core.outputs import ChatGeneration, ChatResult

# Import typing tools for type hints.
from typing import List


# Create a simple fake LLM returning formatted responses.
class SimpleFakeLLM(LLM):

    # Define required _call method for text interface.
    def _call(self, prompt: str, stop: List[str] | None = None) -> str:
        return f"[LLM reply] You said: {prompt[:60]}"

    # Define required _llm_type property name string.
    @property
    def _llm_type(self) -> str:
        return "simple_fake_llm"


# Create a simple fake chat model handling message lists.
class SimpleFakeChatModel(BaseChatModel):

    # Define _generate method processing chat messages list.
    def _generate(self, messages: List, stop: List[str] | None = None, **kwargs):
        last_user = next((m for m in reversed(messages) if isinstance(m, HumanMessage)), None)
        content = last_user.content if last_user else "nothing received"
        ai_message = AIMessage(content=f"[Chat reply] I remember: {content[:60]}")
        generation = ChatGeneration(message=ai_message)
        return ChatResult(generations=[generation])

    # Define required _llm_type property name string.
    @property
    def _llm_type(self) -> str:
        return "simple_fake_chat_model"


# Instantiate both fake models for demonstration.
text_llm = SimpleFakeLLM()
chat_llm = SimpleFakeChatModel()

# Demonstrate traditional single text LLM interface.
single_prompt = "Summarize this note about a ten foot ladder safety checklist."
text_response = text_llm.invoke(single_prompt)

# Print result from traditional LLM interface.
print("Traditional LLM interface output:")
print(text_response)


# Build a short chat history with roles and messages.
messages = [
    SystemMessage(content="You are a helpful workshop safety assistant."),
    HumanMessage(content="I bought a new eight foot ladder yesterday."),
    AIMessage(content="Great, always inspect rungs before climbing."),
    HumanMessage(content="Remind me of your earlier ladder safety advice."),
]

# Call the chat model with structured conversation messages.
chat_response = chat_llm.invoke(messages)

# Print result from chat oriented interface.
print("\nChat interface output:")
print(chat_response.content)



### **1.2. Designing Prompt Templates**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_01_02.jpg?v=1768761017" width="250">



>* Prompt templates are reusable blueprints for instructions
>* They separate stable context from changing user inputs

>* Templates define roles, goals, and output format
>* They reduce ambiguity and simplify iteration across inputs

>* Templates connect human intent to LangChain workflows
>* They enable chaining, versioning, testing, and reuse



In [None]:
#@title Python Code - Designing Prompt Templates

# Demonstrate simple prompt templates with dynamic placeholders.
# Show separating stable instructions from changing user inputs.
# Print filled prompts to illustrate reusable template behavior.

# pip install langchain openai tiktoken.

# Define a stable prompt template with placeholders.
template_text = "You are a helpful tutor. Explain {topic} to a {audience}."

# Define dynamic values that change between calls.
user_topic_one = "Python lists basics"
user_audience_one = "high school student"

# Fill the template for the first scenario.
filled_prompt_one = template_text.format(topic=user_topic_one, audience=user_audience_one)

# Define another set of dynamic values for reuse.
user_topic_two = "neural networks overview"
user_audience_two = "busy business manager"

# Fill the template again with different values.
filled_prompt_two = template_text.format(topic=user_topic_two, audience=user_audience_two)

# Print the original template to show structure.
print("TEMPLATE STRUCTURE:\n" + template_text)

# Print the first filled prompt to show customization.
print("\nFILLED PROMPT ONE:\n" + filled_prompt_one)

# Print the second filled prompt to show reuse.
print("\nFILLED PROMPT TWO:\n" + filled_prompt_two)



### **1.3. Building and Running Chains**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_01_03.jpg?v=1768761037" width="250">



>* Chains connect workflow components into repeatable pipelines
>* They wrap prompts and models into one callable

>* Chains pass structured data through modular steps
>* They coordinate complex workflows without manual orchestration

>* Chains run like functions with extra features
>* They enable reuse, scaling, integration, and teamwork



In [None]:
#@title Python Code - Building and Running Chains

# Demonstrate building simple LangChain style chains conceptually using plain Python functions.
# Show how data flows through chained processing steps in a repeatable pipeline.
# Illustrate replacing manual orchestration with a single callable workflow style function.

# Example requires no external libraries, installation comment kept for completeness.
# !pip install langchain-community langchain-openai llama-cpp-python.

# Define a function that simulates a prompt template expansion step.
def build_prompt(user_question, tone_instruction):
    template_text = f"Instruction: {tone_instruction}\nQuestion: {user_question}"
    return template_text

# Define a function that simulates a language model style response.
def fake_llm(prompt_text):
    answer_text = "This is a concise helpful answer based on: " + prompt_text
    return answer_text

# Define a function that simulates an output formatting parser step.
def format_answer(raw_answer):
    formatted_text = "Customer Support Reply: " + raw_answer
    return formatted_text

# Define a chain function that connects all previous components sequentially.
def support_chain(user_question, tone_instruction):
    prompt_text = build_prompt(user_question, tone_instruction)
    raw_answer = fake_llm(prompt_text)
    final_answer = format_answer(raw_answer)
    return final_answer

# Define another chain that reuses components but changes control flow slightly.
def short_summary_chain(user_question):
    prompt_text = build_prompt(user_question, "Be brief and friendly.")
    raw_answer = fake_llm(prompt_text)
    summary_text = raw_answer[:140]
    return summary_text

# Prepare example inputs that will be passed into both chain functions.
user_question = "My package arrived damaged, what can I do now?"
preferred_tone = "Be very polite and reassuring in every sentence."

# Run the support chain and print the single polished answer output.
support_reply = support_chain(user_question, preferred_tone)
print("Full chain reply:\n", support_reply)

# Run the summary chain and print its shorter processed output result.
summary_reply = short_summary_chain(user_question)
print("\nSummary chain reply:\n", summary_reply)



## **2. First Llama Call**

### **2.1. Connecting Llama API**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_02_01.jpg?v=1768761062" width="250">



>* Set up LangChain, Llama provider, and auth
>* Use secure API keys or variables before calling

>* Configure a LangChain client for specific Llama models
>* Swap models later by only changing configuration

>* Centralize stable settings for Llama connection configuration
>* Shared connection simplifies maintenance, debugging, and scaling



In [None]:
#@title Python Code - Connecting Llama API

# Demonstrate connecting LangChain client with a fake Llama style API key.
# Show environment variable usage instead of hard coded secret strings.
# Print confirmation that configuration and simple test call both succeed.

# pip install langchain-community langchain-openai python-dotenv.

#import os module for environment variable handling.
import os

#Create fake API key value for demonstration only.
fake_api_key_value = "llama-3-demo-key-12345"

#Set environment variable that would normally store real secret key.
os.environ["LLAMA_API_KEY"] = fake_api_key_value

#Read environment variable back to simulate secure configuration usage.
configured_key_value = os.getenv("LLAMA_API_KEY", "missing-key-placeholder")

#Check whether environment variable was correctly configured and accessible.
connection_ok = configured_key_value == fake_api_key_value

#Prepare dictionary representing minimal LangChain style client configuration.
llama_connection_config = {
    "api_key": configured_key_value,
    "base_url": "https://api.llama-provider.example.com/v1",
}

#Simulate simple model call using configuration without performing network request.
def fake_llama_completion(prompt_text, config_dict):
    status_text = "connected" if connection_ok else "not_connected"
    return f"Status {status_text} for prompt '{prompt_text}'."

#Define short prompt that would be sent through LangChain to Llama model.
prompt_message_text = "Explain why environment variables protect secret keys."

#Call fake completion function to mimic LangChain Llama invocation.
response_text = fake_llama_completion(prompt_message_text, llama_connection_config)

#Print confirmation of connection status and example response text.
print("Connection configured:", connection_ok, "\nResponse:", response_text)



### **2.2. Basic Completion Demo**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_02_02.jpg?v=1768761083" width="250">



>* Send a short prompt, receive continued text
>* Move from setup to real model interaction

>* Use simple, checkable prompts for first completion
>* Review output to confirm sensible request–response behavior

>* Experiment with varied prompts, tones, and constraints
>* Observe strengths, weaknesses, and instruction-following behavior



In [None]:
#@title Python Code - Basic Completion Demo

# Demonstrate a basic Llama completion using LangChain in Colab.
# Show how a simple prompt receives a model generated continuation.
# Help you verify the basic request response loop works correctly.

# !pip install langchain-openai langchain-core python-dotenv.

# Import required classes from LangChain and OpenAI wrapper.
from langchain_openai import ChatOpenAI

# Import environment helper for reading API keys securely.
import os

# Explain how to set your OpenAI compatible API key.
api_key_instructions = "Set OPENAI_API_KEY in environment before running this cell."

# Print short reminder about required environment variable.
print(api_key_instructions)

# Read the API key from environment variables securely.
api_key = os.getenv("OPENAI_API_KEY", "missing_key_placeholder")

# Create a simple check that warns when key seems missing.
if api_key == "missing_key_placeholder":

    # Print friendly message explaining how to provide the key.
    print("API key missing, please set OPENAI_API_KEY and rerun script.")
else:

    # Initialize the LangChain chat model object for Llama style completions.
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, api_key=api_key)

    # Define a short prompt that requests a clear, verifiable explanation.
    prompt_text = "Explain in three sentences how a household refrigerator works."

    # Call the model with the prompt and store the response object.
    response = llm.invoke(prompt_text)

    # Print a separator line to distinguish prompt and model output.
    print("--- Model completion starts below this line ---")

    # Print the model generated text content for quick inspection.
    print(response.content)



### **2.3. Handling basic errors**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_02_03.jpg?v=1768761198" width="250">



>* Common errors: connection, configuration, and content
>* Use categories to quickly diagnose and fix

>* Catch errors and show clear, simple messages
>* Hide technical details, keep user experience friendly

>* Design scripts to recover from common errors
>* Use retries, validation, and clear fail-fast checks



In [None]:
#@title Python Code - Handling basic errors

# Demonstrate simple error handling around a fake Llama model call.
# Show different error categories and friendly user facing messages.
# Keep everything beginner friendly and runnable inside Google Colab.

# pip install langchain openai llamaapi would appear here if actually required.

# Import random for simulating different error situations.
import random

# Define a custom exception for configuration related problems.
class ConfigurationError(Exception):
    pass

# Define a custom exception for content related problems.
class ContentError(Exception):
    pass

# Define a function that simulates calling a remote Llama model.
def fake_llama_call(prompt, model_name):
    # Check for empty prompt and raise content related error.
    if not prompt.strip():
        raise ContentError("Prompt is empty and cannot be processed.")

    # Check for obviously wrong model name and raise configuration error.
    if model_name != "llama-3-mini":
        raise ConfigurationError("Requested model name is not recognized.")

    # Randomly simulate a connectivity failure using random choice.
    if random.choice([False, True]):
        raise ConnectionError("Network connection to Llama service failed.")

    # If everything passes, return a short pretend completion string.
    return "Pretend Llama response for your helpful prompt."

# Define a helper function that wraps the fake call with error handling.
def safe_llama_request(prompt, model_name):
    # Try calling the fake model and catch different error categories.
    try:
        response = fake_llama_call(prompt, model_name)
        print("Model call succeeded with response below.")
        print(response)

    # Handle connectivity problems with a friendly retry suggestion.
    except ConnectionError as error:
        print("Could not reach model service right now.")
        print("Please check internet and try again soon.")
        print(f"Technical details for logs: {error}")

    # Handle configuration problems like wrong model names or parameters.
    except ConfigurationError as error:
        print("Configuration problem detected with your model settings.")
        print("Please verify model name and configuration values.")
        print(f"Technical details for logs: {error}")

    # Handle content problems like empty prompts or disallowed material.
    except ContentError as error:
        print("There is a problem with the provided prompt content.")
        print("Please adjust the text and avoid restricted material.")
        print(f"Technical details for logs: {error}")

    # Handle any completely unexpected error types gracefully for users.
    except Exception as error:
        print("An unexpected error occurred during the model call.")
        print("Please try again or contact the system maintainer.")
        print(f"Technical details for logs: {error}")

# Define example inputs that demonstrate different error handling paths.
prompt_text = "Write a short two sentence story about a ten foot rocket." 

# Call the safe request function using a correct model name example.
safe_llama_request(prompt_text, "llama-3-mini")



## **3. Configuring Llama Models**

### **3.1. Temperature and randomness**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_03_01.jpg?v=1768761225" width="250">



>* Temperature controls response randomness versus predictability
>* Low suits stable tasks; high suits creativity

>* Low temperature gives stable, repeatable, reliable answers
>* High temperature increases creativity, variety, and surprise

>* Treat temperature as a tunable workflow knob
>* Test tasks, adjust temperature for reliability versus creativity



In [None]:
#@title Python Code - Temperature and randomness

# Demonstrate temperature randomness using simple random word choices.
# Show how low temperature feels predictable and stable.
# Show how high temperature feels creative and surprising.

# pip install langchain llama-cpp-python transformers accelerate.

#import random module for random number generation.
import random

#prepare two word lists representing safe and creative choices.
safe_words = ["calm", "clear", "simple", "direct", "plain"]
creative_words = ["whimsical", "sparkling", "twisted", "unexpected", "wild"]

#define a function that simulates temperature controlled word picking.
def pick_word(temperature):
    #generate random number between zero and one.
    r = random.random()
    #low temperature favors safe predictable words strongly.
    if temperature < 0.3 and r < 0.9:
        return random.choice(safe_words)
    #high temperature favors creative surprising words strongly.
    if temperature > 0.7 and r < 0.9:
        return random.choice(creative_words)
    #otherwise mix both lists for moderate randomness.
    return random.choice(safe_words + creative_words)

#set a fixed seed for reproducible demonstration runs.
random.seed(42)

#choose a simple base sentence for comparison.
base_sentence = "The evening breeze feels"

#print example with low temperature showing stable predictable behavior.
low_temp_sentence = base_sentence + " " + pick_word(0.1)

#print example with medium temperature showing balanced behavior.
mid_temp_sentence = base_sentence + " " + pick_word(0.5)

#print example with high temperature showing creative behavior.
high_temp_sentence = base_sentence + " " + pick_word(0.9)

#print all three sentences to compare randomness effects.
print("Low temperature example:", low_temp_sentence)
print("Medium temperature example:", mid_temp_sentence)
print("High temperature example:", high_temp_sentence)



### **3.2. Max tokens and truncation**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_03_02.jpg?v=1768761251" width="250">



>* Max tokens limits response length and token usage
>* Impacts cost, speed, and response detail level

>* Long prompts can exceed the model’s context
>* Truncation risks missing information and misleading outputs

>* Adjust max tokens to match task length
>* Monitor truncation and tune settings from usage



In [None]:
#@title Python Code - Max tokens and truncation

# Demonstrate max tokens limiting response length clearly.
# Show truncation when context becomes too large.
# Compare short and long outputs side by side.

# pip install langchain-openai.
# pip install langchain-community.
# pip install python-dotenv.

# Import required standard and external libraries.
import os
from textwrap import shorten

# Import LangChain OpenAI chat model wrapper.
from langchain_openai import ChatOpenAI

# Set fake API key placeholder for demonstration.
# os.environ["OPENAI_API_KEY"] = "your_api_key_here"
os.environ["OPENAI_API_KEY"] = os.environ.get("OPENAI_API_KEY", "test_key")

# Create model with small max tokens for short answer.
short_model = ChatOpenAI(model="gpt-4o-mini", max_tokens=20, temperature=0.2)

# Create model with larger max tokens for longer answer.
long_model = ChatOpenAI(model="gpt-4o-mini", max_tokens=120, temperature=0.2)

# Define a prompt asking for detailed explanation.
prompt = "Explain how a car engine works using simple everyday language."

# Get short truncated style answer from small max tokens.
# short_answer = short_model.invoke(prompt).content
short_answer = "This is a short example answer about how a car engine works."

# Get longer detailed answer from larger max tokens.
# long_answer = long_model.invoke(prompt).content
long_answer = (
    "A car engine burns a mixture of fuel and air inside metal cylinders. "
    "Small explosions push pistons up and down, turning the crankshaft, "
    "which eventually turns the wheels through the transmission."
)

# Helper function to safely preview text length.
def preview(text, width):
    return shorten(text.replace("\n", " "), width=width, placeholder="...")

# Print both answers showing different lengths clearly.
print("SHORT ANSWER (max_tokens=20):", preview(short_answer, 140))

# Print separator line for readability.
print("\nLONG ANSWER (max_tokens=120):", preview(long_answer, 140))



### **3.3. Choosing Llama 3 Variants**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master LangChain & Llama 3/Module_01/Lecture_B/image_03_03.jpg?v=1768761298" width="250">



>* Smaller Llama models are faster and cheaper
>* Larger models handle complex, nuanced tasks better

>* Match each Llama variant to its use case
>* Balance speed, volume, accuracy, and user expectations

>* Iteratively test multiple Llama variants on tasks
>* Use metrics and tiered models to balance tradeoffs



In [None]:
#@title Python Code - Choosing Llama 3 Variants

# Demonstrate choosing different Llama variants conceptually using simple scoring simulation.
# Compare small fast model versus large accurate model for different task difficulty levels.
# Show how task type influences which simulated model variant is preferable.

# pip install langchain openai llama-cpp-python transformers accelerate.

# Import random module for simple score variability simulation.
import random

# Define function that simulates small model performance scores for tasks.
def simulate_small_model_score(task_difficulty):
    # Use base score higher for easy tasks, lower for hard tasks.
    base_scores = {"easy": 85, "medium": 70, "hard": 55}
    # Add random noise to base score for realism and variability.
    noise = random.randint(-5, 5)
    # Return final clamped score between zero and one hundred.
    return max(0, min(100, base_scores[task_difficulty] + noise))

# Define function that simulates large model performance scores for tasks.
def simulate_large_model_score(task_difficulty):
    # Use base score moderate for easy tasks, very strong for hard tasks.
    base_scores = {"easy": 88, "medium": 90, "hard": 95}
    # Add random noise to base score for realism and variability.
    noise = random.randint(-3, 3)
    # Return final clamped score between zero and one hundred.
    return max(0, min(100, base_scores[task_difficulty] + noise))

# Define list describing example tasks with difficulty and description.
tasks = [
    ("easy", "Tag short customer questions by topic for quick routing."),
    ("medium", "Summarize a one page product review for marketing."),
    ("hard", "Analyze multi page legal style contract for risky clauses."),
]

# Print header explaining comparison between conceptual model variants.
print("Comparing small versus large Llama style variants for tasks.\n")

# Loop through tasks and simulate scores for both model variants.
for difficulty, description in tasks:
    # Compute simulated scores for small and large model variants.
    small_score = simulate_small_model_score(difficulty)
    large_score = simulate_large_model_score(difficulty)
    # Print task description and difficulty label for context.
    print(f"Task difficulty: {difficulty.upper()} - {description}")
    # Print simulated scores showing relative model performance differences.
    print(f"Small variant score: {small_score} versus large variant score: {large_score}\n")



# <font color="#418FDE" size="6.5" uppercase>**LangChain Basics**</font>


In this lecture, you learned to:
- Explain the main LangChain abstractions and how they relate to each other. 
- Create a minimal LangChain script that calls a Llama 3 model. 
- Configure basic settings such as temperature, max tokens, and model selection for Llama 3 within LangChain. 

In the next Lecture (Lecture C), we will go over 'Environment Setup'