# LSM-002: Your First LangSmith Project

## üéØ Learning Objectives

By the end of this notebook, you will:
- Set up your development environment with LangSmith
- Create and run your first traced LLM application
- Navigate the LangSmith dashboard
- Understand basic tracing concepts through hands-on experience
- Run your first evaluation

## üõ†Ô∏è Environment Setup

Let's start by installing the necessary packages and setting up your environment.

In [None]:
# Install required packages
!pip install langsmith openai python-dotenv

# Optional: Install langchain if you want to see LangChain integration examples
!pip install langchain langchain-openai

## üîë Configuration

Before we start coding, you'll need to configure your API keys. Create a `.env` file in your project directory with the following variables:

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Required environment variables:
# LANGSMITH_API_KEY=your_langsmith_api_key_here
# LANGSMITH_PROJECT=your_project_name_here
# OPENAI_API_KEY=your_openai_api_key_here (optional, for OpenAI examples)

# Verify your setup
langsmith_api_key = os.getenv("LANGSMITH_API_KEY")
langsmith_project = os.getenv("LANGSMITH_PROJECT")

if not langsmith_api_key:
    print("‚ö†Ô∏è  LANGSMITH_API_KEY not found. Please set it in your .env file.")
else:
    print(f"‚úÖ LangSmith API Key: {langsmith_api_key[:8]}...")

if not langsmith_project:
    print("‚ö†Ô∏è  LANGSMITH_PROJECT not found. Please set it in your .env file.")
else:
    print(f"‚úÖ LangSmith Project: {langsmith_project}")

## üìù Your .env File Template

Create a file named `.env` in your project directory with this content:

```env
# LangSmith Configuration
LANGSMITH_API_KEY=your_langsmith_api_key_here
LANGSMITH_PROJECT=learning-langsmith
LANGSMITH_TRACING=true

# Optional: OpenAI API Key (for OpenAI examples)
OPENAI_API_KEY=your_openai_api_key_here

# Optional: Other LLM API keys
# ANTHROPIC_API_KEY=your_anthropic_key_here
# GOOGLE_API_KEY=your_google_key_here
```

Replace the placeholder values with your actual API keys.

## üöÄ Example 1: Simple LLM Call with Manual Tracing

Let's start with a basic example using the LangSmith SDK directly to trace a simple LLM call.

In [None]:
from langsmith import traceable
import openai

# Initialize OpenAI client
client = openai.OpenAI()

@traceable(run_type="llm")
def call_openai(messages, model="gpt-3.5-turbo"):
    """A simple traced OpenAI call"""
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0.7
    )
    return response.choices[0].message.content

# Test the function
messages = [
    {"role": "user", "content": "Explain quantum computing in simple terms."}
]

try:
    result = call_openai(messages)
    print("Response:")
    print(result)
    print("\n‚úÖ Success! Check your LangSmith dashboard to see the trace.")
except Exception as e:
    print(f"‚ùå Error: {e}")
    print("Make sure your OpenAI API key is set correctly.")

## üîç Example 2: Multi-Step Application with LangChain

Let's create a more complex application that demonstrates multiple steps being traced.

In [None]:
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
from langsmith import traceable

# Initialize the LLM
llm = ChatOpenAI(temperature=0.7, model="gpt-3.5-turbo")

@traceable(run_type="chain")
def research_assistant(topic):
    """A multi-step research assistant that generates questions and provides answers"""
    
    # Step 1: Generate research questions
    question_messages = [
        SystemMessage(content="You are a research assistant. Generate 3 interesting research questions about the given topic."),
        HumanMessage(content=f"Topic: {topic}")
    ]
    
    questions = llm.invoke(question_messages)
    
    # Step 2: Answer the first question in detail
    answer_messages = [
        SystemMessage(content="You are an expert researcher. Provide a detailed answer to the research question."),
        HumanMessage(content=f"Research questions: {questions.content}\n\nPlease answer the first question in detail.")
    ]
    
    answer = llm.invoke(answer_messages)
    
    return {
        "topic": topic,
        "questions": questions.content,
        "detailed_answer": answer.content
    }

# Test the research assistant
try:
    result = research_assistant("Renewable Energy")
    
    print(f"üìö Research Topic: {result['topic']}\n")
    print(f"‚ùì Generated Questions:\n{result['questions']}\n")
    print(f"üìñ Detailed Answer:\n{result['detailed_answer']}")
    
    print("\n‚úÖ Multi-step trace created! Check your LangSmith dashboard.")
    
except Exception as e:
    print(f"‚ùå Error: {e}")
    print("Make sure your API keys are configured correctly.")

## üìä Example 3: Adding Metadata and Tags

Let's enhance our tracing with metadata and tags for better organization and filtering.

In [None]:
from langsmith import traceable
from langchain_openai import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage
import time

llm = ChatOpenAI(temperature=0.2, model="gpt-3.5-turbo")

@traceable(
    run_type="chain",
    metadata={"version": "1.0", "environment": "development"},
    tags=["sentiment-analysis", "quickstart"]
)
def sentiment_analyzer(text, include_reasoning=True):
    """Analyze sentiment with optional reasoning"""
    
    start_time = time.time()
    
    # Create prompt based on requirements
    if include_reasoning:
        prompt = f"""Analyze the sentiment of the following text and provide reasoning:
        
Text: {text}

Please provide:
1. Sentiment (Positive/Negative/Neutral)
2. Confidence score (0-1)
3. Brief reasoning
"""
    else:
        prompt = f"""Analyze the sentiment of the following text:
        
Text: {text}

Respond with just: Sentiment (Positive/Negative/Neutral) and Confidence (0-1)
"""
    
    messages = [
        SystemMessage(content="You are an expert sentiment analysis assistant."),
        HumanMessage(content=prompt)
    ]
    
    response = llm.invoke(messages)
    
    processing_time = time.time() - start_time
    
    return {
        "input_text": text,
        "analysis": response.content,
        "include_reasoning": include_reasoning,
        "processing_time_seconds": round(processing_time, 2)
    }

# Test with different types of text
test_texts = [
    "I absolutely love this new product! It exceeded all my expectations.",
    "The service was terrible and I waited for hours.",
    "The weather today is quite ordinary, nothing special."
]

print("üé≠ Sentiment Analysis Results:\n")

for i, text in enumerate(test_texts, 1):
    try:
        result = sentiment_analyzer(text, include_reasoning=(i % 2 == 1))
        
        print(f"üìù Example {i}:")
        print(f"Text: {result['input_text'][:60]}...")
        print(f"Analysis: {result['analysis']}")
        print(f"‚è±Ô∏è  Processing time: {result['processing_time_seconds']}s\n")
        
    except Exception as e:
        print(f"‚ùå Error processing example {i}: {e}\n")

print("‚úÖ All sentiment analyses completed! Check your dashboard for traces with tags and metadata.")

## üß™ Example 4: Creating Your First Dataset and Evaluation

Now let's create a simple dataset and run an evaluation to test our sentiment analyzer.

In [None]:
from langsmith import Client

# Initialize LangSmith client
client = Client()

# Create a dataset for sentiment analysis
dataset_name = "sentiment-analysis-quickstart"

# Sample data for our dataset
examples = [
    {
        "inputs": {"text": "This movie was absolutely fantastic! I loved every minute of it."},
        "outputs": {"expected_sentiment": "Positive", "confidence": 0.9}
    },
    {
        "inputs": {"text": "The food was terrible and the service was even worse."},
        "outputs": {"expected_sentiment": "Negative", "confidence": 0.9}
    },
    {
        "inputs": {"text": "The weather today is okay, not great but not bad either."},
        "outputs": {"expected_sentiment": "Neutral", "confidence": 0.7}
    },
    {
        "inputs": {"text": "I'm thrilled with this purchase! Exactly what I needed."},
        "outputs": {"expected_sentiment": "Positive", "confidence": 0.95}
    },
    {
        "inputs": {"text": "This is the worst experience I've ever had."},
        "outputs": {"expected_sentiment": "Negative", "confidence": 0.95}
    }
]

try:
    # Create the dataset
    dataset = client.create_dataset(
        dataset_name=dataset_name,
        description="A small dataset for testing sentiment analysis"
    )
    
    # Add examples to the dataset
    client.create_examples(
        inputs=[example["inputs"] for example in examples],
        outputs=[example["outputs"] for example in examples],
        dataset_id=dataset.id
    )
    
    print(f"‚úÖ Dataset '{dataset_name}' created successfully with {len(examples)} examples!")
    print(f"üìä Dataset ID: {dataset.id}")
    
except Exception as e:
    if "already exists" in str(e):
        print(f"‚ÑπÔ∏è  Dataset '{dataset_name}' already exists. That's okay!")
        # Get the existing dataset
        datasets = list(client.list_datasets(dataset_name=dataset_name))
        if datasets:
            dataset = datasets[0]
            print(f"üìä Using existing dataset ID: {dataset.id}")
    else:
        print(f"‚ùå Error creating dataset: {e}")
        dataset = None

In [None]:
# Now let's run an evaluation
from langsmith.evaluation import evaluate

# Create a simple evaluator function
def sentiment_correctness_evaluator(run, example):
    """Evaluator that checks if the predicted sentiment matches expected sentiment"""
    
    # Extract the actual prediction from the run output
    prediction = run.outputs.get("analysis", "")
    expected = example.outputs.get("expected_sentiment", "")
    
    # Simple keyword matching (in a real scenario, you'd use more sophisticated parsing)
    prediction_lower = prediction.lower()
    expected_lower = expected.lower()
    
    # Check if the expected sentiment appears in the prediction
    is_correct = expected_lower in prediction_lower
    
    return {
        "key": "sentiment_accuracy",
        "score": 1.0 if is_correct else 0.0,
        "comment": f"Expected: {expected}, Found in prediction: {is_correct}"
    }

# Wrapper function for evaluation
def sentiment_analyzer_for_eval(inputs):
    """Wrapper function that matches the evaluation interface"""
    result = sentiment_analyzer(inputs["text"], include_reasoning=False)
    return result

if dataset:
    try:
        print("üß™ Running evaluation...")
        
        # Run the evaluation
        results = evaluate(
            sentiment_analyzer_for_eval,
            data=dataset_name,
            evaluators=[sentiment_correctness_evaluator],
            experiment_prefix="sentiment-quickstart",
            description="Quick start sentiment analysis evaluation"
        )
        
        print(f"‚úÖ Evaluation completed!")
        print(f"üìä Check your LangSmith dashboard to see detailed results.")
        
    except Exception as e:
        print(f"‚ùå Error running evaluation: {e}")
else:
    print("‚ö†Ô∏è  Skipping evaluation because dataset wasn't created successfully.")

## üéØ Understanding Your LangSmith Dashboard

Now that you've created traces and run evaluations, let's explore what you can see in your dashboard:

### üìä Projects View
1. **Go to your LangSmith dashboard**: Visit [smith.langchain.com](https://smith.langchain.com)
2. **Select your project**: Click on the project you specified in your environment variables

### üîç Traces Tab
Here you'll see all the traces from the examples above:
- **Simple LLM calls** with input/output
- **Multi-step research assistant** showing the chain of LLM calls
- **Sentiment analysis** with metadata and tags

**What to look for:**
- ‚è±Ô∏è **Latency**: How long each step took
- üí∞ **Cost**: Token usage and estimated costs
- üè∑Ô∏è **Tags**: Filter by "sentiment-analysis" or "quickstart"
- üìã **Metadata**: Version and environment information

### üß™ Experiments Tab
You'll find your evaluation results here:
- **Overall accuracy** of your sentiment analyzer
- **Individual test results** for each example
- **Comparison views** to compare different runs

### üìä Monitoring Tab
View aggregate metrics:
- **Request volume** over time
- **Average latency** trends
- **Error rates** (hopefully zero!)
- **Cost analysis** by model and time period

## üîß Framework-Agnostic Example

Let's create an example that doesn't use LangChain to demonstrate LangSmith's framework-agnostic capabilities.

In [None]:
import requests
from langsmith import traceable
import json

@traceable(run_type="llm")
def call_huggingface_api(text, model="microsoft/DialoGPT-medium"):
    """Example using Hugging Face Inference API (free tier)"""
    
    # Note: This is a simplified example. In practice, you'd handle authentication properly.
    api_url = f"https://api-inference.huggingface.co/models/{model}"
    
    # For this demo, we'll simulate the API call
    # In a real scenario, you'd make an actual HTTP request
    
    # Simulated response
    simulated_response = f"This is a simulated response to: '{text[:50]}...'"
    
    return {
        "model": model,
        "input": text,
        "response": simulated_response,
        "tokens_used": len(text.split()) + len(simulated_response.split())
    }

@traceable(run_type="chain", tags=["custom-framework", "demo"])
def custom_chatbot(user_message):
    """A simple chatbot using custom framework integration"""
    
    # Step 1: Preprocess the message
    processed_message = user_message.strip().lower()
    
    # Step 2: Generate response using our custom LLM call
    llm_response = call_huggingface_api(processed_message)
    
    # Step 3: Post-process the response
    final_response = f"Bot: {llm_response['response']}"
    
    return {
        "user_input": user_message,
        "processed_input": processed_message,
        "raw_llm_response": llm_response,
        "final_response": final_response
    }

# Test the custom chatbot
test_messages = [
    "Hello, how are you today?",
    "What's the weather like?",
    "Can you help me with Python programming?"
]

print("ü§ñ Custom Framework Chatbot Demo:\n")

for message in test_messages:
    try:
        result = custom_chatbot(message)
        print(f"üë§ User: {result['user_input']}")
        print(f"ü§ñ {result['final_response']}")
        print(f"üìä Tokens used: {result['raw_llm_response']['tokens_used']}\n")
    except Exception as e:
        print(f"‚ùå Error: {e}\n")

print("‚úÖ Custom framework integration complete! Check your traces in LangSmith.")

## üìà Key Takeaways

Congratulations! You've successfully completed your first LangSmith project. Here's what you've accomplished:

### ‚úÖ What You've Built
1. **Simple traced LLM calls** with automatic observability
2. **Multi-step applications** showing complex execution flows
3. **Enhanced tracing** with metadata and tags for organization
4. **Your first dataset** with real examples
5. **Automated evaluation** to test your application quality
6. **Framework-agnostic integration** showing flexibility

### üîç Key Concepts You've Learned
- **Tracing**: Every function call is automatically logged with inputs, outputs, and metadata
- **Run Types**: Different types of operations (llm, chain, tool) for better organization
- **Metadata & Tags**: Powerful filtering and organization tools
- **Datasets**: Collections of examples for testing and evaluation
- **Evaluations**: Systematic testing of your application quality

### üí° Best Practices You've Applied
- **Environment variables** for secure API key management
- **Descriptive naming** for functions and metadata
- **Error handling** for robust applications
- **Incremental complexity** from simple to advanced examples

## üéâ What's Next?

You're now ready to dive deeper into LangSmith's advanced capabilities:

### üîç Deep Dive Learning Path:
- **LSM-003: Observability Deep Dive** - Master advanced tracing, debugging, and the new Agent Observability features
- **LSM-004: Evaluation Mastery** - Build comprehensive testing pipelines with custom evaluators
- **LSM-005: Prompt Engineering** - Leverage the Prompt Hub for collaborative prompt development
- **LSM-006: Production Monitoring** - Set up enterprise-grade monitoring with OpenTelemetry integration

### üõ†Ô∏è Immediate Next Steps:
1. **Explore your dashboard** - Spend time clicking through your traces and understanding the interface
2. **Experiment with tags** - Try filtering your traces by the tags you've created
3. **Create more examples** - Add your own examples to the dataset
4. **Invite team members** - If you're working with others, invite them to collaborate

### üìö Additional Resources:
- [LangSmith Documentation](https://docs.langchain.com/langsmith)
- [LangSmith Cookbook](https://github.com/langchain-ai/langsmith-cookbook)
- [Community Examples](https://smith.langchain.com/public)

---

**Ready for advanced tracing and debugging?** Continue to **LSM-003: Observability Deep Dive** to master LangSmith's observability features! üöÄ