# Self-Supervised Neural Networks: An Interactive Lab

## 🎯 Learning Objectives
By the end of this lab, you will:
- Understand the core principles of self-supervised learning (SSL)
- Implement pretext tasks for vision and time-series data
- Evaluate representation quality through transfer learning
- Compare generative vs. discriminative SSL approaches
- Build intuition about when and why SSL works

## 📚 Prerequisites
- Basic understanding of neural networks and backpropagation
- Familiarity with NumPy and Python
- Linear algebra fundamentals (matrix multiplication, derivatives)

## 🔗 Recommended Reading
Before starting, consider reviewing:
- [A Survey on Self-supervised Learning](https://arxiv.org/abs/2301.05712)
- [Representation Learning: A Review and New Perspectives](https://arxiv.org/abs/1206.5538)
- [Self-supervised Visual Feature Learning with Deep Neural Networks](https://arxiv.org/abs/1712.05577)

## Setup: Install Required Package for Anthropic API

First, let's install the Anthropic Python SDK if you haven't already:

In [None]:
# Install Anthropic SDK (uncomment if needed)
# !pip install anthropic

## Module 1: Introduction to Self-Supervised Learning

Self-supervised learning (SSL) is a paradigm where models learn representations from unlabeled data by solving **pretext tasks**. The model generates its own supervision signal from the data structure itself.

### Key Concepts

**Two Main Families of SSL:**
1. **Generative/Predictive Methods**: Reconstruct or predict part of the input
   - Examples: Autoencoders, masked language modeling (BERT), image inpainting
   - Learns by minimizing reconstruction error

2. **Discriminative/Contrastive Methods**: Learn to distinguish between different views
   - Examples: SimCLR, MoCo, SwAV
   - Learns by pulling positive pairs together, pushing negatives apart

### 🤔 Critical Thinking Question 1
**Why might SSL be particularly valuable in domains like medical imaging or astronomy?**

*Think about data availability, labeling costs, and domain expertise requirements.*

In [None]:
# Setup and imports
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import seaborn as sns
from typing import Tuple, List, Optional, Dict
import json
import os

# Set random seed for reproducibility
np.random.seed(42)

# Configure matplotlib
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

## Assessment Module: Open-Ended Questions with Anthropic Claude

This module provides automatic evaluation of open-ended questions using Claude API.

### Setting up your API Key

1. Get your API key from [Anthropic Console](https://console.anthropic.com/)
2. Set it as an environment variable:
   ```bash
   export ANTHROPIC_API_KEY="your-api-key-here"
   ```
   Or set it directly in the notebook (see next cell)

In [None]:
import os
from typing import Dict, List, Optional
import json

# Option 1: Set API key directly (replace with your actual key)
# os.environ['ANTHROPIC_API_KEY'] = 'your-api-key-here'

# Option 2: Load from environment (if already set in ~/.bashrc)
api_key = os.getenv('ANTHROPIC_API_KEY')
if api_key:
    print("✅ Anthropic API key found in environment")
else:
    print("⚠️ No API key found. Set ANTHROPIC_API_KEY environment variable for automatic evaluation.")
    print("   You can still use manual evaluation with provided rubrics.")

In [None]:
class OpenEndedAssessment:
    """Handle open-ended questions with Claude AI verification."""
    
    def __init__(self, api_key: Optional[str] = None):
        self.api_key = api_key or os.getenv('ANTHROPIC_API_KEY')
        self.questions = self._load_questions()
        
        # Only import anthropic if we have an API key
        if self.api_key:
            try:
                import anthropic
                self.client = anthropic.Anthropic(api_key=self.api_key)
                print("✅ Anthropic client initialized successfully")
            except ImportError:
                print("⚠️ Please install anthropic: pip install anthropic")
                self.client = None
        else:
            self.client = None
    
    def _load_questions(self) -> List[Dict]:
        """Load assessment questions."""
        return [
            {
                "id": "q1",
                "question": "Explain why rotation prediction is an effective pretext task for learning visual features. What properties of the task make it useful?",
                "rubric": [
                    "Mentions that rotation is a geometric transformation",
                    "Notes that it requires understanding object structure",
                    "Explains that labels are free (self-supervised)",
                    "Discusses invariance/equivariance properties"
                ],
                "sample_answer": "Rotation prediction works because it forces the network to understand spatial structure and object geometry. The task requires recognizing features regardless of orientation, learning rotation-equivariant representations. Labels are automatically generated without human annotation."
            },
            {
                "id": "q2",
                "question": "Compare and contrast autoencoders with contrastive learning methods. When would you choose one over the other?",
                "rubric": [
                    "Identifies autoencoders as generative/reconstructive",
                    "Identifies contrastive as discriminative",
                    "Mentions computational efficiency differences",
                    "Discusses use cases for each"
                ],
                "sample_answer": "Autoencoders learn by reconstruction, capturing all input details including noise. Contrastive methods learn by comparing samples, focusing on discriminative features. Autoencoders are simpler but may learn trivial solutions. Contrastive methods are more robust but require careful augmentation design."
            },
            {
                "id": "q3",
                "question": "Design a novel pretext task for learning representations from text data. Explain your reasoning.",
                "rubric": [
                    "Proposes a specific, implementable task",
                    "Explains how labels are generated automatically",
                    "Justifies why the task would learn useful features",
                    "Considers computational feasibility"
                ],
                "sample_answer": "One novel task could be 'sentence ordering': given shuffled sentences from a paragraph, predict the correct order. This requires understanding discourse structure, temporal relationships, and causal dependencies. Labels come from the original ordering, making it fully self-supervised."
            }
        ]
    
    def evaluate_answer(self, question_id: str, user_answer: str) -> Dict:
        """Evaluate user answer using Claude API."""
        question_data = next((q for q in self.questions if q['id'] == question_id), None)
        if not question_data:
            return {"error": "Question not found"}
        
        if not self.client:
            return self._manual_evaluation(question_data, user_answer)
        
        # Prepare evaluation prompt
        evaluation_prompt = self._create_evaluation_prompt(question_data, user_answer)
        
        # Call Claude API
        try:
            response = self._call_claude_api(evaluation_prompt)
            return self._parse_evaluation(response)
        except Exception as e:
            return {"error": f"API call failed: {str(e)}"}
    
    def _create_evaluation_prompt(self, question_data: Dict, user_answer: str) -> str:
        """Create prompt for Claude evaluation."""
        prompt = f"""You are evaluating a student's answer to a self-supervised learning question.
        
Question: {question_data['question']}

Evaluation Rubric (each item worth 25 points):
{chr(10).join(f'- {item}' for item in question_data['rubric'])}

Reference Answer: {question_data['sample_answer']}

Student's Answer: {user_answer}

Please evaluate the answer and provide a JSON response with the following structure:
{{
    "score": <number between 0-100>,
    "rubric_met": [<list of rubric points that were addressed>],
    "strengths": "<what the student did well>",
    "improvements": "<what could be improved>",
    "feedback": "<constructive feedback for the student>"
}}

Be encouraging but honest. Focus on understanding rather than perfect wording.
Return ONLY the JSON object, no additional text."""
        return prompt
    
    def _call_claude_api(self, prompt: str) -> str:
        """Call Claude API for evaluation."""
        if not self.client:
            raise ValueError("Claude client not initialized")
        
        # Use Claude Haiku for cost-effective evaluation
        message = self.client.messages.create(
            model="claude-3-haiku-20240307",  # Fast and affordable
            max_tokens=500,
            temperature=0.3,  # Low temperature for consistent evaluation
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        
        return message.content[0].text
    
    def _parse_evaluation(self, response: str) -> Dict:
        """Parse Claude API response."""
        try:
            # Claude might include some text before/after JSON, so we try to extract it
            import re
            json_match = re.search(r'\{.*\}', response, re.DOTALL)
            if json_match:
                return json.loads(json_match.group())
            else:
                return json.loads(response)
        except json.JSONDecodeError:
            return {"error": "Failed to parse API response", "raw_response": response}
    
    def _manual_evaluation(self, question_data: Dict, user_answer: str) -> Dict:
        """Provide manual evaluation guidance when API is not available."""
        return {
            "message": "API key not configured. Please self-evaluate using the rubric below.",
            "rubric": question_data['rubric'],
            "sample_answer": question_data['sample_answer'],
            "self_evaluation_guide": [
                "Compare your answer to the sample",
                "Check each rubric point (25 points each)",
                "Give yourself partial credit where appropriate",
                "Focus on understanding, not exact wording"
            ],
            "your_answer": user_answer
        }

# Initialize assessment system
assessment = OpenEndedAssessment()

# Display questions
print("\n📝 Open-Ended Assessment Questions\n")
print("="*50)
for i, q in enumerate(assessment.questions, 1):
    print(f"\nQuestion {i} (ID: {q['id']})")
    print("-"*40)
    print(f"{q['question']}\n")

### How to Submit and Evaluate Your Answers

Use the function below to submit your answer. It will automatically evaluate it using Claude if you have an API key set up.

In [None]:
def submit_answer(question_id: str, answer: str):
    """Submit and evaluate an answer."""
    print(f"\n📊 Evaluating answer for question {question_id}...\n")
    print("="*50)
    
    result = assessment.evaluate_answer(question_id, answer)
    
    if 'error' in result:
        print(f"❌ Error: {result['error']}")
        if 'raw_response' in result:
            print(f"\nRaw response: {result['raw_response']}")
    elif 'message' in result:
        # Manual evaluation mode
        print(f"ℹ️ {result['message']}\n")
        print("📋 Rubric (25 points each):")
        for i, item in enumerate(result['rubric'], 1):
            print(f"  {i}. {item}")
        print(f"\n📖 Sample Answer:\n{result['sample_answer']}")
        print(f"\n✍️ Your Answer:\n{result['your_answer']}")
        print("\n💡 Self-Evaluation Guide:")
        for tip in result['self_evaluation_guide']:
            print(f"  • {tip}")
    else:
        # Automated evaluation results
        print(f"🎯 Score: {result['score']}/100\n")
        
        if 'rubric_met' in result and result['rubric_met']:
            print("✅ Rubric Points Addressed:")
            for point in result['rubric_met']:
                print(f"  • {point}")
        
        if 'strengths' in result:
            print(f"\n💪 Strengths:\n{result['strengths']}")
        
        if 'improvements' in result:
            print(f"\n📈 Areas for Improvement:\n{result['improvements']}")
        
        if 'feedback' in result:
            print(f"\n💬 Feedback:\n{result['feedback']}")
    
    print("\n" + "="*50)
    return result

### Example: Submit Your Answer

Try answering one of the questions and submitting it for evaluation:

In [None]:
# Example answer submission (replace with your actual answer)
my_answer_q1 = """
Rotation prediction is effective because it requires the network to understand 
the geometric structure of objects. When an image is rotated, the spatial 
relationships between pixels change in predictable ways. The network must learn 
features that capture these relationships to correctly predict the rotation angle. 
This is self-supervised because we can automatically generate labels by rotating 
images ourselves, requiring no human annotation.
"""

# Uncomment to submit your answer
# result = submit_answer("q1", my_answer_q1)

### Try More Questions

Answer the other questions and submit them for evaluation:

In [None]:
# Question 2: Autoencoders vs Contrastive Learning
my_answer_q2 = """
Your answer here...
"""

# Uncomment to submit
# result = submit_answer("q2", my_answer_q2)

In [None]:
# Question 3: Novel Pretext Task
my_answer_q3 = """
Your creative pretext task design here...
"""

# Uncomment to submit
# result = submit_answer("q3", my_answer_q3)

## Configuration for Automated Assessment

This configuration can be used by external systems or for tracking progress:

In [None]:
# Create assessment configuration for Anthropic
assessment_config = {
    "lab_title": "Self-Supervised Neural Networks Lab",
    "version": "2.1",
    "api_configuration": {
        "provider": "anthropic",
        "model": "claude-3-haiku-20240307",
        "temperature": 0.3,
        "max_tokens": 500,
        "api_key_env_var": "ANTHROPIC_API_KEY"
    },
    "questions": assessment.questions,
    "grading_scheme": {
        "open_ended": {
            "q1": {"points": 100, "rubric_items": 4},
            "q2": {"points": 100, "rubric_items": 4},
            "q3": {"points": 100, "rubric_items": 4}
        },
        "total_points": 300
    },
    "cost_estimate": {
        "model": "claude-3-haiku",
        "input_tokens_per_eval": "~500",
        "output_tokens_per_eval": "~200",
        "cost_per_eval": "~$0.0004",
        "note": "Haiku is 10x cheaper than Sonnet, perfect for educational use"
    }
}

# Save configuration
with open('assessment_config_anthropic.json', 'w') as f:
    json.dump(assessment_config, f, indent=2)

print("✅ Assessment configuration saved to assessment_config_anthropic.json")
print(f"\n💰 Cost estimate: ~${assessment_config['cost_estimate']['cost_per_eval']} per evaluation")
print(f"   Using {assessment_config['cost_estimate']['model']} for cost-effective assessment")

## Testing the System

Let's test if everything is working correctly:

In [None]:
def test_assessment_system():
    """Test the assessment system with a sample answer."""
    test_answer = "Rotation prediction works because it teaches the network about spatial relationships."
    
    print("🧪 Testing Assessment System\n")
    print("="*50)
    
    if assessment.client:
        print("✅ Claude API client is configured")
        print("   Submitting a test answer...\n")
    else:
        print("⚠️ No API key found - will use manual evaluation mode")
        print("   To enable automatic evaluation:")
        print("   1. Get API key from https://console.anthropic.com/")
        print("   2. Set: export ANTHROPIC_API_KEY='your-key'")
        print("   3. Restart the notebook kernel\n")
    
    # Test with a simple answer
    result = submit_answer("q1", test_answer)
    
    if 'score' in result:
        print("\n✅ Automatic evaluation is working!")
    elif 'message' in result:
        print("\nℹ️ Manual evaluation mode is active")
    else:
        print("\n❌ Something went wrong")
    
    return result

# Run the test
test_result = test_assessment_system()

## Summary

### What You've Learned
- How to set up automatic evaluation using Claude API
- How to submit answers for assessment
- How to use manual evaluation when API is not available

### Next Steps
1. Complete the main lab exercises in `snn_lab_interactive.ipynb`
2. Answer the open-ended questions thoughtfully
3. Submit your answers using this notebook for evaluation
4. Review the feedback and improve your understanding

### Cost-Effective Learning
- We use Claude 3 Haiku for evaluation (10x cheaper than Sonnet)
- Each evaluation costs approximately $0.0004
- Perfect for educational environments

Good luck with your self-supervised learning journey! 🚀