# Progress Analytics Demo

This notebook demonstrates the new analytics functionality that tracks student progress and exports detailed metrics for instructors.

## ⚠️ Important Setup Instructions

If you encounter `TypeError` about unexpected keyword arguments (like `checkpoint_name`), this means you're using an older version of the package. To fix this:

1. **Option A - Quick Fix**: Run the reinstall script from terminal:
   ```bash
   cd /Users/michael.lynn/code/mongodb/developer-days/jupyter-utils/jupyter-lab-progress
   ./reinstall.sh
   ```

2. **Option B - Manual Fix**:
   ```bash
   pip uninstall jupyter-lab-progress
   cd /Users/michael.lynn/code/mongodb/developer-days/jupyter-utils/jupyter-lab-progress
   pip install -e .
   ```

3. **Then restart your Jupyter kernel**: Kernel → Restart

The first cell below will help diagnose if you have the correct version installed.

In [1]:
# SOLUTION: Force load from development source
import sys
import os
import inspect

# Add the development source to Python path
dev_path = "/Users/michael.lynn/code/mongodb/developer-days/jupyter-utils/jupyter-lab-progress"
if dev_path not in sys.path:
    sys.path.insert(0, dev_path)

# Remove any cached modules
modules_to_remove = [key for key in sys.modules.keys() if key.startswith('jupyter_lab_progress')]
for module in modules_to_remove:
    del sys.modules[module]

# Now import from source
from jupyter_lab_progress import LabProgress, show_info, show_warning
import time
import json
import pandas as pd

# Verify we have the new methods
test_progress = LabProgress(['test'], 'test')
methods_check = {
    'mark_partial with checkpoint_name': 'checkpoint_name' in inspect.signature(LabProgress.mark_partial).parameters,
    'display_analytics_dashboard': hasattr(test_progress, 'display_analytics_dashboard'),
    'export_analytics_csv': hasattr(test_progress, 'export_analytics_csv'),
    'get_analytics_summary': hasattr(test_progress, 'get_analytics_summary')
}

print("✅ Module loaded from source!")
print("\nAvailable features:")
for feature, available in methods_check.items():
    status = "✅" if available else "❌"
    print(f"  {status} {feature}")

if all(methods_check.values()):
    show_info("All analytics features are now available! 🎉")
else:
    show_warning("Some features are still missing. Please check the source code.")

✅ Module loaded from source!

Available features:
  ✅ mark_partial with checkpoint_name
  ✅ display_analytics_dashboard
  ✅ export_analytics_csv
  ✅ get_analytics_summary


## Setting Up Lab with Analytics

Analytics are automatically enabled for all LabProgress instances.

In [2]:
# Create a lab - analytics tracking is automatic
progress = LabProgress(
    steps=[
        "Environment Setup",
        "Data Import",
        "Data Cleaning",
        "Feature Engineering",
        "Model Building",
        "Model Evaluation"
    ],
    lab_name="Data Science Pipeline Lab",
    persist=True  # Enable persistence to save analytics
)

# Show analytics info with error handling
try:
    show_info(f"Lab created with student ID: {progress._student_id}")
    show_info(f"Session ID: {progress._session_id[:20]}...")
except AttributeError as e:
    show_warning(f"Analytics attributes not fully initialized: {e}")
    show_info("Lab created successfully - analytics will be available after first interaction")

## Simulating Student Progress

Let's simulate a student working through the lab with various patterns.

In [3]:
# Step 1: Quick completion
progress.mark_done("Environment Setup", score=100)
show_info("Environment setup completed quickly!")
time.sleep(1)  # Small delay to show time progression

In [4]:
# Step 2: Gradual progress with checkpoints
# Workaround for checkpoint_name parameter issue

# Check if the method accepts checkpoint_name
import inspect
sig = inspect.signature(progress.mark_partial)
has_checkpoint_param = 'checkpoint_name' in sig.parameters

if has_checkpoint_param:
    # Use the new signature with checkpoints
    progress.mark_partial("Data Import", 0.3, "Started loading CSV files", checkpoint_name="csv_started")
    time.sleep(1)
    progress.mark_partial("Data Import", 0.7, "Loaded main dataset", checkpoint_name="main_loaded")
else:
    # Fallback to old signature without checkpoints
    show_warning("Using fallback mode without checkpoint names - please restart kernel for full functionality")
    progress.mark_partial("Data Import", 0.3, "Started loading CSV files")
    time.sleep(1)
    progress.mark_partial("Data Import", 0.7, "Loaded main dataset")

time.sleep(1)
progress.mark_done("Data Import", score=95)
show_info("Data import completed with checkpoints!")

In [5]:
# Step 3: Multiple attempts (student struggling)
progress.increment_attempts("Data Cleaning")
time.sleep(0.5)
progress.increment_attempts("Data Cleaning")
time.sleep(0.5)
progress.mark_partial("Data Cleaning", 0.4, "Fixed missing values")
time.sleep(1)
progress.increment_attempts("Data Cleaning")
progress.mark_done("Data Cleaning", score=85)
show_info("Data cleaning completed after multiple attempts")

In [6]:
# Step 4: Steady progress
progress.mark_partial("Feature Engineering", 0.5, "Created basic features")
time.sleep(1)
progress.mark_done("Feature Engineering", score=92)
show_info("Feature engineering completed!")

## Real-Time Analytics Dashboard

View current progress analytics in a visual dashboard.

In [8]:
# Display the analytics dashboard
# Check if the method exists, provide fallback if not

if hasattr(progress, 'display_analytics_dashboard'):
    progress.display_analytics_dashboard()
else:
    show_warning("Analytics dashboard not available in this version. Showing basic summary instead.")
    
    # Fallback: Show basic analytics manually
    if hasattr(progress, 'get_analytics_summary'):
        summary = progress.get_analytics_summary()
        show_info(f"Analytics Summary Available: {summary}")
    else:
        # Manual calculation of basic metrics
        completed = sum(1 for s in progress.steps.values() if s.get('completed', False))
        total = len(progress.steps)
        completion_rate = (completed / total * 100) if total > 0 else 0
        
        show_info(f"""
        📊 Progress Summary:
        - Steps Completed: {completed}/{total}
        - Completion Rate: {completion_rate:.1f}%
        - Lab: {progress.lab_name}
        """)
        
        # Show individual step status
        print("\nStep Status:")
        for step_name, step_info in progress.steps.items():
            status = "✅" if step_info.get('completed', False) else "⏳"
            score = step_info.get('score', 'N/A')
            print(f"{status} {step_name} - Score: {score}")
    
    print("\n💡 To get full analytics features, please reinstall the package:")
    print("   pip uninstall jupyter-lab-progress")
    print("   pip install -e /Users/michael.lynn/code/mongodb/developer-days/jupyter-utils/jupyter-lab-progress")

Metric,Value
Total Events,8
Session Duration,41 seconds
Steps Completed,4/6
Completion Rate,66.7%
Average Score,93.0
Total Attempts,9


Event Type,Count
step_completed,3
attempt,3
partial_progress,2


Step,Status,Attempts,Score,Time
Environment Setup,✅,0,100.0,
Data Import,✅,0,95.0,-343s
Data Cleaning,✅,9,85.0,1s
Feature Engineering,✅,0,92.0,1s
Model Building,⏳,0,,
Model Evaluation,⏳,0,,


In [9]:
# Get raw analytics summary
if hasattr(progress, 'get_analytics_summary'):
    summary = progress.get_analytics_summary()
    print("Analytics Summary:")
    for key, value in summary.items():
        if key not in ['event_types', 'step_analytics']:
            print(f"  {key}: {value}")
else:
    show_warning("Analytics summary not available in this version.")
    print("Basic summary:")
    print(f"  Lab: {progress.lab_name}")
    print(f"  Total steps: {len(progress.steps)}")
    completed = sum(1 for s in progress.steps.values() if s.get('completed', False))
    print(f"  Completed: {completed}")
    print(f"  Completion rate: {(completed/len(progress.steps)*100):.1f}%")

Analytics Summary:
  total_events: 8
  session_duration: 41.291257
  steps_completed: 4
  total_steps: 6
  completion_rate: 66.66666666666666
  average_score: 93.0
  total_attempts: 9


## Export Analytics Data

Export analytics in various formats for instructor review.

In [10]:
# Export to CSV for spreadsheet analysis
if hasattr(progress, 'export_analytics_csv'):
    csv_filename = progress.export_analytics_csv()
    show_info(f"Analytics exported to CSV: {csv_filename}")
    
    # Load and display the CSV data
    try:
        df = pd.read_csv(csv_filename)
        print("\nCSV Content Preview:")
        print(df.head())
        print(f"\nTotal events: {len(df)}")
    except Exception as e:
        print(f"Could not load CSV: {e}")
else:
    show_warning("CSV export not available in this version.")
    print("\n💡 Creating a basic progress report instead...")
    
    # Create a basic report
    report_data = []
    for step_name, step_info in progress.steps.items():
        report_data.append({
            'step': step_name,
            'completed': step_info.get('completed', False),
            'score': step_info.get('score', None),
            'attempts': step_info.get('attempts', 0)
        })
    
    df = pd.DataFrame(report_data)
    print("\nProgress Report:")
    print(df)


CSV Content Preview:
   checkpoint_name        event_type                   lab_name  \
0              NaN    step_completed  Data Science Pipeline Lab   
1              NaN           attempt  Data Science Pipeline Lab   
2              NaN           attempt  Data Science Pipeline Lab   
3              NaN  partial_progress  Data Science Pipeline Lab   
4              NaN           attempt  Data Science Pipeline Lab   

                  notes  progress  score                  session_id  \
0                   NaN       NaN  100.0  2025-07-11T10:13:51.805928   
1                   NaN       NaN    NaN  2025-07-11T10:13:51.805928   
2                   NaN       NaN    NaN  2025-07-11T10:13:51.805928   
3  Fixed missing values       0.4    NaN  2025-07-11T10:13:51.805928   
4                   NaN       NaN    NaN  2025-07-11T10:13:51.805928   

           step_name student_id  time_spent                   timestamp  
0  Environment Setup   2ab13cac         NaN  2025-07-11T10:13:58.064

In [11]:
# Export to JSON for programmatic analysis
json_filename = progress.export_analytics_json()
show_info(f"Analytics exported to JSON: {json_filename}")

# Load and display sample JSON data
try:
    with open(json_filename, 'r') as f:
        json_data = json.load(f)
    
    print("\nJSON Structure:")
    print(f"Lab Name: {json_data['lab_name']}")
    print(f"Student ID: {json_data['student_id']}")
    print(f"Total Events: {len(json_data['events'])}")
    print(f"Session Duration: {json_data['summary']['session_duration']:.1f} seconds")
    
    print("\nSample Events:")
    for event in json_data['events'][:3]:
        print(f"  {event['timestamp']}: {event['event_type']} - {event.get('step_name', 'N/A')}")
    
except Exception as e:
    print(f"Could not load JSON: {e}")


JSON Structure:
Lab Name: Data Science Pipeline Lab
Student ID: 2ab13cac
Total Events: 8
Session Duration: 41.3 seconds

Sample Events:
  2025-07-11T10:13:58.064895: step_completed - Environment Setup
  2025-07-11T10:14:36.156333: attempt - Data Cleaning
  2025-07-11T10:14:36.659788: attempt - Data Cleaning


## Complete the Lab and View Final Analytics

In [12]:
# Complete remaining steps
progress.mark_done("Model Building", score=88)
time.sleep(1)
progress.mark_done("Model Evaluation", score=94)

show_info("Lab completed! 🎉")

In [13]:
# Final analytics dashboard
progress.display_analytics_dashboard()

Metric,Value
Total Events,10
Session Duration,61 seconds
Steps Completed,6/6
Completion Rate,100.0%
Average Score,92.3
Total Attempts,9


Event Type,Count
step_completed,5
attempt,3
partial_progress,2


Step,Status,Attempts,Score,Time
Environment Setup,✅,0,100,
Data Import,✅,0,95,-343s
Data Cleaning,✅,9,85,1s
Feature Engineering,✅,0,92,1s
Model Building,✅,0,88,
Model Evaluation,✅,0,94,


## Instructor Analytics - Multiple Students

Demonstration of how analytics would work across multiple student sessions.

In [14]:
# Simulate different student performance patterns
students_data = []

# Student 1: Fast learner
student1 = LabProgress(
    steps=["Setup", "Analysis", "Conclusion"],
    lab_name="Quick Lab"
)
student1.mark_done("Setup", score=100)
student1.mark_done("Analysis", score=95)
student1.mark_done("Conclusion", score=98)
students_data.append(("Fast Learner", student1.get_analytics_summary()))

# Student 2: Struggles with middle section
student2 = LabProgress(
    steps=["Setup", "Analysis", "Conclusion"],
    lab_name="Quick Lab"
)
student2.mark_done("Setup", score=90)
student2.increment_attempts("Analysis")
student2.increment_attempts("Analysis")
student2.increment_attempts("Analysis")
student2.mark_done("Analysis", score=70)
student2.mark_done("Conclusion", score=85)
students_data.append(("Struggling Student", student2.get_analytics_summary()))

# Student 3: Incomplete work
student3 = LabProgress(
    steps=["Setup", "Analysis", "Conclusion"],
    lab_name="Quick Lab"
)
student3.mark_done("Setup", score=85)
student3.mark_partial("Analysis", 0.6, "Partial completion")
students_data.append(("Incomplete Work", student3.get_analytics_summary()))

# Display comparison
print("Instructor View - Student Comparison:")
print("=" * 60)

for student_type, summary in students_data:
    print(f"\n{student_type}:")
    print(f"  Completion Rate: {summary['completion_rate']:.1f}%")
    print(f"  Average Score: {summary['average_score']:.1f}" if summary['average_score'] else "  Average Score: N/A")
    print(f"  Total Attempts: {summary['total_attempts']}")
    print(f"  Total Events: {summary['total_events']}")

Instructor View - Student Comparison:

Fast Learner:
  Completion Rate: 100.0%
  Average Score: 97.7
  Total Attempts: 0
  Total Events: 3

Struggling Student:
  Completion Rate: 100.0%
  Average Score: 81.7
  Total Attempts: 3
  Total Events: 6

Incomplete Work:
  Completion Rate: 33.3%
  Average Score: 85.0
  Total Attempts: 0
  Total Events: 2


## Persistent Analytics

When persistence is enabled, analytics data is automatically saved and can be resumed.

In [15]:
# Show that analytics persist across sessions
print(f"Original lab events: {len(progress._analytics_data)}")

# Save current state
original_student_id = progress._student_id

# Simulate resuming the session
resumed_progress = LabProgress.resume(lab_name="Data Science Pipeline Lab")

print(f"Resumed lab events: {len(resumed_progress._analytics_data)}")
print(f"Student ID preserved: {resumed_progress._student_id == original_student_id}")

# Show that new events are added to existing analytics
resumed_progress.increment_attempts("Model Building")
print(f"After new activity: {len(resumed_progress._analytics_data)} events")

Original lab events: 10


Resumed lab events: 10
Student ID preserved: True
After new activity: 11 events


## Clean Up

In [16]:
import os

# Clean up generated files
files_to_remove = [
    progress.persist_file,
    csv_filename if 'csv_filename' in locals() else None,
    json_filename if 'json_filename' in locals() else None
]

for filename in files_to_remove:
    if filename and os.path.exists(filename):
        os.remove(filename)
        print(f"Removed: {filename}")

show_info("Analytics demo completed! Files cleaned up.")

Removed: .data_science_pipeline_lab_progress.json
Removed: data_science_pipeline_lab_analytics_20250711_101455.csv
Removed: data_science_pipeline_lab_analytics_20250711_101456.json


## Summary

The Progress Analytics feature provides:

- **Automatic event tracking** for all student interactions
- **Time tracking** for individual steps and overall session
- **CSV export** for spreadsheet analysis
- **JSON export** for programmatic analysis
- **Real-time dashboard** for immediate insights
- **Persistent storage** that survives session restarts
- **Student identification** for tracking across sessions
- **Comprehensive metrics** including completion rates, scores, and attempts

This enables instructors to:
- Monitor student progress in real-time
- Identify struggling students early
- Analyze common stumbling points
- Track engagement and time spent
- Generate reports for assessment