# Materials Science 465: Computational Electron Microscopy
## Assignment 1: Environment Setup and GitHub Integration

**Due: Friday, January 9, 2026 at 11:59 PM**

### Objectives
This assignment establishes the foundational computational environment for the entire course. You will:

1. Set up a complete Python environment optimized for electron microscopy analysis
2. Configure version control workflows using Git and GitHub
3. Demonstrate proficiency with Jupyter Lab and reproducible research practices
4. Load and visualize sample electron microscopy data
5. Create your first documented workflow for computational EM analysis

### Evaluation Criteria
- **Environment Setup (25%)**: Successful installation and configuration of all required packages
- **GitHub Integration (25%)**: Proper repository setup with meaningful commit history
- **Code Quality (25%)**: Clean, well-documented code following Python best practices
- **Data Analysis (25%)**: Successful EM data loading, processing, and visualization

### Submission Requirements
1. Complete this Jupyter notebook with all cells executed
2. Push to your personal GitHub repository with at least 5 meaningful commits
3. Include a README.md file describing your setup process
4. Submit the GitHub repository URL via Canvas

**Note**: This notebook serves as both an assignment and a template for future computational EM workflows. Take time to understand each step thoroughly.

## Part 1: Environment Verification and Setup

Before proceeding with any computational electron microscopy work, we must verify that our Python environment is properly configured with all necessary packages. This section will test the installation of core scientific computing libraries and EM-specific tools.

In [None]:
"""
Environment Verification Script for MS 465: Computational Electron Microscopy
This cell verifies that all required packages are installed and accessible.
"""

import sys
import platform
from datetime import datetime

print("="*80)
print(f"MS 465 Environment Verification Report")
print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*80)

# System Information
print(f"\nSystem Information:")
print(f"Python Version: {sys.version}")
print(f"Platform: {platform.platform()}")
print(f"Architecture: {platform.architecture()[0]}")

# Test core scientific computing packages
packages_core = {
    'numpy': 'np',
    'scipy': 'scipy',
    'matplotlib': 'plt',
    'pandas': 'pd',
    'scikit-learn': 'sklearn',
    'scikit-image': 'skimage',
    'h5py': 'h5py',
    'zarr': 'zarr'
}

print(f"\n1. Core Scientific Computing Packages:")
print("-" * 50)

for package, alias in packages_core.items():
    try:
        if alias == 'plt':
            import matplotlib.pyplot as plt
            print(f"✓ {package}: {plt.matplotlib.__version__}")
        elif alias == 'sklearn':
            import sklearn
            print(f"✓ {package}: {sklearn.__version__}")
        elif alias == 'skimage':
            import skimage
            print(f"✓ {package}: {skimage.__version__}")
        else:
            module = __import__(package)
            version = getattr(module, '__version__', 'Unknown')
            print(f"✓ {package}: {version}")
    except ImportError as e:
        print(f"✗ {package}: NOT INSTALLED ({e})")

# Test Jupyter and development tools
jupyter_packages = {
    'jupyterlab': 'jupyterlab',
    'ipywidgets': 'ipywidgets',
    'tqdm': 'tqdm'
}

print(f"\n2. Jupyter and Development Tools:")
print("-" * 50)

for package, module_name in jupyter_packages.items():
    try:
        module = __import__(module_name)
        version = getattr(module, '__version__', 'Unknown')
        print(f"✓ {package}: {version}")
    except ImportError as e:
        print(f"✗ {package}: NOT INSTALLED ({e})")

print(f"\n3. Machine Learning and Deep Learning:")
print("-" * 50)

# Test ML/DL packages
ml_packages = ['torch', 'tensorflow']

for package in ml_packages:
    try:
        if package == 'torch':
            import torch
            print(f"✓ PyTorch: {torch.__version__}")
            if torch.cuda.is_available():
                print(f"  - CUDA available: {torch.cuda.get_device_name()}")
            else:
                print(f"  - CUDA available: No")
        elif package == 'tensorflow':
            import tensorflow as tf
            print(f"✓ TensorFlow: {tf.__version__}")
    except ImportError as e:
        print(f"✗ {package}: NOT INSTALLED ({e})")

print(f"\n4. Environment Setup Status:")
print("-" * 50)

# Check if we're in the correct conda environment
conda_env = sys.prefix.split('/')[-1] if 'conda' in sys.prefix or 'miniconda' in sys.prefix else 'Not using conda'
print(f"Active Environment: {conda_env}")

if conda_env == 'ms465-2026':
    print("✓ Correct conda environment activated")
else:
    print("⚠ Warning: Expected 'ms465-2026' environment")

print("\n" + "="*80)

### Testing EM-Specific Packages

Now we'll verify the installation of specialized packages for electron microscopy data analysis. These packages are essential for the course and may require additional setup steps.

In [None]:
"""
Test EM-specific packages that are critical for the course
"""

print("5. Electron Microscopy Specific Packages:")
print("-" * 50)

# Core EM analysis packages
em_packages = {
    'py4DSTEM': 'py4DSTEM',
    'hyperspy': 'hyperspy', 
    'ase': 'ase',
    'ncempy': 'ncempy'
}

em_status = {}

for package_name, import_name in em_packages.items():
    try:
        module = __import__(import_name)
        version = getattr(module, '__version__', 'Unknown')
        print(f"✓ {package_name}: {version}")
        em_status[package_name] = True
    except ImportError as e:
        print(f"✗ {package_name}: NOT INSTALLED")
        print(f"  Installation command: pip install {package_name.lower()}")
        em_status[package_name] = False

# Advanced EM packages (may not be required for Week 1)
print(f"\n6. Advanced EM Packages (Optional for Week 1):")
print("-" * 50)

advanced_em = {
    'atomai': 'atomai',
    'libertem': 'libertem', 
    'pyxem': 'pyxem',
    'abtem': 'abtem'
}

for package_name, import_name in advanced_em.items():
    try:
        module = __import__(import_name)
        version = getattr(module, '__version__', 'Unknown')
        print(f"✓ {package_name}: {version}")
    except ImportError:
        print(f"○ {package_name}: Not installed (will be needed in later weeks)")

# Summary
print(f"\n7. Installation Summary:")
print("-" * 50)

required_count = len([p for p in em_status.values() if p])
total_required = len(em_status)

if required_count == total_required:
    print(f"✓ All required EM packages installed ({required_count}/{total_required})")
    print("  Ready to proceed with electron microscopy data analysis!")
else:
    print(f"⚠ Missing packages: {total_required - required_count}/{total_required}")
    print("  Install missing packages before proceeding with EM analysis")

print("\n" + "="*80)

## Part 2: GitHub Repository Setup and Version Control

Version control is essential for reproducible computational research. In this section, you'll set up your course repository and demonstrate proper Git workflows.

In [None]:
"""
Git Configuration and Repository Status Check
"""

import subprocess
import os
import getpass

print("="*80)
print("Git Configuration and Repository Status")
print("="*80)

def run_git_command(command):
    """Run a git command and return the output"""
    try:
        result = subprocess.run(command.split(), 
                              capture_output=True, 
                              text=True, 
                              check=True)
        return result.stdout.strip()
    except subprocess.CalledProcessError as e:
        return f"Error: {e.stderr.strip()}"
    except FileNotFoundError:
        return "Error: Git not found. Please install Git."

# Check Git installation and version
git_version = run_git_command("git --version")
print(f"Git Version: {git_version}")

# Check if we're in a Git repository
git_status = run_git_command("git status --porcelain")
is_git_repo = "Error:" not in git_status

if is_git_repo:
    print("✓ Currently in a Git repository")
    
    # Get repository information
    repo_url = run_git_command("git config --get remote.origin.url")
    current_branch = run_git_command("git branch --show-current")
    
    print(f"Repository URL: {repo_url}")
    print(f"Current Branch: {current_branch}")
    
    # Check for uncommitted changes
    if git_status:
        print(f"⚠ Uncommitted changes detected:")
        for line in git_status.split('\n'):
            print(f"  {line}")
    else:
        print("✓ Working directory clean")
        
    # Show recent commits
    print(f"\nRecent Commits:")
    recent_commits = run_git_command("git log --oneline -5")
    for line in recent_commits.split('\n'):
        if line:
            print(f"  {line}")
            
else:
    print("⚠ Not in a Git repository")
    print("You need to initialize a Git repository for this assignment")

# Check Git configuration
print(f"\nGit Configuration:")
git_user = run_git_command("git config --get user.name")
git_email = run_git_command("git config --get user.email")

print(f"User Name: {git_user}")
print(f"User Email: {git_email}")

if "Error:" in git_user or "Error:" in git_email:
    print("⚠ Git user information not configured")
    print("Run the following commands to configure Git:")
    print('  git config --global user.name "Your Name"')
    print('  git config --global user.email "your.email@example.com"')

print("\n" + "="*80)

### Action Items for Git Setup

**If you haven't already, complete these steps:**

1. **Create a GitHub account** at https://github.com if you don't have one
2. **Create a new repository** named `ms465-computational-em-2026`
3. **Clone the repository** to your local machine
4. **Configure Git** with your name and email (see output above)
5. **Create the initial commit** with this notebook

**Repository Structure to Create:**
```
ms465-computational-em-2026/
├── README.md
├── week_01/
│   ├── assignment_01_environment_setup.ipynb (this file)
│   └── data/
├── week_02/
├── ...
├── final_project/
└── .gitignore
```

## Part 3: Data Management and HDF5 Handling

Electron microscopy generates large datasets that require efficient storage and management. We'll demonstrate best practices using HDF5 format and proper metadata handling.

In [None]:
"""
Demonstrate HDF5 data management for electron microscopy datasets
"""

import numpy as np
import h5py
import matplotlib.pyplot as plt
from datetime import datetime
import os

print("="*80)
print("HDF5 Data Management Demonstration")
print("="*80)

# Create sample EM-like data
print("Creating sample electron microscopy dataset...")

# Simulate a 4D-STEM dataset: (scan_x, scan_y, detector_x, detector_y)
scan_size = (64, 64)  # 64x64 scan positions
detector_size = (128, 128)  # 128x128 detector pixels

# Create synthetic 4D data
np.random.seed(42)  # For reproducibility
sample_4d_data = np.random.poisson(10, size=(*scan_size, *detector_size)).astype(np.uint16)

# Add some realistic structure (central beam and diffraction spots)
for i in range(scan_size[0]):
    for j in range(scan_size[1]):
        # Central beam
        center_x, center_y = detector_size[0]//2, detector_size[1]//2
        y, x = np.ogrid[:detector_size[0], :detector_size[1]]
        mask = (x - center_x)**2 + (y - center_y)**2 < 100
        sample_4d_data[i, j][mask] += np.random.poisson(50, size=mask.sum())
        
        # Add some diffraction spots
        if i > 20 and j > 20:
            spot_x, spot_y = center_x + 30, center_y + 20
            spot_mask = (x - spot_x)**2 + (y - spot_y)**2 < 25
            sample_4d_data[i, j][spot_mask] += np.random.poisson(20, size=spot_mask.sum())

print(f"Generated 4D dataset with shape: {sample_4d_data.shape}")
print(f"Data type: {sample_4d_data.dtype}")
print(f"Memory usage: {sample_4d_data.nbytes / 1024**2:.2f} MB")

# Create HDF5 file with proper structure and metadata
hdf5_filename = "sample_4dstem_data.h5"

print(f"\nCreating HDF5 file: {hdf5_filename}")

with h5py.File(hdf5_filename, "w") as f:
    # Create main data group
    data_group = f.create_group("4dstem_data")
    
    # Store the 4D dataset with compression
    dataset = data_group.create_dataset(
        "datacube", 
        data=sample_4d_data,
        compression="gzip",
        compression_opts=6,
        chunks=True,  # Enable chunking for better I/O performance
        shuffle=True  # Improve compression
    )
    
    # Add comprehensive metadata
    dataset.attrs["description"] = "Simulated 4D-STEM dataset for MS 465 course"
    dataset.attrs["created"] = datetime.now().isoformat()
    dataset.attrs["scan_shape"] = scan_size
    dataset.attrs["detector_shape"] = detector_size
    dataset.attrs["data_type"] = str(sample_4d_data.dtype)
    dataset.attrs["units"] = "counts"
    
    # Experimental parameters (typical for 4D-STEM)
    params_group = f.create_group("experimental_parameters")
    params_group.attrs["accelerating_voltage_kV"] = 200.0
    params_group.attrs["camera_length_mm"] = 195.0
    params_group.attrs["convergence_angle_mrad"] = 1.0
    params_group.attrs["pixel_size_nm"] = 0.5
    params_group.attrs["dwell_time_ms"] = 1.0
    params_group.attrs["microscope"] = "Simulated TEM"
    
    # Processing parameters
    processing_group = f.create_group("processing")
    processing_group.attrs["software"] = "MS 465 Assignment 1"
    processing_group.attrs["version"] = "1.0"
    processing_group.attrs["processed_date"] = datetime.now().isoformat()
    
    print("✓ 4D dataset stored with compression")
    print("✓ Comprehensive metadata added")
    print("✓ Experimental parameters recorded")

# Verify file size and compression ratio
original_size = sample_4d_data.nbytes
compressed_size = os.path.getsize(hdf5_filename)
compression_ratio = original_size / compressed_size

print(f"\nCompression Statistics:")
print(f"Original size: {original_size / 1024**2:.2f} MB")
print(f"Compressed size: {compressed_size / 1024**2:.2f} MB")
print(f"Compression ratio: {compression_ratio:.2f}x")

print("\n" + "="*80)

In [None]:
"""
Read and explore the HDF5 file structure
"""

print("="*80)
print("Reading and Exploring HDF5 Data Structure")
print("="*80)

def explore_hdf5_structure(filename, max_depth=3, current_depth=0):
    """Recursively explore HDF5 file structure"""
    indent = "  " * current_depth
    
    with h5py.File(filename, "r") as f:
        if current_depth == 0:
            print(f"HDF5 File: {filename}")
            print(f"File size: {os.path.getsize(filename) / 1024**2:.2f} MB")
            print("-" * 60)
        
        def print_structure(name, obj):
            item_indent = "  " * (current_depth + 1)
            if isinstance(obj, h5py.Dataset):
                print(f"{item_indent}📊 {name}: {obj.shape} {obj.dtype}")
                if hasattr(obj, 'attrs') and len(obj.attrs) > 0:
                    for attr_name, attr_value in obj.attrs.items():
                        print(f"{item_indent}   └─ {attr_name}: {attr_value}")
            elif isinstance(obj, h5py.Group):
                print(f"{item_indent}📁 {name}/")
                if hasattr(obj, 'attrs') and len(obj.attrs) > 0:
                    for attr_name, attr_value in obj.attrs.items():
                        print(f"{item_indent}   └─ {attr_name}: {attr_value}")
        
        f.visititems(print_structure)

# Explore the structure
explore_hdf5_structure(hdf5_filename)

# Load and examine the data
print(f"\nLoading data for analysis...")

with h5py.File(hdf5_filename, "r") as f:
    # Load the 4D dataset
    datacube = f["4dstem_data"]["datacube"][:]
    
    # Get metadata
    scan_shape = f["4dstem_data"]["datacube"].attrs["scan_shape"]
    detector_shape = f["4dstem_data"]["datacube"].attrs["detector_shape"]
    
    print(f"✓ Loaded 4D datacube: {datacube.shape}")
    print(f"✓ Scan dimensions: {scan_shape}")
    print(f"✓ Detector dimensions: {detector_shape}")
    
    # Calculate some basic statistics
    total_counts = np.sum(datacube)
    mean_counts = np.mean(datacube)
    max_counts = np.max(datacube)
    
    print(f"\nDataset Statistics:")
    print(f"Total counts: {total_counts:,}")
    print(f"Mean counts per pixel: {mean_counts:.2f}")
    print(f"Maximum counts: {max_counts}")

print("\n" + "="*80)

## Part 4: Basic EM Data Visualization

Now we'll create meaningful visualizations of our simulated 4D-STEM data, demonstrating common analysis techniques used in electron microscopy.

In [None]:
"""
Create comprehensive visualizations of 4D-STEM data
"""

# Set up matplotlib for high-quality figures
plt.style.use('default')
plt.rcParams['figure.dpi'] = 100
plt.rcParams['savefig.dpi'] = 300
plt.rcParams['font.size'] = 10

print("="*80)
print("4D-STEM Data Visualization")
print("="*80)

# Load the data
with h5py.File(hdf5_filename, "r") as f:
    datacube = f["4dstem_data"]["datacube"][:]

# 1. Virtual Bright Field (VBF) Image
print("Creating Virtual Bright Field image...")

# Define central detector region (bright field)
center_x, center_y = detector_shape[0]//2, detector_shape[1]//2
radius = 15  # pixels

# Create circular mask for bright field detector
y, x = np.ogrid[:detector_shape[0], :detector_shape[1]]
bf_mask = (x - center_x)**2 + (y - center_y)**2 <= radius**2

# Sum over detector pixels within bright field region
vbf_image = np.sum(datacube * bf_mask, axis=(2, 3))

# 2. Virtual Dark Field (VDF) Image
print("Creating Virtual Dark Field image...")

# Define annular dark field detector (outer ring)
inner_radius = 25
outer_radius = 50

df_mask = ((x - center_x)**2 + (y - center_y)**2 >= inner_radius**2) & \
          ((x - center_x)**2 + (y - center_y)**2 <= outer_radius**2)

vdf_image = np.sum(datacube * df_mask, axis=(2, 3))

# 3. Average Diffraction Pattern
print("Computing average diffraction pattern...")

avg_dp = np.mean(datacube, axis=(0, 1))

# 4. Selected Area Diffraction Patterns
print("Extracting selected area diffraction patterns...")

# Select a few interesting positions
positions = [(10, 10), (32, 32), (50, 50)]
selected_dps = [datacube[x, y] for x, y in positions]

# Create comprehensive visualization
fig = plt.figure(figsize=(16, 12))

# Virtual Bright Field
ax1 = plt.subplot(2, 4, 1)
im1 = ax1.imshow(vbf_image, cmap='gray', origin='lower')
ax1.set_title('Virtual Bright Field (VBF)')
ax1.set_xlabel('Scan X (pixels)')
ax1.set_ylabel('Scan Y (pixels)')
plt.colorbar(im1, ax=ax1, label='Intensity')

# Virtual Dark Field
ax2 = plt.subplot(2, 4, 2)
im2 = ax2.imshow(vdf_image, cmap='hot', origin='lower')
ax2.set_title('Virtual Dark Field (VDF)')
ax2.set_xlabel('Scan X (pixels)')
ax2.set_ylabel('Scan Y (pixels)')
plt.colorbar(im2, ax=ax2, label='Intensity')

# Average Diffraction Pattern
ax3 = plt.subplot(2, 4, 3)
im3 = ax3.imshow(avg_dp, cmap='viridis', origin='lower', 
                 norm=plt.Normalize(vmin=0, vmax=np.percentile(avg_dp, 99)))
ax3.set_title('Average Diffraction Pattern')
ax3.set_xlabel('Detector X (pixels)')
ax3.set_ylabel('Detector Y (pixels)')

# Add circles to show virtual detector positions
circle_bf = plt.Circle((center_x, center_y), radius, fill=False, color='red', linewidth=2)
circle_df_inner = plt.Circle((center_x, center_y), inner_radius, fill=False, color='yellow', linewidth=2)
circle_df_outer = plt.Circle((center_x, center_y), outer_radius, fill=False, color='yellow', linewidth=2)
ax3.add_patch(circle_bf)
ax3.add_patch(circle_df_inner)
ax3.add_patch(circle_df_outer)

plt.colorbar(im3, ax=ax3, label='Intensity')

# Virtual detector diagram
ax4 = plt.subplot(2, 4, 4)
detector_diagram = np.zeros(detector_shape)
detector_diagram[bf_mask] = 1  # Bright field
detector_diagram[df_mask] = 2  # Dark field

im4 = ax4.imshow(detector_diagram, cmap='viridis', origin='lower')
ax4.set_title('Virtual Detector Configuration')
ax4.set_xlabel('Detector X (pixels)')
ax4.set_ylabel('Detector Y (pixels)')

# Add legend
from matplotlib.patches import Patch
legend_elements = [Patch(facecolor='purple', label='Background'),
                  Patch(facecolor='yellow', label='Bright Field'),
                  Patch(facecolor='green', label='Dark Field')]
ax4.legend(handles=legend_elements, loc='upper right')

# Selected area diffraction patterns
for i, (pos, dp) in enumerate(zip(positions, selected_dps)):
    ax = plt.subplot(2, 4, 5 + i)
    im = ax.imshow(dp, cmap='plasma', origin='lower',
                   norm=plt.Normalize(vmin=0, vmax=np.percentile(dp, 99)))
    ax.set_title(f'Diffraction Pattern\nPosition ({pos[0]}, {pos[1]})')
    ax.set_xlabel('Detector X (pixels)')
    ax.set_ylabel('Detector Y (pixels)')
    
    if i == 0:  # Add colorbar to first DP
        plt.colorbar(im, ax=ax, label='Intensity')

# Add position markers to VBF image
for pos in positions:
    ax1.plot(pos[1], pos[0], 'ro', markersize=8, markerfacecolor='none', markeredgewidth=2)
    ax1.text(pos[1]+2, pos[0]+2, f'({pos[0]},{pos[1]})', color='red', fontsize=8)

plt.tight_layout()
plt.savefig('4dstem_analysis_overview.png', dpi=300, bbox_inches='tight')
plt.show()

# Print analysis summary
print(f"\n4D-STEM Analysis Summary:")
print(f"Dataset shape: {datacube.shape}")
print(f"VBF image shape: {vbf_image.shape}")
print(f"VDF image shape: {vdf_image.shape}")
print(f"Average DP shape: {avg_dp.shape}")
print(f"BF detector area: {np.sum(bf_mask)} pixels")
print(f"DF detector area: {np.sum(df_mask)} pixels")

# Calculate contrast metrics
vbf_contrast = (np.max(vbf_image) - np.min(vbf_image)) / np.mean(vbf_image)
vdf_contrast = (np.max(vdf_image) - np.min(vdf_image)) / np.mean(vdf_image)

print(f"VBF contrast: {vbf_contrast:.3f}")
print(f"VDF contrast: {vdf_contrast:.3f}")

print("\n✓ 4D-STEM visualization complete")
print("✓ Figure saved as '4dstem_analysis_overview.png'")

print("\n" + "="*80)

## Part 5: Reproducible Research Practices

Document your work with proper metadata, version information, and analysis parameters to ensure reproducibility.

In [None]:
"""
Generate analysis report with complete metadata for reproducibility
"""

import json
import hashlib

print("="*80)
print("Reproducible Research Documentation")
print("="*80)

# Create comprehensive analysis metadata
analysis_metadata = {
    "analysis_info": {
        "assignment": "MS 465 Assignment 1: Environment Setup",
        "date": datetime.now().isoformat(),
        "analyst": getpass.getuser(),  # Current username
        "python_version": sys.version,
        "platform": platform.platform()
    },
    
    "data_info": {
        "filename": hdf5_filename,
        "file_size_bytes": os.path.getsize(hdf5_filename),
        "data_shape": list(datacube.shape),
        "data_type": str(datacube.dtype),
        "total_counts": int(np.sum(datacube)),
        "data_hash": hashlib.md5(datacube.tobytes()).hexdigest()
    },
    
    "analysis_parameters": {
        "bf_detector_radius": radius,
        "df_detector_inner_radius": inner_radius,
        "df_detector_outer_radius": outer_radius,
        "selected_positions": positions,
        "visualization_percentile": 99
    },
    
    "results": {
        "vbf_contrast": float(vbf_contrast),
        "vdf_contrast": float(vdf_contrast),
        "bf_detector_area_pixels": int(np.sum(bf_mask)),
        "df_detector_area_pixels": int(np.sum(df_mask)),
        "avg_dp_mean_intensity": float(np.mean(avg_dp)),
        "avg_dp_max_intensity": float(np.max(avg_dp))
    },
    
    "software_versions": {}
}

# Add package versions
key_packages = ['numpy', 'matplotlib', 'h5py', 'scipy']
for package in key_packages:
    try:
        module = __import__(package)
        version = getattr(module, '__version__', 'Unknown')
        analysis_metadata["software_versions"][package] = version
    except ImportError:
        analysis_metadata["software_versions"][package] = "Not installed"

# Save metadata as JSON
metadata_filename = "analysis_metadata.json"
with open(metadata_filename, 'w') as f:
    json.dump(analysis_metadata, f, indent=2)

print(f"Analysis Metadata Summary:")
print(f"Analyst: {analysis_metadata['analysis_info']['analyst']}")
print(f"Date: {analysis_metadata['analysis_info']['date']}")
print(f"Data file: {analysis_metadata['data_info']['filename']}")
print(f"Data hash: {analysis_metadata['data_info']['data_hash'][:16]}...")
print(f"Analysis results saved to: {metadata_filename}")

# Create a summary report
report = f"""
# MS 465 Assignment 1 Analysis Report

## Analysis Summary
- **Date**: {analysis_metadata['analysis_info']['date']}
- **Analyst**: {analysis_metadata['analysis_info']['analyst']}
- **Dataset**: {analysis_metadata['data_info']['filename']}
- **Data Shape**: {analysis_metadata['data_info']['data_shape']}

## Key Results
- **VBF Contrast**: {analysis_metadata['results']['vbf_contrast']:.3f}
- **VDF Contrast**: {analysis_metadata['results']['vdf_contrast']:.3f}
- **Total Counts**: {analysis_metadata['data_info']['total_counts']:,}

## Virtual Detector Configuration
- **Bright Field Radius**: {analysis_metadata['analysis_parameters']['bf_detector_radius']} pixels
- **Dark Field Inner Radius**: {analysis_metadata['analysis_parameters']['df_detector_inner_radius']} pixels
- **Dark Field Outer Radius**: {analysis_metadata['analysis_parameters']['df_detector_outer_radius']} pixels

## Software Environment
"""

for package, version in analysis_metadata['software_versions'].items():
    report += f"- **{package}**: {version}\n"

report += f"\n## Reproducibility Information\n"
report += f"- **Data Hash**: `{analysis_metadata['data_info']['data_hash']}`\n"
report += f"- **Python Version**: {sys.version.split()[0]}\n"
report += f"- **Platform**: {platform.platform()}\n"

# Save report
report_filename = "analysis_report.md"
with open(report_filename, 'w') as f:
    f.write(report)

print(f"\n✓ Metadata saved to: {metadata_filename}")
print(f"✓ Report saved to: {report_filename}")

# Display file summary
files_created = [
    hdf5_filename,
    "4dstem_analysis_overview.png",
    metadata_filename,
    report_filename
]

print(f"\nFiles Created This Session:")
print("-" * 40)
for filename in files_created:
    if os.path.exists(filename):
        size = os.path.getsize(filename)
        print(f"✓ {filename:<30} ({size:,} bytes)")
    else:
        print(f"✗ {filename:<30} (not found)")

print("\n" + "="*80)

## Part 6: Assignment Completion Checklist

Before submitting this assignment, verify that you have completed all required components.

In [None]:
"""
Assignment completion verification
"""

print("="*80)
print("ASSIGNMENT 1 COMPLETION CHECKLIST")
print("="*80)

checklist_items = [
    ("Environment Verification", "All required packages installed and working"),
    ("Git Configuration", "Git configured with user name and email"),
    ("GitHub Repository", "Repository created and properly structured"),
    ("HDF5 Data Management", "Sample EM data created and stored with metadata"),
    ("Data Visualization", "4D-STEM analysis plots generated"),
    ("Reproducibility Documentation", "Analysis metadata and report created"),
    ("Code Quality", "Code is well-documented with comments"),
    ("File Organization", "All files properly organized and named")
]

print("Please verify the following items are complete:")
print()

for i, (item, description) in enumerate(checklist_items, 1):
    print(f"{i}. ☐ {item}")
    print(f"   {description}")
    print()

print("SUBMISSION REQUIREMENTS:")
print("-" * 40)
print("1. Complete this Jupyter notebook with all cells executed")
print("2. Commit all files to your GitHub repository with meaningful messages")
print("3. Include the following files in your repository:")

required_files = [
    "assignment_01_environment_setup.ipynb",
    "sample_4dstem_data.h5", 
    "4dstem_analysis_overview.png",
    "analysis_metadata.json",
    "analysis_report.md",
    "README.md (repository description)"
]

for filename in required_files:
    print(f"   - {filename}")

print("\n4. Submit your GitHub repository URL via Canvas")
print("5. Ensure repository is public or accessible to instructor")

print("\nGRADING CRITERIA:")
print("-" * 40)
print("- Environment Setup (25%): All packages working correctly")
print("- GitHub Integration (25%): Proper version control workflow")
print("- Code Quality (25%): Clean, documented, reproducible code")
print("- Data Analysis (25%): Successful EM data processing and visualization")

print(f"\nDUE DATE: Friday, January 9, 2026 at 11:59 PM")

print("\n" + "="*80)

# Final system check
print("FINAL SYSTEM STATUS:")
print("-" * 40)

try:
    # Check if all required files exist
    files_exist = all(os.path.exists(f) for f in files_created)
    print(f"✓ All output files created: {files_exist}")
    
    # Check if in git repository
    git_status = run_git_command("git status --porcelain")
    in_git_repo = "Error:" not in git_status
    print(f"✓ In Git repository: {in_git_repo}")
    
    # Check package availability
    core_packages = ['numpy', 'matplotlib', 'h5py', 'scipy']
    packages_ok = True
    for pkg in core_packages:
        try:
            __import__(pkg)
        except ImportError:
            packages_ok = False
            break
    print(f"✓ Core packages available: {packages_ok}")
    
    overall_status = files_exist and in_git_repo and packages_ok
    
    if overall_status:
        print("\nStatus: ✅ READY FOR SUBMISSION")
        print("Your environment is properly configured and all components are working!")
    else:
        print("\nStatus: ⚠️  NEEDS ATTENTION")
        print("Please review the checklist and fix any issues before submitting.")
        
except Exception as e:
    print(f"\nStatus: ❌ ERROR")
    print(f"Error during status check: {e}")

print("\n" + "="*80)