# Ungraded Lab: Experiment Management Lab

## Overview 
In this hands-on lab, you'll learn to manage multiple analysis approaches using Git branching strategies at EngageMetrics. You'll create feature branches for different analytical experiments, handle merge conflicts, and document your branching strategy—essential skills for collaborative data science projects.

<b>Pro tip:</b> Stuck on a step? The screencast shows a similar workflow that can help you troubleshoot. Take a moment to review that section before continuing.

## Learning Outcomes 

By the end of this lab, you will be able to:
- Create and manage feature branches for different analysis approaches
- Implement Git Flow for data science experimentation
- Resolve merge conflicts in Jupyter notebooks
- Document branching strategies effectively

## Activities

### Activity 1:  Setting Up Feature Branches

<b>Step 1:</b> Import the required libraries and load the dataset: 

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
df = pd.read_csv('employee_insights_cleaned.csv')

<b>Step 2:</b> Create Analysis Branches 

In [2]:
# Initialize a new Git repository with main branc (if repo doesn't exist yet)
!git init -b main

# (Optional) Set your user info if not set globally. Remove the # to run the commands given below
#!git config user.name "Your Name"
#!git config user.email "your.email@example.com"

# Add all files to staging and commit initial files
!git add .
!git commit -m "initial commit"

# Create a branch for department analysis
!git checkout -b feature/department-analysis

# Create a branch for work-mode analysis
!git checkout -b feature/work-mode-analysis

Reinitialized existing Git repository in /home/jovyan/work/.git/
[main 49b7d9f] initial commit
 4 files changed, 77 insertions(+), 42 deletions(-)
 mode change 100644 => 100755 .dotfiles-coursera/.gitconfig
 mode change 100644 => 100755 .ipynb_checkpoints/ExperimentManagementLab-checkpoint.ipynb
 mode change 100644 => 100755 ExperimentManagementLab.ipynb
 mode change 100644 => 100755 employee_insights_cleaned.csv
fatal: A branch named 'feature/department-analysis' already exists.
fatal: A branch named 'feature/work-mode-analysis' already exists.


<b>Tip:</b> Use descriptive branch names that clearly indicate the analysis purpose. You can use !git branch to verify branches.

### Activity 2:  Developing Parallel Analyses

<b>Step 1:</b> Work on Department Analysis

In [3]:
# Switch to department analysis branch
!git checkout feature/department-analysis

# Add a function to analyze and visualize department performance:
def analyze_department_performance(df):
    """
    Analyze performance metrics by department
    """
    # Your analysis code here
    
# You may call the function to check the output
# analyze_department_performance(df)

Switched to branch 'feature/department-analysis'


<b>Step 2:</b> Work on Work-Mode Analysis


In [4]:
# Switch to work mode analysis branch
!git checkout feature/work-mode-analysis

# Add a function to analyze and visualize work-mode impact:
def analyze_work_mode_impact(df):
    """
    Analyze performance metrics by work mode
    """
    # Your analysis code here
    
# You may call the function to check the output
# analyze_work_mode_impact(df)

Switched to branch 'feature/work-mode-analysis'


<b>Tip:</b> Keep your analyses in separate functions to minimize merge conflicts.

### Activity 3: Handling Merge Conflicts

<b>Step 1:</b>  Merge Feature Branches

In [5]:
# Switch to main branch
!git checkout main

# Merge department analysis
!git merge feature/department-analysis

Switched to branch 'main'
Already up to date.


<b>Step 2:</b> Resolve Conflicts When conflicts occur:
1. Open the conflicting notebook
2. Choose which analysis version to keep
3. Update the notebook metadata
4. Commit the resolved changes

<b>Test Your Work</b>
1. Run all cells in merged notebook
2. Verify analyses produce expected results
3. Check Git status for clean workspace

## Success Checklist
- Feature branches created and organized
- Analyses completed in separate branches
- Merge conflicts resolved successfully
- Documentation updated for all changes
- Final notebook runs without errors

## Common Issues & Solutions 
- Problem: Notebook merge conflicts 
    - Solution: Clear outputs before merging and use nbdime for notebook-specific merging
- Problem: Lost analysis version 
    - Solution: Create backup branches before merging
    
## Summary 
Congratulations! You've learned to manage multiple analysis approaches using Git branching strategies. This workflow will help you organize collaborative data science projects effectively.

### Key Points
- Feature branches keep experiments organized
- Clear branch naming aids collaboration
- Regular merges prevent major conflicts
- Documentation ensures reproducibility

## Solution Code
Stuck on your code or want to check your solution? Here's a complete reference implementation to guide you. This represents just one effective approach—try solving independently first, then use this to overcome obstacles or compare techniques. The solution is provided to help you move forward and explore alternative approaches to achieve the same results. Happy coding!


### Activity 1: Setting Up Feature Branches - Solution Code

In [6]:
# Activity 1: Setting Up Feature Branches

"""
# Notebooks commands for branch setup
!git checkout -b feature/department-analysis
!git checkout -b feature/work-mode-analysis

# Verify branches
!git branch
"""

# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset
df = pd.read_csv('employee_insights_cleaned.csv')

### Activity 2: Developing Parallel Analyses - Solution Code

In [7]:
# Step 1: Department Analysis Branch

"""
!git checkout feature/department-analysis
"""

def analyze_department_performance(df):
    """Analyze performance metrics by department.
    
    Args:
        df (pandas.DataFrame): Employee data containing department information
            and performance metrics
    
    Returns:
        dict: Dictionary containing performance metrics by department
    """
    results = {}
    
    # Calculate average performance metrics by department
    dept_metrics = df.groupby('department').agg({
        'satisfaction_score': ['mean', 'std'],
        'projects_completed': ['mean', 'count']
    }).round(2)
    
    # Visualize department performance
    plt.figure(figsize=(12, 6))
    sns.barplot(data=df, x='department', y='satisfaction_score')
    plt.title('Average Satisfaction by Department')
    plt.xticks(rotation=45)
    plt.tight_layout()
    
    results['metrics'] = dept_metrics
    results['top_department'] = dept_metrics['satisfaction_score']['mean'].idxmax()
    
    return results

# Step 2: Work Mode Analysis Branch

def analyze_work_mode_impact(df):
    """Analyze performance metrics by work mode.
    
    Args:
        df (pandas.DataFrame): Employee data containing work mode information
            and performance metrics
    
    Returns:
        dict: Dictionary containing performance metrics by work mode
    """
    results = {}
    
    # Calculate metrics by work mode
    work_mode_metrics = df.groupby('work_mode').agg({
        'satisfaction_score': ['mean', 'std'],
        'projects_completed': ['mean', 'count']
    }).round(2)
    
    # Visualize work mode impact
    plt.figure(figsize=(10, 6))
    sns.boxplot(data=df, x='work_mode', y='satisfaction_score')
    plt.title('Satisfaction Distribution by Work Mode')
    plt.tight_layout()
    
    results['metrics'] = work_mode_metrics
    results['most_satisfied'] = work_mode_metrics['satisfaction_score']['mean'].idxmax()
    
    return results

### Activity 3: Handling Merge Conflicts - Solution Code

In [8]:
# Step 1: Merge Feature  Branch

"""
!git checkout main

# Merge department analysis
!git merge feature/department-analysis
"""


# Step 2: Merge Conflict Resolution

"""
# Check if there are conflicts::
!git status

# If so, resolve conflicts in notebook: open the notebook file mentioned in the conflict and manually resolve the conflict markers (<<<<<<<, =======, >>>>>>>) in your notebook.

!git add .

!git commit -m "resolve: merge conflicts in analysis notebook"
"""

