# CustomFit Tailoring Time Prediction

CustomFit is a premium tailoring service that wants to better predict alteration times for customer garments. They've collected data on garment complexity (measured by number of alterations needed) and the total time taken to complete the alterations.

## Sample Data

| Alterations Required | Completion Time (minutes) |
|---------------------|---------------------------|
| 3 | 45 |
| 7 | 95 |
| 2 | 35 |
| 8 | 105 |
| 1 | 25 |
| 4 | 55 |
| 6 | 85 |
| 2 | 30 |
| 3 | 40 |
| 7 | 90 |
| 5 | 70 |
| 4 | 50 |
| 5 | 75 |
| 6 | 80 |

In [None]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from utils.testing.regression_tests import (
    check_coefficients,
    check_prediction,
    check_diagnostics
)

## Part 1: Linear Regression Analysis

Fit a linear regression model to predict completion time based on number of alterations.

Hint:
- Expected relationship: Time = β₀ + β₁ × Alterations
- Use sklearn's LinearRegression
- Remember to reshape features for sklearn

In [None]:
### insert code here ###
# Prepare data
alterations = np.array([3, 7, 2, 8, 1, 4, 6, 2, 3, 7, 5, 4, 5, 6])
times = np.array([45, 95, 35, 105, 25, 55, 85, 30, 40, 90, 70, 50, 75, 80])

# Fit model and extract coefficients
slope = None      # Replace None with your calculation
intercept = None  # Replace None with your calculation

# Test your calculations
if check_coefficients(slope, intercept):
    print("✓ Regression coefficients correct!")
else:
    print("✗ Check your regression calculations")

## Part 2: Prediction

Use your model to make predictions and calculate confidence intervals.

Hint:
- Use model.predict() for point predictions
- Consider uncertainty in predictions
- Remember the difference between confidence and prediction intervals

In [None]:
### insert code here ###
# Make predictions
test_alterations = 5  # Predict time for 5 alterations
predicted_time = None  # Replace None with your calculation

# Test your prediction
if check_prediction(predicted_time, test_alterations):
    print("✓ Prediction correct!")
else:
    print("✗ Check your prediction calculation")

## Part 3: Model Diagnostics

Check if the model assumptions are satisfied.

Hint:
- Calculate and plot residuals
- Check for normality using Q-Q plot
- Look for patterns in residuals vs. fitted values

In [None]:
### insert code here ###
# Calculate residuals and run diagnostics
model_diagnostics = None  # Replace None with your calculations

# Test your diagnostics
if check_diagnostics(model_diagnostics):
    print("✓ Model diagnostics correct!")
else:
    print("✗ Review your diagnostic calculations")

## Part 4: Visualization

Create visual representations of:
1. Data points and regression line
2. Residual plot
3. Q-Q plot for normality

This will help communicate your findings to the CustomFit team.

In [None]:
### insert code here ###
# Create visualizations

# 1. Regression Plot
plt.figure(figsize=(12, 4))
plt.subplot(131)
plt.scatter(alterations, times, alpha=0.5)
plt.xlabel('Number of Alterations')
plt.ylabel('Completion Time (minutes)')
plt.title('Alterations vs. Time')

# Add your residual and Q-Q plots

## Extension Questions

1. Statistical Considerations:
   - How would adding polynomial terms affect the model?
   - What other factors might influence completion time?
   - How could we validate the model with new data?

2. Business Implications:
   - How can CustomFit use this model for scheduling?
   - What pricing strategies could this inform?
   - How could the model improve customer satisfaction?

## Statistical Notes

Key concepts used:
1. Simple linear regression
2. Coefficient interpretation
3. Model diagnostics
4. Prediction intervals

Remember:
- Check model assumptions
- Consider practical significance
- Think about prediction uncertainty