# Module 6: Testing and Deployment

**Course**: End-to-End Machine Learning (Datacamp)  
**Case Study**: CardioCare Heart Disease Prediction  
**Author**: Seif

---

## Overview

In this module, you'll:
- Understand why testing is critical pre-deployment
- Learn Python's built-in `unittest` framework
- Write tests for model inference (shape, speed, value ranges)
- Run tests automatically within the notebook

Testing ensures our model doesn't crash and returns predictions quickly and in correct formatsâ€”especially important when cardiologists rely on it.

## Why testing before deployment?

- Catch failures before clinicians see them
- Verify latency (inference time) is acceptable
- Check predictions have the right type and shape
- Validate inputs fall within expected ranges

We'll use `unittest.TestCase` to define focused tests and run them automatically.

## Unittest basics

- Create a test class by subclassing `unittest.TestCase`
- Name test methods starting with `test_...`
- Use assertions like `assertEqual`, `assertTrue`, `assertIn`, etc.
- Run with `unittest.main()` (we'll use a notebook-friendly variant)

In [None]:
# Prepare a small model and data for testing purposes
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import numpy as np

# Use synthetic data as a stand-in for CardioCare inputs
X, y = make_classification(
    n_samples=1000, n_features=8, n_informative=5, n_redundant=2,
    random_state=42, class_sep=1.2
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

deployed_model = LogisticRegression(max_iter=1000, solver="liblinear", random_state=42)
deployed_model.fit(X_train, y_train)

# Helper: simple inference function (could be an API endpoint in prod)
def predict_labels(model, X):
    return model.predict(X)

In [None]:
# Write unit tests for inference behavior
import unittest, time
import numpy as np

class TestModelInference(unittest.TestCase):
    def setUp(self):
        # In real systems, load model artifacts and a validation sample here
        self.model = deployed_model
        # Use a small batch for speed
        self.X_batch = X_test[:64]
        self.y_batch = y_test[:64]

    def test_predictions_shape(self):
        y_pred = predict_labels(self.model, self.X_batch)
        self.assertEqual(y_pred.shape[0], self.X_batch.shape[0], 
                         "Prediction length should match input batch size")

    def test_inference_latency(self):
        t0 = time.perf_counter()
        _ = predict_labels(self.model, self.X_batch)
        dt_ms = (time.perf_counter() - t0) * 1000.0
        # Expect inference to be comfortably under 50 ms for this small batch
        self.assertTrue(dt_ms < 50.0, f"Inference too slow: {dt_ms:.2f} ms")

    def test_input_value_ranges(self):
        # Example range checks (replace with domain-specific ranges in production)
        # Here we assert each standardized-like feature stays within plausible bounds.
        x_min = np.min(X_train, axis=0)
        x_max = np.max(X_train, axis=0)
        # Pick one sample and ensure it's within observed training bounds
        sample = self.X_batch[0]
        for i, val in enumerate(sample):
            self.assertTrue(x_min[i] - 1e-6 <= val <= x_max[i] + 1e-6,
                            f"Feature {i} out of expected range: {val} not in [{x_min[i]}, {x_max[i]}]")

# Notebook-friendly test runner
unittest.main(argv=[''], verbosity=2, exit=False)

## Testing do's and don'ts

Do:
- Run tests on every change and before deployment
- Add tests alongside new functionality (TDD mindset)
- Test failure modes and edge cases (empty inputs, wrong types, out-of-range values)

Don't:
- Over-test trivial, stable library internals (e.g., re-checking sklearn math)
- Write redundant or flaky tests that slow development

Tip: Factor model I/O and preprocessing into small functions so they can be unit-tested independently.

## Next steps

- Extract `predict_labels` into a small `src/` module and write tests in a `tests/` folder
- Add CI to run tests automatically on push (e.g., GitHub Actions)
- Extend tests to cover preprocessing, postprocessing, and schema validation

In [None]:
# Test that model prediction outputs are only {0, 1}
import unittest
import numpy as np

class TestModelPredictionValues(unittest.TestCase):
    def setUp(self):
        # Reuse model and test split prepared earlier; provide safe fallbacks
        try:
            self.model = deployed_model
        except NameError:
            try:
                self.model = model
            except NameError:
                from sklearn.linear_model import LogisticRegression
                from sklearn.datasets import make_classification
                X_tmp, y_tmp = make_classification(n_samples=500, n_features=8, n_informative=5, random_state=0)
                self.model = LogisticRegression(max_iter=500, solver="liblinear", random_state=0).fit(X_tmp, y_tmp)
        try:
            self.X_test = X_test
        except NameError:
            try:
                self.X_test = X
            except NameError:
                from sklearn.datasets import make_classification
                X_tmp, _ = make_classification(n_samples=200, n_features=8, n_informative=5, random_state=0)
                self.X_test = X_tmp[:64]

    def test_prediction_output_values(self):
        print("Running test_prediction_output_values test case")
        # Use helper if defined; otherwise call model.predict directly
        try:
            y_pred = predict_labels(self.model, self.X_test)
        except NameError:
            y_pred = self.model.predict(self.X_test)
        unique_values = np.unique(y_pred)
        for value in unique_values:
            self.assertIn(int(value), [0, 1], f"Unexpected prediction value: {value}")

# Run just this test class to avoid re-running earlier suites
suite = unittest.TestLoader().loadTestsFromTestCase(TestModelPredictionValues)
unittest.TextTestRunner(verbosity=2).run(suite)