# 🚀 Deployment Postmortem – Complaint Classifier (DistilBERT + LogisticRegression)

This document captures all major deployment issues encountered while hosting a DistilBERT-based complaint classifier using AWS SageMaker, and how each was resolved.

---

## 🔥 Summary

- **Model architecture:** DistilBERT embeddings + sklearn LogisticRegression
- **Serving method:** `sagemaker.model.Model` with HuggingFace inference container
- **Final solution:** Local BERT files + custom `inference.py` + script mode override

---

## 🧨 Deployment Errors + Fixes

### ❌ 1. `ModelInvocationTimeout`
**Error:**
ModelError: Invocation timed out while waiting for a response from container primary.

**Cause:**
Model tried to load DistilBERT from HuggingFace at runtime. SageMaker inference containers **don’t have outbound internet**.

**Fix:**
- Pre-downloaded DistilBERT model + tokenizer:
  ```python
  DistilBertTokenizer.from_pretrained("distilbert-base-uncased").save_pretrained("./bert")
  DistilBertModel.from_pretrained("distilbert-base-uncased").save_pretrained("./bert")

  Bundled bert/ folder inside model.tar.gz

Updated inference.py to load locally:

tokenizer = DistilBertTokenizer.from_pretrained(os.path.join(model_dir, "bert"))


❌ 2. TorchServe Backend Crash
WorkerThread - Backend worker error
Cause:
Used PyTorchModel(...), which wraps the model with TorchServe.
TorchServe expects .pt files + handler.py, not .joblib + HuggingFace logic.

Fix:

Switched to raw sagemaker.model.Model() to avoid TorchServe



❌ 3. predictor = None
HuggingFace container defaulted to TorchServe handler.
inference.py was never executed.

--handler sagemaker_huggingface_inference_toolkit.handler_service


Forced script mode via env:


env={
  "SAGEMAKER_PROGRAM": "inference.py",
  "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code"
}


❌ 4. Tokenizer Load Failure
Cause:
Model attempted to load tokenizer from internet (not allowed).

Bundled tokenizer in bert/ folder

Loaded using:
DistilBertTokenizer.from_pretrained(os.path.join(model_dir, "bert"))


❌ 5. Misleading Log: “Model_fn found”

model_fn implementation found. It will be used in place of the default one.


Cause:
TorchServe detected a model_fn() symbol but did not use it properly due to missing handler.py.

Fix:

Fully exited TorchServe by enabling script mode (see #3)

 Final Setup
Deployment Class: sagemaker.model.Model

Image URI:
763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:1.13.1-transformers4.26.0-cpu-py39-ubuntu20.04

Tarball Structure

In [None]:
model.tar.gz
├── logreg_model.joblib
├── label_encoder.joblib
├── bert/
│   ├── config.json
│   ├── pytorch_model.bin or model.safetensors
│   ├── tokenizer_config.json
│   └── vocab.txt
└── code/
    └── inference.py


In [None]:
env={
  'SAGEMAKER_PROGRAM': 'inference.py',
  'SAGEMAKER_SUBMIT_DIRECTORY': '/opt/ml/model/code'
}
