**Module 3.4: MLflow with Docker & Cloud** 
## 🎯 **Learning Objectives Expanded**

### 1️⃣ **Log an MLflow Model to Disk**

* **What it means:**
  Saving a trained ML model in a structured format on your disk so MLflow can manage it.

* **Detailed Steps:**

  * Train your ML model (e.g., Random Forest, Logistic Regression).
  * Use MLflow's built-in logging methods, such as:

    ```python
    import mlflow.sklearn
    mlflow.sklearn.log_model(model, "model_name")
    ```
  * MLflow creates a directory structure on disk, storing metadata, artifacts, and the model itself for later deployment or serving.

* **Why it matters:**
  Ensures consistency, reproducibility, and easy management of models for deployment or future use.

---

### 2️⃣ **Build a Docker Image using `mlflow models build-docker`**

* **What it means:**
  Packaging the MLflow-logged model and its required environment into a Docker image. This allows the model to run in isolation with all necessary dependencies.

* **Detailed Steps:**

  * After logging your ML model, use the command:

    ```bash
    mlflow models build-docker -m runs:/<RUN_ID>/model_name -n my_model_image
    ```
  * This command automatically generates a Docker image that includes the MLflow environment, your model, and necessary libraries.

* **Why it matters:**
  Dockerizing your MLflow model ensures portability, consistency, and ease of deployment to different environments (e.g., cloud, local machines, Kubernetes).

---

### 3️⃣ **Serve the Image via Docker Container**

* **What it means:**
  Running the Docker image you've built as a container, effectively creating a RESTful API endpoint to serve model predictions.

* **Detailed Steps:**

  * Use the following Docker command:

    ```bash
    docker run -p 5001:8080 my_model_image
    ```
  * The Docker container exposes your model at port `8080` inside the container, which you map to `5001` on your local machine.
  * Your model is now running as an API server and ready to accept prediction requests.

* **Why it matters:**
  Serving your model in a Docker container isolates it from your local environment, making deployment safer, cleaner, and more reliable.

---

### 4️⃣ **Test the Model using Curl or REST API**

* **What it means:**
  Sending data to your model's REST API endpoint and checking if it returns predictions correctly.

* **Detailed Steps:**

  * Once your Docker container is running, send a request like this:

    ```bash
    curl http://127.0.0.1:5001/invocations \
      -H "Content-Type: application/json" \
      -d '[{"feature1": 5.1, "feature2": 3.5}]'
    ```
  * Alternatively, use Python's `requests` library:

    ```python
    import requests
    url = "http://127.0.0.1:5001/invocations"
    data = [{"feature1": 5.1, "feature2": 3.5}]
    response = requests.post(url, json=data)
    print(response.json())
    ```

* **Why it matters:**
  Testing ensures your model correctly receives and processes input data, making accurate predictions before you integrate it into larger systems or workflows.



In [1]:
# 📓 Module 3.4: MLflow with Docker & Cloud
# Goal: Build a Docker image for a logged MLflow model and run it in a container or cloud

# ✅ Step 1: Log a model to be packaged into a Docker image
!pip install -q mlflow scikit-learn

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=10, random_state=42)
model.fit(X_train, y_train)

mlflow.set_experiment("dockerized-ml-model")

with mlflow.start_run():
    mlflow.sklearn.log_model(model, "rf_model")
    run_id = mlflow.active_run().info.run_id
    print("✅ Model logged with Run ID:", run_id)

# ✅ Step 2: Export command to build Docker image (run in terminal)
print("""
🐳 To build a Docker image for your model, use:

mlflow models build-docker -m runs:/<RUN_ID>/rf_model -n my_mlflow_rf_image

Then run the container locally with:

docker run -p 5001:8080 my_mlflow_rf_image

Make sure Docker is installed and running.
""")

# ✅ Step 3: Sample curl command to test the Dockerized model (terminal only)
print("""
📤 To send a test request:

curl http://127.0.0.1:5001/invocations \
  -H "Content-Type: application/json" \
  -d '[{"sepal length (cm)": 5.1, "sepal width (cm)": 3.5, "petal length (cm)": 1.4, "petal width (cm)": 0.2}]'

Replace the JSON with the correct feature names from your model.
""")


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.7/24.7 MB[0m [31m73.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m63.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m247.0/247.0 kB[0m [31m20.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m14.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.9/114.9 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.0/85.0 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m677.0/677.0 kB[0m [31m41.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m203.4/203.4 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2025/07/30 19:25:38 INFO mlflow.tracking.fluent: Experiment with name 'dockerized-ml-model' does not exist. Creating a new experiment.


✅ Model logged with Run ID: cae3a564eb5b406a8e9fd7bd71fd7736

🐳 To build a Docker image for your model, use:

mlflow models build-docker -m runs:/<RUN_ID>/rf_model -n my_mlflow_rf_image

Then run the container locally with:

docker run -p 5001:8080 my_mlflow_rf_image

Make sure Docker is installed and running.


📤 To send a test request:

curl http://127.0.0.1:5001/invocations   -H "Content-Type: application/json"   -d '[{"sepal length (cm)": 5.1, "sepal width (cm)": 3.5, "petal length (cm)": 1.4, "petal width (cm)": 0.2}]'

Replace the JSON with the correct feature names from your model.



---

## 📝 Assessment: MLflow with Docker & Cloud    

### 📘 Multiple Choice (Correct answers in **bold**)    

**1. What does `mlflow models build-docker` do?**    
A. Starts a Docker container    
**B. Creates a Docker image that can serve your MLflow model** ✅    
C. Logs a model to S3    
D. Registers a model with the cloud registry    

---

**2. Which default port does a Dockerized MLflow model serve on?**    
A. 8081    
**B. 8080** ✅    
C. 5000    
D. 6006    

---

**3. What must be installed on your system to run a Dockerized MLflow model?**    
A. Anaconda    
**B. Docker engine** ✅    
C. Kubernetes    
D. Apache Spark    

---

**4. Which MLflow flavor(s) support Docker-based serving?**    
A. Only pyfunc    
**B. All flavors via pyfunc abstraction** ✅    
C. Only sklearn    
D. Only models from the registry    

---

### ✏️ Short Answer

**5. What are the benefits of containerizing MLflow models using Docker?**    
*Ensures environment consistency, simplifies deployment, allows scaling in production environments, and removes dependency issues.*    

---

**6. Why is it important to expose the port (e.g., `-p 5001:8080`) when running a Docker container?**    
*It maps the container's internal port (8080) to a host machine port (5001),     making the REST API accessible from outside the container.*    

---

### 🧪 Mini Project

**7. Task:**    

* Log any MLflow-compatible model    
* Build a Docker image using `mlflow models build-docker`    
* Run the container locally with Docker    
* Use `curl` or `requests.post()` to send JSON input and receive prediction    