# 🧱 Week 11-12 · Notebook 01 · MLOps Fundamentals for GenAI in Manufacturing

Design an auditable lifecycle for GenAI copilots powering maintenance, quality, and EHS workflows across multiple plants.

## 🎯 Learning Objectives
- Map the GenAI lifecycle from data intake to monitoring with manufacturing-specific checkpoints.
- Stand up MLflow experiment tracking and register models tagged by plant and release.
- Define governance artefacts that satisfy ISO 9001, SOX ITGC, and OSHA audit requirements.
- Produce an MLOps readiness scorecard for the capstone deployment team.

## 🧩 Scenario
Arvind Manufacturing operates 6 plants across India, Mexico, and Hungary. Each plant runs a GenAI assistant for maintenance troubleshooting. Leadership demands a unified lifecycle ensuring models, prompts, and datasets are versioned and auditable before Week 12 production launch.

## 🔄 Lifecycle Blueprint
```text
Data Intake → Feature/Embedding Store → Model & Prompt Versioning → Continuous Integration → Deployment → Monitoring → Feedback Loop
```
| Stage | Manufacturing Considerations | Tools | Evidence |
| --- | --- | --- | --- |
| Data Intake | PII scrub, SOP freshness & approvals | pandas, Great Expectations | Data quality report, SME sign-off |
| Versioning | Tag models by plant/equipment, freeze prompts | Git, MLflow, DVC | Model card, prompt changelog |
| CI/CD | Safety linting, regression tests | GitHub Actions, pytest | Pipeline logs, approval records |
| Deployment | Edge vs. Cloud, shift scheduling | Cloud Run, GKE, Helm | Deployment manifest, runbook |
| Monitoring | Drift alerts, cost caps | Prometheus, Grafana, BigQuery | Weekly KPI report, alert runbook |
| Feedback | Technician ratings, root-cause capture | ServiceNow, custom forms | Feedback dashboard |

In [None]:
# Example: Logging a manufacturing maintenance model to MLflow
import os
from datetime import datetime
from pathlib import Path

import mlflow
import pandas as pd

mlflow.set_tracking_uri(os.getenv('MLFLOW_TRACKING_URI', 'http://localhost:5000'))
mlflow.set_experiment('maintenance_copilot_week11')

Path('docs').mkdir(exist_ok=True)
Path('release_notes').mkdir(exist_ok=True)

with mlflow.start_run(run_name=f'plant-pune-{datetime.utcnow().date()}'):
    mlflow.log_params({
        'plant': 'Pune',
        'equipment_cluster': 'Presses',
        'rag_version': 'v0.9.1',
        'adapter_version': 'lora-2025-09-30'
    })
    mlflow.log_metrics({
        'precision_top3': 0.88,
        'avg_latency_ms': 125,
        'sme_rating': 4.5
    })
    model_card_path = Path('docs/model_card_pune.md')
    model_card_path.write_text('# Model Card: Pune Plant\n\nKey metrics tracked weekly.')
    mlflow.log_artifact(str(model_card_path))
    mlflow.set_tags({
        'release_train': 'Week11',
        'change_ticket': 'CHG-4582',
        'regulated': 'true'
    })
    run_id = mlflow.active_run().info.run_id
run_id

### 🗃️ Model Registry Entry
Document dependencies, prompts, and dataset snapshots before promoting to staging.

In [None]:
model_version = mlflow.register_model(
    model_uri=f'runs:/{run_id}/model',
    name='maintenance_copilot_lora'
)
model_version.version

## 🧾 Audit-Ready Metadata
Create a release note capturing mandatory governance fields.

In [None]:
import json
release_note = {
    'release_id': 'REL-2025-11-PlantPune',
    'approved_by': 'Head of Maintenance',
    'change_ticket': 'CHG-4582',
    'mlflow_run_id': run_id,
    'datasets': ['maintenance_logs_2024Q4.parquet', 'sop_manuals_v7.pdf'],
    'prompts_version': 'prompts/maintenance_v12.yaml',
    'risk_assessment': {
        'hallucination_mitigation': 'RAG with filtered top-k=5 + SME review',
        'pii_controls': 'Great Expectations + manual sampling'
    }
}
with open('release_notes/REL-2025-11-PlantPune.json', 'w', encoding='utf-8') as fp:
    json.dump(release_note, fp, indent=2)
release_note

## 🧮 MLOps Readiness Scorecard
Assess maturity across lifecycle dimensions.

In [None]:
scorecard = pd.DataFrame([
    {'dimension': 'Data Quality', 'score': 4, 'notes': 'Automated Great Expectations, SME review weekly'},
    {'dimension': 'Model Versioning', 'score': 3, 'notes': 'MLflow registry, need branching policy'},
    {'dimension': 'CI/CD', 'score': 2, 'notes': 'Basic tests, add security scan'},
    {'dimension': 'Monitoring', 'score': 2, 'notes': 'Latency tracked, hallucination alerts pending'},
    {'dimension': 'Governance', 'score': 3, 'notes': 'Change tickets recorded; add automated evidence export'}
])
scorecard

### Interpretation
- Prioritize CI/CD and monitoring upgrades before Week 12 go-live.
- Maintain scorecard in project tracker; update weekly.
- Escalate gaps to Capstone PM and plant SMEs.

## 🧪 Lab Assignment
1. Deploy an MLflow tracking server (local or hosted) and connect via VPN if required by IT.
2. Log experiments for at least two plants and compare metrics in the registry.
3. Draft a lifecycle policy document describing branching, tagging, and rollback procedures.
4. Present the readiness scorecard to stakeholders and capture action items.

## ✅ Checklist
- [ ] Lifecycle documented with manufacturing-specific checkpoints
- [ ] MLflow experiment & registry configured
- [ ] Release metadata captured with audit evidence
- [ ] Readiness scorecard shared with leadership

## 📚 References
- MLflow Tracking & Registry Documentation
- *MLOps Zoomcamp* — Experiment Tracking Module
- ISO 9001 Change Management Templates