# ü§ñ Smart Cage - MQ2 Gas Sensor Model Training

**Samsung Innovation Campus - Phase 3**

Train model ML untuk klasifikasi kondisi gas:
- **Aman**: MQ2 LOW
- **Waspada**: MQ2 HIGH
- **Bahaya**: MQ2 HIGH + Suhu > 55¬∞C

## üì¶ Step 1: Install Dependencies

In [None]:
!pip install pandas scikit-learn matplotlib seaborn joblib

## üìö Step 2: Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import joblib
import os

print("‚úÖ Libraries imported!")

## üìÅ Step 3: Upload Dataset

In [None]:
from google.colab import files

print("üì§ Upload MQ2 dataset CSV:")
uploaded = files.upload()
csv_file = list(uploaded.keys())[0]
print(f"üìÅ Uploaded: {csv_file}")

## üîç Step 4: Load & Explore Dataset

In [None]:
df = pd.read_csv(csv_file)

print(f"üìä Dataset: {len(df)} rows")
print(f"Columns: {list(df.columns)}")

print("\nüìà Label Distribution:")
print(df['label'].value_counts())

print("\nüìã Preview:")
display(df.head())

## üé® Step 5: Data Visualization

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

# Scatter plot
colors = {'Aman': 'green', 'Waspada': 'orange', 'Bahaya': 'red'}
for label in df['label'].unique():
    subset = df[df['label'] == label]
    axes[0].scatter(subset['gas_detected'], subset['temp'], 
                   c=colors.get(label, 'gray'), label=label, alpha=0.6)
axes[0].set_xlabel('Gas Detected (0/1)')
axes[0].set_ylabel('Temperature (¬∞C)')
axes[0].set_title('Gas vs Temperature')
axes[0].axhline(y=55, color='red', linestyle='--', label='55¬∞C threshold')
axes[0].legend()

# Pie chart
label_counts = df['label'].value_counts()
pie_colors = [colors.get(l, 'gray') for l in label_counts.index]
axes[1].pie(label_counts, labels=label_counts.index, autopct='%1.1f%%', colors=pie_colors)
axes[1].set_title('Label Distribution')

plt.tight_layout()
plt.show()

## ‚úÇÔ∏è Step 6: Prepare & Split Data

In [None]:
# Features: gas_detected dan temp
X = df[['gas_detected', 'temp']]
y = df['label']

# Split 80/20
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"‚úÖ Train: {len(X_train)}, Test: {len(X_test)}")

## ü§ñ Step 7: Train Model

In [None]:
model = DecisionTreeClassifier(random_state=42, max_depth=5)
model.fit(X_train, y_train)

print(f"‚úÖ Model trained: {type(model).__name__}")

## üìä Step 8: Evaluate Model

In [None]:
y_pred = model.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print(f"‚úÖ Accuracy: {accuracy:.4f} ({accuracy*100:.2f}%)")

print("\nüìã Classification Report:")
print(classification_report(y_test, y_pred))

## üìà Step 9: Confusion Matrix

In [None]:
cm = confusion_matrix(y_test, y_pred)
labels = sorted(df['label'].unique())

plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=labels, yticklabels=labels)
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

## üíæ Step 10: Save Model

In [None]:
model_filename = "mq2_gas_model.pkl"
joblib.dump(model, model_filename)

print(f"‚úÖ Saved: {model_filename}")

# Download
files.download(model_filename)

## üß™ Step 11: Test Predictions

In [None]:
test_data = [
    [0, 30],   # No gas, normal temp ‚Üí Aman
    [1, 30],   # Gas detected, normal temp ‚Üí Waspada
    [1, 60],   # Gas + high temp ‚Üí Bahaya
    [0, 60],   # No gas, high temp ‚Üí Aman
]

predictions = model.predict(test_data)

print("üß™ Test Predictions:")
print("-" * 40)
for data, pred in zip(test_data, predictions):
    print(f"Gas={data[0]}, Temp={data[1]}¬∞C ‚Üí {pred}")