# ML Model Training for AgriTrace

This notebook demonstrates how to train and save models for price prediction (regression) and quality anomaly detection (classification) using scikit-learn.

## 1. Import Required Libraries
Import pandas, numpy, scikit-learn, and joblib for model training and saving.

In [None]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, accuracy_score
import joblib

## 2. Load or Create Sample Data
For demonstration, we will create synthetic data. Replace this with your real data for production use.

In [None]:
# Synthetic data for price prediction (regression)
np.random.seed(42)
X_price = np.random.rand(100, 3)  # 3 features
price = 10 + 5 * X_price[:, 0] + 2 * X_price[:, 1] - 3 * X_price[:, 2] + np.random.randn(100)

# Synthetic data for quality anomaly detection (classification)
X_quality = np.random.rand(100, 3)
quality_anomaly = (X_quality[:, 0] + X_quality[:, 1] - X_quality[:, 2] + np.random.randn(100) > 1.0).astype(int)


## 3. Train Price Prediction Model (Regression)
Train a simple linear regression model for price prediction.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_price, price, test_size=0.2, random_state=42)
reg = LinearRegression()
reg.fit(X_train, y_train)
price_pred = reg.predict(X_test)
print("Price RMSE:", np.sqrt(mean_squared_error(y_test, price_pred)))

## 4. Train Quality Anomaly Detection Model (Classification)
Train a logistic regression model for anomaly detection.

In [None]:
Xq_train, Xq_test, yq_train, yq_test = train_test_split(X_quality, quality_anomaly, test_size=0.2, random_state=42)
clf = LogisticRegression()
clf.fit(Xq_train, yq_train)
quality_pred = clf.predict(Xq_test)
print("Quality Anomaly Accuracy:", accuracy_score(yq_test, quality_pred))

## 5. Save Trained Models
Save the trained models as .pkl files for use in the FastAPI service.

In [None]:
joblib.dump(reg, "price_model.pkl")
joblib.dump(clf, "quality_model.pkl")
print("Models saved as price_model.pkl and quality_model.pkl")