# Traffic Optimization with AWS IoT and Machine Learning

This notebook demonstrates steps to simulate IoT data for traffic optimization, process the data in real-time, train and deploy a predictive model using Amazon SageMaker, and analyze the data with AWS Athena.

## 1. Introduction and Setup
In this section, we set up AWS configurations and import necessary libraries.

In [None]:
import boto3
import sagemaker
import pandas as pd
import numpy as np
from sagemaker import get_execution_role

# AWS configuration
role = get_execution_role()
bucket = 'your-s3-bucket-name'
sagemaker_session = sagemaker.Session()

## 2. Simulate IoT Data
Here, we simulate IoT data for traffic lights and send it to AWS IoT Core.

In [None]:
# Example code to simulate IoT data and publish to AWS IoT Core
from AWSIoTPythonSDK.MQTTLib import AWSIoTMQTTClient
import json
import time

def generate_traffic_data():
    return {
        'traffic_light_id': 'TL-101',
        'location': 'Main St & 1st Ave',
        'vehicle_count': int(50 + 20 * time.time() % 1),
        'average_speed': int(20 + 10 * time.time() % 1),
        'CO2_level': round(0.04 + 0.02 * time.time() % 1, 4)
    }

# 

## 3. Data Processing with Kinesis and Lambda
Instructions to set up AWS Kinesis and Lambda for real-time data processing.

## 4. Data Exploration and Preparation
Load processed data from S3 for exploration and feature engineering.

In [None]:
# Load data from S3
s3 = boto3.client('s3')
data_key = 'processed-data/traffic_data.csv'
obj = s3.get_object(Bucket=bucket, Key=data_key)
df = pd.read_csv(obj['Body'])
df.head()

## 5. Model Training with SageMaker
Train a model using the processed data.

In [None]:
# Define features and labels, and split data
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import joblib

X = df[['vehicle_count', 'average_speed', 'CO2_level']]
y = df['high_traffic']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Save model to file
joblib.dump(model, 'model.joblib')

# Upload model to S3
s3.upload_file('model.joblib', bucket, 'model/model.joblib')

## 6. Deploy Model as a SageMaker Endpoint
Deploy the trained model as a real-time endpoint using SageMaker.

In [None]:
from sagemaker.sklearn.model import SKLearnModel

model_path = f's3://{bucket}/model/model.joblib'
sklearn_model = SKLearnModel(
    model_data=model_path,
    role=role,
    entry_point='inference.py',  # Assumes inference.py is in the same folder
    framework_version='0.23-1'
)

# Deploy endpoint
predictor = sklearn_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large'
)

## 7. Testing the SageMaker Endpoint
Send test data to the deployed endpoint to get predictions.

In [None]:
# Example test data
test_data = {'features': [50, 20, 0.04]}
prediction = predictor.predict(test_data)
print('Prediction:', prediction)

## 8. Data Analysis with Athena
Set up Athena to query S3 data and analyze traffic trends.