🎬 Iron Man (2008) – J.A.R.V.I.S. as Deep Learning AI
🧠 Scene Analogy:
When Tony Stark builds the Iron Man suit in his lab, J.A.R.V.I.S. doesn’t just follow commands — he learns, adapts, and responds like a deep learning system.

🔍 Mapping to Deep Learning Concepts:
Movie Element	Deep Learning Equivalent
J.A.R.V.I.S.	AI model (multi-modal deep learning assistant)
Tony feeding designs, voice inputs	Training data (supervised learning from input-output examples)
Real-time feedback (e.g., flight test failures)	Reinforcement learning (trial and error)
Speech interaction	Natural Language Processing (NLP)
Gesture and visual sensing	Computer Vision + Sensor Fusion
Self-upgrades (e.g., flight stabilizers)	Model optimization and fine-tuning

🎯 Key Scene to Refer:
🔥 "Test Flight Scene" — Tony crashes multiple times while testing the suit.
→ Each failure is a feedback loop, like how reinforcement learning helps models learn optimal behavior.

💡 Final Analogy:
J.A.R.V.I.S. is like a Deep Learning assistant that learns Tony's preferences, builds the suit, corrects mistakes, and even holds conversations — just like how LLMs + deep models adapt and evolve with data.

#Building a Deep Learning Model
##Loading the dataset into the DBFS (Databricks File System)

In [0]:
%sh
rm -r /dbfs/deepml_lab
mkdir /dbfs/deepml_lab
wget -O /dbfs/deepml_lab/diabetes.csv https://raw.githubusercontent.com/fanidam91/smartdata-lab/main/dev/Data_AI/Data_ai/diabetes.csv

     

##Data Preparation and Feature Engineering

In [0]:
import numpy as np  # <--- ADD THIS
from pyspark.sql.types import *
from pyspark.sql.functions import *
from sklearn.model_selection import train_test_split

# Load the data, removing any incomplete rows
df = spark.read.format("csv").option("header", "true").load("/deepml_lab/diabetes.csv").dropna()

# Convert relevant columns to numeric
for col_name in ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 
                 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']:
    df = df.withColumn(col_name, col(col_name).cast("float"))

# Split the data into training and testing datasets   
features = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 
            'BMI', 'DiabetesPedigreeFunction', 'Age']
label = 'Outcome'

# Convert to Pandas and split
pandas_df = df.toPandas()
x_train, x_test, y_train, y_test = train_test_split(pandas_df[features].values,
                                                    pandas_df[label].values,
                                                    test_size=0.30,
                                                    random_state=0)

# Ensure correct data types
x_train = x_train.astype(np.float32)
y_train = y_train.astype(np.int64)
x_test = x_test.astype(np.float32)
y_test = y_test.astype(np.int64)

print('Training Set: %d rows, Test Set: %d rows\n' % (len(x_train), len(x_test)))


##Install and import the PyTorch Libraries

In [0]:

import torch
import torch.nn as nn
import torch.utils.data as td
import torch.nn.functional as F
   
# Set random seed for reproducability
torch.manual_seed(0)
   
print("Libraries imported - ready to use PyTorch", torch.__version__)
     

##Create Data Loaders

In [0]:

# Create a dataset and loader for the training data and labels
train_x = torch.Tensor(x_train).float()
train_y = torch.Tensor(y_train).long()
train_ds = td.TensorDataset(train_x,train_y)
train_loader = td.DataLoader(train_ds, batch_size=20,
    shuffle=False, num_workers=1)

# Create a dataset and loader for the test data and labels
test_x = torch.Tensor(x_test).float()
test_y = torch.Tensor(y_test).long()
test_ds = td.TensorDataset(test_x,test_y)
test_loader = td.DataLoader(test_ds, batch_size=20,
                             shuffle=False, num_workers=1)
print('Ready to load data')

#Define the Neural Network

In [0]:
h1 = 10 

# Define the neural network
class DiabetesNet(nn.Module):
 def __init__(self):
    super(DiabetesNet, self).__init__()
    self.fc1 = nn.Linear(len(features), h1) # defining the input layer
    self.fc2 = nn.Linear(h1,h1) # defining the hidden layers
    self.fc3 = nn.Linear(h1,2) # defining the output layer

 def forward(self, x):
    fc1_output = torch.relu(self.fc1(x))
    fc2_output = torch.relu(self.fc2(fc1_output))
    y = F.log_softmax(self.fc3(fc2_output).float(), dim=1)
    return y

# Create a model instance from the network
model = DiabetesNet()
print(model)


##Create Functions to Test and Train a Neural Network Model

In [0]:


def train(model, data_loader, optimizer):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model.to(device)
    # set the model to training mode
    model.train()
    train_loss=0

    for batch,tensor in enumerate(data_loader):
        data, target = tensor
        optimizer.zero_grad()
        out = model(data)
        loss = loss_criteria(out, target)
        train_loss = train_loss + loss.item()

        # backpropagate adjustments to the weight
        loss.backward()
        optimizer.step()
    
    # Return the average loss 
    avg_loss = train_loss / (batch+1)
    print('Training set: Average loss: {:.6f}'.format(avg_loss))
    return avg_loss

def test(model, data_loader):
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model.to(device)
    # switch the model to evaluation mode
    model.eval()
    test_loss = 0
    correct = 0

    with torch.no_grad():
         batch_count = 0
         for batch, tensor in enumerate(data_loader):
             batch_count += 1
             data, target = tensor
             # get the predictions
             out = model(data)

             # calculate the loss
             test_loss = loss_criteria(out, target).item() + test_loss

             # calculate the accuracy
             _, predicted = torch.max(out.data,1)
             correct += torch.sum(target==predicted).item()

    # Calculate the average loss and total accuracy for this epoch
    avg_loss = test_loss/batch_count
    print('Validation set: Average loss: {:.6f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        avg_loss, correct, len(data_loader.dataset),
        100. * correct / len(data_loader.dataset)))
       
    # return average loss for the epoch
    return avg_loss
     



##Training the model

In [0]:

# Specify the loss criteria (we'll use CrossEntropyLoss for multi-class classification)
loss_criteria = nn.CrossEntropyLoss()
   
# Use an optimizer to adjust weights and reduce loss
learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
optimizer.zero_grad()
   
# We'll track metrics for each epoch in these arrays
epoch_nums = []
training_loss = []
validation_loss = []
   
# Train over 100 epochs
epochs = 100
for epoch in range(1, epochs + 1):
   
    # print the epoch number
    print('Epoch: {}'.format(epoch))
       
    # Feed training data into the model
    train_loss = train(model, train_loader, optimizer)
       
    # Feed the test data into the model to check its performance
    test_loss = test(model, test_loader)
       
    # Log the metrics for this epoch
    epoch_nums.append(epoch)
    training_loss.append(train_loss)
    validation_loss.append(test_loss)

## Review the Training and Validation Loss

In [0]:
%matplotlib inline
from matplotlib import pyplot as plt
   
plt.plot(epoch_nums, training_loss)
plt.plot(epoch_nums, validation_loss)
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['training', 'validation'], loc='upper right')
plt.show()

## View the learned weights and bias

In [0]:
for param_tensor in model.state_dict():
    print(param_tensor, "\n", model.state_dict()[param_tensor].numpy())
     

## Save the trained model# Save the model weights
model_file = '/dbfs/diabetes_predictor.pt'
torch.save(model.state_dict(), model_file)
del model
print('model saved as', model_file)

In [0]:
# Save the model weights
model_file = '/dbfs/diabetes_predictor.pt'
torch.save(model.state_dict(), model_file)
del model
print('model saved as', model_file)

## Inference/Test the saved model

In [0]:

# New Diabetes Features
x_new = [[8,85,65,29,0,26.6,0.672,32]]
print ('New sample: {}'.format(x_new))
   
# Create a new model class and load weights
model = DiabetesNet()
model.load_state_dict(torch.load(model_file))
   
# Set model to evaluation mode
model.eval()
   
# Get a prediction for the new data sample
x = torch.Tensor(x_new).float()
_, predicted = torch.max(model(x).data, 1)
   
print('Prediction:',predicted.item())

##Logging the Model using MLflow for real-time inferencing via an exposed endpoint

In [0]:
import mlflow
import mlflow.pytorch

# Log the model in MLflow
with mlflow.start_run():   
    mlflow.pytorch.log_model(model, "diabetes_predictor_model")
    print("Model logged in MLflow")

##Inferencing Model through the serving endpoint

In [0]:

 {
   "dataframe_records": [
   {
      "Pregnancies": 8,
      "Glucose": 85,
      "BloodPressure": 65,
      "SkinThickness": 29,
      "Insulin": 0,
      "BMI": 26.6,
      "DiabetesPedigreeFunction": 0.672,
      "Age": 34
   }
   ]
 }