<a href="https://colab.research.google.com/github/hdkhosravian/deep_learning_tutorial/blob/main/Gardening_with_AI_Making_Your_Green_Thumb_Greener_with_Simple_Regression_Models_Torch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Data Preparation**

In our case, think of "Data Preparation" as preparing the soil and seeds for your garden. The soil and seeds are the basis for your garden, just as data is the basis for our model.

We're creating a hypothetical garden where each plant is influenced by factors like Growing Days, Soil pH, Sunlight Hours, and Humidity. The plants, in our scenario, represent our data points, and these factors are akin to the different nurturing elements necessary for growth.

The "Plant Height" we're aiming to predict is the outcome of these combined factors, much like the height of a plant in your garden is the outcome of how many days it's been growing, the pH of the soil it's planted in, how many hours of sunlight it receives, and the level of humidity it's exposed to.

Let's create our garden, or in other words, let's generate our dataset:



```
# Importing necessary libraries
import pandas as pd
import numpy as np

# Number of plants (data points)
n = 1000

# Create the dataframe (garden)
df = pd.DataFrame({
 'Growing Days': np.random.randint(20, 50, n),
 'Soil pH': np.random.uniform(6.0, 7.5, n).round(1),
 'Sunlight Hours': np.random.uniform(5.0, 9.0, n).round(1),
 'Humidity': np.random.randint(50, 80, n),
 'Plant Height': np.random.randint(20, 50, n)
})

df.head()
```



This code creates a pandas dataframe with n (1000) entries. Each entry represents a plant with its corresponding characteristics (Growing Days, Soil pH, Sunlight Hours, Humidity) and outcome (Plant Height). We use the np.random functions to simulate variability in these factors and outcomes, just like in a real garden where these factors vary from plant to plant.

At the end of this step, we have our garden ready, and each plant's data can be used for the next steps of our AI-assisted gardening journey!

# **Train-Test Split**

The Train-Test split is like an exam preparation strategy for our AI model, similar to how students prepare for their exams.

Think of the entire dataset as the complete syllabus for the exam. Now, students don't go to the exam hall without studying and revising, right? They read their textbooks, take notes, and learn the material first - this is their "training". And, they don't use the entire syllabus for revision. Instead, they leave some questions or chapters for self-testing and assessment - this is their "testing" or validation phase.

That's exactly what we are doing here with our data. We split our data into a training set (around 80% of the data) and a testing set (the remaining 20%). The training set is like the textbook. Our model learns from this data - it adjusts its weights and biases based on the features and their corresponding target values in the training set.

Once the model has been trained, we need to assess how well it has learned. We do this by using the testing set. The model has never seen this data during the training process. So, this phase is like the exam - we're testing the model's ability to make correct predictions on new, unseen data.

Let's see this in action:



```
# Importing the necessary library
from sklearn.model_selection import train_test_split

# Split data into input features (X) and target variable (y)
X = df.drop('Plant Height', axis=1).values
y = df['Plant Height'].values

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

```



In this code, we first separate our features (X) and target (y). We then use the train_test_split function from the sklearn.model_selection module to divide our data into training and testing sets. The test_size=0.2 argument means that 20% of the data will be set aside for testing, and the remaining 80% will be used for training. The random_state parameter ensures reproducibility - with the same random state, you'll get the same train-test split every time you run the code.

By the end of this step, we have our "textbook" (X_train, y_train) and our "exam" (X_test, y_test) ready for the next steps!


Standardization in data preparation is a lot like converting various currencies to a single, universal one when dealing with international trade.

Imagine you own an international business that involves transactions in different currencies such as the US Dollar, Euro, British Pound, and Japanese Yen. When analyzing the financial data of your business, dealing with multiple currencies could get confusing and cumbersome. Moreover, each of these currencies has a different scale: 1 US Dollar is not the same as 1 Euro, or 1 British Pound, or 1 Japanese Yen. The solution? Convert all transactions to a common currency, say, US Dollar. Now, you have standardized the financial data, making it simpler and more efficient to analyze.

Similarly, our AI model can face difficulty when the features (input variables) are on different scales. For instance, 'Growing Days' might range from 20 to 50, 'Sunlight Hours' from 5.0 to 9.0, and 'Humidity' from 50 to 80. Each of these features is on a different scale, like different currencies. If we feed these features to our model as they are, the model might get confused and give undue importance to features with larger scales.

Standardizing these features means that we convert all features to the same scale, much like converting all currencies to one standard currency. It involves subtracting the mean and dividing by the standard deviation of each feature, bringing them all to a standard scale where the mean is 0 and the standard deviation is 1. This makes it easier for our model to learn from the data.

Here's how we do it in Python:


```
# Importing the necessary library
from sklearn.preprocessing import StandardScaler

# Initialize a scaler using StandardScaler
scaler = StandardScaler().fit(X_train)

# Apply the scaler to the training and testing data
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

```



We initialize a StandardScaler and use the fit method on the training data (X_train). This calculates the mean and standard deviation of each feature in the training data. We then transform our training data using these calculated values with the transform method. The mean of each feature in the transformed data is now 0 and the standard deviation is 1.

We also apply this transformation to the testing data (X_test) using the same scaler. We use the same scaler to ensure that the model sees the testing data in the same way as the training data. If we fit a new scaler to the testing data, it would have different mean and standard deviation values, creating inconsistency between the training and testing sets.

Remember, we always fit the scaler to the training data only, not the testing data. The testing data is supposed to be unseen, new data, so we shouldn't use it to influence any part of the training process.

After this step, our data is ready to be fed into the neural network. All features are now speaking the same "language", making it easier for the model to learn.

# **Model Architecture**

Creating our model architecture is like designing the blueprint of a building. Each component of the building has a specific role and is connected in a particular way to make the whole structure functional. Similarly, in a neural network, we have different layers (floors of the building), and each layer has neurons (rooms on the floor) performing specific operations.

In our plant height prediction project, we need to design a model that can accept our garden data, process it through a series of operations, and finally make a prediction. We use a specific type of neural network for this, called a fully connected or dense network, where every neuron in each layer is connected to every neuron in the next layer.



```
# Importing necessary library
from torch import nn

# Define a PyTorch model
class Net(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Net, self).__init__()
        self.layer1 = nn.Linear(input_size, hidden_size)
        self.layer2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = torch.relu(self.layer1(x))
        x = self.layer2(x)
        return x

```



1. `class Net(nn.Module):` - Just as a garden has a plan, our machine learning model also needs a plan or a blueprint. Here, we're defining that blueprint and calling it "Net". It is a subclass of PyTorch's nn.Module, which means it will inherit all the properties and functions of a general PyTorch model.

2. `def __init__(self, input_size, hidden_size, output_size):` - This is our model's constructor. It's like defining what type of plants we're going to grow (input_size), how much we're going to care for them (hidden_size), and how tall we expect them to grow (output_size).

3. `super(Net, self).__init__():` - With this line, we're calling the constructor of nn.Module, the parent class. It's like referring back to general gardening knowledge before we apply our specific plan.

4. `self.layer1 = nn.Linear(input_size, hidden_size)` - Here, we're defining our first layer of the neural network. Think of it as the first stage of growing a plant where we plant seeds (input) and start nurturing them (hidden nodes). `nn.Linear` means that every input is connected to every hidden node, much like each seed receiving uniform care and attention.

5. `self.layer2 = nn.Linear(hidden_size, output_size)` - This is the second stage of plant growth where our nurtured plants (hidden nodes) grow to their full height (output). Again, `nn.Linear` ensures that every nurtured plant contributes to the final output.

6. `def forward(self, x):` - This is our model's growth process. We take our initial conditions (`x`, or seeds), and put them through the two stages we defined (layer1 and layer2).

7. `x = torch.relu(self.layer1(x))` - Here, our seeds go through the first stage of growth. The `torch.relu` is an activation function, acting like a growth regulation factor that controls how much each input (seed) contributes to the hidden nodes (nurturing stage).

8. `x = self.layer2(x)` - Now, our nurtured plants go through the second stage and grow to their final height. The output here (`x`) is our final plant height.

9. `return x` - We're returning the final plant heights predicted by our model.

### **why do we need activation functions?**

They introduce non-linearity to our model. In real life, things are not usually linear. For example, doubling the amount of water you give to a plant won't necessarily double its growth. Activation functions help our model capture these non-linear relationships, making its predictions more accurate and versatile.

# **Loss Function and Optimizer Definition**
Defining the loss function and the optimizer in a machine learning model is akin to setting up a system for learning from mistakes and making improvements, much like a gardener trying different methods to optimize plant growth.

Let's look at the loss function first:


```
criterion = nn.MSELoss()
```



In this line, we're defining the loss function to be Mean Squared Error (MSE). Think of it as a gardener measuring the difference between the actual height of the plant and the height he expected the plant to reach. If the plant is shorter than expected, there's a 'loss' or error. If the plant is taller, that's also a deviation. We square these deviations (to make sure they are positive) and take the mean of all squared errors. This gives us a single number that tells us how off our predictions are from the actual values, or in the gardener's context, how off his expectation was from the actual plant height.


Next, let's look at the optimizer:


```
optimizer = optim.SGD(model.parameters(), lr=0.01)
```





Think of SGD as a very determined, yet slightly short-sighted gardener, who is trying to climb to the top of a hill (our optimal model), but can only see a few feet ahead (current batch of data). The gardener's goal is to reach the top of the hill (the best model that minimizes error) as quickly as possible.

Step Size (Learning Rate): The size of the steps the gardener takes is like our learning rate (lr). If the steps are too large (lr is too high), the gardener might overshoot the top and end up on the other side of the hill. If the steps are too small (lr is too low), it could take the gardener a very long time to reach the top. So, the learning rate is a crucial aspect of SGD that needs to be set correctly.
In our code, lr=0.01 means the gardener is taking small steps towards the top of the hill.


1.   Stochastic (Random): The word 'stochastic' means random. When the gardener can't see the entire hill, he decides to move in the direction which seems to be the steepest, hoping it will lead him to the top. In the case of our model, instead of looking at the entire dataset (the whole hill), we look at a small random sample (current surroundings of the gardener), calculate the error (determine the slope), and make a step (update the model parameters).

2.   Gradient Descent: The word 'gradient' refers to the slope, and 'descent' means to go down. But wait, aren't we trying to climb the hill, not descend? The confusion arises because we're technically trying to minimize the error (descend a hill of error), even though we often visualize this as trying to maximize accuracy (climb a hill of accuracy). So, the gardener is using the steepness of the slope to guide his steps.


Overall, Stochastic Gradient Descent is a strategy where the model, like a gardener, is trying to find the best parameters (location on the hill) that minimize the error (distance from the top), by taking steps proportional to the negative of the gradient (opposite of the slope) at the current point, based on a small random sample of the total data (limited view of the hill). And it keeps repeating this process, improving with each step, until it reaches the best possible point (the top of the hill).

# **Model Training**



```
# Training loop
for epoch in range(epochs):
    model.train()
    optimizer.zero_grad()  
    outputs = model(inputs)
    loss = criterion(outputs, targets)
    loss.backward()  
    optimizer.step()
    print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))

```




1. `for epoch in range(epochs):` - The term 'epoch' is like one complete cycle of a gardener trying different gardening methods on all of his plants in the garden and learning from it. The more epochs, the more times the gardener tries different methods and learns from the plants' reactions. The range of epochs is the number of times we want the model (our gardener) to learn from the entire dataset (all plants in the garden).

2. `model.train()` - Here, we're simply telling our model that it's training time. It's like the gardener getting his tools and plants ready for a new cycle of gardening.

3. `optimizer.zero_grad()` - Before calculating new gradients, it's essential to set the old ones to zero. If we don't zero the gradients, they will accumulate and interfere with the current gradients. It's like erasing the blackboard before writing new information; otherwise, it would be tough to read and understand what's written.

4. `outputs = model(inputs)` - In this line, the model is making a prediction (outputs) based on the current input data (inputs). It's like the gardener predicting how tall a plant will grow given the current soil pH, sunlight hours, and other factors.

5. `loss = criterion(outputs, targets)` - Here, we're calculating the 'loss' or error between the model's predictions (outputs) and the actual plant heights (targets). This tells the gardener how far off his predictions were from the actual results.

6. `loss.backward()` - This line is where the model starts learning from its mistakes. It calculates the gradients of the loss function with respect to the model parameters. Gradients can be thought of as the directions and magnitudes to adjust the parameters. Without this step, the model wouldn't know how to improve its predictions.

7. `optimizer.step()` - Now, the optimizer uses the gradients to update the model parameters. This is like a step of learning or an adjustment made based on the mistakes. The model parameters are tweaked slightly in the direction that reduces the loss. Without this step, the model wouldn't learn from its mistakes, and the loss wouldn't decrease over time.

8. `print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))` - This line just prints out the progress of training so we can monitor it. The gardener is simply making a note of how well his current gardening methods are working.

In a nutshell, during model training, the model, like a gardener, is learning from its mistakes and improving its predictions (gardening methods) iteratively until it can predict the plant heights as accurately as possible. This is why we have to go through this process: without it, our model wouldn't learn from the data, and wouldn't be able to make accurate predictions.


# **Model Evaluation**


```
# Switch to evaluation mode
model.eval()

with torch.no_grad():
    predictions = model(test_inputs)
    predictions = torch.flatten(predictions)

test_loss = criterion(predictions, test_targets)
print(f'Test Loss: {test_loss.item()}')

```



1. `model.eval()` - First, we tell our model to switch into evaluation mode with `model.eval()`. This is like telling our gardener, "Now is the time to test your knowledge and predict the height of new plants. Don't learn anything new for now."

2. `with torch.no_grad():` - This line of code is us asking PyTorch to perform the next operations without tracking gradients. This is because during testing we don't want to update our weights and biases, we just want to test our model. It's like telling our gardener not to change his methods based on these new plants.

3. `predictions = model(test_inputs)` - Here, we ask our model to predict the output (`predictions`) given our test inputs. It's like the gardener predicting the height of new plants based on what he has learned.

4. `predictions = torch.flatten(predictions)` - Sometimes, our predictions come out in a different shape than we want (for example, they might be in a two-dimensional matrix but we want a one-dimensional vector). To make sure our predictions and actual targets are in the same shape, we 'flatten' our predictions. It's like our gardener listing down his predictions for each plant height one after another in a single line.

5. `test_loss = criterion(predictions, test_targets)` - Here, we're calculating our test loss, which is the error of our model on the test data. We use the same criterion we used for training. This tells us how far off our predictions were from the actual results. It's like comparing the gardener's predicted plant heights to their actual heights and seeing how much he was off by.

6. `print(f'Test Loss: {test_loss.item()}')` - Finally, we print out the test loss. This gives us a numerical value of how well (or not so well) our model performed. It's like telling the gardener: "On average, you were off by this much in your predictions."

Each step here is necessary for testing our model effectively. We need to put our model in evaluation mode, prevent it from learning from the test data, make predictions, ensure those predictions are in the correct shape, calculate the loss to quantify the error, and then print it out to know how our model performed. This testing phase is crucial to verify that our model has indeed learned meaningful patterns from the data and is not just memorizing it, and that it can generalize its learning to new, unseen data.

# **FULL CODE**

In [None]:
# Importing necessary libraries
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Number of plants (data points)
n = 10000

# Create the dataframe (garden)
df = pd.DataFrame({
    'Growing Days': np.random.randint(20, 50, n),
    'Soil pH': np.random.uniform(6.0, 7.5, n).round(1),
    'Sunlight Hours': np.random.uniform(5.0, 9.0, n).round(1),
    'Humidity': np.random.randint(50, 80, n),
    'Plant Height': np.random.randint(20, 50, n)
})

# Separate the target variable from the input features
inputs_df = df[['Growing Days', 'Soil pH', 'Sunlight Hours', 'Humidity']]
targets_df = df[['Plant Height']]

# Normalize the input features
input_scaler = StandardScaler()
inputs = input_scaler.fit_transform(inputs_df)

# Normalize the target variable
target_scaler = StandardScaler()
targets = target_scaler.fit_transform(targets_df)

# Create input and target tensors
inputs = torch.tensor(inputs, dtype=torch.float32)
targets = torch.tensor(targets, dtype=torch.float32)

# Split the data into train and test datasets
inputs_train, inputs_test, targets_train, targets_test = train_test_split(inputs, targets, test_size=0.2, random_state=42)

# Define a model
model = nn.Linear(4, 1) # Our model will have 4 input features and 1 output feature

# Define the loss function (criterion) and the optimizer
criterion = nn.MSELoss() # We use Mean Squared Error as our loss function
optimizer = optim.SGD(model.parameters(), lr=0.01) # We use Stochastic Gradient Descent as our optimizer

# Training loop
epochs = 100
for epoch in range(epochs):
    model.train() # Put the model in training mode
    optimizer.zero_grad() # Reset the gradients
    outputs = model(inputs_train) # Get the model's predictions
    loss = criterion(outputs, targets_train) # Calculate the loss
    loss.backward() # Compute gradients
    optimizer.step() # Update model parameters
    print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))

# Evaluate the model
model.eval() # Switch the model to evaluation mode
with torch.no_grad():
    predictions = model(inputs_test) # Make predictions on the test set
    test_loss = criterion(predictions, targets_test) # Calculate the test loss

    # Inverse transform the predictions
    predictions = target_scaler.inverse_transform(predictions)
    predictions = predictions.flatten()

    for i, prediction in enumerate(predictions):
        print(f"Sample {i+1}: {prediction}")

    print(f'Test Loss: {test_loss.item()}')


Epoch [1/100], Loss: 1.2304
Epoch [2/100], Loss: 1.2214
Epoch [3/100], Loss: 1.2127
Epoch [4/100], Loss: 1.2044
Epoch [5/100], Loss: 1.1964
Epoch [6/100], Loss: 1.1887
Epoch [7/100], Loss: 1.1813
Epoch [8/100], Loss: 1.1742
Epoch [9/100], Loss: 1.1673
Epoch [10/100], Loss: 1.1607
Epoch [11/100], Loss: 1.1544
Epoch [12/100], Loss: 1.1484
Epoch [13/100], Loss: 1.1425
Epoch [14/100], Loss: 1.1369
Epoch [15/100], Loss: 1.1315
Epoch [16/100], Loss: 1.1263
Epoch [17/100], Loss: 1.1214
Epoch [18/100], Loss: 1.1166
Epoch [19/100], Loss: 1.1120
Epoch [20/100], Loss: 1.1076
Epoch [21/100], Loss: 1.1033
Epoch [22/100], Loss: 1.0992
Epoch [23/100], Loss: 1.0953
Epoch [24/100], Loss: 1.0915
Epoch [25/100], Loss: 1.0879
Epoch [26/100], Loss: 1.0844
Epoch [27/100], Loss: 1.0810
Epoch [28/100], Loss: 1.0778
Epoch [29/100], Loss: 1.0747
Epoch [30/100], Loss: 1.0717
Epoch [31/100], Loss: 1.0689
Epoch [32/100], Loss: 1.0661
Epoch [33/100], Loss: 1.0635
Epoch [34/100], Loss: 1.0609
Epoch [35/100], Loss: 1