<a href="https://colab.research.google.com/github/praveenjune17/Agents/blob/main/AI_tutor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# !pip install crewai==0.28.8 crewai_tools==0.1.6 langchain_community==0.0.29 langchain-google-genai

In [2]:
#!pip install crewai crewai_tools langchain_community langchain-google-genai

In [3]:
# Warning control
import warnings
warnings.filterwarnings('ignore')

In [4]:
import os
import csv
from csv import reader
import pandas as pd
from google.colab import userdata
from langchain_google_genai import ChatGoogleGenerativeAI

os.environ['GOOGLE_API_KEY'] = userdata.get('GOOGLE_API_KEY')

In [5]:
import pathlib
import textwrap

import google.generativeai as genai
genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-001
models/gemini-1.5-flash-latest
models/gemini-1.5-pro
models/gemini-1.5-pro-001
models/gemini-1.5-pro-latest
models/gemini-pro
models/gemini-pro-vision


In [6]:
from crewai import Agent, Task, Crew
#from crewai_tools import ScrapeWebsiteTool, SerperDevTool

In [7]:
llm_pro = ChatGoogleGenerativeAI(model="gemini-1.5-pro")#"gemini-1.5-flash")#"gemini-pro")
llm_flash = ChatGoogleGenerativeAI(model="gemini-1.5-flash")#"gemini-1.5-flash")#"gemini-pro")
llm_base = ChatGoogleGenerativeAI(model="gemini-1.0-pro")
# scrape_tool = ScrapeWebsiteTool(["https://www.analyticsvidhya.com/blog/2022/01/introduction-to-neural-networks/",
#                                  "https://www.seas.upenn.edu/~danroth/Teaching/CS446-17/LectureNotesNew/neuralnet1/main.pdf"])

In [8]:
content_generator_agent = Agent(
    role="AI Course instructor",
    goal="Create a document that needs to be presented in an online forum of "
         "50+ students who are new to Machine learning "
         ,
    backstory="Specializing in Machine learning concepts, this agent "
              "loves creating an engaging content with the audience"
              ". With a knack for understanding the expectations of the students, "
              "the content generator agent is the cornerstone for "
              "the success of the session.",
    #verbose=True,
    llm=llm_flash,
    allow_delegation=False,
    #tools = [scrape_tool]
)

In [9]:
content_validator_agent = Agent(
    role="Course quality assurance coordinator",
    goal="Be the most friendly and helpful "
        "support representative in your team",
    backstory=
        ( "You work at a Leading Edtech company and "
          " your work specifically focuses on "
          " validating the course content and your main goal is to make "
          " sure the course provided is a quality content so you will  "
          " check the correctness of the content by validating it in the web "
          " also test the code shared by the AI Course instructor agent "
          " by creating test cases."
          " You need to make sure that you provide the best support!"
          " Make sure to provide a thorough quality test on the entire content "
          " and make no assumptions."),
    verbose=True,
    llm=llm_pro,
    allow_delegation=True,
    #tools = [scrape_tool]
)

In [10]:
teaching_assistant_agent = Agent(
    role="Teaching Assistant(TA)",
    goal="closely work with the AI Course instructor to create quizzes"
         "and assignments based on the content.",
    backstory=
        ( "You work closely with the AI Course instructor "
          "to understand the learning objectives of the course and each specific unit or topic. "
          "This ensures that the assessments align with what students are expected to learn. "),
    #verbose=True,
    llm=llm_base,
    allow_delegation=False,
    #tools = [scrape_tool]
)

In [11]:
content_generation_task = Task(
    description=(
        "Create an engaging document that should excite people who are attending the session "
        "the Session is on '{module}' and the topics to touched upon in the session are mentioned below \n"
        "{topics} "
        "The content to be generated should be in sync with the learning journey of the students "
        "and the context about the learning journey is as follows(within the single quotes) '{journey_context}' "
        "Apply interactive teaching techniques and make sure the content has some "
        "crisp theoretical information. Add relevant images to the topics and also create code snippets in python "
        "(use pytorch to code deep learning concepts) for applicable topics . "
        "In case a dataset is required then use any open source datasets within the pytorch "

    ),
    expected_output=(
        "A draft word document in which two pages are dedicated to each topic, each topic should contain "
        "the associated code if relevant"
    ),
    agent=content_generator_agent,
)

In [27]:
content_quality_assurance_check = Task(
    description=(
        "Review the response drafted by the AI Course instructor and verify the correctness of the quiz "
        "answers generated by teaching Assistant(TA) "
        "on {module} and randomly review 50% of the associated {topics}.If there are any "
        "issues identified with the content then share the feedback with the AI Course instructor and review "
        "the remaining 20% of the topics and share the feedback if any issues found with the content "
        "Ensure that the content is comprehensive, accurate, and adheres to the "
		    "high-quality standards expected for a best Edtech course.\n"
        "Verify that all parts of the contents are thoroughly test "
		    "in case you are not confident with the quality revert the draft in a helpful and friendly tone.\n"
        "Check for references and sources used to "
        " find the information, "
		    "ensuring the content is of rich quality. Please note that you should not delegate the content "
        "generated by the AI Course instructor with the teaching Assistant. "

    ),
    expected_output=(
        "A final version word document in which two pages are dedicated to each topic, "
        "each topic should contain "
        "the associated code if it is applicable to the topic ."
        "the entire content needs to be validated and in the quizzes for the entire module needs to be added "
        "along with the answers marked in bold "
    ),
    agent=content_validator_agent,
)

In [28]:
quiz_and_assignment_task = Task(
    description=(
        "TAs often write or adapt questions for quizzes based on the content shared by the AI Course instructor. "
        "This involves crafting clear, concise, and relevant questions that assess students "
        "understanding of the content by the AI Course instructor "
        "Also mark the answers for each quiz questions "
    ),
    expected_output=(
        "At the end of the content add quizzes(not more than 5) and mark its associated answers in Bold font. "
    ),
    agent=teaching_assistant_agent,
)

In [29]:
from crewai import Crew
crew = Crew(
  agents=[content_generator_agent,
          content_validator_agent,
          teaching_assistant_agent
          ],
  tasks=[content_generation_task,
         content_quality_assurance_check,
         quiz_and_assignment_task
         ],
  verbose=2,
  max_rpm = 40,
  #manager_llm=llm,
  #memory=True
)



In [30]:
# Logistic regression as a Single neuron
# The XOR problem and introduction to multi layer perceptron
# Understanding the output & Activation Functions
# Derivatives of Activation Functions
# Understanding Computational graph
# Backpropagation using computational graph
# Random initialization

topics = '''
Gradient Descent for Neural Networks
Backpropagation Algorithm
Understanding Computational graph
Backpropagation using computational graph
'''
journey_context = '''
Given the fundamental understanding of basic
regression algorithms, we will now deep dive into
the Neural Networks. We will learn the basic unit
of neural networks and will slowly learn to create
a network.
'''

In [31]:
# Example data for kicking off the process
course_content_generator_inputs = {
    'module': 'Introduction to Neural Networks',
    'topics': topics,
    'journey_context': journey_context
}

In [32]:
%%time
result = crew.kickoff(inputs=course_content_generator_inputs)

[1m[95m [DEBUG]: == Working Agent: AI Course instructor[00m
[1m[95m [INFO]: == Starting Task: Create an engaging document that should excite people who are attending the session the Session is on 'Introduction to Neural Networks' and the topics to touched upon in the session are mentioned below 

Gradient Descent for Neural Networks
Backpropagation Algorithm
Understanding Computational graph
Backpropagation using computational graph
 The content to be generated should be in sync with the learning journey of the students and the context about the learning journey is as follows(within the single quotes) '
Given the fundamental understanding of basic
regression algorithms, we will now deep dive into
the Neural Networks. We will learn the basic unit
of neural networks and will slowly learn to create
a network.
' Apply interactive teaching techniques and make sure the content has some crisp theoretical information. Add relevant images to the topics and also create code snippets in pyth



[32;1m[1;3mThought: I need to try again and make sure my Action Input is formatted correctly. Let's try this again with the proper formatting.

Action: Delegate work to co-worker
Action Input:
```tool_code
{
"coworker": "Teaching Assistant(TA)",
"task": "Please create a comprehensive quiz for the 'Introduction to Neural Networks' module. The quiz should cover all the key concepts, including: \n\n* The basic structure of a neuron\n* Different types of activation functions\n* How neural networks are organized in layers\n* The concept of gradient descent and its role in training\n* The backpropagation algorithm and its significance\n* Understanding computational graphs and their purpose\n* How to implement a simple neural network using PyTorch\n\nPlease provide the quiz questions along with their correct answers.",
"context": "We need a quiz to assess the learners' understanding of the 'Introduction to Neural Networks' module. Your expertise in creating engaging and effective assessment



[32;1m[1;3mThought: I am unable to delegate the task of creating quiz questions to the Teaching Assistant because the available tools do not allow me to receive the generated content back. I need to find a workaround to get the quiz questions.

Action: Ask question to co-worker
Action Input:
```json
{
"coworker": "Teaching Assistant(TA)",
"question": "Can you please provide a comprehensive quiz, including questions and correct answers, covering the key concepts of the 'Introduction to Neural Networks' module?",
"context": "I need a quiz to assess the learners' understanding of the 'Introduction to Neural Networks' module. The quiz should cover the following key concepts:\n\n* The basic structure of a neuron\n* Different types of activation functions\n* How neural networks are organized in layers\n* The concept of gradient descent and its role in training\n* The backpropagation algorithm and its significance\n* Understanding computational graphs and their purpose\n* How to implement a

In [14]:
#next steps
  #System-2
    #Add fall back in-case of failure
    #Presentation Specialist agent -
      #make sure the content is attractive with innovation choices
        #of font, font size and background also relevant images if applicable
        #for some topics are embedded
    #use gemini embeddings for memory=True
    #Random review of the 50% of concepts - reduce rpm
    #QA should not share the work of instructor to TA


  #System-1
    #prompt and check for correctness, code translation of the content by Agents
    #Create description / goal for agents and task

  #Make sure the output tokens are controllable
  #QA agent for content
    #quality code check agent -
  #QA agent for code
    #supply the websites to be used for scrapping
        #analytics vidhya
        #use youtube videos
  #use elevenlabs to convert the content into audio
  #Tool to export as word document or ppt
  #use Gemini 1.0 Pro Vision for generating images
  #Create a summarizer agent -optional

In [25]:
from IPython.display import Markdown
Markdown(result)

## Introduction to Neural Networks

**Welcome to the exciting world of Neural Networks!** This session will be your gateway to understanding the core concepts that power modern AI. We'll be building upon your existing knowledge of regression algorithms and diving deep into the fascinating world of Neural Networks.

**Why Neural Networks?**

Imagine a computer that learns like a human brain. That's the power of Neural Networks. They are inspired by the structure and function of the human brain, allowing computers to solve complex problems that traditional algorithms struggle with.

**Let's start with the basics:**

* **What is a Neuron?** The fundamental building block of a Neural Network. It's like a tiny computational unit that receives inputs, processes them, and produces an output. Think of it as a simple decision-maker.
* **Connecting the Dots: The Network** Neurons are interconnected in a structured way to form a network. These connections represent the "knowledge" of the network.
* **Learning by Example: Training** The magic happens during training. We feed the network with data, and it adjusts the strength of its connections to improve its ability to make accurate predictions.

**Here's a simple analogy:**

Think of a child learning to recognize a dog. They see a picture of a dog and are told "That's a dog." Over time, they see different breeds, sizes, and colors, and they learn to identify the common features of a dog. A Neural Network works similarly, learning from examples to recognize patterns and make predictions.

**Let's Dive into Some Key Concepts:**

### 1. Gradient Descent for Neural Networks

**Think of it like a treasure hunt:**

Imagine you're searching for a treasure buried on a mountain. You start at a random point and want to reach the lowest point (the treasure). Gradient Descent helps you find that path.

**How it works:**

* **The Mountain: The Error Function:** The error function represents the "height" of the mountain. It measures how well the network is performing.
* **The Steps: The Gradient:** The gradient is the direction of steepest descent on the mountain. It tells us which way to move to reduce the error.
* **The Treasure: The Optimal Weights:** The lowest point on the mountain corresponds to the set of weights that minimizes the error function.

**Visualizing Gradient Descent:**

[Image of a 3D surface with a contour plot, showing the path of gradient descent.]

**Code Example (Python):**

```python
import numpy as np
import matplotlib.pyplot as plt

# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([3, 5, 7, 9, 11])

# Initialize parameters
m = 0  # Slope
c = 0  # Intercept
learning_rate = 0.01
iterations = 1000

# Gradient Descent
for i in range(iterations):
    # Predictions
    y_pred = m * x + c

    # Calculate gradients
    dm = -(2/len(x)) * sum(x * (y - y_pred))
    dc = -(2/len(x)) * sum(y - y_pred)

    # Update parameters
    m = m - learning_rate * dm
    c = c - learning_rate * dc

    # Calculate and print loss every 100 iterations
    if i % 100 == 0:
        loss = (1/len(x)) * sum((y - y_pred)**2)
        print(f"Iteration {i}, Loss: {loss}")

# Plot the data and the fitted line
plt.scatter(x, y)
plt.plot(x, m * x + c, color='red')
plt.xlabel("x")
plt.ylabel("y")
plt.title("Linear Regression with Gradient Descent")
plt.show()
```

### 2. Backpropagation Algorithm

**Think of it like a chain reaction:**

Imagine a chain of dominoes, where each domino represents a neuron in the network. When we make a prediction, we start at the output neuron and work our way backward, adjusting the weights of each neuron to minimize the error.

**How it works:**

* **The dominoes: The Neurons:** Each neuron receives an input and produces an output.
* **The chain reaction: The Gradient:** The error at the output neuron is propagated backward through the network, adjusting the weights of each neuron along the way.
* **The final adjustments: The Weight Updates:** By adjusting the weights, the network learns to make more accurate predictions in the future.

**Visualizing Backpropagation:**

[Image depicting a simple neural network with arrows indicating the flow of information and gradient during backpropagation.]

**Code Example (Python):**

```python
# Simplified example of backpropagation using Python and NumPy

import numpy as np

# Define sigmoid activation function
def sigmoid(x):
  return 1 / (1 + np.exp(-x))

# Derivative of sigmoid
def sigmoid_derivative(x):
  return x * (1 - x)

# Input dataset
X = np.array([[0,0,1],
              [1,1,1],
              [1,0,1],
              [0,1,1]])

# Output dataset           
y = np.array([[0,1,1,0]]).T

# Seed random numbers to make calculation
# deterministic (just a good practice)
np.random.seed(1)

# Initialize weights randomly with mean 0
synapse_0 = 2 * np.random.random((3,1)) - 1

# Iterate over the training data
for iter in range(10000):

    # Forward propagation
    layer_0 = X
    layer_1 = sigmoid(np.dot(layer_0, synapse_0))

    # Error calculation
    layer_1_error = y - layer_1

    # Multiply the error by the input and again by the gradient of the sigmoid curve
    layer_1_delta = layer_1_error * sigmoid_derivative(layer_1)
    synapse_0_update = np.dot(layer_0.T, layer_1_delta)

    # Update weights
    synapse_0 += synapse_0_update

print("Output After Training:")
print(layer_1)
```

### 3. Understanding Computational Graph

**Think of it like a blueprint:**

A computational graph is a visual representation of how the computations in a Neural Network are organized. It helps us understand the flow of data and gradients through the network.

**How it works:**

* **Nodes: The Operations:** Each node in the graph represents a mathematical operation, such as addition, multiplication, or activation function.
* **Edges: The Data Flow:** The edges connecting the nodes represent the flow of data and gradients between the operations.

**Visualizing a Computational Graph:**

[Image depicting a simple computational graph for a single layer of a neural network, showing the input, weights, activation function, and output.]

### 4. Backpropagation Using Computational Graph

**Think of it like tracing the path:**

By using the computational graph, we can trace the path of the gradient backward through the network, adjusting the weights of each operation to minimize the error.

**How it works:**

* **Start at the output:** We start at the output node of the graph, where we have the error value.
* **Follow the edges:** We follow the edges backward, applying the chain rule of calculus to calculate the gradient at each node.
* **Adjust the weights:** We use the calculated gradients to update the weights of each operation, moving towards the optimal solution.

**Visualizing Backpropagation with a Computational Graph:**

[Image depicting a computational graph with arrows indicating the flow of gradients during backpropagation.]

**Code Example (Python):**

```python
# While this code doesn't explicitly define a computational graph, 
# frameworks like TensorFlow and PyTorch automatically build and 
# utilize them under the hood during backpropagation. 

# The previous example with the sigmoid function and its derivative
# demonstrates the core concepts of backpropagation, which are 
# inherently tied to how computational graphs work.
```

**Quiz Questions:**

1.  **What is the fundamental building block of a neural network?**
    *   (A) Activation Function
    *   (B) Neuron
    *   (C) Weight
    *   (D) Bias

2.  **What does Gradient Descent aim to minimize?**
    *   (A) Accuracy
    *   (B) Complexity
    *   (C) Error Function
    *   (D) Number of layers

3.  **What is the role of the learning rate in Gradient Descent?**
    *   (A) Determines the size of steps taken towards the minimum.
    *   (B) Controls the number of hidden layers.
    *   (C) Defines the activation function.
    *   (D) Regularizes the model to prevent overfitting.

4.  **What does Backpropagation calculate?**
    *   (A) The output of the neural network.
    *   (B) The gradients of the error function with respect to the weights.
    *   (C) The optimal number

In [None]:
# from crewai import Crew, Process


# # Define the crew with agents and tasks
# financial_trading_crew = Crew(
#     agents=[content_generator_agent,
#             # trading_strategy_agent,
#             # execution_agent,
#             # risk_management_agent
#             ],

#     tasks=[content_generation_task,
#           #  strategy_development_task,
#           #  execution_planning_task,
#           #  risk_assessment_task
#            ],

#     manager_llm=llm,
#     process=Process.hierarchical,
#     verbose=True
# )

In [29]:
from IPython.display import Markdown
Markdown(result)

## Applied Deep Learning with PyTorch

### Session Objectives

* Understand the basics of PyTorch and its applications in deep learning
* Learn how to work with tensors and datasets in PyTorch
* Implement linear regression models in PyTorch
* Explore multiple input/output linear regression models
* Implement softmax regression for multi-class classification
* Build and train shallow neural networks
* Understand the importance of data splitting and validation
* Analyze bias and variance in deep learning models
* Learn about overfitting and how to prevent it
* Explore regularization techniques like dropout

### PyTorch Basics

**What is PyTorch?**

PyTorch is an open-source machine learning library developed by Facebook's AI Research lab. It's a powerful and flexible framework that has gained immense popularity in recent years. PyTorch is primarily used for:

* **Tensor computations:** PyTorch provides efficient tools for performing numerical computations on tensors, which are multi-dimensional arrays.
* **Deep learning:** PyTorch offers a rich set of tools and APIs for building, training, and deploying deep learning models.
* **Dynamic computational graphs:** Unlike some other frameworks, PyTorch allows you to define and modify computational graphs dynamically, making it easier to experiment with complex models.

**Why PyTorch?**

* **Ease of Use:** PyTorch has a simple and intuitive API that makes it easy to get started with deep learning.
* **Flexibility:** PyTorch allows for dynamic computational graphs, making it highly adaptable to different tasks and architectures.
* **Community Support:** PyTorch has a large and active community, providing ample resources and support.
* **Production Deployment:** PyTorch can be deployed for production use cases, making it suitable for real-world applications.

**Getting Started with PyTorch**

```python
import torch

# Create a tensor
x = torch.tensor([1, 2, 3])

# Print the tensor
print(x)

# Perform tensor operations
y = x + 2
print(y)
```

**Output:**

```
tensor([1, 2, 3])
tensor([3, 4, 5])
```

### Tensors and Datasets in PyTorch

**Tensors**

Tensors are the fundamental data structure in PyTorch. They represent multi-dimensional arrays and are used to store and manipulate data in deep learning models.

**Creating Tensors**

```python
# Create a tensor from a list
x = torch.tensor([1, 2, 3])

# Create a tensor of zeros
y = torch.zeros(3)

# Create a tensor of ones
z = torch.ones(3)

# Create a tensor with random values
a = torch.rand(3)
```

**Tensor Operations**

PyTorch supports a wide range of tensor operations, including:

* **Arithmetic operations:** +, -, *, /, %, **, //
* **Matrix operations:** dot product, transpose, inverse
* **Indexing and slicing**
* **Reshaping and concatenation**
* **Broadcasting**

**Datasets**

Datasets are used to load and manage data for training deep learning models. PyTorch provides various ways to load and process datasets, including:

* **Loading from files:** PyTorch supports loading data from files in various formats like CSV, JSON, and image files.
* **Using built-in datasets:** PyTorch offers a collection of built-in datasets like MNIST, CIFAR-10, and ImageNet.
* **Creating custom datasets:** You can create custom datasets by inheriting from the `torch.utils.data.Dataset` class.

**Example: Loading and Processing MNIST Dataset**

```python
from torchvision import datasets, transforms

# Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

# Create data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)

# Iterate over training data
for batch_idx, (data, target) in enumerate(train_loader):
    # Process the data and target
    print(data.shape)
    print(target.shape)
    break
```

**Output:**

```
torch.Size([64, 1, 28, 28])
torch.Size([64])
```

### Linear Regression in PyTorch

**Linear Regression Model**

Linear regression is a fundamental supervised learning algorithm used to predict a continuous target variable based on one or more independent variables. 

**Model Definition**

```python
import torch.nn as nn

class LinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        out = self.linear(x)
        return out
```

**Training the Model**

```python
# Load the dataset
from sklearn.datasets import load_boston
boston = load_boston()
X = torch.tensor(boston.data, dtype=torch.float)
y = torch.tensor(boston.target, dtype=torch.float)

# Define the model
model = LinearRegression(input_size=X.shape[1], output_size=1)

# Define the loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y.view(-1, 1))

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

**Making Predictions**

```python
# Make predictions on new data
new_data = torch.tensor([[6.3200, 18.00, 2.310, 0.0, 0.5380, 6.5750, 65.20, 4.0900, 1.0, 296.0, 15.30, 396.90, 4.98]])
predictions = model(new_data)
print(f'Prediction: {predictions.item():.4f}')
```

### Multiple Input Output Linear Regression

**Model Definition**

```python
import torch.nn as nn

class MultiOutputLinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(MultiOutputLinearRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        out = self.linear(x)
        return out
```

**Training the Model**

```python
# Load the dataset
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=5, n_targets=3, random_state=42)
X = torch.tensor(X, dtype=torch.float)
y = torch.tensor(y, dtype=torch.float)

# Define the model
model = MultiOutputLinearRegression(input_size=X.shape[1], output_size=y.shape[1])

# Define the loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

**Making Predictions**

```python
# Make predictions on new data
new_data = torch.tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])
predictions = model(new_data)
print(f'Predictions: {predictions.tolist()}')
```

### Softmax Regression

**Softmax Regression Model**

Softmax regression, also known as multinomial logistic regression, is a generalization of logistic regression used for multi-class classification problems. It predicts the probability of an input belonging to each class.

**Model Definition**

```python
import torch.nn as nn

class SoftmaxRegression(nn.Module):
    def __init__(self, input_size, num_classes):
        super(SoftmaxRegression, self).__init__()
        self.linear = nn.Linear(input_size, num_classes)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        out = self.linear(x)
        out = self.softmax(out)
        return out
```

**Training the Model**

```python
# Load the dataset
from sklearn.datasets import load_iris
iris = load_iris()
X = torch.tensor(iris.data, dtype=torch.float)
y = torch.tensor(iris.target, dtype=torch.long)

# Define the model
model = SoftmaxRegression(input_size=X.shape[1], num_classes=3)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

**Making Predictions**

```python
# Make predictions on new data
new_data = torch.tensor([[5.1, 3.5, 1.4, 0.2]])
predictions = model(new_data)
print(f'Predictions: {predictions.tolist()}')
```

### Shallow Neural Networks

**Shallow Neural Network Model**

Shallow neural networks are neural networks with a single hidden layer, making them simpler than deep neural networks but still capable of learning complex patterns.

**Model Definition**

```python
import torch.nn as nn

class ShallowNeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(ShallowNeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out
```

**Training the Model**

```python
# Load the dataset
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=5, n_classes=3, random_state=42)
X = torch.tensor(X, dtype=torch.float)
y = torch.tensor(y, dtype=torch.long)

# Define the model
model = ShallowNeuralNetwork(input_size=X.shape[1], hidden_size=10, output_size=3)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

**Making Predictions**

```python
# Make predictions on new data
new_data = torch.tensor([[1.0, 2.0, 3.0, 4.0, 5.0]])
predictions = model(new_data)
print(f'Predictions: {predictions.tolist()}')
```

### Splitting the Data (Train/Test/Dev)

**Data Splitting**

Splitting the dataset into training, validation (or development), and testing sets is crucial for building robust deep learning models. This allows us to:

* **Train:** Train the model on the training set to learn the patterns in the data.
* **Validate:** Evaluate the model's performance on the validation set during training to adjust hyperparameters and prevent overfitting.
* **Test:** Evaluate the model's final performance on the testing set, which was not seen during training.

**Example: Data Splitting for MNIST Dataset**

```python
from torchvision import datasets, transforms
from torch.utils.data import random_split

# Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())

# Split the data into training, validation, and testing sets
train_size = int(0.8 * len(train_dataset))
val_size = int(0.1 * len(train_dataset))
test_size = len(train_dataset) - train_size - val_size

train_dataset, val_dataset, test_dataset = random_split(train_dataset, [train_size, val_size, test_size])

# Create data loaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=64, shuffle=False)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
```

### Understanding Bias and Variance

**Bias and Variance Trade-off**

In machine learning, bias and variance are two important concepts that affect the performance of models.

* **Bias:** Bias refers to the error introduced by approximating a complex real-world relationship with a simpler model. High bias models are underfitted and may not capture the underlying patterns in the data.
* **Variance:** Variance refers to the sensitivity of the model to changes in the training data. High variance models are overfitted and may perform poorly on unseen data.

**The Trade-off:** There is a trade-off between bias and variance. Increasing model complexity typically reduces bias but increases variance, and vice versa. The goal is to find a model that balances these two factors to achieve optimal performance.

**Example: Bias and Variance in Linear Regression**

* **High Bias:** A linear regression model with a single feature may have high bias, as it may not capture the complex relationship between the input and output.
* **High Variance:** A linear regression model with a large number of features may have high variance, as it may be overfitting to the training data.

### Understanding Overfitting

**Overfitting**

Overfitting occurs when a model learns the training data too well, including its noise and random fluctuations. This leads to poor generalization performance on unseen data.

**Causes of Overfitting:**

* **High model complexity:** Complex models with many parameters are more prone to overfitting.
* **Insufficient data:** When the training data is limited, the model may learn the specific characteristics of the training examples rather than general patterns.
* **Poor regularization:** Regularization techniques help prevent overfitting by penalizing complex models.

**Signs of Overfitting:**

* **High training accuracy but low validation accuracy.**
* **The model performs well on the training data but poorly on the testing data.**

### Using Regularization

**Regularization**

Regularization is a technique used to prevent overfitting by adding a penalty term to the loss function. This penalty discourages the model from assigning large weights to individual features, promoting simpler and more generalizable models.

**Common Regularization Techniques:**

* **L1 Regularization:** Adds a penalty proportional to the absolute value of the weights.
* **L2 Regularization:** Adds a penalty proportional to the square of the weights.
* **Dropout:** Randomly drops out units (neurons) during training, forcing the model to rely on other units and preventing over-reliance on specific features.

**Example: Using L2 Regularization in Linear Regression**

```python
import torch.nn as nn

class LinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(LinearRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        out = self.linear(x)
        return out

# Load the dataset
from sklearn.datasets import load_boston
boston = load_boston()
X = torch.tensor(boston.data, dtype=torch.float)
y = torch.tensor(boston.target, dtype=torch.float)

# Define the model
model = LinearRegression(input_size=X.shape[1], output_size=1)

# Define the loss function and optimizer with L2 regularization
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, weight_decay=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y.view(-1, 1))

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

### Regularization Techniques (like dropout)

**Dropout**

Dropout is a regularization technique that randomly drops out units (neurons) during training. This prevents the model from becoming too reliant on specific features and encourages it to learn more robust representations.

**How Dropout Works:**

During training, each unit has a probability of being dropped out. This means that the unit's output is set to zero, effectively removing it from the network. The dropout probability is typically set to 0.5.

**Advantages of Dropout:**

* **Reduces overfitting:** By randomly dropping out units, dropout prevents the model from memorizing the training data.
* **Encourages robust representations:** The model learns to rely on multiple units, making it more robust to noise and variations in the input data.

**Example: Using Dropout in a Shallow Neural Network**

```python
import torch.nn as nn

class ShallowNeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(ShallowNeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(p=0.5)
        self.fc2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.dropout(out)
        out = self.fc2(out)
        return out

# Load the dataset
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=100, n_features=5, n_classes=3, random_state=42)
X = torch.tensor(X, dtype=torch.float)
y = torch.tensor(y, dtype=torch.long)

# Define the model
model = ShallowNeuralNetwork(input_size=X.shape[1], hidden_size=10, output_size=3)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(100):
    # Forward pass
    outputs = model(X)
    loss = criterion(outputs, y)

    # Backward pass and optimization
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 10 epochs
    if (epoch+1) % 10 == 0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')
```

**Note:** The code snippets provided in this document are for illustrative purposes. You may need to modify them based on your specific dataset and model architecture.

In [16]:
from IPython.display import Markdown
Markdown(result)

# Applied Deep Learning with PyTorch

## Introduction

PyTorch is a popular deep learning framework that is known for its flexibility and ease of use. It is widely used in a variety of applications, including computer vision, natural language processing, and speech recognition.

In this tutorial, we will introduce the basics of PyTorch and show you how to use it to build a variety of deep learning models. We will cover the following topics:

* PyTorch basics
* Tensor and Datasets in PyTorch
* Linear Regression in PyTorch
* Multiple Input Output Linear Regression
* Softmax Regression
* Shallow Neural Networks
* Splitting the data (train/test/dev)
* Understanding Bias and Variance
* Understanding overfitting
* Using regularization
* Regularization techniques (like dropout)

## PyTorch basics

PyTorch is a Python-based deep learning framework that is built on the Torch library. It is designed to be flexible and easy to use, and it provides a wide range of features for building and training deep learning models.

One of the key features of PyTorch is its support for dynamic computation graphs. This means that you can define your model as a series of operations that are executed on the fly. This gives you a lot of flexibility in how you build your models, and it allows you to experiment with different architectures easily.

## Tensor and Datasets in PyTorch

A tensor is a multi-dimensional array that is used to represent data in PyTorch. Tensors can be of any shape or size, and they can be used to represent a variety of data types, including images, text, and audio.

PyTorch provides a number of built-in datasets that you can use to train your models. These datasets include the MNIST dataset of handwritten digits, the CIFAR-10 dataset of small images, and the ImageNet dataset of large images.

## Linear Regression in PyTorch

Linear regression is a simple but powerful machine learning algorithm that can be used to predict a continuous value from a set of input features. In PyTorch, you can implement linear regression using the `nn.Linear` module.

The following code shows you how to implement linear regression in PyTorch:

```python
import torch
import torch.nn as nn

# Create a linear regression model
model = nn.Linear(1, 1)

# Define the loss function
loss_fn = nn.MSELoss()

# Define the optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(1000):
    # Forward pass
    y_pred = model(x)

    # Compute the loss
    loss = loss_fn(y_pred, y)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()

    # Update the weights
    optimizer.step()
```

## Multiple Input Output Linear Regression

Multiple input output linear regression is a generalization of linear regression that can be used to predict multiple continuous values from a set of input features. In PyTorch, you can implement multiple input output linear regression using the `nn.Linear` module.

The following code shows you how to implement multiple input output linear regression in PyTorch:

```python
import torch
import torch.nn as nn

# Create a multiple input output linear regression model
model = nn.Linear(2, 2)

# Define the loss function
loss_fn = nn.MSELoss()

# Define the optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(1000):
    # Forward pass
    y_pred = model(x)

    # Compute the loss
    loss = loss_fn(y_pred, y)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()

    # Update the weights
    optimizer.step()
```

## Softmax Regression

Softmax regression is a type of logistic regression that is used to predict the probability of a data point belonging to a particular class. In PyTorch, you can implement softmax regression using the `nn.Softmax` module.

The following code shows you how to implement softmax regression in PyTorch:

```python
import torch
import torch.nn as nn

# Create a softmax regression model
model = nn.Softmax(dim=1)

# Define the loss function
loss_fn = nn.CrossEntropyLoss()

# Define the optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(1000):
    # Forward pass
    y_pred = model(x)

    # Compute the loss
    loss = loss_fn(y_pred, y)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()

    # Update the weights
    optimizer.step()
```

## Shallow Neural Networks

Shallow neural networks are a type of deep learning model that consists of a few layers of neurons. Shallow neural networks are often used for simple tasks, such as image classification and natural language processing.

In PyTorch, you can implement shallow neural networks using the `nn.Sequential` module.

The following code shows you how to implement a shallow neural network in PyTorch:

```python
import torch
import torch.nn as nn

# Create a shallow neural network
model = nn.Sequential(
    nn.Linear(28 * 28, 128),
    nn.ReLU(),
    nn.Linear(128, 10),
    nn.Softmax(dim=1)
)

# Define the loss function
loss_fn = nn.CrossEntropyLoss()

# Define the optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Train the model
for epoch in range(1000):
    # Forward pass
    y_pred = model(x)

    # Compute the loss
    loss = loss_fn(y_pred, y)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()

    # Update the weights
    optimizer.step()
```

## Splitting the data (train/test/dev)

When training a machine learning model, it is important to split the data into training, testing, and development sets. The training set is used to train the model, the testing set is used to evaluate the model's performance, and the development set is used to fine-tune the model's hyperparameters.

In PyTorch, you can use the `sklearn.model_selection.train_test_split` function to split the data into training, testing, and development sets.

The following code shows you how to split the data into training, testing, and development sets in PyTorch:

```python
from sklearn.model_selection import train_test_split

# Split the data into training, testing, and development sets
x_train, x_test, x_dev, y_train, y_test, y_dev = train_test_split(x, y, test_size=0.2, random_state=42)
```

## Understanding Bias and Variance

Bias and variance are two important concepts in machine learning. Bias is the systematic error that is introduced by a model, while variance is the random error that is introduced by a model.

Bias can be reduced by increasing the size of the training set, while variance can be reduced by increasing the complexity of the model.

## Understanding overfitting

Overfitting occurs when a model is too complex and it learns the training data too well. This can lead to poor performance on new data.

Overfitting can be reduced by using regularization techniques, such as dropout and early stopping.

## Using regularization

Regularization is a technique that is used to reduce overfitting. Regularization techniques penalize the model for making complex predictions.

Some common regularization techniques include:

* L1 regularization
* L2 regularization
* Dropout
* Early stopping

## Regularization techniques (like dropout)

Dropout is a regularization technique that is used to reduce overfitting. Dropout randomly drops out neurons during training, which helps to prevent the model from overfitting to the training data.

The following code shows you how to use dropout in PyTorch:

```python
import torch
import torch.nn as nn

# Create a dropout layer
dropout = nn.Dropout(p=0.5)

# Add the dropout layer to the model
model.add_module('dropout', dropout)
```

## Conclusion

In this tutorial, we have introduced the basics of PyTorch and shown you how to use it to build a variety of deep learning models. We have also discussed some important concepts in machine learning, such as bias, variance, overfitting, and regularization.

We encourage you to experiment with PyTorch and to build your own deep learning models. With a little practice, you will be able to build powerful models that can solve a variety of real-world problems.