# Introduction to Deep Learning

## Basics of Neural Networks
A neural network is a series of algorithms that attempt to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. Neural networks can adapt to changing input, so the network generates the best possible result without needing to redesign the output criteria.

**When to Use**: Neural networks are used for a wide range of tasks, including image and speech recognition, medical diagnosis, and financial forecasting.

## Activation Functions
Activation functions decide whether a neuron should be activated or not by calculating the weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

**Common Activation Functions**:
- **ReLU (Rectified Linear Unit)**: Introduces non-linearity and helps mitigate the vanishing gradient problem.
- **Sigmoid**: Maps input values to a range between 0 and 1.
- **Tanh**: Maps input values to a range between -1 and 1.

**When to Use**: Activation functions are used in the hidden layers of a neural network to introduce non-linearity, which allows the network to model complex relationships.

## Backpropagation Algorithm
Backpropagation is the heart of neural network training. It is the process of fine-tuning the weights of a neural network based on the error rate obtained in the previous epoch (iteration).

**When to Use**: Backpropagation is used during the training phase of the neural network. It helps minimize the error by adjusting the weights in the network.

## Optimization Techniques
Optimization algorithms are used to change the attributes of the neural network, such as weights and learning rate, to reduce the losses.

**Common Optimization Techniques**:
- **Gradient Descent**: The simplest optimization algorithm that minimizes the cost function.
- **Adam (Adaptive Moment Estimation)**: Combines the advantages of two other extensions of gradient descent: AdaGrad and RMSProp.
- **RMSprop**: Divides the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight.

**When to Use**: Optimization techniques are used during the training process to efficiently converge to the minimum of the cost function.



In [None]:
import pandas as pd
import io
from sklearn.datasets import load_iris, load_wine, fetch_openml, load_breast_cancer
from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from IPython.display import display, Markdown
import ipywidgets as widgets
import matplotlib.pyplot as plt
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import classification_report

# Global variable to store dataset
data = pd.DataFrame()

# Step 1: Data Collection
# Create a file upload widget
upload_button = widgets.FileUpload(description="Upload CSV", accept='.csv')

# Create a dropdown for selecting built-in datasets
builtin_datasets_dropdown = widgets.Dropdown(
    options=['Select Dataset', 'Iris', 'Wine', 'California Housing', 'Breast Cancer', 'Diabetes'],
    description='Dataset:',
    disabled=False,
)

# Function to load built-in datasets
def load_builtin_dataset(change):
    global data
    dataset_name = builtin_datasets_dropdown.value
    if dataset_name == 'Iris':
        data = load_iris(as_frame=True).frame
    elif dataset_name == 'Wine':
        data = load_wine(as_frame=True).frame
    elif dataset_name == 'California Housing':
        data = fetch_openml(data_id=42165, as_frame=True).frame  # fetch_openml for California Housing
    elif dataset_name == 'Diabetes':
        data = fetch_openml(name='diabetes', version=1, as_frame=True).frame
    elif dataset_name == 'Breast Cancer':
        data = load_breast_cancer(as_frame=True).frame

    if dataset_name != 'Select Dataset':
        display(Markdown("### Dataset Information"))
        display(Markdown(f"**Number of instances:** {data.shape[0]}"))
        display(Markdown(f"**Number of features:** {data.shape[1]}"))
        display(Markdown("### First 5 Rows of the Dataset"))
        display(data.head())
        target_input.options = data.columns.tolist()
        target_input.value = data.columns[-1]
        feature_dropdown.options = data.columns[:-1].tolist()
        columns_to_drop.options = data.columns.tolist()
        columns_to_fill.options = data.columns[data.isnull().any()].tolist()
        categorical_columns.options = data.select_dtypes(include=['object', 'category']).columns.tolist()

builtin_datasets_dropdown.observe(load_builtin_dataset, names='value')

# Function to load the dataset from upload button
def load_dataset(change):
    global data
    uploaded_file = upload_button.value
    if uploaded_file:
        file_content = uploaded_file[list(uploaded_file.keys())[0]]['content']
        data = pd.read_csv(io.BytesIO(file_content))
        display(Markdown("### Dataset Information"))
        display(Markdown(f"**Number of instances:** {data.shape[0]}"))
        display(Markdown(f"**Number of features:** {data.shape[1]}"))
        display(Markdown("### First 5 Rows of the Dataset"))
        display(data.head())
        target_input.options = data.columns.tolist()
        target_input.value = data.columns[-1]
        feature_dropdown.options = data.columns[:-1].tolist()
        columns_to_drop.options = data.columns.tolist()
        columns_to_fill.options = data.columns[data.isnull().any()].tolist()
        categorical_columns.options = data.select_dtypes(include=['object', 'category']).columns.tolist()

# Attach the load_dataset function to the file upload button
upload_button.observe(load_dataset, names='value')

# Display the upload button and built-in dataset dropdown
display(Markdown("## Step 1: Data Collection"))
display(widgets.HBox([upload_button, builtin_datasets_dropdown]))

# Step 2: Data Preprocessing
# Create a dropdown widget for selecting the feature to visualize and use for modeling
feature_dropdown = widgets.Dropdown(
    options=[],
    description='Feature:',
    disabled=False,
)

# Create a dropdown widget for the target column name (auto-filled after loading data)
target_input = widgets.Dropdown(
    options=[],
    description='Target:',
    disabled=False,
)

# Create a dropdown widget for selecting the preprocessing method
preprocess_dropdown = widgets.Dropdown(
    options=['None', 'Standard Scaler', 'Min-Max Scaler', 'Robust Scaler'],
    value='None',
    description='Preprocess:',
    disabled=False,
)

# Create a widget to select columns to drop
columns_to_drop = widgets.SelectMultiple(
    options=[],
    description='Drop Columns:',
    disabled=False,
)

# Create a widget to select columns to fill missing values
columns_to_fill = widgets.SelectMultiple(
    options=[],
    description='Fill Columns:',
    disabled=False,
)

# Create a dropdown for filling method
fill_method_dropdown = widgets.Dropdown(
    options=['Mean', 'Median', 'Mode'],
    value='Mean',
    description='Fill Method:',
    disabled=False,
)

# Create a widget to select categorical columns
categorical_columns = widgets.SelectMultiple(
    options=[],
    description='Categorical Columns:',
    disabled=False,
)

# Create a dropdown for selecting the encoding method
encoding_method_dropdown = widgets.Dropdown(
    options=['Label Encoding', 'One-Hot Encoding'],
    value='Label Encoding',
    description='Encoding Method:',
    disabled=False,
)

# Display the preprocessing widgets
display(Markdown("## Step 2: Data Preprocessing"))
display(feature_dropdown)
display(target_input)
display(preprocess_dropdown)
display(columns_to_drop)
display(columns_to_fill)
display(fill_method_dropdown)
display(categorical_columns)
display(encoding_method_dropdown)

# Function to handle categorical data
def handle_categorical_data(b):
    global data
    categorical_cols = list(categorical_columns.value)
    if categorical_cols:
        if encoding_method_dropdown.value == 'Label Encoding':
            label_encoder = LabelEncoder()
            for col in categorical_cols:
                data[col] = label_encoder.fit_transform(data[col])
        elif encoding_method_dropdown.value == 'One-Hot Encoding':
            data = pd.get_dummies(data, columns=categorical_cols, drop_first=True)
        display(Markdown("### Handled Categorical Data"))
        display(data.head())

handle_categorical_data_button = widgets.Button(description="Handle Categorical Data")
handle_categorical_data_button.on_click(handle_categorical_data)
display(handle_categorical_data_button)

# Function to fill missing values
def fill_missing_values(b):
    global data
    if columns_to_fill.value:
        for column in columns_to_fill.value:
            if fill_method_dropdown.value == 'Mean':
                data[column].fillna(data[column].mean(), inplace=True)
            elif fill_method_dropdown.value == 'Median':
                data[column].fillna(data[column].median(), inplace=True)
            elif fill_method_dropdown.value == 'Mode':
                data[column].fillna(data[column].mode()[0], inplace=True)
        display(Markdown("### Filled Missing Values"))
        display(data.head())

fill_missing_values_button = widgets.Button(description="Fill Missing Values")
fill_missing_values_button.on_click(fill_missing_values)
display(fill_missing_values_button)

# Function to drop selected columns
def drop_columns(b):
    global data
    if columns_to_drop.value:
        data.drop(columns=columns_to_drop.value, inplace=True)
        display(Markdown("### Dropped Selected Columns"))
        display(data.head())

drop_columns_button = widgets.Button(description="Drop Columns")
drop_columns_button.on_click(drop_columns)
display(drop_columns_button)

# Function to apply scaler
def apply_scaler(b):
    global data
    scaler = None
    if preprocess_dropdown.value == 'Standard Scaler':
        scaler = StandardScaler()
    elif preprocess_dropdown.value == 'Min-Max Scaler':
        scaler = MinMaxScaler()
    elif preprocess_dropdown.value == 'Robust Scaler':
        scaler = RobustScaler()

    if scaler:
        numerical_cols = data.select_dtypes(include=[np.number]).columns
        data[numerical_cols] = scaler.fit_transform(data[numerical_cols])
        display(Markdown("### Applied Scaler"))
        display(data.head())

apply_scaler_button = widgets.Button(description="Apply Scaler")
apply_scaler_button.on_click(apply_scaler)
display(apply_scaler_button)

# Step 3: Train-Test Split
def train_test_split_data(b):
    global data, X_train, X_test, y_train, y_test
    X = data.drop(columns=[target_input.value])
    y = data[target_input.value]
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    display(Markdown(f"### Data Split\n- Train shape: {X_train.shape}\n- Test shape: {X_test.shape}"))

split_button = widgets.Button(description="Train-Test Split")
split_button.on_click(train_test_split_data)
display(Markdown("## Step 3: Train-Test Split"))
display(split_button)

# Step 4: Model Training
layer_count = widgets.IntSlider(value=3, min=2, max=10, step=1, description='Number of Layers:')
layer_sizes_sliders = [widgets.IntSlider(value=10, min=1, max=100, step=1, description=f'Units in Layer {i+1}') for i in range(layer_count.value)]
activation_functions_dropdowns = [widgets.Dropdown(options=['relu', 'sigmoid', 'tanh', 'softmax'], value='relu', description=f'Activation {i+1}') for i in range(layer_count.value)]

def update_layers(change):
    global layer_sizes_sliders, activation_functions_dropdowns
    layer_sizes_sliders = [widgets.IntSlider(value=10, min=1, max=100, step=1, description=f'Units in Layer {i+1}') for i in range(layer_count.value)]
    activation_functions_dropdowns = [widgets.Dropdown(options=['relu', 'sigmoid', 'tanh', 'softmax'], value='relu', description=f'Activation {i+1}') for i in range(layer_count.value)]
    layer_controls.children = [widgets.HBox([layer_sizes_sliders[i], activation_functions_dropdowns[i]]) for i in range(layer_count.value)]

layer_count.observe(update_layers, names='value')

# Create initial layer controls
layer_controls = widgets.VBox([widgets.HBox([layer_sizes_sliders[i], activation_functions_dropdowns[i]]) for i in range(layer_count.value)])

# Dropdowns for optimizer and regularization
optimizer_dropdown = widgets.Dropdown(
    options=['sgd', 'adam', 'rmsprop'],
    value='adam',
    description='Optimizer:'
)

regularization_dropdown = widgets.Dropdown(
    options=[None, 'l1', 'l2'],
    value=None,
    description='Regularization:'
)

# Slider for epochs
epoch_slider = widgets.IntSlider(value=10, min=1, max=100, step=1, description='Epochs:')

# Display model training widgets
display(Markdown("## Step 4: Model Training"))
display(layer_count)
display(layer_controls)
display(optimizer_dropdown)
display(regularization_dropdown)
display(epoch_slider)

# Function to create and train the neural network
def train_model(b):
    global data, X_train, X_test, y_train, y_test

    model = Sequential()
    for i in range(layer_count.value):
        units = layer_sizes_sliders[i].value
        activation = activation_functions_dropdowns[i].value
        if i == 0:
            model.add(Dense(units, activation=activation, input_shape=(X_train.shape[1],)))
        else:
            model.add(Dense(units, activation=activation))

    optimizer = optimizer_dropdown.value
    if regularization_dropdown.value:
        if regularization_dropdown.value == 'l1':
            from keras.regularizers import l1
            regularizer = l1(0.01)
        elif regularization_dropdown.value == 'l2':
            from keras.regularizers import l2
            regularizer = l2(0.01)
        model.add(Dense(units, activation=activation, kernel_regularizer=regularizer))
    else:
        model.add(Dense(units, activation=activation))

    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    history = model.fit(X_train, y_train, epochs=epoch_slider.value, batch_size=32, validation_data=(X_test, y_test))

    y_pred = np.argmax(model.predict(X_test), axis=-1)
    report = classification_report(y_test, y_pred, output_dict=True)
    display(Markdown("### Model Training Report"))
    display(pd.DataFrame(report).transpose())

    plot_history(history)

train_button = widgets.Button(description="Train Model")
train_button.on_click(train_model)
display(train_button)

# Function to plot training history
def plot_history(history):
    plt.figure(figsize=(14, 5))
    plt.subplot(1, 2, 1)
    plt.plot(history.history['loss'], label='Train Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.legend()
    plt.title('Loss')
    plt.subplot(1, 2, 2)
    plt.plot(history.history['accuracy'], label='Train Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.legend()
    plt.title('Accuracy')
    plt.show()


## Step 1: Data Collection

HBox(children=(FileUpload(value={}, accept='.csv', description='Upload CSV'), Dropdown(description='Dataset:',…

## Step 2: Data Preprocessing

Dropdown(description='Feature:', options=(), value=None)

Dropdown(description='Target:', options=(), value=None)

Dropdown(description='Preprocess:', options=('None', 'Standard Scaler', 'Min-Max Scaler', 'Robust Scaler'), va…

SelectMultiple(description='Drop Columns:', options=(), value=())

SelectMultiple(description='Fill Columns:', options=(), value=())

Dropdown(description='Fill Method:', options=('Mean', 'Median', 'Mode'), value='Mean')

SelectMultiple(description='Categorical Columns:', options=(), value=())

Dropdown(description='Encoding Method:', options=('Label Encoding', 'One-Hot Encoding'), value='Label Encoding…

Button(description='Handle Categorical Data', style=ButtonStyle())

Button(description='Fill Missing Values', style=ButtonStyle())

Button(description='Drop Columns', style=ButtonStyle())

Button(description='Apply Scaler', style=ButtonStyle())

## Step 3: Train-Test Split

Button(description='Train-Test Split', style=ButtonStyle())

## Step 4: Model Training

IntSlider(value=3, description='Number of Layers:', max=10, min=2)

VBox(children=(HBox(children=(IntSlider(value=10, description='Units in Layer 1', min=1), Dropdown(description…

Dropdown(description='Optimizer:', index=1, options=('sgd', 'adam', 'rmsprop'), value='adam')

Dropdown(description='Regularization:', options=(None, 'l1', 'l2'), value=None)

IntSlider(value=10, description='Epochs:', min=1)

Button(description='Train Model', style=ButtonStyle())