# Ⅰ -- Math Assignment:
## Raw Forward Propagation

1. **Input to Hidden Layer Computation:**
   - For each neuron in the hidden layer, the input is the dot product of the $\( x \)$ vector and the weights $\( w \)$, since there is no bias term, so $\( out = x \cdot w \)$.
   - Given that the initial weights are 0.1, we have three inputs $\( x = [6, 2, 2] \)$.
   - Therefore, the $\( out \)$ for each neuron in the hidden layer will be $\( 6 \times 0.1 + 2 \times 0.1 + 2 \times 0.1 = 1.0 \)$.

2. **Activation in Hidden Layer:**
   - Apply the ReLU activation function, $\( \text{ReLU}(out) = \max(0, out) \)$, since $\( out \)$ is 1.0, the output after activation remains $1.0$.

3. **Hidden to Output Layer Computation:**
   - Similarly, the $\( out \)$ for the output layer is also the dot product of the inputs, $\( out = 1.0 \times 0.1 + 1.0 \times 0.1 = 0.2 \)$.

4. **Activation in Output Layer:**
   - Apply the ReLU activation function to the output layer, $\( \text{ReLU}(out) = \max(0, 0.2) = 0.2 \)$.
   - So, the raw modeled output $\( \hat{y} \)$ is $0.2$.

## Backpropagation

To compute the gradient and update the weights, we use the loss function $\( L = y - \hat{y} \)$ and a learning rate of 0.05.

1. **Loss Computation:**
   - The given true value $\( y = 0.7 \)$, the model output $\( \hat{y} = 0.2 \)$.
   - The loss $\( L = y - \hat{y} = 0.7 - 0.2 = 0.5 \)$.

2. **Gradient for Output Layer Weights:**
   - $\( \frac{\partial L}{\partial w} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial w} \)$.
   - $\( \frac{\partial L}{\partial \hat{y}} = -1 \)$ since the loss function is $\( L = y - \hat{y} \)$.
   - $\( \frac{\partial \hat{y}}{\partial w} \)$ is the derivative of the ReLU function times the output of the hidden layer. Since ReLU's derivative is a unit step function, for $\( out = 0.2 \)$, it is $1$.
   - The output of the hidden layer is 1, so $\( \frac{\partial \hat{y}}{\partial w} = 1 \)$.
   - Thus, the gradient for the output layer weights $\( \frac{\partial L}{\partial w} = -1 \times 1 \times 1 = -1 \).$

3. **Gradient Calculation for Hidden Layer Weights:**
   - The gradient for hidden layer weights (since only one hidden layer neuron outputs a positive value, hence its gradient is non-zero):
     $\( \frac{\partial L}{\partial w_{\text{hidden}}} = \frac{\partial L}{\partial \hat{y}} \cdot \frac{\partial \hat{y}}{\partial \text{out}_{\text{hidden}}} \cdot \frac{\partial \text{out}_{\text{hidden}}}{\partial w_{\text{hidden}}} \)$
   - As $\( \text{out}_{\text{hidden}} \)$ is activated by ReLU, its derivative is 1 for $\( \text{out} > 0 \)$, hence:
     $\( \frac{\partial \hat{y}}{\partial \text{out}_{\text{hidden}}} = w_{\text{out}} = 0.1 \)$
   - $\( \frac{\partial \text{out}_{\text{hidden}}}{\partial w_{\text{hidden}}} \)$ is the input $\( x \)$, hence:
     $\( \frac{\partial L}{\partial w_{\text{hidden}}} = -1 \times 0.1 \times x = -1 \times 0.1 \times [6, 2, 2] = [-0.6, -0.2, -0.2] \)$

4. **Update Weights:**
   - After computing gradients for all weights, update them simultaneously:
   - For the output layer weights $\( w_{\text{out}} \)$:
     $\( w_{\text{out}_{\text{new}}} = w_{\text{out}_{\text{old}}} - \alpha \times \frac{\partial L}{\partial w_{\text{out}}} = 0.1 - 0.05 \times -1 = 0.1 + 0.05 = 0.15 \)$
   - For the hidden layer weights $\( w_{\text{hidden}} \)$, for each corresponding input $\( x_i \)$:
     $\( w_{\text{hidden}_{\text{new}_6}} = w_{\text{hidden}_{\text{old}_6}} - \alpha \times \frac{\partial L}{\partial w_{\text{hidden}_6}} = 0.1 - 0.05 \times -0.6 = 0.1 + 0.03 = 0.13 \)$
     $\( w_{\text{hidden}_{\text{new}_2}}= w_{\text{hidden}_{\text{old}_2}} - \alpha \times \frac{\partial L}{\partial w_{\text{hidden}_2}} = 0.1 - 0.05 \times -0.2 = 0.1 + 0.01 = 0.11 \)$
## Updated Forward Propagation

1. **Input to Hidden Layer Computation:**
   - For each neuron in the hidden layer, the input is the dot product of the $\( x \)$ vector and the updated weights $\( w_{\text{new}} \)$, since there is no bias term, so $\( out = x \cdot w_{\text{new}} \)$.
   - Given that the initial weights are 0.1, we have three inputs $\( x = [6, 2, 2] \)$.
   - Therefore, the $\( out \)$ for each neuron in the hidden layer will be $\( 6 \times 0.13 + 2 \times 0.11 + 2 \times 0.11 = 1.22 \)$.

2. **Activation in Hidden Layer:**
   - Apply the ReLU activation function, $\( \text{ReLU}(out) = \max(0, out) \)$, since $\( out \)$ is 1.22, the output after activation remains $1.22$.

3. **Hidden to Output Layer Computation:**
   - Similarly, the $\( out \)$ for the output layer is also the dot product of the inputs, $\( out = 1.22 \times 0.15 + 1.22 \times 0.15 = 0.366 \)$.

4. **Activation in Output Layer:**
   - Apply the ReLU activation function to the output layer, $\( \text{ReLU}(out) = \max(0, 0.366) = 0.366 \)$.
   - So, the updated modeled output $\( \hat{y}_{\text{new}} \)$ is $0.366$.
## The Final Answer is [0.366]()


# Ⅱ -- Coding Assignment:
## a. Data Preprocessing
1. Please modify the categorial variables into dummy variables (e.g., season, weathersit, month, hour, weekday)
2. Normalize continuous variables using z-score (mean=0,sd=1).
3. Exclude the useless features in your training and modeling.
4. Separate the training and validation data. Use the last 21 days’ data for
validation. Note that the target column is “cnt”. The other two “causal” and
“registered” could be overlooked and should not be used as variables in your
code. 

In [188]:
import pandas as pd
import numpy as np
import os
import tqdm 
import logging
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

In [206]:
class DataPreProcess(object):
    SourceDataPath='bikeRidershipPredictionDataHour.csv'
    DataSaveFolder='DataPreProcessed'
    # Exclude UnCorrelated Features
    UnCorrelatedBar=0.1
    # Overlooked features
    OverlookedFeatures=['casual','registered']
    # The categorial variables
    CategoryVariables=['season','yr','mnth','hr','holiday','weekday','workingday','weathersit']
    # The continuous variables
    ContinuousVariables=['temp','atemp','hum','windspeed']
    # The target variable
    TargetVariable='cnt'
    # Validation data length (last days)
    ValidationDays=21
    def DropOverlookedVariables(self):
        # Drop the useless features
        self.RawData.drop(self.OverlookedFeatures,axis=1,inplace=True)
    def FileSystemMaker(self):
        # Create the folder for saving the preprocessed data
        if not os.path.exists(self.DataSaveFolder):
            os.makedirs(self.DataSaveFolder)
        if self.IFVisualAnalysis:
            self.VisualFolder=os.path.join(self.DataSaveFolder,'VisualAnalysisResults')
            if not os.path.exists(self.VisualFolder):
                os.makedirs(self.VisualFolder)
        if self.IfDummy:
            self.DummyFolder=os.path.join(self.DataSaveFolder,'Dummy')
            if not os.path.exists(self.DummyFolder):
                os.makedirs(self.DummyFolder)
        if self.IfNormalize:
            self.NormalizeFolder=os.path.join(self.DataSaveFolder,'Normalize')
            if not os.path.exists(self.NormalizeFolder):
                os.makedirs(self.NormalizeFolder)
        self.ProcessedDatasetFolder=os.path.join(self.DataSaveFolder,'ProcessedDataset')
        if not os.path.exists(self.ProcessedDatasetFolder):
            os.makedirs(self.ProcessedDatasetFolder)
    def VisualAnalysis(self):
        # Visual analysis of the data
        for i in tqdm.tqdm(self.CategoryVariables,desc='Category Var Visual Analysis'):
            plt.figure(figsize=(20, 12))
            sns.boxplot(x = i, y = self.TargetVariable, data = self.RawData)
            plt.title('Boxplot of '+i+' vs '+self.TargetVariable)
            plt.savefig(os.path.join(self.VisualFolder,i+'.png'))
            if self.IFShowVisualResult:
                plt.show()
        print('The boxplot of the categorial variables have been saved in the folder:',self.VisualFolder)
        for i in tqdm.tqdm(self.ContinuousVariables,desc='Continuous Var Visual Analysis'):
            plt.figure(figsize=(20, 12))
            sns.boxplot(self.RawData[i])
            plt.title('Boxplot of '+i)
            plt.savefig(os.path.join(self.VisualFolder,i+'.png'))
            if self.IFShowVisualResult:
                plt.show()
        print('The boxplot of the categorial variables have been saved in the folder:',self.VisualFolder)
        Varlist=self.ContinuousVariables+[self.TargetVariable]
        sns.pairplot(self.RawData[Varlist])
        plt.title('Pairplot of Continuous Variables Vs Target Variable')
        plt.savefig(os.path.join(self.VisualFolder,'Pairplot of Continuous Variables Vs Target Variable.png'))
        if self.IFShowVisualResult:
            plt.show()
        print('The pairplot of the continuous variables have been saved in the folder:',self.VisualFolder)
    def DummyVariables(self):
        # Create the dummy variables for the categorial variables
        for i in tqdm.tqdm(self.CategoryVariables,desc='Creating Dummy Variables'):
            DummyData=pd.get_dummies(self.RawData[i],drop_first=False, prefix=i)
            DummyResult=pd.concat([self.RawData,DummyData],axis=1)
            # Remove the original categorial variables
            Columns=self.RawData.columns.tolist()
            # Save the dummy variables and the key variable
            Columns.remove('instant')
            DummyResult.drop(Columns,axis=1,inplace=True)
            DummyResult.to_csv(os.path.join(self.DummyFolder,i+'.csv'),index=False)
    def NormalizeVariables(self,target_mean=0,target_sd=1):
        # Normalize the continuous variables
        for i in tqdm.tqdm(self.ContinuousVariables,desc='Normalizing Variables'):
            Mean=self.RawData[i].mean()
            SD=self.RawData[i].std()
            NormalizedData=(self.RawData[i]-Mean)/SD
            NormalizedData=NormalizedData*target_sd+target_mean
            NormalizedData=pd.concat([self.RawData['instant'],NormalizedData],axis=1)
            NormalizedData.to_csv(os.path.join(self.NormalizeFolder,i+'.csv'),index=False)
    def ExcludeUselessFeatures(self):
        # Exclude the useless features
        self.UselessFeatureList=[]
        for i in self.CategoryVariables:
            corr=self.RawData[[i,self.TargetVariable]].corr()[self.TargetVariable][i]
            print('The correlation between',i,'and',self.TargetVariable,'is:',corr)
            if abs(corr)<self.UnCorrelatedBar:
                self.UselessFeatureList.append(i)
        for i in self.ContinuousVariables:
            corr=self.RawData[[i,self.TargetVariable]].corr()[self.TargetVariable][i]
            print('The correlation between',i,'and',self.TargetVariable,'is:',corr)
            if abs(corr)<self.UnCorrelatedBar:
                self.UselessFeatureList.append(i)
        print('The bar for uncorrelated features is:',self.UnCorrelatedBar)        
        print('The features:',self.UselessFeatureList,'are uncorrelated with the target variable:',self.TargetVariable)
        self.RawData.drop(self.UselessFeatureList,axis=1,inplace=True)
        print('The uncorrelated features have been dropped.')
    def SeparateData(self):
        # Update the data
        # Update the data
        Date=pd.to_datetime(self.RawData['dteday'].unique()).strftime('%Y-%m-%d')
        Date=Date.sort_values(ascending=False)
        ValidationDate=[]
        print('Rawdata is form:',Date[0],'to',Date[-1],'totally',len(Date),'days.')
        for i in range(0,self.ValidationDays):
            ValidationDate.append(Date[i])
        self.ValidationData=self.RawData[self.RawData['dteday'].isin(ValidationDate)]
        self.ValidationData.drop('dteday',axis=1,inplace=True)
        print('The validation data is from:',ValidationDate[0],'to',ValidationDate[-1],'totally',len(ValidationDate),'days','with',len(self.ValidationData),'records.')
        TrainDate=Date[self.ValidationDays:]
        self.TrainData=self.RawData[self.RawData['dteday'].isin(TrainDate)]
        self.TrainData.drop('dteday',axis=1,inplace=True)
        print('The training data is from:',TrainDate[0],'to',TrainDate[-1],'totally',len(TrainDate),'days','with',len(self.TrainData),'records.')
    def CreatDataset(self):
        # Create the dataset
        self.SeparateData()
        for i in self.CategoryVariables:
            if (i not in self.UselessFeatureList):
                DummyData=pd.read_csv(os.path.join(self.DummyFolder,i+'.csv'))
                self.ValidationData.drop(i,axis=1,inplace=True)
                self.ValidationData=pd.merge(self.ValidationData,DummyData,on='instant',how='left')
        for i in self.ContinuousVariables:
            if (i not in self.UselessFeatureList):
                NormalizedData=pd.read_csv(os.path.join(self.NormalizeFolder,i+'.csv'))
                self.ValidationData.drop(i,axis=1,inplace=True)
                self.ValidationData=pd.merge(self.ValidationData,NormalizedData,on='instant',how='left')
        for i in self.CategoryVariables:
            if (i not in self.UselessFeatureList):
                DummyData=pd.read_csv(os.path.join(self.DummyFolder,i+'.csv'))
                self.TrainData.drop(i,axis=1,inplace=True)
                self.TrainData=pd.merge(self.TrainData,DummyData,on='instant',how='left')
        for i in self.ContinuousVariables:
            if (i not in self.UselessFeatureList):
                NormalizedData=pd.read_csv(os.path.join(self.NormalizeFolder,i+'.csv'))
                self.TrainData.drop(i,axis=1,inplace=True)
                self.TrainData=pd.merge(self.TrainData,NormalizedData,on='instant',how='left')
        np.save(os.path.join(self.ProcessedDatasetFolder,'ValidationData.npy'),self.ValidationData)
        np.save(os.path.join(self.ProcessedDatasetFolder,'TrainData.npy'),self.TrainData)
        print('The preprocessed data has been saved in the folder:',self.ProcessedDatasetFolder)
        print('The preprocessed data has been saved as ValidationData.npy and TrainData.npy.')
    def __init__(self,IFVisualAnalysis=False,IFShowVisualResult=False,IfCheckInfo=False,IfDummy=True,IfNormalize=True):
        # Read the source data
        self.RawData=pd.read_csv(self.SourceDataPath)
        # Drop the overlooked features
        self.DropOverlookedVariables()
        print('The overlooked features:',self.OverlookedFeatures,' have been dropped.')
        # Check the basic information of the source data
        if IfCheckInfo:
            self.RawData.info()
        # Set the parameters for preprocessing
        self.IFVisualAnalysis=IFVisualAnalysis
        self.IFShowVisualResult=IFShowVisualResult
        self.IfDummy=IfDummy
        self.IfNormalize=IfNormalize
        # Create the folder for saving the preprocessed data
        self.FileSystemMaker()
        if self.IFVisualAnalysis:
            print('The visual analysis results have been saved in the folder:',self.VisualFolder)
            self.VisualAnalysis()
        if self.IfDummy:
            print('The values of the categorial variables are:',self.CategoryVariables)
            self.DummyVariables()
            print('The dummy variables have been created.')
            print('The dummy variables have been saved in the folder:',self.DummyFolder)
        if self.IfNormalize:
            print('The values of the continuous variables are:',self.ContinuousVariables)
            self.NormalizeVariables()
            print('The continuous variables have been normalized.')
            print('The normalized data has been saved in the folder:',self.NormalizeFolder)
        self.ExcludeUselessFeatures()

In [207]:
# Check the basic information of the source data
DataPreProcess=DataPreProcess(IFVisualAnalysis=False,IFShowVisualResult=False,IfCheckInfo=False,IfDummy=True,IfNormalize=True)

The overlooked features: ['casual', 'registered']  have been dropped.
The values of the categorial variables are: ['season', 'yr', 'mnth', 'hr', 'holiday', 'weekday', 'workingday', 'weathersit']


Creating Dummy Variables: 100%|██████████| 8/8 [00:00<00:00, 57.60it/s]


The dummy variables have been created.
The dummy variables have been saved in the folder: DataPreProcessed\Dummy
The values of the continuous variables are: ['temp', 'atemp', 'hum', 'windspeed']


Normalizing Variables: 100%|██████████| 4/4 [00:00<00:00, 58.11it/s]

The continuous variables have been normalized.
The normalized data has been saved in the folder: DataPreProcessed\Normalize
The correlation between season and cnt is: 0.17805573098267663
The correlation between yr and cnt is: 0.2504948988596485
The correlation between mnth and cnt is: 0.12063776021315144
The correlation between hr and cnt is: 0.39407149778294204
The correlation between holiday and cnt is: -0.030927303249110614
The correlation between weekday and cnt is: 0.02689985999083953
The correlation between workingday and cnt is: 0.030284367747910722
The correlation between weathersit and cnt is: -0.14242613813809568
The correlation between temp and cnt is: 0.4047722757786578
The correlation between atemp and cnt is: 0.4009293041266357
The correlation between hum and cnt is: -0.32291074082456017
The correlation between windspeed and cnt is: 0.09323378392612537
The bar for uncorrelated features is: 0.1
The features: ['holiday', 'weekday', 'workingday', 'windspeed'] are uncorrelate




In [208]:
DataPreProcess.CreatDataset()

Rawdata is form: 2012-12-31 to 2011-01-01 totally 731 days.
The validation data is from: 2012-12-31 to 2012-12-11 totally 21 days with 502 records.
The training data is from: 2012-12-10 to 2011-01-01 totally 710 days with 16877 records.
The preprocessed data has been saved in the folder: DataPreProcessed\ProcessedDataset
The preprocessed data has been saved as ValidationData.npy and TrainData.npy.
