# **MAI Assignment**
<hr>
Name: Goh Pin Pin Isaac<br>
Admin Number: P2317623<br>
Class: DAAA/FT/2B/23<hr>

# **Background**
Raising children in today's world is more important than ever, as we are seeing a troubling rise in aggression among the younger generation. This increase can be linked to various factors such as exposure to violent media, societal pressures, and the fast-paced, stressful nature of modern life. It is crucial for parents, educators, and communities to provide a nurturing and supportive environment that promotes emotional well-being and positive behavior.

By teaching values like empathy, patience, and respect, we can help children develop healthy ways to express themselves and manage their emotions. Ensuring that children have strong role models and the tools to navigate challenges will not only reduce aggression but also foster a more compassionate and harmonious society.

Hence, a study was carried out to explore the relationship between aggression and several potential predicting factors in 666 children who had an older sibling.<br><br>
[**Case Study & Dataset**](https://rdrr.io/github/profandyfield/discovr/man/child_aggression.html)
<br>
<hr>

# **About Dataset**
- **aggression**: The child's aggression (high score = more aggression seen)
- **television**: Time spent watching television (high score = more time spent watching television)
- **computer_games**: Time spent playing video games (high score = more time spent playing video games)
- **sibling_aggression**: Aggression in older sibling (high score = more aggression seen in their older sibling)
- **diet**: The child's diet (high score = the child has a good diet low in harmful additives)
- **parenting_style**: the parent's parenting style (high score = bad parenting practices)
<br><br>
<hr>


## **Importing**

In [2]:
# importing libriaries
import pandas as pd
import numpy as np

# plotting
import matplotlib.pyplot as plt
import seaborn as sns

# gradient descent
import sympy as sp

In [3]:
# importing data
child_aggression = pd.read_csv('./data/child_aggression.csv')
child_aggression.drop(['television', 'computer_games','diet'], axis=1, inplace=True)

In [3]:
child_aggression.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 666 entries, 0 to 665
Data columns (total 3 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   aggression          666 non-null    float64
 1   sibling_aggression  666 non-null    float64
 2   parenting_style     666 non-null    float64
dtypes: float64(3)
memory usage: 15.7 KB


In [4]:
child_aggression.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
aggression,666.0,-0.005011186,0.319404,-1.295608,-0.174279,-0.005548,0.149611,1.178823
sibling_aggression,666.0,0.008274587,0.326806,-1.433127,-0.156414,0.008459,0.185136,1.103671
parenting_style,666.0,-6.001206e-17,1.0,-4.460406,-0.580084,0.027364,0.517836,3.993256


# $$Error Function Formula$$

$$\frac{1}{n} \sum_{i=1}^{n} e_i^2 = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 = \frac{1}{n} \left[ (y_1 - \hat{y}_1)^2 + (y_2 - \hat{y}_2)^2 + \cdots + (y_n - \hat{y}_n)^2 \right]$$

# **MODEL 1:  SLR with intercept a fixed** ($𝒚̂_𝒊 =𝒃𝒙_𝒊$)

In [43]:
# X and y data
x_data = child_aggression['parenting_style'].values
y_data = child_aggression['aggression'].values

In [44]:
# Define the symbols
b = sp.symbols('b')

In [45]:
# Define the error function (Mean Squared Error)
error_function = sum((y - b*x)**2 for x, y in zip(x_data, y_data))/len(x_data)
error_function.simplify()

0.998498498498497*b**2 - 0.13438726075923*b + 0.101890580668516

# $$Error Function$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(y - bx_1)^2 + \\ 
(y - bx_2)^2 + \\ 
(y - bx_3)^2 + \\ 
\vdots \\
(y - bx_{664})^2 + \\ 
(y - bx_{665})^2 + \\ 
(y - bx_{666})^2
\end{array} \right] 
$$

### $$Substitute$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(0.374 - (-0.279b))^2 + \\ 
(0.771 - (-1.248b))^2 + \\ 
(-0.098 - (-0.328b))^2 + \\
\vdots \\
(-0.062 - (-0.335b))^2 + \\
(0.104 - (-0.475b))^2 + \\
(-0.316 - (-1.102b))^2
\end{array} \right] 
$$

### $$Simplify$$

$$
E(b) = 0.998b^2−0.134b+0.102
$$

### $$E'(b)$$
$$
E'(b) = 2b−0.134
$$

The function $E(b)=b^2-0.134b+0.102$ is a **parabola** that opens upwards because the coefficient of 665 is positive. <br>
A parabola that opens upwards has `exactly one minimum point`.

In [46]:
# Initialize the variables for gradient descent
b_value = 0.0  # Starting value of b
learning_rate = 0.15
epsilon = 0.00001
max_iter = 1000
diff = 1
iter_count = 0

In [47]:
# differentiating the error function
error_derivative = error_function.diff(b)
error_derivative_function = sp.lambdify(b, error_derivative, 'numpy') # Convert the symbolic derivative to a numerical function

# Gradient descent algorithm
while diff > epsilon and iter_count < max_iter:
    b_new = b_value - learning_rate * error_derivative_function(b_value)
    diff = abs(b_new - b_value)
    b_value = b_new
    iter_count += 1
    print(f"Iteration {iter_count}: b-value is {b_value}")

Iteration 1: b-value is 0.0201580891138845
Iteration 2: b-value is 0.03427783171392523
Iteration 3: b-value is 0.044168011778368196
Iteration 4: b-value is 0.05109559285954335
Iteration 5: b-value is 0.05594802014838451
Iteration 6: b-value is 0.05934690502863136
Iteration 7: b-value is 0.0617276554740295
Iteration 8: b-value is 0.06339525319591874
Iteration 9: b-value is 0.06456332277138621
Iteration 10: b-value is 0.06538149763167986
Iteration 11: b-value is 0.06595458858111979
Iteration 12: b-value is 0.06635601039480406
Iteration 13: b-value is 0.06663718648501984
Iteration 14: b-value is 0.06683413640406739
Iteration 15: b-value is 0.06697209006358042
Iteration 16: b-value is 0.0670687197665276
Iteration 17: b-value is 0.06713640408548385
Iteration 18: b-value is 0.0671838135971852
Iteration 19: b-value is 0.06721702161101203
Iteration 20: b-value is 0.06724028217925561
Iteration 21: b-value is 0.06725657505475956
Iteration 22: b-value is 0.06726798740674543
Iteration 23: b-value 

In [95]:
print(f"Model 1:")
print(f"Aggression = {b_value:.4f}parenting_style")
print()
print(f"Number of iterations is {iter_count}")
print(f"The local minimum occurs when b: {b_value:.4f}")
print(f"Minimum error: {error_derivative_function(b_value):.4f}")

Model 1:
Aggression = 0.0673parenting_style

Number of iterations is 34
The local minimum occurs when b: 0.0673
Minimum error: -0.0001


![image](./Images/model1.png)

# **MODEL 2:  SLR with intercept 𝒂** ($𝒚̂𝒊 =𝒂+𝒃𝒙𝒊$)

In [48]:
next_a = 1 # Initial value of a
next_b = 1 # Initial value of b
alpha = 0.0005 # Learning rate
epsilon = 0.00001 # Stopping criterion constant
max_iters = 1000 # Maximum number of iterations

In [49]:
# Define the symbols 
a = sp.Symbol('a')
b = sp.Symbol('b')

In [50]:
func_expr = sum((y_data[i] - (a + b * x_data[i]))**2 for i in range(len(x_data)))
func_expr.simplify()

666.0*a**2 - 6.3948846218409e-14*a*b + 6.67490040749479*a + 665.0*b**2 - 89.5019156656474*b + 67.8591267252315

# $$Error Function$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(y - (a + bx_1))^2 + \\ 
(y - (a + bx_2))^2 + \\ 
(y - (a + bx_3))^2 + \\ 
\vdots \\
(y - (a + bx_{664}))^2 + \\ 
(y - (a + bx_{665}))^2 + \\ 
(y - (a + bx_{666}))^2
\end{array} \right] 
$$

### $$Substitute$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(0.374 - (a + -0.279b))^2 + \\ 
(0.771 - (a + -1.248b))^2 + \\ 
(-0.098 - (a + -0.328b))^2 + \\ 
\vdots \\
 (-0.062 - (a + -0.335b))^2 +  \\ 
 (0.104 - (a + -0.475b))^2 +  \\ 
 (-0.316 - (a + -1.102b))^2
 \end{array} \right] 
$$

### $$Simplify$$

$$
E(a,b) = 666a^2 − 6.395*10^{−14}ab + 6.675a + 665b^2 − 89.502b + 67.859
$$

### $$E'(a,b)$$
$$
\frac{∂E}{∂a} = 1332a−5.684*10^{−14}b+6.675
$$
$$
\frac{∂E}{∂b} = −5.68*10^{−14}a+1330b−89.502
$$

In [51]:
# Calculate partial derivatives
partial_a = sp.diff(func_expr, a)
partial_b = sp.diff(func_expr, b)

# Convert symbolic expressions to numerical functions
partialf_a = sp.lambdify((a, b), partial_a)
partialf_b = sp.lambdify((a, b), partial_b)
func = sp.lambdify((a, b), func_expr)

next_func = func(next_a, next_b)                                    # Initial value of function

for n in range(max_iters):
    current_a = next_a
    current_b = next_b
    current_func = next_func
    next_a = current_a - alpha * partialf_a(current_a, current_b)   # update of a
    next_b = current_b - alpha * partialf_b(current_a, current_b)   # update of b
    next_func = func(next_a, next_b)
    change_func = abs(next_func - current_func)                     # stopping criterion: values of function converge
    print("Iteration", n+1, ": a = ", next_a, ", b = ", next_b, ", f(a,b) = ", next_func)
    if change_func < epsilon:
        break

Iteration 1 : a =  0.33066254979625254 , b =  0.37975095783282364 , f(a,b) =  204.796926188995
Iteration 2 : a =  0.10710384142820084 , b =  0.1719675287068196 , f(a,b) =  80.48838459383433
Iteration 3 : a =  0.032435232833271585 , b =  0.10236007994960826 , f(a,b) =  66.5824616222365
Iteration 4 : a =  0.007495917562565218 , b =  0.07904158461594246 , f(a,b) =  65.0268451611675
Iteration 5 : a =  -0.0008338137378507103 , b =  0.07122988867916442 , f(a,b) =  64.85282117796832
Iteration 6 : a =  -0.00361594399218963 , b =  0.06861297054034378 , f(a,b) =  64.83335325830508
Iteration 7 : a =  -0.0045451754971388295 , b =  0.06773630296383887 , f(a,b) =  64.8311753787749
Iteration 8 : a =  -0.004855538819791862 , b =  0.06744261932570972 , f(a,b) =  64.83093173684605
Iteration 9 : a =  -0.004959200169557975 , b =  0.06734423530693646 , f(a,b) =  64.83090448009573
Iteration 10 : a =  -0.004993823060379857 , b =  0.06731127666064741 , f(a,b) =  64.83090143079697


In [52]:
print(f"Model 2:")
print(f"Aggression = {next_a:.4f} + {next_b:.4f}parenting_style")
print()
print(f"Number of iterations is {n+1}")
print(f"The optimal value of intercept is: {next_a:.4f}")
print(f"The local minimum occurs when b: {b_value:.4f}")
print(f"Minimum error: {error_derivative_function(b_value):.4f}")

Model 2:
Aggression = -0.0050 + 0.0673parenting_style

Number of iterations is 10
The optimal value of intercept is: -0.0050
The local minimum occurs when b: 0.0673
Minimum error: -0.0000


![image](./Images/model2.png)

# **MODEL 3:  MLR** ($𝒚̂𝒊=𝒂+𝒃𝒙𝒊+𝒄𝒘𝒊$)

| Predictors | Variables | 
| -------- | ------- |
| Aggressin | Parenting Style |
|           | Sibling Aggression | 

In [4]:
# X and y data
x_data1 = child_aggression['parenting_style'].values
x_data2 = child_aggression['sibling_aggression'].values
y_data = child_aggression['aggression'].values

# Combine input variables
X = np.column_stack((x_data1, x_data2))

In [5]:
# Set up parameters
feature_count = X.shape[1]
params = [0,0,0]
learning_rate = 0.001
convergence_threshold = 0.00001
max_iterations = 1000

In [6]:
symbols = sp.symbols('x0:%d' % (feature_count + 1))

# Define prediction function (Array with all the data points with symbols)
prediction = symbols[0] + sum(symbols[i+1] * X[:, i] for i in range(feature_count))

# Define loss function (sum of squared errors)
loss_expr = sum((y_data[i] - prediction[i])**2 for i in range(len(y_data)))

# Calculate gradients
gradients = [sp.diff(loss_expr, symbol) for symbol in symbols]

In [7]:
loss_expr.simplify()

666.0*x0**2 - 4.9737991503207e-14*x0*x1 + 11.0217504158594*x0*x2 + 6.67490040749478*x0 + 665.0*x1**2 + 75.6163788266224*x1*x2 - 89.5019156656473*x1 + 71.0688739115785*x2**2 - 17.9002518078758*x2 + 67.8591267252315

In [8]:
gradients

[1332.0*x0 - 5.6843418860808e-14*x1 + 11.0217504158594*x2 + 6.67490040749499,
 -5.6843418860808e-14*x0 + 1330.0*x1 + 75.6163788266224*x2 - 89.5019156656474,
 11.0217504158594*x0 + 75.6163788266224*x1 + 142.137747823157*x2 - 17.9002518078758]

# $$Error Function$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(y - (a + bx_1 + cw_1))^2 + \\
(y - (a + bx_2 + cw_2))^2 + \\
(y - (a + bx_3 + cw_3))^2 + \\
\vdots \\
(y - (a + bx_{664} + cw_{664}))^2 + \\
(y - (a + bx_{665} + cw_{665}))^2 + \\
(y - (a + bx_{666} + cw_{666}))^2
\end{array} \right] 
$$

### $$Substitute$$

$$
E(b) = \frac{1}{666} \left[ \begin{array}{c}
(0.374 - (a + -0.279b + -0.328c))^2 + \\
(0.771 - (a + -1.248b + 0.577c))^2 + \\
(-0.098 - (a + -0.328b + -0.217c))^2 + \\
\vdots \\
(-0.062 - (a + -0.335b + 0.306c))^2 + \\
(0.104 - (a + -0.475b + 0.094c))^2 + \\
(-0.316 - (a + -1.102b + 0.149c))^2
\end{array} \right] 
$$

### $$Simplify$$

$$
E(a,b,c) = 666a^2​ −4.974*10^{−14}a​b​ + 11.022a​c​ + 6.675a​ +665b^2​ +75.616b​c​ − 89.502b​ + 71.069c^2​ −17.900c​ + 67.859

$$
### $$E'(a,b,c)$$
$$
\frac{∂E}{∂a} = 1332a - 5.684*10^{−14}b + 11.022c + 6.675
$$
$$
\frac{∂E}{∂b} = -5.684*10^{−14}a + 1330b + 75.616c - 89.502
$$
$$
\frac{∂E}{∂c} = 11.022a + 75.616b + 142.138c - 17.900
$$

In [9]:
# Convert symbolic expressions to numerical functions
grad_funcs = [sp.lambdify(symbols, gradient) for gradient in gradients]
loss_func = sp.lambdify(symbols, loss_expr)

current_loss = loss_func(*params)  # Initial loss value

for iteration in range(max_iterations):
    old_params = params.copy()
    old_loss = current_loss
    
    # Update parameters
    for i in range(len(params)):
        params[i] = old_params[i] - learning_rate * grad_funcs[i](*old_params)
    
    current_loss = loss_func(*params)
    loss_change = abs(current_loss - old_loss)
    print(f"Iteration {iteration+1}: params = {params}, loss = {current_loss}")
    
    if loss_change < convergence_threshold:
        break

print("Final parameters:", params)

Iteration 1: params = [-0.00667490040749499, 0.0895019156656474, 0.0179002518078758], loss = 64.98287838184005
Iteration 2: params = [-0.004656125580014091, 0.05861273127418749, 0.026561960463972866], loss = 64.57214779570491
Iteration 3: params = [-0.005421826013720142, 0.06815119508034398, 0.036305991195687656], loss = 64.46022027080734
Iteration 4: params = [-0.005275009744499157, 0.06426669370520476, 0.04395220261096015], loss = 64.39861503599374
Iteration 5: params = [-0.005408027379726552, 0.06497040034003501, 0.05080371251154482], loss = 64.35519078077566
Iteration 6: params = [-0.005439381156927092, 0.06422009078236407, 0.05662961856536657], loss = 64.32314028944984
Iteration 7: params = [-0.00549318338536798, 0.06402715901722136, 0.061684524718925784], loss = 64.29931399154061
Iteration 8: params = [-0.005531034959525726, 0.06370859280107789, 0.06603611969253509], loss = 64.28158228377494
Iteration 9: params = [-0.005566430430615385, 0.06348466779838079, 0.06979369477144147], 

In [10]:
intercept, coef1, coef2 = params

print(f"Model3:")
print(f"Aggression = {intercept:.4f} + {coef1:.4f}parenting_style + {coef2:.4f}sibling_aggression")
print()
print(f"Number of iterations is {iteration+1}")
print(f"Intercept = {intercept:.4f}")
print(f"Coefficient for parenting_style = {coef1:.4f}")
print(f"Coefficient for sibling_aggression = {coef2:.4f}")
print(f"Minimum error: {loss_func(intercept, coef1, coef2):.4f}")

Model3:
Aggression = -0.0058 + 0.0620parenting_style + 0.0928sibling_aggression

Number of iterations is 34
Intercept = -0.0058
Coefficient for parenting_style = 0.0620
Coefficient for sibling_aggression = 0.0928
Minimum error: 64.2300


![image](./Images/model3.png)