<a href="https://colab.research.google.com/github/jdasam/mas1004/blob/2024/live_coding/2_function_approximation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Function Approximation

## A. Regression with One Variable
- Regression:
  - Process of training a model to predict a **continuous** numerical output based on one or more input features.
  - e.g.: Predicts the child's height based on the parent's height.
  - Why is it named "regression"?:
    - The term was first used by Sir Francis Galton, a British statistician and cousin of Charles Darwin, in the late 19th century. Galton was studying the relationship between heights of parents and their children. He observed that although tall parents often had tall children, the children's heights tended to "regress" towards the average or mean height of the population. Similarly, children of short parents were often short but their heights still regressed towards the average.

- Let's make function that works as f(x) = ax+b
    -  In this cell, we will create a function that follows the linear equation format f(x) = ax + b. This function will take an input x and return value that is the result of the equation. The variables a and b are coefficients that we will define. The variable a is the slope of the line and b is the y-intercept. This function will help us understand the concept of function approximation in the context of regression.


In [1]:
# Let's make function that works as f(x) = ax+b

2.5

In [1]:
# Let's plot this function
# First, let's make many x candidates
# from -5 to 5, with 500 total x

In [3]:
# Now, let's make y
# y = f(x)

# Using for loop
ys = []

# Using list comprehension
ys = []


First 10 numbers: [-9.5, -9.46, -9.42, -9.38, -9.34, -9.3, -9.26, -9.22, -9.18, -9.14]
Last 10 numbers: [10.14, 10.18, 10.22, 10.26, 10.3, 10.34, 10.38, 10.42, 10.46, 10.5]


In [4]:
# Check the length of xs and ys are equal
# The item of xs and ys are 1:1 mapping


(501, 501)

In [5]:
# We can check that i-th value of xs and ys are 1:1 mapping


ys[23] = my_function(xs[23])
xs[23]: -4.54, ys[23]: -8.58, my_function(xs[23]): -8.58


In [8]:
# Let's plot this function


In [7]:
# Let's add some noise

In [2]:
# plot again

##### (Extra) Short Explanation about Random Number Generator
- Random number generator is a function that generates a sequence of numbers that seem to occur in random order.

In [3]:
# random is not actually complete random
# usually computers uses pseudo-random

# random.random() in fact has its destiny. It will always return same value
# if the seed is the same

# No matter how many times you run this code, it will always return same value

### 1) Guess Regression manually
- Since we know the function f(x) = ax + b, we can guess the value of a and b.
- If we select correct a and b, the line will be the best fit line for the data.
    - What machine learning does is to find the best a and b automatically.
    - But here, we will find the best a and b manually, so that we can understand the concept of machine learning better.

In [4]:
# Let's suppose we try guessing a and b manually


#### 1-1) Calculating Error
- How can we calculates how good or bad our estimation is?
    - We can calculate the error between the actual value and the predicted value.
        - There are many ways to calculate the error.
            - For example, we can calculate the absolute value of the difference between the actual value and the predicted value.
            - Or we can calculate the square of the difference between the actual value and the predicted value.
    - We call this error value as **loss**.
        - Sometimes, we call this error as **cost**.
    - The function that calculates the loss is called **loss function**.
        - Sometimes, we call this loss function as **cost function** or **objective function**.

In [5]:
def cal_error(pred, target):
  return abs(pred-target) 


In [6]:
# compare every value in y_noise and estimation
# we have to compare values in the same idx


In [7]:
# change a and b to get better estimation



#### (Extra) calculate gradient
- How can we calculate the slope (gradient) of each parameter?
  - We can calculate the gradient of the loss function with respect to each parameter.
  - One brutal way to calculate the gradient is to calculate the loss function for each parameter and see how the loss function changes when we change the parameter a little bit.

##### (Extra) Naming Convention
- CamelCase
  - uses upper letter to distinguish words
    - MyModelFunction
    - myModelFunction
- snake_case
  - uses underbar
    - my_model_function

### 2. Using Artificial Neural Network

#### 2-1) Define the function
- We want to design more complex function that is not linear.
    - And we will try approximating the function using neural network.

- To do this, we will practice with ``class``
    - ``class`` is a template for creating objects.
        - It has attributes and methods.
            - Attributes are variables that store data.
            - Methods are functions that are defined inside the class.
        - We can create an object from a class.
            - We call this process as **instantiation**.
            - The object that is created from a class is called **instance**.
        - We can access the attributes and methods of an object using dot notation.
            - ``object.attribute``
            - ``object.method()``
        - We can define a class using ``class`` keyword.
            - ``class ClassName:``
        - We can define a method using ``def`` keyword.
            - ``def method_name(self, arguments):``
            - Every method should have ``self`` as the first argument.
                - ``self`` is a reference to the current instance of the class.
                - We can access the attributes and methods of the class using ``self``.
                - ``self`` is not a keyword. You can use any word instead of ``self``.
                    - But it is a convention to use ``self``.
            - There are special methods that are defined using double underscore.
                - ``__init__`` is a special method that is called when an instance of a class is created.
                    - We call this process as **instantiation**.
                    - ``class_instance = ClassName(arguments)``
                - ``__call__`` is a special method that is called when an instance of a class is called.
                    - ``class_instance(arguments)``
                

        - We can define an attribute using ``self``.
            - ``self.attribute_name = value``
    



#### 2-2) Making Artificial Neuron
An artificial neuron is a mathematical function designed to model the behavior of biological neurons. It serves as a fundamental building block of neural networks in machine learning.

##### Basic Components
- Input: Receives various forms of information (or data) that the neural network will learn from.
- Weights: These are parameters that transform input data within the neuron's internal function.
- Bias: An additional parameter to shift the activation function.
- Activation Function: This function processes the incoming information, and depending on its output, the artificial neuron activates or not.

$\text{Output} = \text{Activation Function}(\sum^n_{i=1}(\text{Input}[i] \times \text{Weight}[i]) + \text{Bias})$


#### 2-4) Implementing following digaram in Python
- Ignoring the bias and activation now
![data_ai_figure.jpg](https://github.com/jdasam/mas1004-2023/blob/main/live_coding/data_ai_figure.jpg?raw=true)

#### 2-5) Implementing it as a matrix and layer
- Using ``torch`` library

#### 2-6) Make two layers model for function approximation

#### 2-7) Combination of Linear Operation
- Let's suppose the first layer of the neural network takes 1-dim input and has 4 neurons
    - Each neuron has 1 weight and 1 bias
    - weights = $[w_1, w_2, w_3, w_4]$
    - bias = $[b_1, b_2, b_3, b_4]$
    - for input $x$, the result is
        - $[w_1 x + b_1, w_2 x + b_2, w_3 x + b_3, w_4 x + b_4]$
        - or, we can notate them using $o_n$, so that
            - $o_n = w_n x + b_n$
            - $o_1 = w_1 x + b_1$
- So the second layer takes $[o_1, o_2, o_3, o_4]$ as an input
    - This layer has only one neuron, and it has 4 weights and 1 bias
        - Because the input dimension is 4
    - weights = $[u_1, u_2, u_3, u_4]$
    - bias = $c$
    - output = $u_1 o_1 + u_2 o_2 + u_3 o_3 + u_4 o_4 + c$
        - = $u_1 (w_1 x + b_1) + u_2 (w_2 x + b_2) + u_3 (w_3 x + b_3) + u_4 (w_4 x + b_4) + c$
        - = $ (u_1 w_1 + u_2 w_2 + u_3 w_3 + u_4 w_4) x + (u_1 b_1 + u_2 b_2 + u_3 b_3 + u_4 b_4 + c)$
    - Therefore, if we replace the equation above using new symbol $v, d$,
        - $v = u_1 w_1 + u_2 w_2 + u_3 w_3 + u_4 w_4$
        - $d = u_1 b_1 + u_2 b_2 + u_3 b_3 + u_4 b_4 + c$
        - output = $v x + d$
            - Which is a linear equation
- So, we can say that the combination of linear operations is also a linear operation
    - **If** there is NO non-linear activation function between layers

#### 2-8) Combination of Linear and Non-linear Operations

#### 2-9) Training the model

#### 2-10) Visualizing the result
- Draw how each neuron in the first layer is activated

### Update plot in for loop
```
from IPython import display
import matplotlib.pyplot as plt

for i in range(10):
  plt.plot([i], [i], 'o')
  display.clear_output(wait=True)
  display.display(plt.gcf())
  plt.close()
```