# Lineare Regression 1D

<p>In dieses &Uuml;bung wird eine zuf&auml;llige Punkteverteilung erzeugt, f&uuml;r die eine Approximation mittels einer Geraden gesucht wird. Zu Beginn sind die Parameter theta0 (y-Achsenabschnitt) und theta1 (Steigung) der Geraden zuf&auml;llig gesetzt. Danach werden sie mit dem Gradientenverfahren ge&auml;ndert.</p>

<p>Sie k&ouml;nnen das hier gezeigte Jupyter Notebook <a href="01_LinearRegression_1D_Vorlage.ipynb" target="_blank">downloaden</a>&nbsp;und auf ihren eigenen Rechner ausf&uuml;hren.</p>

<hr />

<h2>Einleitung</h2>

<p>Zun&auml;chst wird numpy f&uuml;r den Umgang und matplot zum Plotten von Matrizen importiert. Danach wird der Zufallsgenerator von Numpy geseedet und die Anzahl an Punkte die sp&auml;ter erzeugt werden soll festgelegt.</p>


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas
from datetime import datetime

# enable interactive plots
%matplotlib notebook

In [2]:
def rssiToDistance(rssi):
    return np.power(10, (t - rssi) / k)

def calc_error(y, prediction):
    d = y - prediction
    d = np.square(d)
    e = np.sum(d)
    return e

def get_file(path):
    return pandas.read_csv(path,sep=';')[['time_ms','rssi_median', 'distance_median']]

In [3]:
all_measures = [[get_file('data.csv'), 25]]



In [4]:
k = 30 # env_factor
t = -56 # base_power

actual_dist = []
rssi = []

for df in all_measures:
    
    # Calculate time and distances based on it
    times = df[0]['time_ms'].to_numpy()
    first_time = datetime.strptime(times[0], '%Y-%m-%d %H:%M:%S')

    for idx,item in enumerate(times):
        current_time = datetime.strptime(item, '%Y-%m-%d %H:%M:%S')
#         times[idx] = str(current_time - first_time)
        actual_dist.append(int(np.floor((max((current_time - first_time).seconds - 1, 0)/30))))
    
    rssi = [*rssi, *(df[1] + df[0]['rssi_median'].to_numpy())] #get rssi

predictions = []
rssi = np.array(rssi)
actual_dist = np.array(actual_dist)

print(rssi)
print(actual_dist)

# learn rate
alpha = 0.5 / len(rssi)

tx_power = 25

# train for 30 iterations
for i in range(400):


    # prediction with input data
    dist_pred = rssiToDistance(rssi)
    

    # partial derivative
    t_deri_sum =  np.sum((2*np.log(10)*np.power(10, (t-rssi)/k)*(np.power(10, (t-rssi)/k) - actual_dist))/k)
    k_deri_sum = np.sum(-(2*np.log(10)*(t-rssi)*np.power(10, (t-rssi)/k)*(np.power(10, (t-rssi)/k) - actual_dist))/np.power(k,2))    

    # update theta
    t = t - alpha * t_deri_sum
    k = k - alpha * k_deri_sum

    # print error
    print("{:2d} error: {:5.3f}".format(i, calc_error(actual_dist, dist_pred)))


    # remember the current predictions 
    predictions.append(dist_pred)



# df[0]['distance_median'] = rssiToDistance(rssi)
# df[0]['actual_distance'] = actual_dist
# df[0]['times_ms'] = times


# print(df[0].iloc[:, :-1].to_csv(index=False))

[-31.  -33.  -35.  -35.5 -36.  -37.  -38.  -39.5 -38.  -38.  -39.5 -41.
 -45.  -46.  -46.  -46.  -43.5 -43.5 -43.5 -44.  -43.5 -43.  -43.  -43.5
 -44.  -44.  -44.  -44.  -45.  -46.5 -47.  -47.5 -48.  -48.5 -48.5 -48.5
 -48.5 -48.  -48.  -48.  -48.  -48.  -48.  -48.  -48.  -48.  -48.  -48.
 -48.  -48.  -48.  -49.5 -51.5 -51.5 -51.5 -51.5 -52.  -53.  -52.  -52.
 -52.  -52.5 -53.5 -53.5 -54.  -55.  -55.  -54.5 -55.  -56.  -55.  -55.
 -54.5 -54.5 -54.5 -57.  -60.  -31.  -33.  -35.  -35.5 -36.  -37.  -38.
 -39.5 -38.  -38.  -39.5 -41.  -45.  -46.  -46.  -46.  -43.5 -43.5 -43.5
 -44.  -43.5 -43.  -43.  -43.5 -44.  -44.  -44.  -44.  -45.  -46.5 -47.
 -47.5 -48.  -48.5 -48.5 -48.5 -48.5 -48.  -48.  -48.  -48.  -48.  -48.
 -48.  -48.  -48.  -48.  -48.  -48.  -48.  -48.  -49.5 -51.5 -51.5 -51.5
 -51.5 -52.  -53.  -52.  -52.  -52.  -52.5 -53.5 -53.5 -54.  -55.  -55.
 -54.5 -55.  -56.  -55.  -55.  -54.5 -54.5 -54.5 -57.  -60. ]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2