# Lineare Regression 1D

<p>In dieses &Uuml;bung wird eine zuf&auml;llige Punkteverteilung erzeugt, f&uuml;r die eine Approximation mittels einer Geraden gesucht wird. Zu Beginn sind die Parameter theta0 (y-Achsenabschnitt) und theta1 (Steigung) der Geraden zuf&auml;llig gesetzt. Danach werden sie mit dem Gradientenverfahren ge&auml;ndert.</p>

<p>Sie k&ouml;nnen das hier gezeigte Jupyter Notebook <a href="01_LinearRegression_1D_Vorlage.ipynb" target="_blank">downloaden</a>&nbsp;und auf ihren eigenen Rechner ausf&uuml;hren.</p>

<hr />

<h2>Einleitung</h2>

<p>Zun&auml;chst wird numpy f&uuml;r den Umgang und matplot zum Plotten von Matrizen importiert. Danach wird der Zufallsgenerator von Numpy geseedet und die Anzahl an Punkte die sp&auml;ter erzeugt werden soll festgelegt.</p>


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas
from datetime import datetime

# enable interactive plots
%matplotlib notebook

In [2]:
def rssiToDistance(rssi):
    return np.power(10, (t - tx_power - rssi) / k) 

def calc_error(y, prediction):
    d = y - prediction
    d = np.square(d)
    e = np.sum(d)
    return e

def get_file(path):
    return pandas.read_csv(path,sep=';')[['time_ms','rssi_median', 'distance_median']]

In [3]:
all_measures = [[get_file('d1.csv'), 25], [get_file('d2.csv'), 25]]

k = 30 # env_factor
t = -56 # base_power

In [4]:
for df in all_measures:
    
    # Calculate time and distances based on it
    times = df[0]['time_ms'].to_numpy()
    first_time = datetime.strptime(times[0], '%Y-%m-%d %H:%M:%S')
    actual_dist = []

    for idx,item in enumerate(times):
        current_time = datetime.strptime(item, '%Y-%m-%d %H:%M:%S')
        times[idx] = str(current_time - first_time)
        actual_dist.append(((current_time - first_time).seconds//60)+1)
    
    rssi = df[0]['rssi_median'].to_numpy() #get rssi
    tx_power = df[1] #get tx_power
    
    predictions = []

    # learn rate
    alpha = 0.5 / len(rssi)


    # train for 30 iterations
    for i in range(200):


        # prediction with input data
        dist_pred = rssiToDistance(rssi)


        # partial derivative
        t_deri_sum =  np.sum((2*np.log(10)*np.power(10, (t-rssi-tx_power)/k)*(np.power(10, (t-rssi-tx_power)/k) - actual_dist))/k)
        k_deri_sum = np.sum((2*np.log(10)*(-t+rssi+tx_power)*np.power(10, (t-rssi-tx_power)/k)*(np.power(10, (t-rssi-tx_power)/k) - actual_dist))/np.power(k,2))    

        # update theta
        t = t - alpha * t_deri_sum
        k = k - alpha * k_deri_sum

        # print error
        #print("{:2d} error: {:5.3f}".format(i, calc_error(actual_dist, dist_pred)))


        # remember the current predictions 
        predictions.append(dist_pred)
        
    # print error
    print("{:2d} error: {:5.3f}".format(i, calc_error(actual_dist, rssiToDistance(rssi))))


    df[0]['distance_median'] = rssiToDistance(rssi)
    df[0]['actual_distance'] = actual_dist
    df[0]['times_ms'] = times
    
    
    print(df[0].iloc[:, :-1].to_csv(index=False))

[-56.  -58.  -60.  -60.5 -61.  -62.  -63.  -64.5 -63.  -63.  -64.5 -66.
 -70.  -71.  -71.  -71.  -68.5 -68.5 -68.5 -69.  -68.5 -68.  -68.  -68.5
 -69.  -69.  -69.  -69.  -70.  -71.5 -72.  -72.5 -73.  -73.5 -73.5 -73.5
 -73.5 -73.  -73.  -73.  -73.  -73.  -73.  -73.  -73.  -73.  -73.  -73.
 -73.  -73.  -73.  -74.5 -76.5 -76.5 -76.5 -76.5 -77.  -78.  -77.  -77.
 -77.  -77.5 -78.5 -78.5 -79.  -80.  -80.  -79.5 -80.  -81.  -80.  -80.
 -79.5 -79.5 -79.5 -82.  -85.  -85. ]
199 error: 13.851
time_ms,rssi_median,distance_median,actual_distance
0:00:00,-56.0,0.4297552456892586,1
0:00:01,-58.0,0.5041696368941883,1
0:00:05,-60.0,0.5914692730704016,1
0:00:05,-60.5,0.6155609525835106,1
0:00:09,-61.0,0.6406339324755039,1
0:00:09,-62.0,0.6938853024579316,1
0:00:13,-63.0,0.7515630823778531,1
0:00:14,-64.5,0.847192425938067,1
0:00:17,-63.0,0.7515630823778531,1
0:00:18,-63.0,0.7515630823778531,1
0:00:21,-64.5,0.847192425938067,1
0:00:22,-66.0,0.9549897053165545,1
0:00:25,-70.0,1.31434595030