# **Homework 1: Linear Regression**

# **Load 'train.csv'**
train.csv 的資料為 12 個月中，每個月取 20 天，每天 24 小時的資料(每小時資料有 18 個 features)。

In [1]:
import sys
import pandas as pd
import numpy as np
data = pd.read_csv('./../data/train.csv', encoding = 'big5')

# **Preprocessing** 

In [2]:
data = data.iloc[:, 3:]
data[data == 'NR'] = 0
raw_data = data.to_numpy()

# **Extract Features (1)**

將原始 4320 * 18 的資料依照每個月分重組成 12 個 18 (features) * 480 (hours) 的資料。 

In [3]:
month_data = {}
for month in range(12):
    sample = np.empty([18, 480])
    for day in range(20):
        sample[:, day * 24 : (day + 1) * 24] = raw_data[18 * (20 * month + day) : 18 * (20 * month + day + 1), :]
    month_data[month] = sample

In [4]:
items = np.empty([18, 480*12])
for i in range(18):
    for m in range(12):
        if m == 0:
            temp = month_data[m][i,:]
        else:
            temp = np.concatenate((temp, month_data[m][i,:]), axis = -1)
    items[i] = temp

In [5]:
cor = np.corrcoef(items)[9]
cor

array([-0.01712724,  0.25465706,  0.28311942,  0.29177826,  0.02997038,
        0.44911349,  0.37556381,  0.35667002,  0.77642643,  1.        ,
       -0.06265388, -0.26419607,  0.3708308 ,  0.3521594 ,  0.18613794,
        0.15699025, -0.08470312, -0.04545785])

In [13]:
items_bool = np.copy(cor)
# items_bool[items_bool == 1] = 0
items_bool[items_bool > 0.3] = 1
items_bool[items_bool <= 0.3] = 0
items_bool

array([0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 1., 1., 0., 0., 0.,
       0.])

In [14]:
new_month_data = {}
for month in range(12):
    j = 0
    sample = np.empty([int(np.sum(items_bool)), 480])
    for i in range(18):
        if items_bool[i] == 1:
            sample[j,:] = month_data[month][i,:]
            j += 1
    new_month_data[month] = sample

# **Extract Features (2)**

每個月會有 480hrs，每 9 小時形成一個 data，每個月會有 471 個 data，故總資料數為 471 * 12 筆，而每筆 data 有 9 * 18 的 features (一小時 18 個 features * 9 小時)。

對應的 target 則有 471 * 12 個(第 10 個小時的 PM2.5)

In [15]:
x = np.empty([12 * 471, int(np.sum(items_bool)) * 9], dtype = float)
y = np.empty([12 * 471, 1], dtype = float)
for month in range(12):
    for day in range(20):
        for hour in range(24):
            if day == 19 and hour > 14:
                continue
            x[month * 471 + day * 24 + hour, :] = new_month_data[month][:,day * 24 + hour : day * 24 + hour + 9].reshape(1, -1) #vector dim:18*9 (9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9)
            y[month * 471 + day * 24 + hour, 0] = month_data[month][9, day * 24 + hour + 9] #value

# **Normalize (1)**


In [16]:
mean_x = np.mean(x, axis = 0) #18 * 9 
std_x = np.std(x, axis = 0) #18 * 9 
for i in range(len(x)): #12 * 471
    for j in range(len(x[0])): #18 * 9 
        if std_x[j] != 0:
            x[i][j] = (x[i][j] - mean_x[j]) / std_x[j]

#**Split Training Data Into "train_set" and "validation_set"**
這部分是針對作業中 report 的第二題、第三題做的簡單示範，以生成比較中用來訓練的 train_set 和不會被放入訓練、只是用來驗證的 validation_set。

In [17]:
import math
x_train_set = x[: math.floor(len(x) * 0.8), :]
y_train_set = y[: math.floor(len(y) * 0.8), :]
x_validation = x[math.floor(len(x) * 0.8): , :]
y_validation = y[math.floor(len(y) * 0.8): , :]

# **Training**

因為常數項的存在，所以 dimension (dim) 需要多加一欄；eps 項是避免 adagrad 的分母為 0 而加的極小數值。

每一個 dimension (dim) 會對應到各自的 gradient, weight (w)，透過一次次的 iteration (iter_time) 學習。

In [18]:
dim = int(np.sum(items_bool) * 9 + 1)
w = np.zeros([dim, 1])

train_x = np.concatenate((np.ones([x_train_set.shape[0], 1]), x_train_set), axis = 1).astype(float)
validation_x = np.concatenate((np.ones([x_validation.shape[0], 1]), x_validation), axis = 1).astype(float)

learning_rate = 1.5
iter_time = 20000
early_stopping_iter = 250
adagrad = np.zeros([dim, 1])
eps = 0.0000000001

temp_loss = np.inf
temp_iter = 0

for t in range(iter_time):
    loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
    validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
    if validation_loss < temp_loss:
        temp_w = np.copy(w)
        temp_loss = np.copy(validation_loss)
        temp_iter = 0
    else:
        if temp_iter < early_stopping_iter:
            temp_iter += 1
        else:
            print("early stopping at iter", t)
            w = temp_w
            break
    if(t%100==0):
        print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
    gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
    adagrad += gradient ** 2
    w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
# np.save('./../model/weight4.npy', w)
# w

training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.493453344161135 validation 100:3.3468758972785735
training 200:5.674992539386751 validation 200:2.8520444021738234
training 300:5.507788589140977 validation 300:2.744473739271659
training 400:5.4244335699907085 validation 400:2.6809801578842714
training 500:5.371104358853002 validation 500:2.6374906797457984
training 600:5.333824623688139 validation 600:2.6067259987987867
training 700:5.306369691910539 validation 700:2.5843520583358406
training 800:5.2854338373260195 validation 800:2.5676825640398566
training 900:5.269081594459484 validation 900:2.555018261016676
training 1000:5.2560923448562695 validation 1000:2.5452479229049705
training 1100:5.245648938371093 validation 1100:2.537619011885057
training 1200:5.237177624965252 validation 1200:2.531605610146172
training 1300:5.230260176619787 validation 1300:2.5268302104783453
training 1400:5.224582738944069 validation 1400:2.523015714172004
training 1500:5.21990

# **Testing**
![alt text](https://drive.google.com/uc?id=1165ETzZyE6HStqKvgR0gKrJwgFLK6-CW)

載入 test data，並且以相似於訓練資料預先處理和特徵萃取的方式處理，使 test data 形成 240 個維度為 18 * 9 + 1 的資料。

In [19]:
testdata = pd.read_csv('./../data/test.csv', header = None, encoding = 'big5')
test_data = testdata.iloc[:, 2:]
test_data[test_data == 'NR'] = 0
test_data = test_data.to_numpy()
test_x = np.empty([240, int(np.sum(items_bool)*9)], dtype = float)
k = 0
for i in range(18):
    if items_bool[i] == 1:
        for j in range(240):
                test_x[j, k*9 : (k+1)*9] = test_data[18 * j + i, :]
        k += 1
for i in range(len(test_x)):
    for j in range(len(test_x[0])):
        if std_x[j] != 0:
            test_x[i][j] = (test_x[i][j] - mean_x[j]) / std_x[j]
test_x = np.concatenate((np.ones([240, 1]), test_x), axis = 1).astype(float)
# test_x

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


# **Prediction**
說明圖同上

![alt text](https://drive.google.com/uc?id=1165ETzZyE6HStqKvgR0gKrJwgFLK6-CW)

有了 weight 和測試資料即可預測 target。

In [20]:
# w = np.load('./../model/weight.npy')
ans_y = np.dot(test_x, w)
ans_y

array([[  6.82975691],
       [ 17.76852395],
       [ 23.40144313],
       [  6.66709447],
       [ 26.63020263],
       [ 21.89300618],
       [ 24.30005025],
       [ 29.22948677],
       [ 17.37058281],
       [ 58.54494234],
       [ 13.10911139],
       [ 11.04098904],
       [ 62.40551913],
       [ 52.14296428],
       [ 22.03875111],
       [ 11.44868447],
       [ 32.76744294],
       [ 67.70890778],
       [  1.83830453],
       [ 17.04073663],
       [ 41.76233972],
       [ 71.80365825],
       [  9.57207555],
       [ 18.68183015],
       [ 13.38175287],
       [ 37.30683375],
       [ 13.01843992],
       [ 74.46157981],
       [  7.46458492],
       [ 56.06918693],
       [ 23.78328306],
       [  8.21451732],
       [  3.12748865],
       [ 19.05679879],
       [ 28.70185589],
       [ 37.62004885],
       [ 43.07562382],
       [ 30.85712044],
       [ 42.22687575],
       [ 34.93079092],
       [  8.18479243],
       [ 39.33054737],
       [ 32.77027141],
       [ 51

# **Save Prediction to CSV File**


In [21]:
# import csv
# with open('./../results/submit.csv', mode='w', newline='') as submit_file:
#     csv_writer = csv.writer(submit_file)
#     header = ['id', 'value']
#     print(header)
#     csv_writer.writerow(header)
#     for i in range(240):
#         row = ['id_' + str(i), ans_y[i][0]]
#         csv_writer.writerow(row)
#         print(row)

---
# Homework1 report 

## Problem1

### 使用四種不同的 learning rate 進行 training (其他參數需一致)，作圖並討論其收斂過程（橫軸為 iteration 次數，縱軸為 loss 的大小，四種 learning rate 的收斂線請以不同顏色呈現在一張圖裡做比較）。

In [22]:
import matplotlib.pyplot as plt

In [24]:
learning_rate_list = [10**power for power in range(-1,3)]
loss_history = []
validation_loss_history = []
best_validation = []

dim = int(np.sum(items_bool) * 9 + 1)
iter_time = 100000

early_stopping_iter = 250
eps = 0.0000000001

for lr in learning_rate_list:
    print("Use learning rate:", lr)
    history = []
    validation_history = []
    w = np.zeros([dim, 1])
    learning_rate = lr
    adagrad = np.zeros([dim, 1])
    temp_loss = np.inf
    temp_iter = 0
    for t in range(iter_time):
        loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
        history.append(loss)
        validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
        validation_history.append(validation_loss)
        if validation_loss < temp_loss:
            temp_w = np.copy(w)
            temp_loss = np.copy(validation_loss)
            temp_iter = 0
        else:
            if temp_iter < early_stopping_iter:
                temp_iter += 1
            else:
                print("early stopping at iter", t-temp_iter)
                best_validation.append((lr,temp_loss))
                w = temp_w
                break
        if(t%100==0):
            print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
        gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
    loss_history.append(np.array(history))
    validation_loss_history.append(np.array(validation_history))

Use learning rate: 0.1
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:19.160932572549996 validation 100:10.354122022619407
training 200:18.320721973030576 validation 200:10.115329330447938
training 300:17.72373186683206 validation 300:9.883045617898041
training 400:17.24428875630067 validation 400:9.67187610878531
training 500:16.83553896675529 validation 500:9.476990970286032
training 600:16.47477708331891 validation 600:9.29515377592024
training 700:16.14932634207709 validation 700:9.12451502306083
training 800:15.851332233783229 validation 800:8.963827798456059
training 900:15.575527821936078 validation 900:8.81209085492047
training 1000:15.318165884922138 validation 1000:8.66842528491573
training 1100:15.07646004419237 validation 1100:8.53204031949323
training 1200:14.8482679760517 validation 1200:8.402225864349752
training 1300:14.631899165324779 validation 1300:8.278349806748114
training 1400:14.42599148868811 validation 1400:8.159854062336034
training 

training 12400:6.772758168697487 validation 12400:3.4959943257781716
training 12500:6.748719607047353 validation 12500:3.4805407087708042
training 12600:6.725036808526068 validation 12600:3.465311289729609
training 12700:6.701704645336145 validation 12700:3.4503030726981696
training 12800:6.678718070339984 validation 12800:3.43551310474902
training 12900:6.656072115160954 validation 12900:3.420938474931805
training 13000:6.633761888357204 validation 13000:3.406576313264139
training 13100:6.611782573664992 validation 13100:3.392423789763351
training 13200:6.590129428308223 validation 13200:3.378478113517245
training 13300:6.568797781371105 validation 13300:3.3647365317921283
training 13400:6.547783032231181 validation 13400:3.351196329176543
training 13500:6.527080649049876 validation 13500:3.337854826759078
training 13600:6.506686167317957 validation 13600:3.3247093813387703
training 13700:6.486595188453387 validation 13700:3.3117573846666475
training 13800:6.4668033784492875 validatio

training 24600:5.430334179410326 validation 24600:2.637484110256648
training 24700:5.426699322594682 validation 24700:2.6352851839862192
training 24800:5.423121880679522 validation 24800:2.6331238794366194
training 24900:5.419600955034202 validation 24900:2.630999586275896
training 25000:5.416135660545218 validation 25000:2.6289117032068128
training 25100:5.412725125430595 validation 25100:2.6268596378555786
training 25200:5.409368491056315 validation 25200:2.6248428066613525
training 25300:5.406064911754858 validation 25300:2.622860634766559
training 25400:5.40281355464577 validation 25400:2.620912555908003
training 25500:5.399613599458269 validation 25500:2.618998012308782
training 25600:5.3964642383559065 validation 25600:2.61711645457104
training 25700:5.393364675763206 validation 25700:2.6152673415695307
training 25800:5.390314128194329 validation 25800:2.613450140346016
training 25900:5.38731182408372 validation 25900:2.6116643260045143
training 26000:5.384357003618706 validation

training 36800:5.233074767920486 validation 36800:2.5264506169289804
training 36900:5.232546859183382 validation 36900:2.526203940062195
training 37000:5.232027061889515 validation 37000:2.5259617365620244
training 37100:5.231515247012033 validation 37100:2.5257239249772607
training 37200:5.23101128760614 validation 37200:2.5254904252786643
training 37300:5.230515058775604 validation 37300:2.525261158835174
training 37400:5.230026437639786 validation 37400:2.5250360483905
training 37500:5.229545303301178 validation 37500:2.5248150180400843
training 37600:5.229071536813461 validation 37600:2.5245979932084244
training 37700:5.228605021150075 validation 37700:2.524384900626767
training 37800:5.228145641173268 validation 37800:2.524175668311145
training 37900:5.227693283603639 validation 37900:2.5239702255407774
training 38000:5.227247836990166 validation 38000:2.5237685028368024
training 38100:5.2268091916807125 validation 38100:2.5235704319413594
training 38200:5.226377239792983 validati

training 48900:5.20356982293351 validation 48900:2.514132780048232
training 49000:5.203481909908598 validation 49000:2.5141011535816973
training 49100:5.20339521813651 validation 49100:2.5140699579586294
training 49200:5.20330972931812 validation 49200:2.5140391844684764
training 49300:5.20322542544219 validation 49300:2.51400882456954
training 49400:5.203142288780721 validation 49400:2.5139788698859085
training 49500:5.203060301884368 validation 49500:2.513949312204445
training 49600:5.202979447577942 validation 49600:2.51392014347182
training 49700:5.202899708955977 validation 49700:2.513891355791607
training 49800:5.2028210693783725 validation 49800:2.5138629414214217
training 49900:5.202743512466114 validation 49900:2.5138348927701104
training 50000:5.202667022097048 validation 50000:2.513807202394994
training 50100:5.202591582401743 validation 50100:2.5137798629991526
training 50200:5.202517177759407 validation 50200:2.513752867428764
training 50300:5.202443792793874 validation 50

training 61000:5.198278437091838 validation 61000:2.5119095529871673
training 61100:5.198260193035598 validation 61100:2.5118972975998073
training 61200:5.198242166049169 validation 61200:2.511885080829325
training 61300:5.198224353257962 validation 61300:2.511872902070932
training 61400:5.198206751829016 validation 61400:2.5118607607355785
training 61500:5.198189358970348 validation 61500:2.51184865624961
training 61600:5.198172171930333 validation 61600:2.51183658805444
training 61700:5.198155187997074 validation 61700:2.511824555606224
training 61800:5.1981384044977945 validation 61800:2.511812558375541
training 61900:5.198121818798238 validation 61900:2.5118005958470793
training 62000:5.198105428302073 validation 62000:2.511788667519335
training 62100:5.198089230450312 validation 62100:2.511776772904305
training 62200:5.198073222720738 validation 62200:2.5117649115271945
training 62300:5.198057402627342 validation 62300:2.5117530829261274
training 62400:5.198041767719767 validation

training 73300:5.197066526456236 validation 73300:2.510599554894608
training 73400:5.19706184990889 validation 73400:2.5105902195914944
training 73500:5.197057221153621 validation 73500:2.5105809043818823
training 73600:5.197052639653318 validation 73600:2.5105716092810986
training 73700:5.197048104877602 validation 73700:2.510562334304704
training 73800:5.19704361630273 validation 73800:2.5105530794684756
training 73900:5.197039173411496 validation 73900:2.5105438447883777
training 74000:5.1970347756931545 validation 74000:2.5105346302805462
training 74100:5.197030422643319 validation 74100:2.5105254359612617
training 74200:5.197026113763879 validation 74200:2.510516261846934
training 74300:5.197021848562917 validation 74300:2.5105071079540764
training 74400:5.197017626554615 validation 74400:2.5104979742992906
training 74500:5.19701344725918 validation 74500:2.510488860899244
training 74600:5.197009310202754 validation 74600:2.5104797677706556
training 74700:5.197005214917335 validat

training 85500:5.196734389323034 validation 85500:2.509612583118817
training 85600:5.196732966283381 validation 85600:2.5096057677428645
training 85700:5.196731556472238 validation 85700:2.5095989727576886
training 85800:5.196730159759618 validation 85800:2.5095921981432654
training 85900:5.1967287760169185 validation 85900:2.5095854438792244
training 86000:5.196727405116908 validation 86000:2.5095787099448583
training 86100:5.196726046933711 validation 86100:2.509571996319122
training 86200:5.196724701342789 validation 86200:2.5095653029806337
training 86300:5.196723368220925 validation 86300:2.5095586299076813
training 86400:5.196722047446211 validation 86400:2.509551977078225
training 86500:5.19672073889803 validation 86500:2.509545344469894
training 86600:5.196719442457039 validation 86600:2.5095387320599962
training 86700:5.196718158005161 validation 86700:2.5095321398255197
training 86800:5.196716885425562 validation 86800:2.5095255677431325
training 86900:5.196715624602643 valid

training 97600:5.196629995636578 validation 97600:2.508927835006834
training 97700:5.1966295229393795 validation 97700:2.5089232664439676
training 97800:5.196629054446203 validation 97800:2.5089187141542846
training 97900:5.196628590118707 validation 97900:2.5089141780958286
training 98000:5.19662812991892 validation 98000:2.508909658226601
training 98100:5.196627673809225 validation 98100:2.508905154504565
training 98200:5.196627221752368 validation 98200:2.5089006668876475
training 98300:5.196626773711448 validation 98300:2.5088961953337403
training 98400:5.196626329649913 validation 98400:2.5088917398007013
training 98500:5.1966258895315605 validation 98500:2.508887300246357
training 98600:5.196625453320528 validation 98600:2.508882876628505
training 98700:5.196625020981295 validation 98700:2.5088784689049133
training 98800:5.19662459247868 validation 98800:2.5088740770333224
training 98900:5.19662416777783 validation 98900:2.5088697009714505
training 99000:5.1966237468442245 valida

training 500:5.404587095994841 validation 500:2.662362619745554
training 600:5.363921812625104 validation 600:2.628466236352276
training 700:5.333642246541786 validation 700:2.603482966158103
training 800:5.3102332059780695 validation 800:2.5845524863986014
training 900:5.291649423815118 validation 900:2.5699025498353856
training 1000:5.276612873868813 validation 1000:2.558376954145005
training 1100:5.2642767547203455 validation 1100:2.5491903608107087
training 1200:5.254051901698825 validation 1200:2.54179137014789
training 1300:5.245511474684761 validation 1300:2.535781935778285
training 1400:5.238335841471147 validation 1400:2.5308678984701234
training 1500:5.232279191066629 validation 1500:2.526827509507473
training 1600:5.227148427009191 validation 1600:2.5234907559917596
training 1700:5.222789285418574 validation 1700:2.5207253928613222
training 1800:5.2190768672616095 validation 1800:2.518427266769924
training 1900:5.215908966332377 validation 1900:2.516513462965121
training 200

In [25]:
best_validation

[(1, array(2.50772645)), (10, array(2.50727658)), (100, array(2.50730128))]

### You will find that the best learning rate $\eta$ should be between 10 and 100
### Let's try to tune the learning rate

In [31]:
np.round(np.dot(train_x, w))

array([[20.],
       [38.],
       [45.],
       ...,
       [ 7.],
       [ 9.],
       [ 5.]])

In [33]:
learning_rate_list = np.arange(1,11,1)
loss_history = []
validation_loss_history = []
best_validation = []
dim = int(np.sum(items_bool) * 9 + 1)
iter_time = 100000
early_stopping_iter = 250
eps = 0.0000000001
for lr in learning_rate_list:
    print("Use learning rate:", lr)
    history = []
    validation_history = []
    w = np.zeros([dim, 1])
    learning_rate = lr
    adagrad = np.zeros([dim, 1])
    temp_loss = np.inf
    temp_iter = 0
    for t in range(iter_time):
        loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
#         loss = np.sqrt(np.sum(np.power(np.round(np.dot(train_x, w)) - y_train_set, 2))/471/12)#rmse
        history.append(loss)
        validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
#         validation_loss = np.sqrt(np.sum(np.power(np.round(np.dot(validation_x, w)) - y_validation, 2))/471/12)#rmse
        validation_history.append(validation_loss)
        if validation_loss < temp_loss:
            temp_w = np.copy(w)
            temp_loss = np.copy(validation_loss)
            temp_iter = 0
        else:
            if temp_iter < early_stopping_iter:
                temp_iter += 1
            else:
                print("early stopping at iter", t-temp_iter)
                best_validation.append((lr,temp_loss))
                break
        if(t%100==0):
            print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
        gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
    loss_history.append(np.array(history))
    validation_loss_history.append(np.array(validation_history))

Use learning rate: 1
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:8.495138279083523 validation 100:4.583482915914626
training 200:6.305840427421834 validation 200:3.2204616074907664
training 300:5.691683258548431 validation 300:2.825704729564583
training 400:5.486230592551655 validation 400:2.7106902205979373
training 500:5.394471102094957 validation 500:2.65389723973897
training 600:5.350820587768866 validation 600:2.621127885891252
training 700:5.314740015238981 validation 700:2.6003225342979897
training 800:5.292221454776598 validation 800:2.5841807390356486
training 900:5.268196316336295 validation 900:2.5679374800854755
training 1000:5.260432629624095 validation 1000:2.556924461822797
training 1100:5.253381615737255 validation 1100:2.547600622419404
training 1200:5.245680324469374 validation 1200:2.5411684625363553
training 1300:5.240989976876617 validation 1300:2.5344058518593933
training 1400:5.232222284592438 validation 1400:2.5271701205335346
train

training 1800:5.223660039074294 validation 1800:2.518544103640355
training 1900:5.221966234258308 validation 1900:2.5185089782370684
training 2000:5.216304939874662 validation 2000:2.5163302456486734
training 2100:5.214218543338288 validation 2100:2.5144311016687104
training 2200:5.208684701523675 validation 2200:2.513621771281181
training 2300:5.207172906820808 validation 2300:2.512882589800071
training 2400:5.206918066737319 validation 2400:2.51344579540804
early stopping at iter 2180
Use learning rate: 6
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.036263965484358 validation 100:3.1222156739711666
training 200:5.685104884811649 validation 200:2.878588255860831
training 300:5.528972138535777 validation 300:2.7619008398411697
training 400:5.446841478768562 validation 400:2.6955048948866898
training 500:5.395225406162188 validation 500:2.651696301750298
training 600:5.3575287159256035 validation 600:2.621904031458232
training 700:5.331242955523024 validat

training 2400:5.207155917869925 validation 2400:2.5116149154216934
training 2500:5.207053983000611 validation 2500:2.513234608095529
training 2600:5.20557570308666 validation 2600:2.5112978968415556
training 2700:5.205405758954569 validation 2700:2.5112626700845024
training 2800:5.2043349833136325 validation 2800:2.5100294221242168
training 2900:5.205184823287833 validation 2900:2.5106285039376393
training 3000:5.204063005211223 validation 3000:2.511967111364616
early stopping at iter 2781
Use learning rate: 10
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.0509455432556285 validation 100:3.1300825617857413
training 200:5.693423777480364 validation 200:2.8869964948888236
training 300:5.535224653107407 validation 300:2.770024565490035
training 400:5.449423247432082 validation 400:2.700554319130287
training 500:5.4005681072553875 validation 500:2.6572618124119347
training 600:5.360318541093967 validation 600:2.6243996381500034
training 700:5.3327693438526484 

In [34]:
best_validation

[(1, array(2.51400887)),
 (2, array(2.5110513)),
 (3, array(2.50999418)),
 (4, array(2.50999418)),
 (5, array(2.51076944)),
 (6, array(2.50999418)),
 (7, array(2.50999418)),
 (8, array(2.50999418)),
 (9, array(2.50999418)),
 (10, array(2.50999418))]

4, array(2.50726051)

### 0.01 to 0.1

In [None]:
learning_rate_list = np.arange(0.01,0.11,0.01)
loss_history = []
validation_loss_history = []
best_validation = []
dim = int(np.sum(items_bool) * 9 + 1)
iter_time = 100000
early_stopping_iter = 250
eps = 0.0000000001
for lr in learning_rate_list:
    print("Use learning rate:", lr)
    history = []
    validation_history = []
    w = np.zeros([dim, 1])
    learning_rate = lr
    adagrad = np.zeros([dim, 1])
    temp_loss = np.inf
    temp_iter = 0
    for t in range(iter_time):
        loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
        history.append(loss)
        validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
        validation_history.append(validation_loss)
        if validation_loss < temp_loss:
            temp_w = np.copy(w)
            temp_loss = np.copy(validation_loss)
            temp_iter = 0
        else:
            if temp_iter < early_stopping_iter:
                temp_iter += 1
            else:
                print("early stopping at iter", t)
                best_validation.append((lr,temp_loss))
                history = history[:-temp_iter-1]
                validation_history = validation_history[:-temp_iter-1]
                break
        if(t%100==0):
            print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
        gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
    loss_history.append(np.array(history))
    validation_loss_history.append(np.array(validation_history))

In [None]:
best_validation

### 1 to 5

In [None]:
learning_rate_list = np.arange(1,5.5,0.5)
loss_history = []
validation_loss_history = []
best_validation = []

dim = int(np.sum(items_bool) * 9 + 1)
iter_time = 100000
early_stopping_iter = 250
eps = 0.0000000001

for lr in learning_rate_list:
    print("Use learning rate:", lr)
    history = []
    validation_history = []
    w = np.zeros([dim, 1])
    learning_rate = lr
    adagrad = np.zeros([dim, 1])
    temp_loss = np.inf
    temp_iter = 0
    for t in range(iter_time):
        loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
        history.append(loss)
        validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
        validation_history.append(validation_loss)
        if validation_loss < temp_loss:
            temp_w = np.copy(w)
            temp_loss = np.copy(validation_loss)
            temp_iter = 0
        else:
            if temp_iter < early_stopping_iter:
                temp_iter += 1
            else:
                print("early stopping at iter", t)
                history = history[:-temp_iter-1]
                validation_history = validation_history[:-temp_iter-1]
                best_validation.append((lr,temp_loss))
                break
        if(t%100==0):
            print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
        gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
    loss_history.append(np.array(history))
    validation_loss_history.append(np.array(validation_history))

In [None]:
best_validation

### 3.5 to 4.5

In [50]:
learning_rate_list = np.arange(4.1,4.16,0.01)

loss_history = []
validation_loss_history = []
best_validation = []

dim = int(np.sum(items_bool) * 9 + 1)
iter_time = 100000
early_stopping_iter = 250
eps = 0.0000000001

for lr in learning_rate_list:
    print("Use learning rate:", lr)
    history = []
    validation_history = []
    w = np.zeros([dim, 1])
    learning_rate = lr
    adagrad = np.zeros([dim, 1])
    temp_loss = np.inf
    temp_iter = 0
    for t in range(iter_time):
        loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
#         loss = np.sqrt(np.sum(np.power(np.round(np.dot(train_x, w)) - y_train_set, 2))/471/12)#rmse
        history.append(loss)
        validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
#         validation_loss = np.sqrt(np.sum(np.power(np.round(np.dot(validation_x, w)) - y_validation, 2))/471/12)#rmse
        validation_history.append(validation_loss)
        if validation_loss < temp_loss:
            temp_w = np.copy(w)
            temp_loss = np.copy(validation_loss)
            temp_iter = 0
        else:
            if temp_iter < early_stopping_iter:
                temp_iter += 1
            else:
                print("early stopping at iter", t)
                history = history[:-temp_iter-1]
                validation_history = validation_history[:-temp_iter-1]
                best_validation.append((lr,temp_loss))
                break
        if(t%100==0):
            print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
        gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
        adagrad += gradient ** 2
        w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
    loss_history.append(np.array(history))
    validation_loss_history.append(np.array(validation_history))

Use learning rate: 4.1
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.0110581880270555 validation 100:3.1053001701230247
training 200:5.666494795011621 validation 200:2.8688986108915113
training 300:5.519861636039562 validation 300:2.7564347149053936
training 400:5.437230013093678 validation 400:2.6890954330036148
training 500:5.383666536813436 validation 500:2.6444537849263434
training 600:5.345940183768055 validation 600:2.613108023954564
training 700:5.317899167965888 validation 700:2.590235185803498
training 800:5.296284536571396 validation 800:2.5730705138960888
training 900:5.279198698512755 validation 900:2.559912996258312
training 1000:5.265451076343708 validation 1000:2.5496602867013136
training 1100:5.254247760836925 validation 1100:2.5415678790452247
training 1200:5.245032407373539 validation 1200:2.535115589067987
training 1300:5.237399289772711 validation 1300:2.529929565276545
training 1400:5.23104304849983 validation 1400:2.5257347131926284
t

early stopping at iter 4095
Use learning rate: 4.129999999999999
training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.011421820441479 validation 100:3.1055647565976785
training 200:5.6667243594528545 validation 200:2.8690668424110006
training 300:5.520035599223531 validation 300:2.7565709602309134
training 400:5.437372634485638 validation 400:2.689210870198837
training 500:5.38378831818823 validation 500:2.644552770086135
training 600:5.346046676830572 validation 600:2.6131934012579805
training 700:5.317993641368006 validation 700:2.5903091553138604
training 800:5.2963690513656045 validation 800:2.5731348282216957
training 900:5.279274641679367 validation 900:2.5599690640871233
training 1000:5.265519450133002 validation 1000:2.5497092534869
training 1100:5.254309343787567 validation 1100:2.5416106896211583
training 1200:5.24508784395673 validation 1200:2.5351530340961745
training 1300:5.237449139803038 validation 1300:2.5299623149565336
training 1400:5.2310878139

training 200:5.666951978852356 validation 200:2.8692336887634964
training 300:5.520208014146998 validation 300:2.7567060821527924
training 400:5.437513912115561 validation 400:2.6893253273426034
training 500:5.383908898181912 validation 500:2.6446508939332487
training 600:5.346152082410921 validation 600:2.613278021188975
training 700:5.318087125356547 validation 700:2.5903824586259634
training 800:5.2964526648890145 validation 800:2.5731985564809143
training 900:5.279349764664768 validation 900:2.560024616554752
training 1000:5.265587079121436 validation 1000:2.5497577675375784
training 1100:5.254370252193311 validation 1100:2.541653103083746
training 1200:5.245142671366779 validation 1200:2.535190131486322
training 1300:5.237498441260716 validation 1300:2.5299947610300357
training 1400:5.2311320868160145 validation 1400:2.525791705837589
training 1500:5.225807967704276 validation 1500:2.5223743372240706
training 1600:5.221340791824275 validation 1600:2.519584944378702
training 1700:5

In [51]:
best_validation

[(4.1, array(2.50726044)),
 (4.109999999999999, array(2.50726044)),
 (4.119999999999999, array(2.50726043)),
 (4.129999999999999, array(2.50726043)),
 (4.139999999999999, array(2.50726043)),
 (4.149999999999999, array(2.50726043)),
 (4.159999999999998, array(2.50726043))]

In [53]:
best_validation[-3][1] < best_validation[-2][1]

False

In [54]:
best_validation[-2][0]

4.149999999999999

### Training

In [55]:
dim = int(np.sum(items_bool) * 9 + 1)
w = np.zeros([dim, 1])

train_x = np.concatenate((np.ones([x_train_set.shape[0], 1]), x_train_set), axis = 1).astype(float)
validation_x = np.concatenate((np.ones([x_validation.shape[0], 1]), x_validation), axis = 1).astype(float)

history = []
validation_history = []

learning_rate = 4.15
iter_time = 200000
early_stopping_iter = 1000
adagrad = np.zeros([dim, 1])
eps = 0.0000000001

temp_loss = np.inf
temp_iter = 0

for t in range(iter_time):
    loss = np.sqrt(np.sum(np.power(np.dot(train_x, w) - y_train_set, 2))/471/12)#rmse
    history.append(loss)
    validation_loss = np.sqrt(np.sum(np.power(np.dot(validation_x, w) - y_validation, 2))/471/12)#rmse
    validation_history.append(validation_loss)
    if validation_loss < temp_loss:
        temp_w = np.copy(w)
        temp_loss = np.copy(validation_loss)
        temp_iter = 0
    else:
        if temp_iter < early_stopping_iter:
            temp_iter += 1
        else:
            print("early stopping at iter", t - temp_iter)
            history = history[:-temp_iter-1]
            validation_history = validation_history[:-temp_iter-1]
            break
    if(t%100==0):
        print("training " + str(t) + ":" + str(loss), "validation " + str(t) + ":" + str(validation_loss))
    gradient = 2 * np.dot(train_x.transpose(), np.dot(train_x, w) - y_train_set) #dim*1
    adagrad += gradient ** 2
    w = w - learning_rate * gradient / np.sqrt(adagrad + eps)
np.save('./../model/weight6.npy', w)
# w

training 0:24.36221492236629 validation 0:11.803946645288281
training 100:6.011662433625542 validation 100:3.105739289424246
training 200:5.666876321126352 validation 200:2.8691782268094053
training 300:5.520150713783925 validation 300:2.7566611660118534
training 400:5.437466968056147 validation 400:2.689287283498333
training 500:5.383868837485717 validation 500:2.6446182813150725
training 600:5.346117067202393 validation 600:2.6132498983333914
training 700:5.318056073142943 validation 700:2.590358097863415
training 800:5.296424893078199 validation 800:2.573177378522527
training 900:5.279324814072966 validation 900:2.5600061560308136
training 1000:5.265564618206963 validation 1000:2.549741646222313
training 1100:5.254350023722302 validation 1100:2.541639009146015
training 1200:5.245124462685826 validation 1200:2.535177804103143
training 1300:5.237482067883757 validation 1300:2.529983979219747
training 1400:5.231117383484583 validation 1400:2.525782280353798
training 1500:5.225794783581

### Predict

In [56]:
w = np.load('./../model/weight6.npy')
ans_y = np.dot(test_x, w)
ans_y

array([[  6.89839091],
       [ 17.80732458],
       [ 23.35903227],
       [  6.65259305],
       [ 26.60200308],
       [ 21.91880916],
       [ 24.31587656],
       [ 29.20919715],
       [ 17.35989067],
       [ 58.55763806],
       [ 13.07625969],
       [ 10.99661671],
       [ 62.45782123],
       [ 52.16025926],
       [ 22.02575563],
       [ 11.48025265],
       [ 32.79680815],
       [ 67.67635096],
       [  1.86215716],
       [ 17.05167628],
       [ 41.74642672],
       [ 71.79602678],
       [  9.56587921],
       [ 18.6696507 ],
       [ 13.35989984],
       [ 37.30228019],
       [ 13.00475953],
       [ 74.41071602],
       [  7.44635216],
       [ 56.0640257 ],
       [ 23.84192887],
       [  8.1996752 ],
       [  3.10891629],
       [ 19.01255615],
       [ 28.61154273],
       [ 37.64023407],
       [ 43.05208949],
       [ 30.83768433],
       [ 42.18475609],
       [ 34.92112649],
       [  8.19725729],
       [ 39.35014673],
       [ 32.75518478],
       [ 51

### Save

In [57]:
import csv
with open('./../results/submit6.csv', mode='w', newline='') as submit_file:
    csv_writer = csv.writer(submit_file)
    header = ['id', 'value']
    print(header)
    csv_writer.writerow(header)
    for i in range(240):
        row = ['id_' + str(i), ans_y[i][0]]
        csv_writer.writerow(row)
        print(row)

['id', 'value']
['id_0', 6.898390909664564]
['id_1', 17.807324577918884]
['id_2', 23.359032270639098]
['id_3', 6.652593048071527]
['id_4', 26.60200307533507]
['id_5', 21.918809162044084]
['id_6', 24.31587655807283]
['id_7', 29.209197150869233]
['id_8', 17.35989067012718]
['id_9', 58.55763806261701]
['id_10', 13.07625968525129]
['id_11', 10.996616707930018]
['id_12', 62.457821231931874]
['id_13', 52.16025925885989]
['id_14', 22.02575562983203]
['id_15', 11.480252649635478]
['id_16', 32.796808147018126]
['id_17', 67.67635095613338]
['id_18', 1.8621571629467013]
['id_19', 17.051676280483115]
['id_20', 41.74642672050495]
['id_21', 71.79602678094233]
['id_22', 9.565879214830083]
['id_23', 18.66965069688074]
['id_24', 13.359899836300201]
['id_25', 37.302280187210705]
['id_26', 13.004759530315495]
['id_27', 74.41071601841767]
['id_28', 7.446352163369912]
['id_29', 56.06402570244599]
['id_30', 23.841928873021494]
['id_31', 8.19967520325889]
['id_32', 3.108916287551363]
['id_33', 19.01255614967