# Exercise Sheet 4: Machine Learning Fundamentals & Linear Regression (Deadline: 01 Dec 23:59)

# ML Fundamentals(7 points)
For theoretical tasks you are encouraged to write in $\\LaTeX$. Jupyter notebooks support them by default. For reference, please have a look at the examples in this short excellent guide: [Typesetting Equations](http://nbviewer.jupyter.org/github/ipython/ipython/blob/3.x/examples/Notebook/Typesetting%20Equations.ipynb)

Alternatively, you can upload the solutions in the written form as images and paste them inside the cells. But if you do this, **make sure** that the images are of high quality, so that we can read them without any problems.

###### 1. Sigmoid Function (1.5 points)
The special case of the logistic function is the *sigmoid function* which is defined as:

\begin{equation*}
  \sigma(a) = \frac{1}{1 + e^{-a}}
\end{equation*}

a) Compute its gradient analytically. (0.5 points)

b) What are the inherent properties that you observe from the above computed gradient? (0.5 points) <br />
   *Hint: Think about how would the gradient signal be for the whole domain of the sigmoid function*

c) Prove that the sigmoid function is symmetric. (0.5 points)

###### 2. Regularization (3.5 points)

In the lecture, we've seen that we can add a *regularizer* to our cost function to avoid *over or underfitting*. For example, consider the following training criterion for linear regression:

\begin{equation*}
  J(\textbf{w}) = \frac{1}{m}\sum_{i=1}^{m} \Vert\hat{y}^{(i)} - y^{(i)}\Vert^{2} + \lambda\Omega(\textbf{w})
\end{equation*}
where $\Omega(\textbf{w}) = \textbf{w}^{T}\textbf{w}$ is the regularizer.

a) In the above criterion, what is the role of the regularization parameter $\lambda$ on the regularizer (i.e. parameters of our model) while minimizing $J(\textbf{w})$? (1.0 point)

b) Is $\lambda$ the model parameter or a hyperparameter? Justify.(0.5 points)

c) Derive the closed form solution for the weights ($\textbf{w}$) in the above criterion.(2.0 points)

###### 3. Maximum Likelihood Estimation (MLE) (2 points)
Consider the density function of a ***univariate Gaussian distribution***


\begin{equation*}
 p(x;\mu,\sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}}exp\left(-\frac{1}{2\sigma^2}(x-\mu)^{2}\right)
\end{equation*}
where $\mu$ is the $\textit{mean}$ and $\sigma^{2}$ is the $\textit{variance}$. 

Let's say you're given *N* samples (i.e. $x_1, x_2, x_3, ..., x_N$) which are drawn from the above stated distribution. Also, you can assume that these samples are **i.i.d** (i.e. [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables)).

Now, please derive the *MLE step-by-step* for:

a) *mean* $(\mu)$. (1.0 point)

b) *variance* $(\sigma^2)$. (1.0 point)

# Multiple Linear Regression (13 points)

#### 1. Introduction
As we have seen in first assignment sheet, when we have one independent (or explanatory) variable and a scalar dependent variable, it is called **simple linear regression**.
But, when there are more than one explanatory variable (i.e. $x^{(1)}, x^{(2)}, ...,x^{(k)}$), and a single scalar dependent variable (*y*), then it's called $\textit{multiple linear regression}$. (Please don't confuse this with *multivariate linear regression* where we predict more than one (correlated) dependent variable.)

Here, we will implement a **multiple linear regression** model in Python/NumPy using the *Gradient Descent* algorithm. Particularly, we will be using $\textit{stochastic gradient descent}$ (*SGD*) where one performs the update step using a small set of training samples of size *batch_size* which we will set to 64. This is again a hyperparameter but in this exercise we will just use a fixed batch-size of *64* (i.e. we go through the training samples sampling 64 at a time and perform gradient descent.) Such a procedure is sometimes called *mini-batch gradient descent* in the deep learning community.

Going through all the training samples *once* is called an **epoch**. Ideally, the algorithm has to go through multiple epochs over the training samples, each time shuffling it, until a convergence criterion has been satisfied. <br />

Here, we will set a *tolerance value* for the difference in error (i.e. change in MSE values between subsequent epochs) that we will accept. Once this difference falls below the *tolerance value*, we terminate our training phase and return the parameters. 

We repeat the above training procedure for all possible hyperparameter combinations. Later on, using these parameters (*i.e. weight vectors*), we compute the prediction for validation data and the corresponding MSE values. And then, we pick the hyperparameter combination which yielded the least MSE.

As a next step, we will combine training data and validation data and make it as our *new training data*. We keep the test data as it is. Using the hyperparameter combination (for the least MSE) that we found above, we train the model again with the *new training data* and obtain the parameter (*i.e. weight vector*) after convergence according to our *tolerance value*.

Phew! That will be our much desired *weight vector*. This is then used on the *test data*, which has not been seen by our algorithm so far, to make a prediction. The resulting MSE value will be the so-called [*generalization error*](https://en.wikipedia.org/wiki/Generalization_error).

It is this *generalization error* that we want it to be as low as possible for *unseen data* (implies that we can achieve higher accuracy).

#### 2. Dataset
For our task, we will be using the *Wine Quality* dataset and predict the quality of white wine based on 11 features such as acidity, citric acid content, residual sugar etc. .

In [86]:
%matplotlib inline
import itertools
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# get data
data_url = 'http://mlr.cs.umass.edu/ml/machine-learning-databases/wine-quality/winequality-white.csv'
data = pd.read_csv(data_url, sep=';')

# inspect data
print(data.head())
#print(data.shape)

# data as np array
data_npr = data.values

print(data_npr)

   fixed acidity  volatile acidity  citric acid  residual sugar  chlorides  \
0            7.0              0.27         0.36            20.7      0.045   
1            6.3              0.30         0.34             1.6      0.049   
2            8.1              0.28         0.40             6.9      0.050   
3            7.2              0.23         0.32             8.5      0.058   
4            7.2              0.23         0.32             8.5      0.058   

   free sulfur dioxide  total sulfur dioxide  density    pH  sulphates  \
0                 45.0                 170.0   1.0010  3.00       0.45   
1                 14.0                 132.0   0.9940  3.30       0.49   
2                 30.0                  97.0   0.9951  3.26       0.44   
3                 47.0                 186.0   0.9956  3.19       0.40   
4                 47.0                 186.0   0.9956  3.19       0.40   

   alcohol  quality  
0      8.8        6  
1      9.5        6  
2     10.1        6 

#### 3. Loss function
We will use a *regularized* form of the MSE loss function. In matrix form it can be written as follows:

\begin{equation*}
    J(\textbf{w}) = \frac{1}{2} \Vert{X\textbf{w}-\textbf{y}}\Vert^{2} + \frac{\lambda}{2}\Vert{\textbf{w}}\Vert^{2}
\end{equation*}

It's important to note that, in the above equation, $X$, called *design matrix*, is the horizontal concatenation of shape *(batch_size, num_features)* according to the *order* of the polynomial. To make things easier, you can add the *bias* term as the first column of $X$. Take care to have the *weight* vector $\textbf{w}$ with matching dimensions.

$\textit{Hint}$: see [Design_matrix#Multiple_regression](https://en.wikipedia.org/wiki/Design_matrix#Multiple_regression) for how $X$ with 2 features looks like for $1^{st}$ degree polynomial.

a) Derive the gradient (w.r.t $\textbf{w}$) for the regularized loss function given in **3**. (1.0 point)

\begin{equation*}
    J(\textbf{w}) = \frac{1}{2m} ({X\textbf{w}-\textbf{y}})^{T}({X\textbf{w}-\textbf{y}}) + \frac{\lambda}{2m}\sum_{j=1}^{n}{\textbf{w}_j}^{2}
\end{equation*}
\begin{equation*}
\nabla_wJ(\textbf{w}) = X^T({X\textbf{w}-\textbf{y}})+ \lambda\textbf{w}
\end{equation*}

#### 4. Matrix format for higher order polynomial

Written in matrix form, linear regression model for second order would look like: <br />
$$\hat{\textbf{y}} = X\textbf{w}_{1} + X^{2}\textbf{w}_{2} + \textbf{b}$$

where $X^{2}$ is the element-wise squaring of the original design matrix $X$, $\textbf{w}_1$ and $\textbf{w}_2$ are the *weight* vectors, and **b** is the *bias* vector.

a) Now, please write down the matrix format for a $9^{th}$ order linear regression model (0.5 points)
$$\hat{\textbf{y}} =\textbf{b} + \sum_{i=1}^{9} w_{i}x^{i} $$

#### 5. Hyperparameters
we will experiment with three hyperparameters:

i) regularization parameter $\lambda$ <br />
ii) learning rate $\epsilon$ <br />
iii) order of polynomial *p*

And do a grid search over the values that these hyperparameters can take in order to select the best combination (i.e. the one that achieves lowest test error). This approach is called **hyperparameter optimization or tuning**.

In [71]:
polynomial_order = [1, 5, 9]
learning_rates = [1e-5, 1e-8]
lambdas = [0.1, 0.8]

#hyperparams combination
comb_gen = itertools.product(*(polynomial_order, learning_rates, lambdas))
hparams_comb = list(comb_gen)

batch_size = 64

#### 6. Normalization
First of all, inspect the data, and understand its structure and features. Ideally, before starting to train our learning algorithm, we would want the data to be normalized. Here, we normalize the data (i.e. normalize each column) using the formula:

\begin{equation*}
  norm\_x_i = \frac{x_i - min(x)}{max(x) - min(x)}
\end{equation*}
where $x_i$ is the $i^{th}$ sample in feature $x$

a) Complete the following function which performs normalization (i.e. normalizes columns of $X$). (0.5 points)

In [75]:
def data_normalization(data):
    # TODO: implement
    newDF = pd.DataFrame() #creates a new dataframe that's empty
    for column in data:
        min_col = np.amin(data[column])
        max_col = np.amax(data[column])
        new_col = (data[column] - min_col)/(max_col - min_col)
        newDF = pd.concat([newDF, new_col], axis=1) 
    
    data_normalized = newDF.values
    return data_normalized

# perform data normalization
data_normalized = data_normalization(data)
data_npr = data_normalized
print (data_npr)

[[ 0.30769231  0.18627451  0.21686747 ...,  0.26744186  0.12903226  0.5       ]
 [ 0.24038462  0.21568627  0.20481928 ...,  0.31395349  0.24193548  0.5       ]
 [ 0.41346154  0.19607843  0.24096386 ...,  0.25581395  0.33870968  0.5       ]
 ..., 
 [ 0.25961538  0.15686275  0.11445783 ...,  0.27906977  0.22580645  0.5       ]
 [ 0.16346154  0.20588235  0.18072289 ...,  0.18604651  0.77419355
   0.66666667]
 [ 0.21153846  0.12745098  0.22891566 ...,  0.11627907  0.61290323  0.5       ]]


In [76]:
def split_data(data_npr):
    # (in-place) shuffling of data_npr along axis 0
    np.random.shuffle(data_npr)

    n_tr = 3898
    n_va = n_tr + 500
    n_te = n_va + 500
    
    X_train = data_npr[0:n_tr, 0:-1]
    Y_train = data_npr[0:n_tr, -1]
    
    X_val = data_npr[n_tr:n_va, 0:-1]
    Y_val = data_npr[n_tr:n_va, -1]
    
    X_test = data_npr[n_va:, 0:-1]
    Y_test = data_npr[n_va:, -1]
    
    return [(X_train, Y_train), (X_val, Y_val), (X_test, Y_test)]


# shuffle only the training data along axis 0
def shuffle_train_data(X_train, Y_train):
    """called after each epoch"""
    perm = np.random.permutation(len(Y_train))
    Xtr_shuf = X_train[perm]
    Ytr_shuf = Y_train[perm]
    
    return Xtr_shuf, Ytr_shuf

###### 7. Implementation of required functions

Complete the following function which computes the MSE value. (0.5 point) <br />
(i.e. just a vanilla version of it.) That is, you can ignore the regularization term and also the constants $\frac{1}{2}$

In [106]:
def compute_mse(prediction, ground_truth):
    # TODO: implement
    residual = np.subtract(prediction, ground_truth)
    squared = residual**2
    sum_of_squared = np.sum(squared)
    mse = (1/ground_truth.size)*sum_of_squared
    return mse

prediction = np.array([1, 3, 6])
truth = np.array([0, 0, 0])
mse = compute_mse(prediction, truth)
print (mse)

15.3333333333


Implement a function which computes the prediction of your model. (0.5 point)

In [107]:
def get_prediction(X, W):
    # TODO: implement
    Yhat = np.matmul(X, W)
    return Yhat

dataset = [[1, 1], [2, 3], [4, 3], [3, 2], [5, 5]]
coef = [0.4, 0.8]
for row in dataset:
    yhat = get_prediction(dataset, np.transpose(coef))
    
print (yhat)

[ 1.2  3.2  4.   2.8  6. ]


Implement a function which computes the gradient of your loss function. (1.0 point) <br />
*Hint: Just implementing the gradient computed in **3.** (a)*

In [120]:
def compute_gradient(X, Y, Yhat, W, lambda_):
    # TODO: implement
    X_T = np.transpose(X)
    pre = np.matmul(X,W)
    gradient = X_T*(pre - Y) + W * lambda_
    return gradient

X = np.array([1, 1])
coef = np.transpose(np.array([0.4, 0.8]))
truth = np.array([1])
Yhat = np.array([1.2])
print (compute_gradient(X, 1.0, Yhat, coef, 1.0))


[ 0.6  1. ]


Implement a function which performs a single update step of SGD. (0.5 point)

In [80]:
# Hint: avoid in-place modification
def sgd(gradient, lr, cur_W):
    # TODO: implement
    new_W = cur_W - (lr * gradient)
    return new_W

Complete the following function which reformats your data as a design matrix. (0.5 point)

In [94]:
# concatenate X acc. to order of polynomial; likewise do it for W
# where X is design matrix, W is the corresponding weight vector
# [1 X X^2 X^3], [1 W1 W2 W3].T

#this function is not done yet
def prepare_data_matrix(X, W, order):
    # TODO: implement
    for i in range(0, order+1):
        if(i==0):
            X_mat = np.ones(X.shape)
            W_vec = np.ones(W.shape)
        else:
            X_mat = np.concatenate((X_mat, X**(i)), axis=1)
            W_vec = np.concatenate((W_vec, W**(i)))
    
#     X_mat = None
    W_vec = np.transpose(W_vec)
    return X_mat, W_vec


# X = np.array([[1, 2, 3], [4, 5, 6]], np.int32)
# W = np.array([[1, 2, 3]], np.int32)
# x_mat, w_vec = prepare_data_matrix(X, W, 3)
# print (x_mat)
# print (w_vec)
# prediction = get_prediction(x_mat, w_vec)

###### 8. Training
Complete the code in the following cell such that it performs **mini-batch gradient descent** on the training data for all possible hyperparameter combinations. (4.0 points)

Note: You can also define a function, named appropriately, which performs training. But, take care to do correct bookkeeping of hyperparameter combinations, weight vectors, and the MSE values.

In [129]:
splits = split_data(data_npr)
X_train, Y_train, X_val, Y_val, X_test, Y_test = itertools.chain(*splits)

tolerance = 1e-3
start = 1

# initialize weight vector from normal distribution
# TODO: implement
w_shape = X_train.shape[1]
W_init = np.random.randn(w_shape)

# cache weights for each hyperparam combination
# TODO: implement
weights_hist = {}
for order in polynomial_order:
    for lr in learning_rates:
        for lamb in lambdas:
            weights_hist[(order, lr, lamb)] = W_init

# keep track of MSE for each hparam combination. will be useful for plotting
# TODO: implement
mse_hist = {}
for order in polynomial_order:
    for lr in learning_rates:
        for lamb in lambdas:
            mse_hist[(order, lr, lamb)] = 0.0

# find optimal hyperparameters
for order in polynomial_order:
    for lr in learning_rates:
        for lamb in lambdas:
            # initialize necessary stuffs
            # TODO: implement
            W = weights_hist[(order, lr, lamb)]
            
            # design matrix needed at this point
            # use the function that we defined above
            # TODO: implement
            X_mat, W_vec = prepare_data_matrix(X_train, W, order)

            epochs = 1
            # goes through multiple epochs
            while True:
                # good idea to shuffle the train data
                # TODO: implement
                X_mat, Y_train = shuffle_train_data(X_mat, Y_train)
                
                # some more initialization
                # TODO: implement
                bs = 0
                nsamples = X_train.shape[0]
                prediction = np.empty(Y_train.shape)
                # goes through 1 epoch
                while bs < nsamples:
                    x = X_mat[bs]
                    prediction[bs] = get_prediction(x, W_vec)
                    gradient = compute_gradient(x, Y_train[bs], prediction[bs], W_vec, lamb)
                    W_vec = sgd(gradient, lr, W_vec)
#                     print("prediction: {} , ground truth: {} ".format(prediction[bs], Y_train[bs]))
                    bs = bs + 1
                    # complete code for 1 epoch
                    # TODO: implement
                    
                # after each epoch
                # get prediction for whole X_train
                # compute the MSE
                # might need to do bookkeeping of mse values as well
                mse = compute_mse(prediction, Y_train)
                mse_hist[(order, lr, lamb)] = mse

                # stopping/convergence criterion
                # check whether diff-in-mse < tolerance
                # TODO: implement
                if(mse < tolerance):
                    break
                epochs += 1
                weights_hist[(order, lr, lamb)] = W_vec
                    # cache weight vector for later use
                    # but we also need the hparam combination
                    # TODO: implement
                print("order: {} , learning rate: {} , regularizer: {} ".format(order, lr, lamb))
                print("Convergence after epoch {} with MSE {}".format(epochs, mse), "\n")
            

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2 with MSE 28.325647337410512 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3 with MSE 1.4556499732824661 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4 with MSE 1.0431315315464287 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5 with MSE 0.9951038571454035 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6 with MSE 0.9547749920055263 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7 with MSE 0.9550365134463659 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8 with MSE 0.9484224551681265 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 9 with MSE 0.9482105680607978 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 10 with MSE 0.9354620385251489 

order: 1 , learning rate: 1

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 81 with MSE 0.804424706620353 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 82 with MSE 0.794475756612034 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 83 with MSE 0.8110279569885431 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 84 with MSE 0.8034430274990151 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 85 with MSE 0.8106210173291772 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 86 with MSE 0.7974785783434499 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 87 with MSE 0.7965098025994953 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 88 with MSE 0.8029498557493399 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 89 with MSE 0.8040051619935702 

order: 1 , learning r

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 160 with MSE 0.7658145205989733 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 161 with MSE 0.7594663424850554 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 162 with MSE 0.7644450116352363 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 163 with MSE 0.7541039411866935 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 164 with MSE 0.7689455605633688 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 165 with MSE 0.7668551979443827 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 166 with MSE 0.7573989042644808 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 167 with MSE 0.7544093953889386 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 168 with MSE 0.765114632602041 

order: 1 , 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 235 with MSE 0.7318471319504887 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 236 with MSE 0.742491301768388 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 237 with MSE 0.7350743927056405 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 238 with MSE 0.727712126513259 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 239 with MSE 0.7363007022810456 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 240 with MSE 0.7443088805131199 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 241 with MSE 0.7397565802819326 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 242 with MSE 0.7378909897024981 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 243 with MSE 0.7353091390876773 

order: 1 , l

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 314 with MSE 0.7308140772653696 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 315 with MSE 0.7413982228104319 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 316 with MSE 0.7319794067161918 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 317 with MSE 0.723727227573517 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 318 with MSE 0.7219309166501342 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 319 with MSE 0.7341742558498103 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 320 with MSE 0.7209121131193144 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 321 with MSE 0.7311341013001788 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 322 with MSE 0.7403300534235947 

order: 1 , 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 393 with MSE 0.7327228231499008 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 394 with MSE 0.718352884549687 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 395 with MSE 0.7157690479014847 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 396 with MSE 0.727609979574524 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 397 with MSE 0.7308668490484029 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 398 with MSE 0.7240904933946838 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 399 with MSE 0.7132808568345794 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 400 with MSE 0.7251317124756518 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 401 with MSE 0.7219176200479097 

order: 1 , l

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 468 with MSE 0.717151463727585 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 469 with MSE 0.7108455014188373 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 470 with MSE 0.7142033192390256 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 471 with MSE 0.7131080020786088 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 472 with MSE 0.707458067248205 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 473 with MSE 0.7124276758962791 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 474 with MSE 0.716655413271431 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 475 with MSE 0.7254294994937406 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 476 with MSE 0.7118291670954277 

order: 1 , le

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 545 with MSE 0.7062367898236807 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 546 with MSE 0.723646125088567 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 547 with MSE 0.7117408660477859 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 548 with MSE 0.7057561507820487 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 549 with MSE 0.717586725753932 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 550 with MSE 0.7177175197152026 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 551 with MSE 0.7045475349379712 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 552 with MSE 0.7139383233911968 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 553 with MSE 0.7100983849963922 

order: 1 , l

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 623 with MSE 0.7114241850758736 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 624 with MSE 0.7134850939220316 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 625 with MSE 0.70831086624645 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 626 with MSE 0.7255378712271844 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 627 with MSE 0.707619518666556 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 628 with MSE 0.7114280134715988 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 629 with MSE 0.7184541823831018 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 630 with MSE 0.7204850665367728 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 631 with MSE 0.7234681769806023 

order: 1 , le

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 698 with MSE 0.7188382391967061 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 699 with MSE 0.7120296221701828 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 700 with MSE 0.7078131300895368 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 701 with MSE 0.7145773626365466 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 702 with MSE 0.7105011343947953 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 703 with MSE 0.7202019404558653 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 704 with MSE 0.7081438725336534 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 705 with MSE 0.6986163798219667 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 706 with MSE 0.7076664796771014 

order: 1 ,

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 777 with MSE 0.6921929275784234 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 778 with MSE 0.7021858916837526 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 779 with MSE 0.716155122455961 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 780 with MSE 0.7068601579826892 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 781 with MSE 0.7182684596340454 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 782 with MSE 0.7211851377165621 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 783 with MSE 0.71889537271278 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 784 with MSE 0.7152041618949657 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 785 with MSE 0.7130116967689522 

order: 1 , le

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 854 with MSE 0.7100407724280838 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 855 with MSE 0.7094303511712942 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 856 with MSE 0.708241703346413 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 857 with MSE 0.7124471551874004 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 858 with MSE 0.7008577187337429 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 859 with MSE 0.7091852526710152 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 860 with MSE 0.7100970564600204 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 861 with MSE 0.7131686122830795 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 862 with MSE 0.7097835785300135 

order: 1 , 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 932 with MSE 0.7050602524097195 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 933 with MSE 0.7074100737353313 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 934 with MSE 0.6992258264404356 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 935 with MSE 0.7080006567933372 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 936 with MSE 0.7170508975456391 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 937 with MSE 0.7099744479665625 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 938 with MSE 0.7074232266834606 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 939 with MSE 0.7047606442131759 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 940 with MSE 0.7139099288128958 

order: 1 ,

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1008 with MSE 0.7176676150249949 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1009 with MSE 0.7178451832946127 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1010 with MSE 0.7109735574391375 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1011 with MSE 0.7126145539281016 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1012 with MSE 0.7091480571341172 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1013 with MSE 0.7139802454399043 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1014 with MSE 0.7116705291002967 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1015 with MSE 0.7123846917485489 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1016 with MSE 0.7079129377504773 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1084 with MSE 0.716661642247156 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1085 with MSE 0.7164397565535029 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1086 with MSE 0.7169266127539284 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1087 with MSE 0.7118166863686097 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1088 with MSE 0.7204906335708449 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1089 with MSE 0.7144785449753694 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1090 with MSE 0.7046805941620313 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1091 with MSE 0.7080343193713902 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1092 with MSE 0.7151919205130993 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1161 with MSE 0.7090863939409636 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1162 with MSE 0.7017330442811408 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1163 with MSE 0.715073203459975 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1164 with MSE 0.7161653422797943 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1165 with MSE 0.7118137032476874 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1166 with MSE 0.7242757306520528 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1167 with MSE 0.7106317207156386 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1168 with MSE 0.7015914653892062 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1169 with MSE 0.7213648040783798 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1239 with MSE 0.7188583238572497 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1240 with MSE 0.7076554606745468 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1241 with MSE 0.7153808193551874 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1242 with MSE 0.7104957015752152 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1243 with MSE 0.7154031025403291 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1244 with MSE 0.7013712059970771 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1245 with MSE 0.7008809346864602 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1246 with MSE 0.7053515023256092 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1247 with MSE 0.7042115378416849 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1315 with MSE 0.7019351045979858 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1316 with MSE 0.7088952298098569 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1317 with MSE 0.7069432156892544 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1318 with MSE 0.7169044278449097 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1319 with MSE 0.7059676221444642 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1320 with MSE 0.7061464265686722 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1321 with MSE 0.708477233600695 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1322 with MSE 0.7093334980753639 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1323 with MSE 0.7251136671447681 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1390 with MSE 0.7025037492941505 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1391 with MSE 0.7106937558753322 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1392 with MSE 0.7003916676394104 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1393 with MSE 0.714498055356297 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1394 with MSE 0.7040186449563769 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1395 with MSE 0.7099055621431416 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1396 with MSE 0.7100605585161766 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1397 with MSE 0.7162252544054073 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1398 with MSE 0.7125591506365043 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1468 with MSE 0.7079601443402472 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1469 with MSE 0.7081181851602145 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1470 with MSE 0.7110411336567218 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1471 with MSE 0.708168569249882 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1472 with MSE 0.7099895731825806 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1473 with MSE 0.7050240319961513 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1474 with MSE 0.7104229783433436 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1475 with MSE 0.7068809749050081 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1476 with MSE 0.7055728593027474 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1543 with MSE 0.7090355113761208 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1544 with MSE 0.6994970365687133 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1545 with MSE 0.7084417047965801 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1546 with MSE 0.714057036050429 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1547 with MSE 0.7047924426905531 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1548 with MSE 0.7209194143373323 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1549 with MSE 0.7154437187478423 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1550 with MSE 0.7105905275111893 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1551 with MSE 0.695171547656614 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1618 with MSE 0.7140105196292599 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1619 with MSE 0.718248899978676 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1620 with MSE 0.718357662055578 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1621 with MSE 0.7112752305492899 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1622 with MSE 0.7192362558752255 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1623 with MSE 0.7071347650317938 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1624 with MSE 0.713199080266178 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1625 with MSE 0.7041746116114452 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1626 with MSE 0.697215433237141 

order

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1696 with MSE 0.7137552351719996 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1697 with MSE 0.7120017501885951 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1698 with MSE 0.7146727829997663 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1699 with MSE 0.7081463089496461 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1700 with MSE 0.7124930114488106 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1701 with MSE 0.7117933894230105 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1702 with MSE 0.7056677751127003 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1703 with MSE 0.7099028409304308 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1704 with MSE 0.7040167324637736 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1775 with MSE 0.7117987202309565 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1776 with MSE 0.6973958668889345 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1777 with MSE 0.707500690282334 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1778 with MSE 0.7104837186100956 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1779 with MSE 0.7104474508516359 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1780 with MSE 0.7059012199709166 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1781 with MSE 0.7117970756663533 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1782 with MSE 0.7102908662793247 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1783 with MSE 0.6975784123455286 

or


order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1849 with MSE 0.7120591230115699 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1850 with MSE 0.7141102411406401 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1851 with MSE 0.7019365300434054 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1852 with MSE 0.709572584688213 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1853 with MSE 0.7188488161435539 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1854 with MSE 0.7038175956989906 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1855 with MSE 0.7165312130300707 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1856 with MSE 0.7129241576233919 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1857 with MSE 0.7101428518495798 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1926 with MSE 0.7145507699896275 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1927 with MSE 0.7136975627731936 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1928 with MSE 0.6999448163620776 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1929 with MSE 0.7063765166850802 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1930 with MSE 0.7171656963783367 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1931 with MSE 0.7142608666188178 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1932 with MSE 0.7187276008703135 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1933 with MSE 0.7207224732363773 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 1934 with MSE 0.7164443570112863 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2005 with MSE 0.7020165752889073 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2006 with MSE 0.7062805979165704 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2007 with MSE 0.7133370167491999 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2008 with MSE 0.7096253512173188 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2009 with MSE 0.7150085784951378 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2010 with MSE 0.7050619948098699 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2011 with MSE 0.7116745431540339 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2012 with MSE 0.7119546290595774 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2013 with MSE 0.7199226586282865 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2080 with MSE 0.7234826665701204 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2081 with MSE 0.7125751261579133 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2082 with MSE 0.6990569997264051 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2083 with MSE 0.6965361924262087 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2084 with MSE 0.7086000655485771 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2085 with MSE 0.6949703693439089 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2086 with MSE 0.7135683722664626 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2087 with MSE 0.7150919912463799 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2088 with MSE 0.7182182252856582 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2155 with MSE 0.7073997508432746 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2156 with MSE 0.723646181588313 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2157 with MSE 0.7003318983923543 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2158 with MSE 0.7106538552773024 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2159 with MSE 0.7097909150123186 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2160 with MSE 0.7064669621997435 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2161 with MSE 0.7100282483614379 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2162 with MSE 0.7177426411373161 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2163 with MSE 0.7027386989899325 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2234 with MSE 0.7214050712676144 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2235 with MSE 0.7067069828662209 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2236 with MSE 0.7108096693381482 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2237 with MSE 0.7121668134995717 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2238 with MSE 0.6988472473562927 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2239 with MSE 0.7131420274973802 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2240 with MSE 0.704330443314012 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2241 with MSE 0.7200603552087897 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2242 with MSE 0.7109738163654363 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2309 with MSE 0.7110511623322527 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2310 with MSE 0.7123273180349566 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2311 with MSE 0.7134906227365284 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2312 with MSE 0.6987736495661483 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2313 with MSE 0.7098998925421248 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2314 with MSE 0.7093735392484872 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2315 with MSE 0.7092480378829413 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2316 with MSE 0.70985048684265 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2317 with MSE 0.7155695819110819 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2384 with MSE 0.7145520641629376 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2385 with MSE 0.7130393580327162 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2386 with MSE 0.7158596167584673 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2387 with MSE 0.7175747168956025 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2388 with MSE 0.7151188657835399 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2389 with MSE 0.7078329361690343 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2390 with MSE 0.7097011766707474 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2391 with MSE 0.7013129007441457 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2392 with MSE 0.7063900931395002 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2459 with MSE 0.7099737779597405 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2460 with MSE 0.7246124160308226 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2461 with MSE 0.710243783212117 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2462 with MSE 0.7204040103924485 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2463 with MSE 0.7146097058974141 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2464 with MSE 0.7077086226653714 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2465 with MSE 0.7057760788963163 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2466 with MSE 0.7266199094141396 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2467 with MSE 0.7197026869150741 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2533 with MSE 0.7163626932190907 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2534 with MSE 0.7027122038301004 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2535 with MSE 0.7113812836525099 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2536 with MSE 0.7241115470669575 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2537 with MSE 0.7109661406420307 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2538 with MSE 0.7195730608282165 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2539 with MSE 0.7160102300575134 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2540 with MSE 0.7059555839431394 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2541 with MSE 0.7129387997939488 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2609 with MSE 0.7049029740992631 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2610 with MSE 0.7058002309916381 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2611 with MSE 0.7058713429059017 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2612 with MSE 0.7114624565442139 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2613 with MSE 0.7012692222316527 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2614 with MSE 0.7073286459060775 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2615 with MSE 0.7095534653855474 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2616 with MSE 0.7157835059843448 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2617 with MSE 0.7097883128497213 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2684 with MSE 0.7047280964619963 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2685 with MSE 0.7103040778090142 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2686 with MSE 0.7174320776441493 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2687 with MSE 0.7248028275978299 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2688 with MSE 0.6998630025039496 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2689 with MSE 0.7172175931256891 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2690 with MSE 0.7118354617125936 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2691 with MSE 0.7032999301714279 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2692 with MSE 0.7075714564995276 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2760 with MSE 0.7116854169720324 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2761 with MSE 0.7037327132444485 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2762 with MSE 0.7092865469371563 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2763 with MSE 0.7088197930391145 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2764 with MSE 0.7128918193631756 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2765 with MSE 0.7157254919406355 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2766 with MSE 0.7079759993412955 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2767 with MSE 0.705346844036185 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2768 with MSE 0.7031745459855534 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2838 with MSE 0.7168821088292321 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2839 with MSE 0.7022127679912346 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2840 with MSE 0.71712515668668 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2841 with MSE 0.7124810001519855 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2842 with MSE 0.7031172329395244 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2843 with MSE 0.7001465331425908 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2844 with MSE 0.7112073671270007 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2845 with MSE 0.7098785261519694 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2846 with MSE 0.7191909809980176 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2912 with MSE 0.721045482985977 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2913 with MSE 0.721122883447858 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2914 with MSE 0.7073015955616309 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2915 with MSE 0.7034802245972545 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2916 with MSE 0.698717740691194 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2917 with MSE 0.7003103257037787 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2918 with MSE 0.7107910842929753 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2919 with MSE 0.7074073206801803 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2920 with MSE 0.7119648969290391 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2986 with MSE 0.7010335400557564 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2987 with MSE 0.7149925774140439 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2988 with MSE 0.7216834935337817 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2989 with MSE 0.7156570427371614 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2990 with MSE 0.7039555770970541 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2991 with MSE 0.7144401059544602 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2992 with MSE 0.70468235310629 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2993 with MSE 0.7142785125864989 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 2994 with MSE 0.713742176412674 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3063 with MSE 0.7139360645701689 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3064 with MSE 0.7040019211667072 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3065 with MSE 0.703333529633178 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3066 with MSE 0.7112715844408753 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3067 with MSE 0.7014983228394516 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3068 with MSE 0.7152950198991207 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3069 with MSE 0.7045636518898373 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3070 with MSE 0.7113737689989812 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3071 with MSE 0.7167746170954588 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3139 with MSE 0.7157369310192537 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3140 with MSE 0.705342606175393 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3141 with MSE 0.6969850854158879 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3142 with MSE 0.7121091797881199 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3143 with MSE 0.7084469361312845 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3144 with MSE 0.7140621737400978 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3145 with MSE 0.7139215690652583 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3146 with MSE 0.7232253677775142 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3147 with MSE 0.715549725797036 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3213 with MSE 0.7020740758825967 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3214 with MSE 0.7057786219407479 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3215 with MSE 0.7153870177455588 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3216 with MSE 0.7136004709076214 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3217 with MSE 0.701637811236345 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3218 with MSE 0.7010759554011944 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3219 with MSE 0.7042944770322103 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3220 with MSE 0.7037564834213884 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3221 with MSE 0.6988428989568206 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3292 with MSE 0.7092480755138212 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3293 with MSE 0.7050565867006209 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3294 with MSE 0.6974000197021509 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3295 with MSE 0.7152585247623005 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3296 with MSE 0.704031532394585 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3297 with MSE 0.7038897100125832 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3298 with MSE 0.7258543758065559 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3299 with MSE 0.702682031300209 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3300 with MSE 0.7031501959167082 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3367 with MSE 0.7106923260180031 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3368 with MSE 0.71004054843087 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3369 with MSE 0.7104748261973914 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3370 with MSE 0.7069918860503136 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3371 with MSE 0.7106655266426636 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3372 with MSE 0.7122973740755825 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3373 with MSE 0.706048366655848 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3374 with MSE 0.712646097213178 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3375 with MSE 0.7139372115939115 

order

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3441 with MSE 0.7048722662858005 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3442 with MSE 0.7143407490556412 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3443 with MSE 0.7066343790019445 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3444 with MSE 0.7139977901503988 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3445 with MSE 0.7056028224963234 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3446 with MSE 0.7072424829410682 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3447 with MSE 0.7197076920360249 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3448 with MSE 0.712553842775886 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3449 with MSE 0.7220209882449489 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3519 with MSE 0.7087448652114317 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3520 with MSE 0.708770714602298 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3521 with MSE 0.7194545989177926 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3522 with MSE 0.7141848245519147 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3523 with MSE 0.7122092248033359 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3524 with MSE 0.6994958434853953 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3525 with MSE 0.715417967833001 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3526 with MSE 0.7187539610769331 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3527 with MSE 0.7003228064626111 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3594 with MSE 0.7058474144144831 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3595 with MSE 0.7143602931594623 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3596 with MSE 0.711942418037941 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3597 with MSE 0.7042104252912887 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3598 with MSE 0.7137804391823579 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3599 with MSE 0.7136123278702389 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3600 with MSE 0.712389756820457 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3601 with MSE 0.7102968857356389 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3602 with MSE 0.7205379474824469 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3668 with MSE 0.6965529932878117 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3669 with MSE 0.7095667690533505 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3670 with MSE 0.7127649333325184 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3671 with MSE 0.7094709872673346 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3672 with MSE 0.7116318588341802 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3673 with MSE 0.7150933829135966 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3674 with MSE 0.7151611744941224 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3675 with MSE 0.7062455085245283 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3676 with MSE 0.7034393541872237 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3746 with MSE 0.7051572643726723 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3747 with MSE 0.707107291674161 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3748 with MSE 0.7075492547759793 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3749 with MSE 0.7122227263979617 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3750 with MSE 0.7051506433076273 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3751 with MSE 0.7063758701503184 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3752 with MSE 0.7104969668261543 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3753 with MSE 0.709608860826268 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3754 with MSE 0.7082924507971893 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3822 with MSE 0.7076163735141254 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3823 with MSE 0.7122920131863051 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3824 with MSE 0.7086554058988109 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3825 with MSE 0.7167339514765491 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3826 with MSE 0.7098079907896848 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3827 with MSE 0.7059017248305283 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3828 with MSE 0.7032661527684582 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3829 with MSE 0.7081901975218368 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3830 with MSE 0.7102564153757885 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3900 with MSE 0.7110447167714617 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3901 with MSE 0.7120961873797054 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3902 with MSE 0.7126100705954805 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3903 with MSE 0.7089201152144445 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3904 with MSE 0.703673852316498 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3905 with MSE 0.7160798677251875 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3906 with MSE 0.6996278353473536 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3907 with MSE 0.7148457966017543 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3908 with MSE 0.7046037301387608 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3978 with MSE 0.7139843288794695 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3979 with MSE 0.714424726077155 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3980 with MSE 0.7168710678264496 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3981 with MSE 0.7121837334197779 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3982 with MSE 0.7087760323022515 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3983 with MSE 0.710444704594565 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3984 with MSE 0.7026594082508933 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3985 with MSE 0.7051531723680418 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 3986 with MSE 0.7149710795174299 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4054 with MSE 0.7043504118825747 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4055 with MSE 0.7077241698358916 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4056 with MSE 0.7197119205553903 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4057 with MSE 0.7037151888782067 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4058 with MSE 0.7071931534689341 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4059 with MSE 0.7160366534060646 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4060 with MSE 0.7035942309696098 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4061 with MSE 0.7074636307916011 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4062 with MSE 0.6987622033931038 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4132 with MSE 0.7122799499691447 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4133 with MSE 0.6943478547539191 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4134 with MSE 0.7065898222269561 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4135 with MSE 0.7044786240621297 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4136 with MSE 0.7068790726147024 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4137 with MSE 0.716968176946752 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4138 with MSE 0.7150006671958472 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4139 with MSE 0.7088734241671149 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4140 with MSE 0.7189257537789021 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4209 with MSE 0.6909615428813122 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4210 with MSE 0.7140273512205816 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4211 with MSE 0.704952939018429 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4212 with MSE 0.7175652291880138 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4213 with MSE 0.7071036070320693 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4214 with MSE 0.7048356887288044 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4215 with MSE 0.711155576043394 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4216 with MSE 0.7053792087858474 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4217 with MSE 0.7027114752002341 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4287 with MSE 0.7134471145572879 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4288 with MSE 0.7138891239002225 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4289 with MSE 0.7004147872659059 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4290 with MSE 0.6966147713559216 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4291 with MSE 0.7133183647415194 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4292 with MSE 0.7005765892121202 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4293 with MSE 0.7254415194120877 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4294 with MSE 0.7121928913906531 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4295 with MSE 0.7125528474963766 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4361 with MSE 0.7160769393343462 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4362 with MSE 0.711581599911891 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4363 with MSE 0.6988561344183524 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4364 with MSE 0.7087306574505068 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4365 with MSE 0.7142454414545919 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4366 with MSE 0.7058638702361949 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4367 with MSE 0.7020632300335721 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4368 with MSE 0.7098964039838269 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4369 with MSE 0.7155861776713831 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4436 with MSE 0.6986612425866836 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4437 with MSE 0.7173597941176457 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4438 with MSE 0.7135368063375991 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4439 with MSE 0.7154767818380477 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4440 with MSE 0.7084457533887696 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4441 with MSE 0.7166496703829107 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4442 with MSE 0.7145478123206201 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4443 with MSE 0.7052791122666493 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4444 with MSE 0.7178757641523248 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4512 with MSE 0.7087982941727453 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4513 with MSE 0.6983171997558997 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4514 with MSE 0.7024225889175736 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4515 with MSE 0.7101322158639244 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4516 with MSE 0.7002628328118389 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4517 with MSE 0.7167586167456707 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4518 with MSE 0.7109695531190151 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4519 with MSE 0.7115936375629424 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4520 with MSE 0.7024957513673139 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4586 with MSE 0.7044532092900299 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4587 with MSE 0.7049762867127013 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4588 with MSE 0.7036381299122038 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4589 with MSE 0.7226629158269486 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4590 with MSE 0.7045524408513837 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4591 with MSE 0.7119235668286015 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4592 with MSE 0.714643233579059 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4593 with MSE 0.7121480720119128 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4594 with MSE 0.7147142492415108 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4663 with MSE 0.7088493411543643 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4664 with MSE 0.7152126952130764 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4665 with MSE 0.7135656502009273 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4666 with MSE 0.7082708341939492 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4667 with MSE 0.7095009901880197 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4668 with MSE 0.7006988554943927 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4669 with MSE 0.7074986139697799 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4670 with MSE 0.7175938141074862 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4671 with MSE 0.7173130411192709 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4742 with MSE 0.7056070589986707 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4743 with MSE 0.7355340219214257 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4744 with MSE 0.7079751614087434 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4745 with MSE 0.7168565952675816 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4746 with MSE 0.7048084681712562 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4747 with MSE 0.7103615767768293 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4748 with MSE 0.7030531951023438 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4749 with MSE 0.7046540217602446 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4750 with MSE 0.7188025399895669 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4817 with MSE 0.701341303467123 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4818 with MSE 0.7145434937522449 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4819 with MSE 0.7142119333371807 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4820 with MSE 0.7051004438028309 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4821 with MSE 0.7014967932067985 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4822 with MSE 0.7146679926666768 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4823 with MSE 0.7296140804406303 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4824 with MSE 0.7082136082098907 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4825 with MSE 0.7043401304908827 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4892 with MSE 0.7142965676346174 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4893 with MSE 0.7219523825067233 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4894 with MSE 0.7139813688839369 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4895 with MSE 0.6993757424929825 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4896 with MSE 0.7099169375960377 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4897 with MSE 0.708417155731498 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4898 with MSE 0.7058396027299946 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4899 with MSE 0.7105957698729206 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4900 with MSE 0.7062449273979631 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4967 with MSE 0.7198902152226797 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4968 with MSE 0.7096584817835904 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4969 with MSE 0.7080543772649444 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4970 with MSE 0.7181660701933301 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4971 with MSE 0.716383532927569 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4972 with MSE 0.7110838943745684 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4973 with MSE 0.7140376886088143 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4974 with MSE 0.7227123491095432 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 4975 with MSE 0.7115722269214996 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5044 with MSE 0.7241772939404455 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5045 with MSE 0.7101104021425187 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5046 with MSE 0.6999047739477776 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5047 with MSE 0.7074930072485387 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5048 with MSE 0.7062308804907953 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5049 with MSE 0.7102558788823261 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5050 with MSE 0.7110013548132291 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5051 with MSE 0.7096162060224537 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5052 with MSE 0.7182627030869829 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5122 with MSE 0.7004711436827193 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5123 with MSE 0.723117717526395 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5124 with MSE 0.7069269503304658 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5125 with MSE 0.7130740857612572 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5126 with MSE 0.7219432739904947 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5127 with MSE 0.7033234261611124 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5128 with MSE 0.7030860171952107 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5129 with MSE 0.7060956714312816 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5130 with MSE 0.7100852749702541 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5198 with MSE 0.7334412896733211 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5199 with MSE 0.7036820491099403 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5200 with MSE 0.7041671043081666 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5201 with MSE 0.7195982278267917 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5202 with MSE 0.7069880760382645 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5203 with MSE 0.7014138440752686 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5204 with MSE 0.7088838623562885 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5205 with MSE 0.7074398968127364 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5206 with MSE 0.7141846252488985 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5275 with MSE 0.7053295522252628 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5276 with MSE 0.7127660176607146 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5277 with MSE 0.710673992527685 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5278 with MSE 0.7115726028582265 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5279 with MSE 0.7145447201662668 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5280 with MSE 0.7035940628096826 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5281 with MSE 0.7097272933130773 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5282 with MSE 0.7080060831309357 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5283 with MSE 0.706328728429453 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5352 with MSE 0.7125327066258791 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5353 with MSE 0.7158573906993843 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5354 with MSE 0.6997986022461928 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5355 with MSE 0.7142637408574466 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5356 with MSE 0.7029390060690939 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5357 with MSE 0.7121267789085999 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5358 with MSE 0.7010332702295973 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5359 with MSE 0.7129874816759485 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5360 with MSE 0.7187282226125317 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5428 with MSE 0.7079984978876068 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5429 with MSE 0.721932521391904 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5430 with MSE 0.7113773839868892 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5431 with MSE 0.7117596306427073 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5432 with MSE 0.6986377048633132 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5433 with MSE 0.7094532693339323 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5434 with MSE 0.7118807274157226 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5435 with MSE 0.7102190527193446 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5436 with MSE 0.7121891174117999 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5503 with MSE 0.7117329581268639 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5504 with MSE 0.7165379666984066 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5505 with MSE 0.7054390102526612 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5506 with MSE 0.711996423382174 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5507 with MSE 0.7110897228047045 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5508 with MSE 0.7143143295767198 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5509 with MSE 0.6980717033788965 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5510 with MSE 0.7035815384487474 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5511 with MSE 0.7067273387833588 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5577 with MSE 0.7144530453017319 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5578 with MSE 0.6971445993305552 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5579 with MSE 0.7116101985952757 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5580 with MSE 0.7061357574336227 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5581 with MSE 0.7272300351104644 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5582 with MSE 0.7098335008950343 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5583 with MSE 0.71408692082905 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5584 with MSE 0.7111454814451631 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5585 with MSE 0.7081946818862273 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5651 with MSE 0.7252860368769113 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5652 with MSE 0.7075537190265451 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5653 with MSE 0.7041136848726561 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5654 with MSE 0.7074291956527354 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5655 with MSE 0.7251337382049227 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5656 with MSE 0.7137674966827001 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5657 with MSE 0.6979481834281919 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5658 with MSE 0.7140815379169919 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5659 with MSE 0.7189465546618776 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5725 with MSE 0.6981630534076781 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5726 with MSE 0.7168595818382352 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5727 with MSE 0.7093597471363744 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5728 with MSE 0.7157289339999741 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5729 with MSE 0.7190245502070903 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5730 with MSE 0.7178393186579167 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5731 with MSE 0.7134973062162918 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5732 with MSE 0.7185059085324056 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5733 with MSE 0.7164800908898821 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5801 with MSE 0.7125755462522214 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5802 with MSE 0.7263146755444795 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5803 with MSE 0.7091333901674225 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5804 with MSE 0.7147281606102815 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5805 with MSE 0.703401216332162 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5806 with MSE 0.719198356483191 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5807 with MSE 0.7119863913571313 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5808 with MSE 0.7086593697618441 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5809 with MSE 0.7099133832622855 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5878 with MSE 0.7111949365905593 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5879 with MSE 0.7040473822011349 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5880 with MSE 0.7091427656211761 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5881 with MSE 0.7093534889817695 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5882 with MSE 0.710283001922113 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5883 with MSE 0.7042676037246108 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5884 with MSE 0.7070133423469718 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5885 with MSE 0.7017221731427459 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5886 with MSE 0.7029597106369911 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5957 with MSE 0.7097387664625555 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5958 with MSE 0.7123871272444461 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5959 with MSE 0.7112318296433546 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5960 with MSE 0.7190039559308568 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5961 with MSE 0.7073280650260386 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5962 with MSE 0.7137662774306728 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5963 with MSE 0.706250511538367 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5964 with MSE 0.7162816139760169 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 5965 with MSE 0.7119348619592075 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6033 with MSE 0.7275305510926797 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6034 with MSE 0.7178919333505671 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6035 with MSE 0.7070587686662096 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6036 with MSE 0.7088783317407801 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6037 with MSE 0.7044599129567224 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6038 with MSE 0.707907732545935 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6039 with MSE 0.718365490693482 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6040 with MSE 0.7090121259633079 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6041 with MSE 0.7108992071055837 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6110 with MSE 0.7175045971127982 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6111 with MSE 0.716042802318251 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6112 with MSE 0.7083254253713656 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6113 with MSE 0.7111437540446798 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6114 with MSE 0.7031964067806094 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6115 with MSE 0.7167485175999421 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6116 with MSE 0.7093470486470705 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6117 with MSE 0.7130408576954487 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6118 with MSE 0.7031416446586327 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6184 with MSE 0.7096244277441452 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6185 with MSE 0.7065865767978885 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6186 with MSE 0.7134560596370022 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6187 with MSE 0.7169832355520709 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6188 with MSE 0.7038518451975955 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6189 with MSE 0.7091263423806392 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6190 with MSE 0.7135373661004691 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6191 with MSE 0.7053280953968694 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6192 with MSE 0.7050441226783665 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6258 with MSE 0.7052975665146142 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6259 with MSE 0.7193430815479558 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6260 with MSE 0.7078978323172366 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6261 with MSE 0.7203040434298389 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6262 with MSE 0.7150047830143623 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6263 with MSE 0.7058716070548011 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6264 with MSE 0.7001853799341142 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6265 with MSE 0.7172850559754171 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6266 with MSE 0.7071863097213201 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6332 with MSE 0.7065726976931969 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6333 with MSE 0.6992744527863604 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6334 with MSE 0.6980434322522902 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6335 with MSE 0.7252741846740401 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6336 with MSE 0.7062240453779876 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6337 with MSE 0.7052079451786927 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6338 with MSE 0.7114690033311027 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6339 with MSE 0.695737550487566 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6340 with MSE 0.7121453802051282 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6406 with MSE 0.6988913562438106 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6407 with MSE 0.706128403296401 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6408 with MSE 0.7059567956584634 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6409 with MSE 0.7015315400752865 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6410 with MSE 0.7110405525775665 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6411 with MSE 0.7141495006533773 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6412 with MSE 0.707369705042266 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6413 with MSE 0.7131645824724974 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6414 with MSE 0.6962489848840346 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6484 with MSE 0.7060852516971243 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6485 with MSE 0.7188152225550928 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6486 with MSE 0.7123283617818379 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6487 with MSE 0.7101151854949722 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6488 with MSE 0.7125028267078359 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6489 with MSE 0.7061743676895108 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6490 with MSE 0.7127479063812445 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6491 with MSE 0.7144895892356806 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6492 with MSE 0.706004439459995 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6561 with MSE 0.7068266467389476 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6562 with MSE 0.7196337089670587 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6563 with MSE 0.705241181304071 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6564 with MSE 0.7030944808570784 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6565 with MSE 0.7081402625399786 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6566 with MSE 0.714080912754557 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6567 with MSE 0.7092083400556727 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6568 with MSE 0.721854023705012 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6569 with MSE 0.7077097500359935 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6638 with MSE 0.7134120553160856 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6639 with MSE 0.7023342946752636 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6640 with MSE 0.7125256245508927 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6641 with MSE 0.7195802335737603 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6642 with MSE 0.7023957044747298 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6643 with MSE 0.7173013154127004 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6644 with MSE 0.7136454259763135 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6645 with MSE 0.710723407722119 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6646 with MSE 0.7047425347281981 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6712 with MSE 0.7085077742773123 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6713 with MSE 0.7061140713287962 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6714 with MSE 0.7216437221320674 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6715 with MSE 0.7143127177546965 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6716 with MSE 0.7058061960409133 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6717 with MSE 0.7135798191141238 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6718 with MSE 0.7078122643350856 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6719 with MSE 0.7006264746239272 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6720 with MSE 0.7190826504315254 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6787 with MSE 0.7082934208591799 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6788 with MSE 0.7107722095512861 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6789 with MSE 0.7093099069662838 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6790 with MSE 0.7071873720763275 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6791 with MSE 0.7138834737956895 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6792 with MSE 0.7084169792490358 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6793 with MSE 0.7123720459188962 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6794 with MSE 0.7032436652009127 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6795 with MSE 0.706420153410095 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6863 with MSE 0.7107693885162156 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6864 with MSE 0.7142259603916936 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6865 with MSE 0.7087133730777466 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6866 with MSE 0.7252367364198994 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6867 with MSE 0.7245962778577779 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6868 with MSE 0.7098535248239645 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6869 with MSE 0.7026965716990331 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6870 with MSE 0.7115823171373414 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6871 with MSE 0.7064422097494969 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6941 with MSE 0.7185434065471042 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6942 with MSE 0.7034861734718336 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6943 with MSE 0.7128138660658248 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6944 with MSE 0.7049773145249766 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6945 with MSE 0.6969216860267093 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6946 with MSE 0.7179276239692779 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6947 with MSE 0.716628176934941 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6948 with MSE 0.7164502630518592 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 6949 with MSE 0.7029391650199737 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7016 with MSE 0.7055300002527117 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7017 with MSE 0.6983217967050178 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7018 with MSE 0.7068192851432702 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7019 with MSE 0.7163250305595171 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7020 with MSE 0.7145118397330595 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7021 with MSE 0.709047387224838 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7022 with MSE 0.6954644572240241 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7023 with MSE 0.712924030617335 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7024 with MSE 0.703166615107571 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7090 with MSE 0.7071742981596458 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7091 with MSE 0.7106579513189466 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7092 with MSE 0.6999301459949977 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7093 with MSE 0.713187356921252 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7094 with MSE 0.721854094554691 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7095 with MSE 0.7043281925376171 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7096 with MSE 0.7100860129052786 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7097 with MSE 0.7090042732029705 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7098 with MSE 0.7021567711732678 

ord

Convergence after epoch 7164 with MSE 0.7179622652759788 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7165 with MSE 0.7122393748528564 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7166 with MSE 0.703152009218271 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7167 with MSE 0.7032590041475367 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7168 with MSE 0.7087685044629461 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7169 with MSE 0.7089810011338613 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7170 with MSE 0.7008885228031355 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7171 with MSE 0.7128601436749479 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7172 with MSE 0.6998315382624268 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Co

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7240 with MSE 0.7146559472540435 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7241 with MSE 0.7157398175350473 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7242 with MSE 0.7066913592752576 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7243 with MSE 0.7044860357045302 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7244 with MSE 0.7166297674188556 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7245 with MSE 0.7065848850759034 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7246 with MSE 0.7102408663497298 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7247 with MSE 0.7090742803895124 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7248 with MSE 0.7109208039333678 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7314 with MSE 0.7080674719550778 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7315 with MSE 0.7069456299805283 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7316 with MSE 0.7157192132595195 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7317 with MSE 0.7165803823263339 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7318 with MSE 0.7082949848865557 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7319 with MSE 0.7162197437808012 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7320 with MSE 0.7233203597832044 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7321 with MSE 0.7189198817253981 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7322 with MSE 0.719017608551553 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7390 with MSE 0.7036570299134074 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7391 with MSE 0.7143268279979106 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7392 with MSE 0.7168500400159882 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7393 with MSE 0.7203039710880347 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7394 with MSE 0.7086460527346743 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7395 with MSE 0.7007483688739767 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7396 with MSE 0.7225212843746237 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7397 with MSE 0.7169293489757143 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7398 with MSE 0.7038394660390357 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7468 with MSE 0.7099754142261053 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7469 with MSE 0.7043064882715526 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7470 with MSE 0.7085022900877407 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7471 with MSE 0.6992466665248499 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7472 with MSE 0.7009195005401858 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7473 with MSE 0.7088942565676749 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7474 with MSE 0.6947790637380908 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7475 with MSE 0.7147736611025007 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7476 with MSE 0.7081329708174494 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7543 with MSE 0.6928761748470642 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7544 with MSE 0.7078228003446632 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7545 with MSE 0.7177776532503795 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7546 with MSE 0.7095441335897024 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7547 with MSE 0.702404293402897 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7548 with MSE 0.706248331096316 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7549 with MSE 0.7132527035998635 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7550 with MSE 0.7083633080982618 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7551 with MSE 0.7180954989279443 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7619 with MSE 0.7044658603270149 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7620 with MSE 0.7081212228943089 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7621 with MSE 0.712926139361301 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7622 with MSE 0.7154568543334742 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7623 with MSE 0.7027029390878465 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7624 with MSE 0.7103624999282896 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7625 with MSE 0.7148495128066054 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7626 with MSE 0.714188203907726 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7627 with MSE 0.721585245395441 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7695 with MSE 0.7227600739150699 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7696 with MSE 0.7040880243376086 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7697 with MSE 0.7048473061714482 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7698 with MSE 0.7098653297396351 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7699 with MSE 0.7105808252240611 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7700 with MSE 0.7048392424146068 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7701 with MSE 0.7101280845453226 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7702 with MSE 0.7077202940552918 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7703 with MSE 0.7096079943282922 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7770 with MSE 0.7114573863771784 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7771 with MSE 0.7051716812206025 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7772 with MSE 0.7085248082349839 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7773 with MSE 0.7038738059405153 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7774 with MSE 0.7079690307743549 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7775 with MSE 0.7141497950426107 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7776 with MSE 0.7044398644590824 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7777 with MSE 0.6961331323615454 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7778 with MSE 0.7028538672965648 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7846 with MSE 0.7038401984415329 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7847 with MSE 0.7113272184487208 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7848 with MSE 0.7127010905910038 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7849 with MSE 0.716026674114038 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7850 with MSE 0.7113208654457259 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7851 with MSE 0.7173292830673538 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7852 with MSE 0.7099321340617453 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7853 with MSE 0.7123636108346872 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7854 with MSE 0.6943678919969036 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7921 with MSE 0.7128842415918176 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7922 with MSE 0.710254861800535 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7923 with MSE 0.7101027572894271 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7924 with MSE 0.71152308681244 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7925 with MSE 0.7155018620470315 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7926 with MSE 0.7145106248595307 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7927 with MSE 0.7102978839915498 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7928 with MSE 0.7108124089515779 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7929 with MSE 0.7115992201198491 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 7999 with MSE 0.7150861108207722 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8000 with MSE 0.7018234574985717 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8001 with MSE 0.7152036600122222 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8002 with MSE 0.7184699889275217 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8003 with MSE 0.71005078051655 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8004 with MSE 0.7106966018385561 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8005 with MSE 0.698504954498396 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8006 with MSE 0.7100632995306787 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8007 with MSE 0.7032024907265622 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8077 with MSE 0.7178785974201471 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8078 with MSE 0.710069995013565 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8079 with MSE 0.7024051438096806 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8080 with MSE 0.7149646669242777 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8081 with MSE 0.6977456824571923 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8082 with MSE 0.704478614627726 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8083 with MSE 0.71955046954193 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8084 with MSE 0.7131530591446477 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8085 with MSE 0.7210267906716369 

order

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8152 with MSE 0.718418980708223 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8153 with MSE 0.7132885870389746 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8154 with MSE 0.7118516684997994 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8155 with MSE 0.7033794204883913 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8156 with MSE 0.7138219619157645 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8157 with MSE 0.7209227829303719 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8158 with MSE 0.7082880138988706 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8159 with MSE 0.6971445830574076 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8160 with MSE 0.7146000671911615 

or

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8227 with MSE 0.7175428613658289 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8228 with MSE 0.7039071367701916 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8229 with MSE 0.7064157664619058 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8230 with MSE 0.715022103975991 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8231 with MSE 0.7200940328870814 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8232 with MSE 0.71834235518817 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8233 with MSE 0.7100157390172926 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8234 with MSE 0.7190255441440899 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8235 with MSE 0.7101825092621863 

orde

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8302 with MSE 0.7123844872414283 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8303 with MSE 0.7134420352523064 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8304 with MSE 0.7161932311973516 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8305 with MSE 0.6987246336402847 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8306 with MSE 0.7081394559486829 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8307 with MSE 0.7097447032306813 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8308 with MSE 0.7196478045069947 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8309 with MSE 0.7096223501044938 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8310 with MSE 0.7045362663363295 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8378 with MSE 0.7076733350921905 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8379 with MSE 0.7135776110380891 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8380 with MSE 0.7130870795963551 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8381 with MSE 0.7104493243190898 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8382 with MSE 0.699657180362145 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8383 with MSE 0.721756832426252 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8384 with MSE 0.7049505223087535 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8385 with MSE 0.7069646620103008 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8386 with MSE 0.7183478544189269 

ord

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8455 with MSE 0.7081697086056435 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8456 with MSE 0.7083272089760996 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8457 with MSE 0.7075152018524923 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8458 with MSE 0.7035216500600499 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8459 with MSE 0.7171176293438731 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8460 with MSE 0.7070581252278478 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8461 with MSE 0.7158051865097351 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8462 with MSE 0.7172441773896583 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8463 with MSE 0.7083018161279778 

o

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8529 with MSE 0.7110838041272667 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8530 with MSE 0.7175366202087077 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8531 with MSE 0.7165556558577391 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8532 with MSE 0.709443659430537 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8533 with MSE 0.7031251094287037 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8534 with MSE 0.7121442168752448 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8535 with MSE 0.7161146852405129 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8536 with MSE 0.7084095562607965 

order: 1 , learning rate: 1e-05 , regularizer: 0.1 
Convergence after epoch 8537 with MSE 0.7056475200427106 

or

KeyboardInterrupt: 

Complete the following function which selects the best hyperparameter combination (i.e. the one that gives lowest MSE on **validation data**). (0.5 point)

In [None]:
# find hparams of minimum MSE on Validation data
def find_best_hparams(weights_hist):
    # TODO: implement
    hpm_best, mse_best = None
    return hpm_best, mse_best

best_hpm_combination = find_best_hparams(weights_hist)

###### 9. Re-Training on Train + Validation data
Complete the following function which does re-training on the combined training and validation data. (**1 point**)

In [2]:
# re-run the training on X_train + X_val combined
# Later test it on X_test; That will be our best possible MSE on test data
# this will be more or less the same training code as you did above
# but, here we just have only one value for each hyperparameter.

# TODO: implement

In [4]:
# plot the convergence of MSE values using matplotlib
# i.e. #epochs on X-axis and MSE values on Y-axis
# TODO: implement

###### 10. Evaluation on Test set
Evaluate your model on test data. (1.0 point)

**Please note that you should keep X_test undisturbed throughout this whole phase.** Else restart the kernel and start from beginning. The whole point of this exercise would not make sense if test data has been *seen in training*.

In [None]:
# finally!!!
# test it on X_test with the Weight vector that you found above
# this will be the generalization error of our model!!
# TODO: implement

#print("Finally!!! MSE achieved on X_test is : {}".format(round(mse_test, 6)))

###### 11. Results
Please report the following

a) MSE value on Test data. (0.5 points)

b) Which hyperparameter combination turned out to be the best? In your understanding, why do you think such a combination turned out to be the best for this task? (1.0 point)

# Bonus (2 points)

Now, please repeat the whole *training, validation, re-training, and testing* procedure that we talked about above with the following hyperparameter combination:

In [2]:
polynomial_order = [1]
learning_rates = [0.1]
lambdas = [0.1]

What are your observations during the training phase? Please explain why such a behaviour happened.

---

## Submission instructions
You should provide a single Jupyter notebook as the solution. The naming should include the assignment number and matriculation IDs of all members in your team in the following format:
**assignment-4_matriculation1_matriculation2_matriculation3.ipynb** (in case of 3 members in a team). 
Make sure to keep the order matriculation1_matriculation2_matriculation3 the same for all assignments.

Please submit the solution to your tutor (with **[NNIA][assignment-4]** in email subject):
1. Maksym Andriushchenko <s8mmandr@stud.uni-saarland.de>
2. Marius Mosbach <s9msmosb@stud.uni-saarland.de>
3. Rajarshi Biswas <rbisw17@gmail.com>
4. Marimuthu Kalimuthu <s8makali@stud.uni-saarland.de>

Note: **If you are in a team, please submit only 1 solution to only 1 tutor.**