# iLykei Lecture Series

# Advanced Machine Learning and Artificial Intelligence

# Reinforcement Learning

## Notebook 6: Learning Ms. Pac-Man with DQN

## Yuri Balasanov, Mihail Tselishchev, &copy; iLykei 2018

##### Main text: Hands-On Machine Learning with Scikit-Learn and TensorFlow, Aurelien Geron, &copy; Aurelien Geron 2017, O'Reilly Media, Inc


In [2]:
%matplotlib inline
import matplotlib.pyplot as plt


import numpy as np
import random
import time
import os
import gc

from keras.models import Sequential, clone_model
from keras.layers import Dense, Flatten, Conv2D, InputLayer
from keras.callbacks import CSVLogger, TensorBoard
from keras.optimizers import Adam
import keras.backend as K

import gym

plt.rcParams['figure.figsize'] = (9, 9)

Using TensorFlow backend.


# Deep Q-Learning of MS. Pac-Man with Keras

This notebook shows how to implement a deep neural network approach to train an agent to play Ms.Pac-Man Atari game.


## Explore the game

Use [Gym](https://gym.openai.com/) toolkit that provides both game environment and also a convenient renderer of the game.

Create an environment.

In [3]:
env = gym.make("MsPacman-ram-v0")
env.action_space  # actions are integers from 0 to 8

Discrete(9)

Try to play the game using random strategy:

In [3]:
env.reset()
done = False
score = 0
while not done:
    action = random.randrange(env.action_space.n)  # select random action
    obs, reward, done, info = env.step(action)     # make action and get results
    score += reward
    env.render()
    time.sleep(0.01)
    
env.close()
print('Score =', score)

Score = 160.0


### Observation

In this environment, observation (i.e. current state) is the RAM of the Atari machine, namely a vector of 128 bytes:

In [4]:
obs = env.reset()
print('obs shape =', obs.shape)
print('obs dtype =', obs.dtype)

obs shape = (128,)
obs dtype = uint8


Look at that vector:

In [5]:
print(obs)

[  0 112 114 115   0   3  88  88  88  88  88   0  80  80  80  50  98   0
   0   3   0   0   1   0   0   1   6   6 198   4  71   0  45   1   0 198
 198   0   0   0   0  16  52   0   0 120   0 100 130   0   0 134   1 222
   0   1   3   0   6  80 255 255   0 255 255  80 255 255  80 255 255  80
 255 255  80 191 191  80 191 191  80 191 191  80 255 255  80 255 255  80
 255 255  80 255 255   0 255 255  80 255 255  20 223  43 217 123 217 123
 217 123 217 123 217 123 217 221   0  63   0   0   0   0   0   2  66 240
 146 215]


Create a deep neural network that takes byte vector as an input and produces Q-values for state-action pairs.

## Creating a DQN-model using Keras

The following model is of the same general type applied to the cartPole problem.

Use vanilla multi-layer dense network with relu activations which computes Q-values $Q(s,a)$ for all states $s$ and actions $a$ (with some discount factor $\gamma$).
This neural network denoted by $Q(s\ |\ \theta)$ takes current state as an input and produces a vector of q-values for all 9 possible actions. Vector $\theta$ corresponds to all trainable parameters.

In [7]:
# def create_dqn_model(input_shape, nb_actions, dense_layers, dense_units):
#     model = Sequential()
#     model.add(InputLayer(input_shape=input_shape))
#     for i in range(dense_layers):
#         model.add(Dense(units=dense_units, activation='relu'))
#     model.add(Dense(nb_actions, activation='linear'))
#     return model

def create_dqn_model(input_shape, nb_actions, dense_layers, dense_units):
    model = Sequential()
    for i in range(dense_layers):
        if i==0:
            model.add(Dense(units=dense_units, activation='relu',input_shape=input_shape))
        else:
            model.add(Dense(units=dense_units, activation='relu'))
    model.add(Dense(nb_actions, activation='linear'))
    return model

Create a network using specific input shape and action space size. We call this network *online*.

In [8]:
input_shape = obs.shape
nb_actions = env.action_space.n  # 9
dense_layers = 5
dense_units = 256

online_network = create_dqn_model(input_shape, nb_actions, dense_layers, dense_units)
online_network.summary()




_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 256)               33024     
_________________________________________________________________
dense_2 (Dense)              (None, 256)               65792     
_________________________________________________________________
dense_3 (Dense)              (None, 256)               65792     
_________________________________________________________________
dense_4 (Dense)              (None, 256)               65792     
_________________________________________________________________
dense_5 (Dense)              (None, 256)               65792     
_________________________________________________________________
dense_6 (Dense)              (None, 9)                 2313      
Total params: 298,505
Trainable params: 298,505
Non-trainable params: 0
_________________________________________________________________


In [8]:
from keras.utils import plot_model
plot_model(online_network, to_file='online_DenseNetwork.png',show_shapes=True,show_layer_names=True)

Plot the architecture of the network saved as *online_DenseNetwork.png*. (To see the plot log into iLykei.com, then rerun this cell).

![Model plot](https://ilykei.com/api/fileProxy/documents%2FAdvanced%20Machine%20Learning%2FReinforced%20Learning%2Fonline_DenseNetwork.png)

This network is used to explore states and rewards of Markov decision process according to an $\varepsilon$-greedy exploration strategy:

In [9]:
def epsilon_greedy(q_values, epsilon, n_outputs):
    if random.random() < epsilon:
        return random.randrange(n_outputs)  # random action
    else:
        return np.argmax(q_values)          # q-optimal action

Online network stores explored information in a *replay memory*, a double-ended queue (deque).

In [10]:
from collections import deque

replay_memory_maxlen = 1000000
replay_memory = deque([], maxlen=replay_memory_maxlen)

So, online network explores the game using $\varepsilon$-greedy strategy and saves experienced transitions in replay memory. 

In order to produce Q-values for $\varepsilon$-greedy strategy, following the proposal of the [original paper by Google DeepMind](https://www.nature.com/articles/nature14236), use another network, called *target network*, to calculate "ground-truth" target for the online network. *Target network*, has the same architecture as online network and is not going to be trained. Instead, weights from the online network are periodically copied to target network.

In [11]:
target_network = clone_model(online_network)
target_network.set_weights(online_network.get_weights())









The target network uses past experience in the form of randomly selected records of the replay memory to predict targets for the online network: 

- Select a random minibatch from replay memory containing tuples $(\text{state},\text{action},\text{reward},\text{next_state})$

- For every tuple $(\text{state},\text{action},\text{reward},\text{next_state})$ from minibatch Q-value function $Q(\text{state},\text{action}\ |\ \theta_{\text{online}})$ is trained on predictions of $Q(\text{next_state}, a\ |\ \theta_\text{target})$ according to Bellman-type equation: 

$$y_\text{target} = \text{reward} + \gamma \cdot \max_a Q(\text{next_state}, a\ |\ \theta_\text{target})$$
if the game continues and $$ y_\text{target} = \text{reward}$$ if the game has ended. 

Note that at this step predictions are made by the target network. This helps preventing situations when online network simultaneously predicts values and creates targets, which might potentially lead to instability of training process.

- For each record in the minibatch targets need to be calculated for only one specific $\text{action}$ output of online network. It is important to ignore all other outputs during optimization (calculating gradients). So, predictions for every record in the minibatch are calculated by online network first, then the values corresponding to the actually selected action are replaced with ones predicted by target network. 

## Double DQN

Approach proposed in the previous section is called **DQN**-approach. 

DQN approach is very powerful and allows to train agents in very complex, very multidimentional environments.

However, [it is known](https://arxiv.org/abs/1509.06461) to overestimate q-values under certain conditions. 

Alternative approach proposed in the [same paper](https://arxiv.org/abs/1509.06461) is called **Double DQN**. 

Instead of taking action that maximizes q-value for target network, they pick an action that maximizes q-value for online network as an optimal one:

$$y_\text{target} = \text{reward} + \gamma \cdot Q\left(\text{next_state}, \arg\max_a Q\left(\text{next_state},a\ |\ \theta_\text{online}\right)\ |\ \theta_\text{target}\right).$$


## Training DQN

First, define hyperparameters (Do not forget to change them before moving to cluster):

In [12]:
name = 'MsPacman_DQN'  # used in naming files (weights, logs, etc)
n_steps = 10000        # total number of training steps (= n_epochs)
warmup = 1000          # start training after warmup iterations
training_interval = 4  # period (in actions) between training steps
save_steps = int(n_steps/10)  # period (in training steps) between storing weights to file
copy_steps = 100       # period (in training steps) between updating target_network weights
gamma = 0.9            # discount rate
skip_start = 90        # skip the start of every game (it's just freezing time before game starts)
batch_size = 64        # size of minibatch that is taken randomly from replay memory every training step
double_dqn = False     # whether to use Double-DQN approach or simple DQN (see above)
# eps-greedy parameters: we slowly decrease epsilon from eps_max to eps_min in eps_decay_steps
eps_max = 1.0
eps_min = 0.05
eps_decay_steps = int(n_steps/2)

learning_rate = 0.001

Compile online-network with Adam optimizer, mean squared error loss and `mean_q` metric, which measures the maximum of predicted q-values averaged over samples from minibatch (we expect it to increase during training process).

In [13]:
def mean_q(y_true, y_pred):
    return K.mean(K.max(y_pred, axis=-1))


online_network.compile(optimizer=Adam(learning_rate), loss='mse', metrics=[mean_q])




Create folder for logs and trained weights:

In [14]:
if not os.path.exists(name):
    os.makedirs(name)
    
weights_folder = os.path.join(name, 'weights')
if not os.path.exists(weights_folder):
    os.makedirs(weights_folder)

Use standard callbacks:

In [15]:
csv_logger = CSVLogger(os.path.join(name, 'log.csv'), append=True, separator=';')
tensorboard = TensorBoard(log_dir=os.path.join(name, 'tensorboard'), write_graph=False, write_images=False)

Next chunk of code explores the game, trains online network and periodically copies weights to target network as explained above.

In [16]:
# counters:
step = 0          # training step counter (= epoch counter)
iteration = 0     # frames counter
episodes = 0      # game episodes counter
done = True       # indicator that env needs to be reset

episode_scores = []  # collect total scores in this list and log it later

while step < n_steps:
    if done:  # game over, restart it
        obs = env.reset()
        score = 0  # reset score for current episode
        for skip in range(skip_start):  # skip the start of each game (it's just freezing time before game starts)
            obs, reward, done, info = env.step(0)
            score += reward
        state = obs
        episodes += 1

    # Online network evaluates what to do
    iteration += 1
    q_values = online_network.predict(np.array([state]))[0]  # calculate q-values using online network
    # select epsilon (which linearly decreases over training steps):
    epsilon = max(eps_min, eps_max - (eps_max-eps_min) * step/eps_decay_steps)
    action = epsilon_greedy(q_values, epsilon, nb_actions)
    # Play:
    obs, reward, done, info = env.step(action)
    score += reward
    if done:
        episode_scores.append(score)
    next_state = obs
    # Let's memorize what just happened
    replay_memory.append((state, action, reward, next_state, done))
    state = next_state

    if iteration >= warmup and iteration % training_interval == 0:
        # learning branch
        step += 1
        minibatch = random.sample(replay_memory, batch_size)
        replay_state = np.array([x[0] for x in minibatch])
        replay_action = np.array([x[1] for x in minibatch])
        replay_rewards = np.array([x[2] for x in minibatch])
        replay_next_state = np.array([x[3] for x in minibatch])
        replay_done = np.array([x[4] for x in minibatch], dtype=int)

        # calculate targets (see above for details)
        if double_dqn == False:
            # DQN
            target_for_action = replay_rewards + (1-replay_done) * gamma * \
                                    np.amax(target_network.predict(replay_next_state), axis=1)
        else:
            # Double DQN
            best_actions = np.argmax(online_network.predict(replay_next_state), axis=1)
            target_for_action = replay_rewards + (1-replay_done) * gamma * \
                                    target_network.predict(replay_next_state)[np.arange(batch_size), best_actions]

        target = online_network.predict(replay_state)  # targets coincide with predictions ...
        target[np.arange(batch_size), replay_action] = target_for_action  #...except for targets with actions from replay
        
        # Train online network
        online_network.fit(replay_state, target, epochs=step, verbose=2, initial_epoch=step-1,
                           callbacks=[csv_logger, tensorboard])

        # Periodically copy online network weights to target network
        if step % copy_steps == 0:
            target_network.set_weights(online_network.get_weights())
        # And save weights
        if step % save_steps == 0:
            online_network.save_weights(os.path.join(weights_folder, 'weights_{}.h5f'.format(step)))
            gc.collect()  # also clean the garbage





Epoch 1/1
 - 4s - loss: 811.9525 - mean_q: 40.0516

Epoch 2/2
 - 0s - loss: 181.3541 - mean_q: 57.7502
Epoch 3/3
 - 0s - loss: 164.1998 - mean_q: 49.9461
Epoch 4/4
 - 0s - loss: 74.0074 - mean_q: 44.9145
Epoch 5/5
 - 0s - loss: 47.5185 - mean_q: 44.7768
Epoch 6/6
 - 0s - loss: 27.8570 - mean_q: 41.0398
Epoch 7/7
 - 0s - loss: 19.6247 - mean_q: 39.5150
Epoch 8/8
 - 0s - loss: 11.9747 - mean_q: 38.3906
Epoch 9/9
 - 0s - loss: 10.1756 - mean_q: 38.3837
Epoch 10/10
 - 0s - loss: 8.0550 - mean_q: 37.6850
Epoch 11/11
 - 0s - loss: 6.3151 - mean_q: 37.6307
Epoch 12/12
 - 0s - loss: 6.2599 - mean_q: 38.1350
Epoch 13/13
 - 0s - loss: 9.2713 - mean_q: 38.4844
Epoch 14/14
 - 0s - loss: 4.2630 - mean_q: 38.6195
Epoch 15/15
 - 0s - loss: 4.8821 - mean_q: 38.5510
Epoch 16/16
 - 0s - loss: 3.6426 - mean_q: 38.5114
Epoch 17/17
 - 0s - loss: 4.3575 - mean_q: 38.4802
Epoch 18/18
 - 0s - loss: 3.7726 - mean_q: 39.9176
Epoch 19/19
 - 0s - loss: 4.2404 - mean_q: 38.6411
Epoch 20/20
 - 0s - loss: 4.9785

 - 0s - loss: 0.9008 - mean_q: 37.0289
Epoch 142/142
 - 0s - loss: 0.6041 - mean_q: 35.8214
Epoch 143/143
 - 0s - loss: 2.4839 - mean_q: 36.3672
Epoch 144/144
 - 0s - loss: 0.4730 - mean_q: 35.6955
Epoch 145/145
 - 0s - loss: 1.2193 - mean_q: 36.4380
Epoch 146/146
 - 0s - loss: 1.1520 - mean_q: 35.8391
Epoch 147/147
 - 0s - loss: 0.8791 - mean_q: 36.3352
Epoch 148/148
 - 0s - loss: 0.6481 - mean_q: 35.8368
Epoch 149/149
 - 0s - loss: 0.9378 - mean_q: 35.9679
Epoch 150/150
 - 0s - loss: 0.7714 - mean_q: 35.9879
Epoch 151/151
 - 0s - loss: 1.0090 - mean_q: 36.1869
Epoch 152/152
 - 0s - loss: 0.9423 - mean_q: 35.9501
Epoch 153/153
 - 0s - loss: 1.0028 - mean_q: 36.2005
Epoch 154/154
 - 0s - loss: 0.8369 - mean_q: 35.7498
Epoch 155/155
 - 0s - loss: 1.6402 - mean_q: 36.0144
Epoch 156/156
 - 0s - loss: 0.8642 - mean_q: 36.0532
Epoch 157/157
 - 0s - loss: 1.0083 - mean_q: 36.3979
Epoch 158/158
 - 0s - loss: 1.1246 - mean_q: 36.4084
Epoch 159/159
 - 0s - loss: 1.2585 - mean_q: 36.1552
Epoch 1

Epoch 296/296
 - 0s - loss: 0.7735 - mean_q: 35.1831
Epoch 297/297
 - 0s - loss: 0.7792 - mean_q: 34.2103
Epoch 298/298
 - 0s - loss: 0.9739 - mean_q: 34.9018
Epoch 299/299
 - 0s - loss: 2.7878 - mean_q: 33.7923
Epoch 300/300
 - 0s - loss: 2.7606 - mean_q: 35.0018
Epoch 301/301
 - 0s - loss: 1.8082 - mean_q: 33.7387
Epoch 302/302
 - 0s - loss: 0.8182 - mean_q: 33.9434
Epoch 303/303
 - 0s - loss: 0.7523 - mean_q: 33.3769
Epoch 304/304
 - 0s - loss: 0.8146 - mean_q: 34.2059
Epoch 305/305
 - 0s - loss: 0.9362 - mean_q: 32.5526
Epoch 306/306
 - 0s - loss: 0.9172 - mean_q: 33.2359
Epoch 307/307
 - 0s - loss: 1.2319 - mean_q: 32.2326
Epoch 308/308
 - 0s - loss: 0.4450 - mean_q: 33.0891
Epoch 309/309
 - 0s - loss: 0.3341 - mean_q: 32.2667
Epoch 310/310
 - 0s - loss: 1.4570 - mean_q: 32.9397
Epoch 311/311
 - 0s - loss: 0.5744 - mean_q: 32.0145
Epoch 312/312
 - 0s - loss: 3.4997 - mean_q: 32.6931
Epoch 313/313
 - 0s - loss: 0.6340 - mean_q: 32.1942
Epoch 314/314
 - 0s - loss: 1.0529 - mean_q: 3

Epoch 451/451
 - 0s - loss: 0.7651 - mean_q: 31.0162
Epoch 452/452
 - 0s - loss: 0.3125 - mean_q: 31.4892
Epoch 453/453
 - 0s - loss: 0.5051 - mean_q: 30.8783
Epoch 454/454
 - 0s - loss: 1.6800 - mean_q: 30.7295
Epoch 455/455
 - 0s - loss: 0.4950 - mean_q: 30.8335
Epoch 456/456
 - 0s - loss: 0.7100 - mean_q: 31.1685
Epoch 457/457
 - 0s - loss: 1.0339 - mean_q: 31.5481
Epoch 458/458
 - 0s - loss: 0.3581 - mean_q: 30.9959
Epoch 459/459
 - 0s - loss: 0.6165 - mean_q: 31.1794
Epoch 460/460
 - 0s - loss: 0.3033 - mean_q: 30.1805
Epoch 461/461
 - 0s - loss: 0.4115 - mean_q: 30.7683
Epoch 462/462
 - 0s - loss: 1.8536 - mean_q: 30.2661
Epoch 463/463
 - 0s - loss: 1.1384 - mean_q: 30.9044
Epoch 464/464
 - 0s - loss: 1.6897 - mean_q: 29.9442
Epoch 465/465
 - 0s - loss: 0.5833 - mean_q: 30.8708
Epoch 466/466
 - 0s - loss: 0.8119 - mean_q: 30.7697
Epoch 467/467
 - 0s - loss: 0.4456 - mean_q: 30.7665
Epoch 468/468
 - 0s - loss: 0.9467 - mean_q: 30.6257
Epoch 469/469
 - 0s - loss: 0.9046 - mean_q: 3

Epoch 606/606
 - 0s - loss: 0.3092 - mean_q: 27.4290
Epoch 607/607
 - 0s - loss: 0.3747 - mean_q: 27.0156
Epoch 608/608
 - 0s - loss: 0.3898 - mean_q: 27.3017
Epoch 609/609
 - 0s - loss: 0.1166 - mean_q: 27.0764
Epoch 610/610
 - 0s - loss: 0.4416 - mean_q: 27.0450
Epoch 611/611
 - 0s - loss: 0.6368 - mean_q: 27.6518
Epoch 612/612
 - 0s - loss: 0.3471 - mean_q: 26.7546
Epoch 613/613
 - 0s - loss: 0.4423 - mean_q: 26.9386
Epoch 614/614
 - 0s - loss: 0.5806 - mean_q: 27.2181
Epoch 615/615
 - 0s - loss: 0.3641 - mean_q: 27.8558
Epoch 616/616
 - 0s - loss: 2.1143 - mean_q: 26.9805
Epoch 617/617
 - 0s - loss: 0.5095 - mean_q: 27.2753
Epoch 618/618
 - 0s - loss: 0.4179 - mean_q: 26.7785
Epoch 619/619
 - 0s - loss: 0.5873 - mean_q: 26.9710
Epoch 620/620
 - 0s - loss: 0.4527 - mean_q: 27.1463
Epoch 621/621
 - 0s - loss: 0.9329 - mean_q: 27.0358
Epoch 622/622
 - 0s - loss: 0.7036 - mean_q: 27.4854
Epoch 623/623
 - 0s - loss: 0.7291 - mean_q: 26.9766
Epoch 624/624
 - 0s - loss: 0.6267 - mean_q: 2

Epoch 761/761
 - 0s - loss: 0.4545 - mean_q: 24.9237
Epoch 762/762
 - 0s - loss: 0.7696 - mean_q: 25.8757
Epoch 763/763
 - 0s - loss: 0.5875 - mean_q: 25.0144
Epoch 764/764
 - 0s - loss: 0.9955 - mean_q: 25.8331
Epoch 765/765
 - 0s - loss: 0.4186 - mean_q: 24.9017
Epoch 766/766
 - 0s - loss: 0.5635 - mean_q: 25.4860
Epoch 767/767
 - 0s - loss: 0.9798 - mean_q: 25.1132
Epoch 768/768
 - 0s - loss: 0.8168 - mean_q: 26.2783
Epoch 769/769
 - 0s - loss: 1.1847 - mean_q: 25.5682
Epoch 770/770
 - 0s - loss: 0.4200 - mean_q: 25.7824
Epoch 771/771
 - 0s - loss: 0.3000 - mean_q: 24.8831
Epoch 772/772
 - 0s - loss: 0.4072 - mean_q: 25.9301
Epoch 773/773
 - 0s - loss: 0.2758 - mean_q: 25.5341
Epoch 774/774
 - 0s - loss: 0.7238 - mean_q: 25.9427
Epoch 775/775
 - 0s - loss: 0.6520 - mean_q: 25.2957
Epoch 776/776
 - 0s - loss: 1.1043 - mean_q: 26.1451
Epoch 777/777
 - 0s - loss: 0.7955 - mean_q: 25.6899
Epoch 778/778
 - 0s - loss: 0.2097 - mean_q: 26.3391
Epoch 779/779
 - 0s - loss: 0.5853 - mean_q: 2

Epoch 916/916
 - 0s - loss: 0.6655 - mean_q: 23.1005
Epoch 917/917
 - 0s - loss: 0.5685 - mean_q: 23.7818
Epoch 918/918
 - 0s - loss: 0.3152 - mean_q: 23.7888
Epoch 919/919
 - 0s - loss: 1.0324 - mean_q: 23.8440
Epoch 920/920
 - 0s - loss: 0.3668 - mean_q: 23.4815
Epoch 921/921
 - 0s - loss: 0.3944 - mean_q: 24.1597
Epoch 922/922
 - 0s - loss: 0.2066 - mean_q: 23.3589
Epoch 923/923
 - 0s - loss: 0.1297 - mean_q: 23.3119
Epoch 924/924
 - 0s - loss: 0.6162 - mean_q: 23.9010
Epoch 925/925
 - 0s - loss: 0.6303 - mean_q: 24.1488
Epoch 926/926
 - 0s - loss: 0.9904 - mean_q: 23.0136
Epoch 927/927
 - 0s - loss: 0.4281 - mean_q: 23.9995
Epoch 928/928
 - 0s - loss: 0.5029 - mean_q: 23.2509
Epoch 929/929
 - 0s - loss: 0.5387 - mean_q: 24.0466
Epoch 930/930
 - 0s - loss: 0.3993 - mean_q: 23.5970
Epoch 931/931
 - 0s - loss: 0.0918 - mean_q: 23.6193
Epoch 932/932
 - 0s - loss: 0.1060 - mean_q: 24.1289
Epoch 933/933
 - 0s - loss: 0.5330 - mean_q: 23.5328
Epoch 934/934
 - 0s - loss: 0.2653 - mean_q: 2

 - 0s - loss: 0.7553 - mean_q: 23.5476
Epoch 1069/1069
 - 0s - loss: 0.5049 - mean_q: 22.9521
Epoch 1070/1070
 - 0s - loss: 0.7698 - mean_q: 23.8281
Epoch 1071/1071
 - 0s - loss: 0.6873 - mean_q: 23.1429
Epoch 1072/1072
 - 0s - loss: 0.7594 - mean_q: 23.6674
Epoch 1073/1073
 - 0s - loss: 0.5452 - mean_q: 22.3232
Epoch 1074/1074
 - 0s - loss: 0.6292 - mean_q: 23.1555
Epoch 1075/1075
 - 0s - loss: 0.4099 - mean_q: 22.6457
Epoch 1076/1076
 - 0s - loss: 0.6297 - mean_q: 22.8946
Epoch 1077/1077
 - 0s - loss: 0.7337 - mean_q: 22.1752
Epoch 1078/1078
 - 0s - loss: 0.6859 - mean_q: 22.4215
Epoch 1079/1079
 - 0s - loss: 0.8262 - mean_q: 22.2566
Epoch 1080/1080
 - 0s - loss: 1.0997 - mean_q: 22.5414
Epoch 1081/1081
 - 0s - loss: 1.0429 - mean_q: 22.2264
Epoch 1082/1082
 - 0s - loss: 0.6716 - mean_q: 22.9900
Epoch 1083/1083
 - 0s - loss: 0.8465 - mean_q: 22.0121
Epoch 1084/1084
 - 0s - loss: 0.4091 - mean_q: 22.4262
Epoch 1085/1085
 - 0s - loss: 0.5528 - mean_q: 22.0858
Epoch 1086/1086
 - 0s - lo

 - 0s - loss: 0.2710 - mean_q: 20.8657
Epoch 1218/1218
 - 0s - loss: 0.3329 - mean_q: 20.7199
Epoch 1219/1219
 - 0s - loss: 0.2367 - mean_q: 21.0132
Epoch 1220/1220
 - 0s - loss: 0.4210 - mean_q: 20.4430
Epoch 1221/1221
 - 0s - loss: 0.2047 - mean_q: 19.8895
Epoch 1222/1222
 - 0s - loss: 0.3039 - mean_q: 20.1908
Epoch 1223/1223
 - 0s - loss: 0.3248 - mean_q: 20.4221
Epoch 1224/1224
 - 0s - loss: 0.2449 - mean_q: 20.4452
Epoch 1225/1225
 - 0s - loss: 0.2846 - mean_q: 21.1292
Epoch 1226/1226
 - 0s - loss: 0.4675 - mean_q: 20.9341
Epoch 1227/1227
 - 0s - loss: 0.7315 - mean_q: 20.4847
Epoch 1228/1228
 - 0s - loss: 0.2958 - mean_q: 20.2884
Epoch 1229/1229
 - 0s - loss: 0.7157 - mean_q: 20.3026
Epoch 1230/1230
 - 0s - loss: 0.1593 - mean_q: 20.4625
Epoch 1231/1231
 - 0s - loss: 0.7387 - mean_q: 20.1930
Epoch 1232/1232
 - 0s - loss: 0.4838 - mean_q: 20.6536
Epoch 1233/1233
 - 0s - loss: 0.3013 - mean_q: 21.0304
Epoch 1234/1234
 - 0s - loss: 0.3969 - mean_q: 20.7465
Epoch 1235/1235
 - 0s - lo

 - 0s - loss: 0.3620 - mean_q: 19.7732
Epoch 1367/1367
 - 0s - loss: 0.5883 - mean_q: 19.6435
Epoch 1368/1368
 - 0s - loss: 0.4704 - mean_q: 19.6257
Epoch 1369/1369
 - 0s - loss: 0.4584 - mean_q: 20.0859
Epoch 1370/1370
 - 0s - loss: 0.5210 - mean_q: 20.5539
Epoch 1371/1371
 - 0s - loss: 0.4144 - mean_q: 20.1963
Epoch 1372/1372
 - 0s - loss: 0.1118 - mean_q: 19.9861
Epoch 1373/1373
 - 0s - loss: 0.2760 - mean_q: 20.4386
Epoch 1374/1374
 - 0s - loss: 0.3093 - mean_q: 20.1200
Epoch 1375/1375
 - 0s - loss: 1.0299 - mean_q: 21.5614
Epoch 1376/1376
 - 0s - loss: 1.0600 - mean_q: 19.2044
Epoch 1377/1377
 - 0s - loss: 1.0107 - mean_q: 20.2369
Epoch 1378/1378
 - 0s - loss: 1.0109 - mean_q: 18.6960
Epoch 1379/1379
 - 0s - loss: 0.8736 - mean_q: 19.4552
Epoch 1380/1380
 - 0s - loss: 0.7540 - mean_q: 18.8251
Epoch 1381/1381
 - 0s - loss: 0.6235 - mean_q: 20.4866
Epoch 1382/1382
 - 0s - loss: 0.9388 - mean_q: 19.2613
Epoch 1383/1383
 - 0s - loss: 1.1366 - mean_q: 20.4609
Epoch 1384/1384
 - 0s - lo

 - 0s - loss: 0.3198 - mean_q: 18.3382
Epoch 1516/1516
 - 0s - loss: 0.1982 - mean_q: 17.8300
Epoch 1517/1517
 - 0s - loss: 0.6238 - mean_q: 18.0738
Epoch 1518/1518
 - 0s - loss: 0.2436 - mean_q: 18.2495
Epoch 1519/1519
 - 0s - loss: 0.5788 - mean_q: 18.5806
Epoch 1520/1520
 - 0s - loss: 0.2196 - mean_q: 18.2237
Epoch 1521/1521
 - 0s - loss: 0.3506 - mean_q: 18.7896
Epoch 1522/1522
 - 0s - loss: 0.4433 - mean_q: 18.1562
Epoch 1523/1523
 - 0s - loss: 0.5301 - mean_q: 18.3760
Epoch 1524/1524
 - 0s - loss: 0.2823 - mean_q: 18.4664
Epoch 1525/1525
 - 0s - loss: 0.2727 - mean_q: 18.2664
Epoch 1526/1526
 - 0s - loss: 0.2823 - mean_q: 18.4994
Epoch 1527/1527
 - 0s - loss: 0.4223 - mean_q: 18.1357
Epoch 1528/1528
 - 0s - loss: 0.7594 - mean_q: 18.5749
Epoch 1529/1529
 - 0s - loss: 0.4289 - mean_q: 19.3925
Epoch 1530/1530
 - 0s - loss: 0.3607 - mean_q: 18.2780
Epoch 1531/1531
 - 0s - loss: 0.3762 - mean_q: 18.6714
Epoch 1532/1532
 - 0s - loss: 0.4850 - mean_q: 18.5100
Epoch 1533/1533
 - 0s - lo

Epoch 1665/1665
 - 0s - loss: 0.6264 - mean_q: 17.9330
Epoch 1666/1666
 - 0s - loss: 0.3327 - mean_q: 17.6616
Epoch 1667/1667
 - 0s - loss: 0.7169 - mean_q: 18.1963
Epoch 1668/1668
 - 0s - loss: 0.3585 - mean_q: 18.0948
Epoch 1669/1669
 - 0s - loss: 0.7302 - mean_q: 17.8902
Epoch 1670/1670
 - 0s - loss: 0.2725 - mean_q: 18.3991
Epoch 1671/1671
 - 0s - loss: 0.5121 - mean_q: 18.5761
Epoch 1672/1672
 - 0s - loss: 0.3007 - mean_q: 17.2037
Epoch 1673/1673
 - 0s - loss: 0.6923 - mean_q: 18.1790
Epoch 1674/1674
 - 0s - loss: 0.4769 - mean_q: 17.3307
Epoch 1675/1675
 - 0s - loss: 0.6313 - mean_q: 18.1056
Epoch 1676/1676
 - 0s - loss: 0.4448 - mean_q: 18.4334
Epoch 1677/1677
 - 0s - loss: 0.3033 - mean_q: 17.7613
Epoch 1678/1678
 - 0s - loss: 0.2835 - mean_q: 17.2169
Epoch 1679/1679
 - 0s - loss: 0.5112 - mean_q: 17.4371
Epoch 1680/1680
 - 0s - loss: 0.6715 - mean_q: 18.3568
Epoch 1681/1681
 - 0s - loss: 0.2783 - mean_q: 17.8128
Epoch 1682/1682
 - 0s - loss: 0.4355 - mean_q: 17.9795
Epoch 1683

Epoch 1814/1814
 - 0s - loss: 0.1529 - mean_q: 17.1051
Epoch 1815/1815
 - 0s - loss: 0.4860 - mean_q: 16.7458
Epoch 1816/1816
 - 0s - loss: 0.7090 - mean_q: 16.6890
Epoch 1817/1817
 - 0s - loss: 0.4079 - mean_q: 16.4882
Epoch 1818/1818
 - 0s - loss: 1.0250 - mean_q: 17.3389
Epoch 1819/1819
 - 0s - loss: 0.3268 - mean_q: 16.5140
Epoch 1820/1820
 - 0s - loss: 0.4128 - mean_q: 17.5072
Epoch 1821/1821
 - 0s - loss: 0.5202 - mean_q: 16.7636
Epoch 1822/1822
 - 0s - loss: 0.6340 - mean_q: 17.9745
Epoch 1823/1823
 - 0s - loss: 0.6260 - mean_q: 17.1112
Epoch 1824/1824
 - 0s - loss: 0.6433 - mean_q: 16.5825
Epoch 1825/1825
 - 0s - loss: 0.2312 - mean_q: 16.3724
Epoch 1826/1826
 - 0s - loss: 0.4669 - mean_q: 17.3777
Epoch 1827/1827
 - 0s - loss: 0.1646 - mean_q: 16.4342
Epoch 1828/1828
 - 0s - loss: 0.2697 - mean_q: 17.0699
Epoch 1829/1829
 - 0s - loss: 0.6926 - mean_q: 17.0122
Epoch 1830/1830
 - 0s - loss: 0.2495 - mean_q: 17.4253
Epoch 1831/1831
 - 0s - loss: 0.5516 - mean_q: 17.0172
Epoch 1832

Epoch 1963/1963
 - 0s - loss: 0.3366 - mean_q: 15.6329
Epoch 1964/1964
 - 0s - loss: 0.7643 - mean_q: 16.2088
Epoch 1965/1965
 - 0s - loss: 0.6729 - mean_q: 16.2875
Epoch 1966/1966
 - 0s - loss: 0.3555 - mean_q: 17.3080
Epoch 1967/1967
 - 0s - loss: 0.4869 - mean_q: 16.1189
Epoch 1968/1968
 - 0s - loss: 0.3050 - mean_q: 16.4757
Epoch 1969/1969
 - 0s - loss: 0.5972 - mean_q: 16.7254
Epoch 1970/1970
 - 0s - loss: 0.5226 - mean_q: 17.0760
Epoch 1971/1971
 - 0s - loss: 0.5388 - mean_q: 16.4606
Epoch 1972/1972
 - 0s - loss: 0.5521 - mean_q: 17.0622
Epoch 1973/1973
 - 0s - loss: 0.4247 - mean_q: 17.0779
Epoch 1974/1974
 - 0s - loss: 0.3593 - mean_q: 16.0890
Epoch 1975/1975
 - 0s - loss: 0.5627 - mean_q: 16.2936
Epoch 1976/1976
 - 0s - loss: 0.4020 - mean_q: 16.6566
Epoch 1977/1977
 - 0s - loss: 0.4602 - mean_q: 16.9160
Epoch 1978/1978
 - 0s - loss: 0.3979 - mean_q: 15.9210
Epoch 1979/1979
 - 0s - loss: 0.5735 - mean_q: 16.7394
Epoch 1980/1980
 - 0s - loss: 0.5108 - mean_q: 16.1711
Epoch 1981

Epoch 2112/2112
 - 0s - loss: 0.7443 - mean_q: 16.4624
Epoch 2113/2113
 - 0s - loss: 0.5759 - mean_q: 14.7377
Epoch 2114/2114
 - 0s - loss: 0.6076 - mean_q: 15.9408
Epoch 2115/2115
 - 0s - loss: 0.5728 - mean_q: 14.8253
Epoch 2116/2116
 - 0s - loss: 0.3407 - mean_q: 15.1324
Epoch 2117/2117
 - 0s - loss: 0.7642 - mean_q: 15.5634
Epoch 2118/2118
 - 0s - loss: 0.6042 - mean_q: 15.5839
Epoch 2119/2119
 - 0s - loss: 0.2840 - mean_q: 15.7058
Epoch 2120/2120
 - 0s - loss: 0.4000 - mean_q: 15.1425
Epoch 2121/2121
 - 0s - loss: 0.1523 - mean_q: 14.7998
Epoch 2122/2122
 - 0s - loss: 1.3705 - mean_q: 15.5534
Epoch 2123/2123
 - 0s - loss: 0.6128 - mean_q: 15.3399
Epoch 2124/2124
 - 0s - loss: 0.7290 - mean_q: 15.4366
Epoch 2125/2125
 - 0s - loss: 0.6014 - mean_q: 15.1614
Epoch 2126/2126
 - 0s - loss: 0.3659 - mean_q: 14.9370
Epoch 2127/2127
 - 0s - loss: 0.1988 - mean_q: 14.7749
Epoch 2128/2128
 - 0s - loss: 0.2869 - mean_q: 15.4317
Epoch 2129/2129
 - 0s - loss: 0.1069 - mean_q: 14.2844
Epoch 2130

Epoch 2261/2261
 - 0s - loss: 0.3505 - mean_q: 15.2508
Epoch 2262/2262
 - 0s - loss: 0.3925 - mean_q: 14.8150
Epoch 2263/2263
 - 0s - loss: 0.2398 - mean_q: 15.1578
Epoch 2264/2264
 - 0s - loss: 0.6022 - mean_q: 14.4837
Epoch 2265/2265
 - 0s - loss: 0.5048 - mean_q: 15.2501
Epoch 2266/2266
 - 0s - loss: 0.4337 - mean_q: 14.0110
Epoch 2267/2267
 - 0s - loss: 0.9185 - mean_q: 15.0529
Epoch 2268/2268
 - 0s - loss: 0.4275 - mean_q: 14.1680
Epoch 2269/2269
 - 0s - loss: 0.5435 - mean_q: 14.7945
Epoch 2270/2270
 - 0s - loss: 0.5369 - mean_q: 14.4556
Epoch 2271/2271
 - 0s - loss: 1.0851 - mean_q: 14.9250
Epoch 2272/2272
 - 0s - loss: 1.0057 - mean_q: 14.3031
Epoch 2273/2273
 - 0s - loss: 0.6048 - mean_q: 15.5786
Epoch 2274/2274
 - 0s - loss: 0.8734 - mean_q: 15.4465
Epoch 2275/2275
 - 0s - loss: 0.9961 - mean_q: 15.0211
Epoch 2276/2276
 - 0s - loss: 1.0749 - mean_q: 13.5727
Epoch 2277/2277
 - 0s - loss: 0.9899 - mean_q: 15.0940
Epoch 2278/2278
 - 0s - loss: 0.5912 - mean_q: 14.1032
Epoch 2279

Epoch 2410/2410
 - 0s - loss: 0.0855 - mean_q: 14.4825
Epoch 2411/2411
 - 0s - loss: 0.4535 - mean_q: 14.5374
Epoch 2412/2412
 - 0s - loss: 0.6083 - mean_q: 14.9752
Epoch 2413/2413
 - 0s - loss: 0.5033 - mean_q: 14.6584
Epoch 2414/2414
 - 0s - loss: 0.4592 - mean_q: 14.5644
Epoch 2415/2415
 - 0s - loss: 0.2566 - mean_q: 14.7639
Epoch 2416/2416
 - 0s - loss: 0.6075 - mean_q: 14.1352
Epoch 2417/2417
 - 0s - loss: 0.6711 - mean_q: 15.3341
Epoch 2418/2418
 - 0s - loss: 0.3219 - mean_q: 15.1212
Epoch 2419/2419
 - 0s - loss: 4.2895 - mean_q: 14.1441
Epoch 2420/2420
 - 0s - loss: 0.8891 - mean_q: 15.4975
Epoch 2421/2421
 - 0s - loss: 0.5524 - mean_q: 15.1391
Epoch 2422/2422
 - 0s - loss: 0.6380 - mean_q: 14.8168
Epoch 2423/2423
 - 0s - loss: 0.4587 - mean_q: 15.2554
Epoch 2424/2424
 - 0s - loss: 0.3834 - mean_q: 14.4274
Epoch 2425/2425
 - 0s - loss: 0.3305 - mean_q: 15.6199
Epoch 2426/2426
 - 0s - loss: 0.3197 - mean_q: 15.8833
Epoch 2427/2427
 - 0s - loss: 0.8073 - mean_q: 14.9750
Epoch 2428

Epoch 2559/2559
 - 0s - loss: 0.3239 - mean_q: 14.1418
Epoch 2560/2560
 - 0s - loss: 0.2535 - mean_q: 14.2964
Epoch 2561/2561
 - 0s - loss: 0.5001 - mean_q: 14.2398
Epoch 2562/2562
 - 0s - loss: 0.3087 - mean_q: 14.3384
Epoch 2563/2563
 - 0s - loss: 0.4256 - mean_q: 14.8020
Epoch 2564/2564
 - 0s - loss: 0.5011 - mean_q: 14.5248
Epoch 2565/2565
 - 0s - loss: 0.9843 - mean_q: 15.4134
Epoch 2566/2566
 - 0s - loss: 0.5236 - mean_q: 13.9683
Epoch 2567/2567
 - 0s - loss: 0.6795 - mean_q: 14.8136
Epoch 2568/2568
 - 0s - loss: 0.3344 - mean_q: 14.2579
Epoch 2569/2569
 - 0s - loss: 0.3234 - mean_q: 13.8871
Epoch 2570/2570
 - 0s - loss: 0.3059 - mean_q: 14.0587
Epoch 2571/2571
 - 0s - loss: 0.5039 - mean_q: 14.3140
Epoch 2572/2572
 - 0s - loss: 0.4461 - mean_q: 13.7555
Epoch 2573/2573
 - 0s - loss: 0.5658 - mean_q: 14.0549
Epoch 2574/2574
 - 0s - loss: 0.5373 - mean_q: 14.9114
Epoch 2575/2575
 - 0s - loss: 0.2108 - mean_q: 14.3749
Epoch 2576/2576
 - 0s - loss: 0.1898 - mean_q: 13.7732
Epoch 2577

Epoch 2708/2708
 - 0s - loss: 0.5005 - mean_q: 12.9074
Epoch 2709/2709
 - 0s - loss: 1.1820 - mean_q: 14.2519
Epoch 2710/2710
 - 0s - loss: 1.0342 - mean_q: 13.6281
Epoch 2711/2711
 - 0s - loss: 1.0796 - mean_q: 14.4208
Epoch 2712/2712
 - 0s - loss: 0.9166 - mean_q: 12.6165
Epoch 2713/2713
 - 0s - loss: 1.1645 - mean_q: 13.7400
Epoch 2714/2714
 - 0s - loss: 1.0636 - mean_q: 12.6790
Epoch 2715/2715
 - 0s - loss: 0.8203 - mean_q: 14.3213
Epoch 2716/2716
 - 0s - loss: 0.7269 - mean_q: 13.1980
Epoch 2717/2717
 - 0s - loss: 1.0259 - mean_q: 14.4687
Epoch 2718/2718
 - 0s - loss: 0.6913 - mean_q: 13.3346
Epoch 2719/2719
 - 0s - loss: 0.9971 - mean_q: 14.2116
Epoch 2720/2720
 - 0s - loss: 0.6409 - mean_q: 12.3562
Epoch 2721/2721
 - 0s - loss: 0.3074 - mean_q: 14.0300
Epoch 2722/2722
 - 0s - loss: 0.4776 - mean_q: 13.0302
Epoch 2723/2723
 - 0s - loss: 0.3239 - mean_q: 13.4607
Epoch 2724/2724
 - 0s - loss: 0.4444 - mean_q: 12.9735
Epoch 2725/2725
 - 0s - loss: 0.3460 - mean_q: 14.0194
Epoch 2726

Epoch 2857/2857
 - 0s - loss: 0.6050 - mean_q: 13.4440
Epoch 2858/2858
 - 0s - loss: 0.2046 - mean_q: 12.7159
Epoch 2859/2859
 - 0s - loss: 0.5532 - mean_q: 13.5812
Epoch 2860/2860
 - 0s - loss: 0.5345 - mean_q: 12.9045
Epoch 2861/2861
 - 0s - loss: 0.6431 - mean_q: 12.7827
Epoch 2862/2862
 - 0s - loss: 0.4241 - mean_q: 12.7668
Epoch 2863/2863
 - 0s - loss: 0.4984 - mean_q: 12.7332
Epoch 2864/2864
 - 0s - loss: 0.4984 - mean_q: 13.0663
Epoch 2865/2865
 - 0s - loss: 0.6000 - mean_q: 13.2436
Epoch 2866/2866
 - 0s - loss: 0.7921 - mean_q: 13.4735
Epoch 2867/2867
 - 0s - loss: 0.2709 - mean_q: 12.7405
Epoch 2868/2868
 - 0s - loss: 0.2418 - mean_q: 13.3714
Epoch 2869/2869
 - 0s - loss: 0.2785 - mean_q: 12.4872
Epoch 2870/2870
 - 0s - loss: 0.3878 - mean_q: 12.3091
Epoch 2871/2871
 - 0s - loss: 0.9957 - mean_q: 13.2730
Epoch 2872/2872
 - 0s - loss: 0.5136 - mean_q: 13.1124
Epoch 2873/2873
 - 0s - loss: 0.6597 - mean_q: 13.1142
Epoch 2874/2874
 - 0s - loss: 0.6200 - mean_q: 13.1606
Epoch 2875

Epoch 3006/3006
 - 0s - loss: 0.2368 - mean_q: 14.0342
Epoch 3007/3007
 - 0s - loss: 0.2590 - mean_q: 13.6190
Epoch 3008/3008
 - 0s - loss: 0.3795 - mean_q: 12.4635
Epoch 3009/3009
 - 0s - loss: 0.2921 - mean_q: 13.4383
Epoch 3010/3010
 - 0s - loss: 0.4756 - mean_q: 14.6459
Epoch 3011/3011
 - 0s - loss: 0.8226 - mean_q: 12.9089
Epoch 3012/3012
 - 0s - loss: 0.4215 - mean_q: 13.7538
Epoch 3013/3013
 - 0s - loss: 0.3532 - mean_q: 12.9160
Epoch 3014/3014
 - 0s - loss: 0.5418 - mean_q: 13.9398
Epoch 3015/3015
 - 0s - loss: 0.4979 - mean_q: 12.4080
Epoch 3016/3016
 - 0s - loss: 0.3057 - mean_q: 13.7945
Epoch 3017/3017
 - 0s - loss: 0.5746 - mean_q: 13.6117
Epoch 3018/3018
 - 0s - loss: 0.7954 - mean_q: 14.0239
Epoch 3019/3019
 - 0s - loss: 0.5988 - mean_q: 11.9244
Epoch 3020/3020
 - 0s - loss: 0.2338 - mean_q: 12.8090
Epoch 3021/3021
 - 0s - loss: 0.6725 - mean_q: 13.1584
Epoch 3022/3022
 - 0s - loss: 0.3438 - mean_q: 12.9786
Epoch 3023/3023
 - 0s - loss: 0.3358 - mean_q: 12.2143
Epoch 3024

Epoch 3155/3155
 - 0s - loss: 0.6568 - mean_q: 12.1525
Epoch 3156/3156
 - 0s - loss: 0.9193 - mean_q: 12.2712
Epoch 3157/3157
 - 0s - loss: 0.6605 - mean_q: 12.0692
Epoch 3158/3158
 - 0s - loss: 0.3322 - mean_q: 12.9469
Epoch 3159/3159
 - 0s - loss: 0.5345 - mean_q: 12.9331
Epoch 3160/3160
 - 0s - loss: 0.5003 - mean_q: 12.5667
Epoch 3161/3161
 - 0s - loss: 0.4993 - mean_q: 12.6252
Epoch 3162/3162
 - 0s - loss: 0.4194 - mean_q: 14.0696
Epoch 3163/3163
 - 0s - loss: 0.3361 - mean_q: 11.8915
Epoch 3164/3164
 - 0s - loss: 0.5752 - mean_q: 13.1438
Epoch 3165/3165
 - 0s - loss: 0.3249 - mean_q: 12.6163
Epoch 3166/3166
 - 0s - loss: 0.2362 - mean_q: 12.3243
Epoch 3167/3167
 - 0s - loss: 0.1687 - mean_q: 12.8697
Epoch 3168/3168
 - 0s - loss: 0.4190 - mean_q: 12.0436
Epoch 3169/3169
 - 0s - loss: 0.0559 - mean_q: 11.9979
Epoch 3170/3170
 - 0s - loss: 0.3288 - mean_q: 12.5300
Epoch 3171/3171
 - 0s - loss: 0.2456 - mean_q: 12.1421
Epoch 3172/3172
 - 0s - loss: 0.1691 - mean_q: 12.4592
Epoch 3173

Epoch 3304/3304
 - 0s - loss: 0.6638 - mean_q: 12.9920
Epoch 3305/3305
 - 0s - loss: 0.5385 - mean_q: 12.8963
Epoch 3306/3306
 - 0s - loss: 0.3611 - mean_q: 12.1258
Epoch 3307/3307
 - 0s - loss: 0.9031 - mean_q: 13.5857
Epoch 3308/3308
 - 0s - loss: 0.7845 - mean_q: 11.4479
Epoch 3309/3309
 - 0s - loss: 0.6937 - mean_q: 12.1281
Epoch 3310/3310
 - 0s - loss: 0.3413 - mean_q: 11.6610
Epoch 3311/3311
 - 0s - loss: 0.7514 - mean_q: 11.7508
Epoch 3312/3312
 - 0s - loss: 0.3873 - mean_q: 11.8887
Epoch 3313/3313
 - 0s - loss: 0.1227 - mean_q: 12.5835
Epoch 3314/3314
 - 0s - loss: 0.3770 - mean_q: 11.9725
Epoch 3315/3315
 - 0s - loss: 0.2426 - mean_q: 12.2621
Epoch 3316/3316
 - 0s - loss: 0.4191 - mean_q: 12.0849
Epoch 3317/3317
 - 0s - loss: 0.4744 - mean_q: 12.0656
Epoch 3318/3318
 - 0s - loss: 0.1711 - mean_q: 13.5939
Epoch 3319/3319
 - 0s - loss: 0.3607 - mean_q: 11.4486
Epoch 3320/3320
 - 0s - loss: 0.3195 - mean_q: 11.3785
Epoch 3321/3321
 - 0s - loss: 0.3041 - mean_q: 12.4969
Epoch 3322

Epoch 3453/3453
 - 0s - loss: 0.4771 - mean_q: 11.4070
Epoch 3454/3454
 - 0s - loss: 0.1793 - mean_q: 12.0478
Epoch 3455/3455
 - 0s - loss: 0.6660 - mean_q: 12.0857
Epoch 3456/3456
 - 0s - loss: 0.3110 - mean_q: 11.8592
Epoch 3457/3457
 - 0s - loss: 4.6820 - mean_q: 12.0082
Epoch 3458/3458
 - 0s - loss: 0.1963 - mean_q: 12.0120
Epoch 3459/3459
 - 0s - loss: 0.6400 - mean_q: 12.0269
Epoch 3460/3460
 - 0s - loss: 0.3637 - mean_q: 11.5043
Epoch 3461/3461
 - 0s - loss: 0.7417 - mean_q: 12.2615
Epoch 3462/3462
 - 0s - loss: 0.6727 - mean_q: 11.7974
Epoch 3463/3463
 - 0s - loss: 0.4680 - mean_q: 12.4266
Epoch 3464/3464
 - 0s - loss: 0.7055 - mean_q: 11.2156
Epoch 3465/3465
 - 0s - loss: 0.9414 - mean_q: 12.0758
Epoch 3466/3466
 - 0s - loss: 0.3828 - mean_q: 12.0895
Epoch 3467/3467
 - 0s - loss: 0.8268 - mean_q: 12.4990
Epoch 3468/3468
 - 0s - loss: 0.7170 - mean_q: 11.6344
Epoch 3469/3469
 - 0s - loss: 0.7499 - mean_q: 12.7671
Epoch 3470/3470
 - 0s - loss: 0.6915 - mean_q: 11.6322
Epoch 3471

Epoch 3602/3602
 - 0s - loss: 0.5201 - mean_q: 12.2625
Epoch 3603/3603
 - 0s - loss: 0.6550 - mean_q: 12.1071
Epoch 3604/3604
 - 0s - loss: 0.1124 - mean_q: 12.2049
Epoch 3605/3605
 - 0s - loss: 0.1753 - mean_q: 12.1928
Epoch 3606/3606
 - 0s - loss: 0.4186 - mean_q: 12.1638
Epoch 3607/3607
 - 0s - loss: 0.6638 - mean_q: 12.0076
Epoch 3608/3608
 - 0s - loss: 0.2148 - mean_q: 11.8637
Epoch 3609/3609
 - 0s - loss: 0.9870 - mean_q: 11.7750
Epoch 3610/3610
 - 0s - loss: 0.1954 - mean_q: 12.8189
Epoch 3611/3611
 - 0s - loss: 0.9660 - mean_q: 11.3006
Epoch 3612/3612
 - 0s - loss: 0.5973 - mean_q: 12.4144
Epoch 3613/3613
 - 0s - loss: 0.5789 - mean_q: 11.9523
Epoch 3614/3614
 - 0s - loss: 0.6288 - mean_q: 12.2147
Epoch 3615/3615
 - 0s - loss: 0.4466 - mean_q: 12.4138
Epoch 3616/3616
 - 0s - loss: 0.0462 - mean_q: 12.1115
Epoch 3617/3617
 - 0s - loss: 0.8112 - mean_q: 11.9847
Epoch 3618/3618
 - 0s - loss: 0.4407 - mean_q: 12.6743
Epoch 3619/3619
 - 0s - loss: 0.6449 - mean_q: 12.1534
Epoch 3620

Epoch 3751/3751
 - 0s - loss: 0.7081 - mean_q: 14.9572
Epoch 3752/3752
 - 0s - loss: 0.6030 - mean_q: 15.5988
Epoch 3753/3753
 - 0s - loss: 0.2939 - mean_q: 16.6026
Epoch 3754/3754
 - 0s - loss: 0.6890 - mean_q: 16.4363
Epoch 3755/3755
 - 0s - loss: 0.2320 - mean_q: 16.0460
Epoch 3756/3756
 - 0s - loss: 0.3629 - mean_q: 14.6313
Epoch 3757/3757
 - 0s - loss: 0.5692 - mean_q: 15.1924
Epoch 3758/3758
 - 0s - loss: 0.2678 - mean_q: 14.7652
Epoch 3759/3759
 - 0s - loss: 0.4289 - mean_q: 16.0441
Epoch 3760/3760
 - 0s - loss: 0.6817 - mean_q: 16.0204
Epoch 3761/3761
 - 0s - loss: 0.6658 - mean_q: 16.8972
Epoch 3762/3762
 - 0s - loss: 4.6040 - mean_q: 15.6706
Epoch 3763/3763
 - 0s - loss: 0.3908 - mean_q: 15.2026
Epoch 3764/3764
 - 0s - loss: 1.4833 - mean_q: 18.9040
Epoch 3765/3765
 - 0s - loss: 0.9593 - mean_q: 14.9535
Epoch 3766/3766
 - 0s - loss: 0.7731 - mean_q: 17.7726
Epoch 3767/3767
 - 0s - loss: 0.7791 - mean_q: 13.6958
Epoch 3768/3768
 - 0s - loss: 0.7271 - mean_q: 15.1924
Epoch 3769

Epoch 3900/3900
 - 0s - loss: 0.3650 - mean_q: 14.8822
Epoch 3901/3901
 - 0s - loss: 0.5049 - mean_q: 15.0161
Epoch 3902/3902
 - 0s - loss: 0.1917 - mean_q: 15.0705
Epoch 3903/3903
 - 0s - loss: 0.3535 - mean_q: 14.6257
Epoch 3904/3904
 - 0s - loss: 0.4135 - mean_q: 15.5476
Epoch 3905/3905
 - 0s - loss: 0.2877 - mean_q: 13.9538
Epoch 3906/3906
 - 0s - loss: 0.5744 - mean_q: 14.5418
Epoch 3907/3907
 - 0s - loss: 0.5556 - mean_q: 13.1348
Epoch 3908/3908
 - 0s - loss: 0.2636 - mean_q: 12.0891
Epoch 3909/3909
 - 0s - loss: 0.7766 - mean_q: 15.3827
Epoch 3910/3910
 - 0s - loss: 0.4540 - mean_q: 13.0442
Epoch 3911/3911
 - 0s - loss: 0.0754 - mean_q: 13.8930
Epoch 3912/3912
 - 0s - loss: 0.4140 - mean_q: 14.2475
Epoch 3913/3913
 - 0s - loss: 0.5667 - mean_q: 15.2099
Epoch 3914/3914
 - 0s - loss: 0.4567 - mean_q: 13.8569
Epoch 3915/3915
 - 0s - loss: 0.6090 - mean_q: 14.7623
Epoch 3916/3916
 - 0s - loss: 0.4436 - mean_q: 13.8646
Epoch 3917/3917
 - 0s - loss: 0.7818 - mean_q: 13.6480
Epoch 3918

Epoch 4049/4049
 - 0s - loss: 1.0712 - mean_q: 11.5661
Epoch 4050/4050
 - 0s - loss: 1.2002 - mean_q: 13.4857
Epoch 4051/4051
 - 0s - loss: 0.9079 - mean_q: 13.2405
Epoch 4052/4052
 - 0s - loss: 0.3439 - mean_q: 12.9200
Epoch 4053/4053
 - 0s - loss: 0.1469 - mean_q: 13.6325
Epoch 4054/4054
 - 0s - loss: 1.1122 - mean_q: 14.0804
Epoch 4055/4055
 - 0s - loss: 1.2009 - mean_q: 13.6118
Epoch 4056/4056
 - 0s - loss: 0.5977 - mean_q: 14.6477
Epoch 4057/4057
 - 0s - loss: 0.6940 - mean_q: 14.1790
Epoch 4058/4058
 - 0s - loss: 0.5358 - mean_q: 14.6134
Epoch 4059/4059
 - 0s - loss: 0.4717 - mean_q: 14.1768
Epoch 4060/4060
 - 0s - loss: 0.3944 - mean_q: 14.3746
Epoch 4061/4061
 - 0s - loss: 0.4808 - mean_q: 14.4424
Epoch 4062/4062
 - 0s - loss: 0.5979 - mean_q: 14.1862
Epoch 4063/4063
 - 0s - loss: 0.7536 - mean_q: 13.7283
Epoch 4064/4064
 - 0s - loss: 0.3701 - mean_q: 13.6766
Epoch 4065/4065
 - 0s - loss: 0.5193 - mean_q: 13.8429
Epoch 4066/4066
 - 0s - loss: 0.5525 - mean_q: 13.2309
Epoch 4067

Epoch 4198/4198
 - 0s - loss: 0.7544 - mean_q: 14.1110
Epoch 4199/4199
 - 0s - loss: 0.4773 - mean_q: 13.1607
Epoch 4200/4200
 - 0s - loss: 0.2738 - mean_q: 13.3006
Epoch 4201/4201
 - 0s - loss: 0.7353 - mean_q: 12.3825
Epoch 4202/4202
 - 0s - loss: 0.3251 - mean_q: 12.7774
Epoch 4203/4203
 - 0s - loss: 0.9659 - mean_q: 13.6946
Epoch 4204/4204
 - 0s - loss: 0.4245 - mean_q: 13.6852
Epoch 4205/4205
 - 0s - loss: 0.3570 - mean_q: 12.4244
Epoch 4206/4206
 - 0s - loss: 0.4551 - mean_q: 14.2953
Epoch 4207/4207
 - 0s - loss: 0.4524 - mean_q: 13.6032
Epoch 4208/4208
 - 0s - loss: 0.4471 - mean_q: 13.7051
Epoch 4209/4209
 - 0s - loss: 0.9323 - mean_q: 13.0196
Epoch 4210/4210
 - 0s - loss: 0.4434 - mean_q: 15.2547
Epoch 4211/4211
 - 0s - loss: 0.2410 - mean_q: 14.1220
Epoch 4212/4212
 - 0s - loss: 0.4256 - mean_q: 13.0783
Epoch 4213/4213
 - 0s - loss: 0.4381 - mean_q: 13.5133
Epoch 4214/4214
 - 0s - loss: 0.3761 - mean_q: 11.8849
Epoch 4215/4215
 - 0s - loss: 407.6046 - mean_q: 16.7162
Epoch 42

Epoch 4347/4347
 - 0s - loss: 3.0250 - mean_q: 15.7557
Epoch 4348/4348
 - 0s - loss: 5.4398 - mean_q: 21.9075
Epoch 4349/4349
 - 0s - loss: 2.4454 - mean_q: 14.4395
Epoch 4350/4350
 - 0s - loss: 1.8671 - mean_q: 17.2845
Epoch 4351/4351
 - 0s - loss: 7.0383 - mean_q: 18.9553
Epoch 4352/4352
 - 0s - loss: 4.8597 - mean_q: 17.4681
Epoch 4353/4353
 - 0s - loss: 2.2167 - mean_q: 18.9562
Epoch 4354/4354
 - 0s - loss: 0.8956 - mean_q: 15.3103
Epoch 4355/4355
 - 0s - loss: 2.0176 - mean_q: 19.5830
Epoch 4356/4356
 - 0s - loss: 1.4182 - mean_q: 17.0227
Epoch 4357/4357
 - 0s - loss: 0.9828 - mean_q: 16.4612
Epoch 4358/4358
 - 0s - loss: 1.8388 - mean_q: 17.4790
Epoch 4359/4359
 - 0s - loss: 3.5305 - mean_q: 16.6800
Epoch 4360/4360
 - 0s - loss: 7.6310 - mean_q: 16.1140
Epoch 4361/4361
 - 0s - loss: 68.6363 - mean_q: 17.4815
Epoch 4362/4362
 - 0s - loss: 1.3087 - mean_q: 15.0579
Epoch 4363/4363
 - 0s - loss: 0.5960 - mean_q: 15.5388
Epoch 4364/4364
 - 0s - loss: 0.8130 - mean_q: 15.5613
Epoch 436

Epoch 4496/4496
 - 0s - loss: 0.5679 - mean_q: 11.7276
Epoch 4497/4497
 - 0s - loss: 0.8790 - mean_q: 12.4289
Epoch 4498/4498
 - 0s - loss: 0.2179 - mean_q: 10.9568
Epoch 4499/4499
 - 0s - loss: 0.5078 - mean_q: 11.5881
Epoch 4500/4500
 - 0s - loss: 0.3174 - mean_q: 11.9023
Epoch 4501/4501
 - 0s - loss: 0.1956 - mean_q: 11.6900
Epoch 4502/4502
 - 0s - loss: 0.5896 - mean_q: 11.9402
Epoch 4503/4503
 - 0s - loss: 0.7259 - mean_q: 11.3969
Epoch 4504/4504
 - 0s - loss: 0.4133 - mean_q: 11.5005
Epoch 4505/4505
 - 0s - loss: 0.2608 - mean_q: 11.3059
Epoch 4506/4506
 - 0s - loss: 0.5333 - mean_q: 11.2085
Epoch 4507/4507
 - 0s - loss: 0.1739 - mean_q: 10.5032
Epoch 4508/4508
 - 0s - loss: 0.4811 - mean_q: 10.3372
Epoch 4509/4509
 - 0s - loss: 0.2847 - mean_q: 11.8269
Epoch 4510/4510
 - 0s - loss: 0.5058 - mean_q: 10.7177
Epoch 4511/4511
 - 0s - loss: 0.2310 - mean_q: 11.6440
Epoch 4512/4512
 - 0s - loss: 0.3679 - mean_q: 11.7379
Epoch 4513/4513
 - 0s - loss: 0.5891 - mean_q: 10.3092
Epoch 4514

Epoch 4645/4645
 - 0s - loss: 0.4292 - mean_q: 10.7836
Epoch 4646/4646
 - 0s - loss: 0.4873 - mean_q: 12.5244
Epoch 4647/4647
 - 0s - loss: 4.8322 - mean_q: 10.6931
Epoch 4648/4648
 - 0s - loss: 0.4393 - mean_q: 10.8564
Epoch 4649/4649
 - 0s - loss: 0.5500 - mean_q: 11.2690
Epoch 4650/4650
 - 0s - loss: 0.6996 - mean_q: 10.7181
Epoch 4651/4651
 - 0s - loss: 0.5220 - mean_q: 11.1520
Epoch 4652/4652
 - 0s - loss: 0.3291 - mean_q: 11.0684
Epoch 4653/4653
 - 0s - loss: 0.3696 - mean_q: 12.1824
Epoch 4654/4654
 - 0s - loss: 0.4248 - mean_q: 12.3049
Epoch 4655/4655
 - 0s - loss: 0.4495 - mean_q: 10.6281
Epoch 4656/4656
 - 0s - loss: 0.5044 - mean_q: 12.2002
Epoch 4657/4657
 - 0s - loss: 0.8596 - mean_q: 11.0376
Epoch 4658/4658
 - 0s - loss: 0.4421 - mean_q: 11.9677
Epoch 4659/4659
 - 0s - loss: 0.2118 - mean_q: 11.0020
Epoch 4660/4660
 - 0s - loss: 0.6027 - mean_q: 10.5007
Epoch 4661/4661
 - 0s - loss: 0.3488 - mean_q: 12.1804
Epoch 4662/4662
 - 0s - loss: 65.4118 - mean_q: 11.6368
Epoch 466

Epoch 4794/4794
 - 0s - loss: 0.3872 - mean_q: 12.4307
Epoch 4795/4795
 - 0s - loss: 0.6142 - mean_q: 12.5237
Epoch 4796/4796
 - 0s - loss: 0.3515 - mean_q: 11.8574
Epoch 4797/4797
 - 0s - loss: 0.7886 - mean_q: 13.1839
Epoch 4798/4798
 - 0s - loss: 0.5269 - mean_q: 12.3544
Epoch 4799/4799
 - 0s - loss: 0.6492 - mean_q: 12.4559
Epoch 4800/4800
 - 0s - loss: 0.7904 - mean_q: 13.9554
Epoch 4801/4801
 - 0s - loss: 0.7529 - mean_q: 12.6122
Epoch 4802/4802
 - 0s - loss: 0.3744 - mean_q: 11.8249
Epoch 4803/4803
 - 0s - loss: 0.4322 - mean_q: 12.2523
Epoch 4804/4804
 - 0s - loss: 0.2087 - mean_q: 13.2324
Epoch 4805/4805
 - 0s - loss: 0.5334 - mean_q: 12.7228
Epoch 4806/4806
 - 0s - loss: 0.6180 - mean_q: 12.7346
Epoch 4807/4807
 - 0s - loss: 0.3989 - mean_q: 12.6930
Epoch 4808/4808
 - 0s - loss: 0.6415 - mean_q: 12.3812
Epoch 4809/4809
 - 0s - loss: 0.4413 - mean_q: 12.1914
Epoch 4810/4810
 - 0s - loss: 0.4673 - mean_q: 12.9103
Epoch 4811/4811
 - 0s - loss: 5.5861 - mean_q: 13.7063
Epoch 4812

Epoch 4943/4943
 - 0s - loss: 0.4527 - mean_q: 11.4905
Epoch 4944/4944
 - 0s - loss: 0.4417 - mean_q: 12.2362
Epoch 4945/4945
 - 0s - loss: 0.3137 - mean_q: 10.4870
Epoch 4946/4946
 - 0s - loss: 0.1415 - mean_q: 12.0164
Epoch 4947/4947
 - 0s - loss: 0.2748 - mean_q: 11.6373
Epoch 4948/4948
 - 0s - loss: 0.3380 - mean_q: 12.1488
Epoch 4949/4949
 - 0s - loss: 390.5205 - mean_q: 14.0476
Epoch 4950/4950
 - 0s - loss: 80.0448 - mean_q: 19.5567
Epoch 4951/4951
 - 0s - loss: 9.8045 - mean_q: 12.1705
Epoch 4952/4952
 - 0s - loss: 2.7191 - mean_q: 13.7433
Epoch 4953/4953
 - 0s - loss: 1.6811 - mean_q: 14.4039
Epoch 4954/4954
 - 0s - loss: 1.7560 - mean_q: 15.8547
Epoch 4955/4955
 - 0s - loss: 1.8188 - mean_q: 16.6157
Epoch 4956/4956
 - 0s - loss: 2.4395 - mean_q: 15.1304
Epoch 4957/4957
 - 0s - loss: 2.1067 - mean_q: 18.7217
Epoch 4958/4958
 - 0s - loss: 2.2348 - mean_q: 17.0196
Epoch 4959/4959
 - 0s - loss: 69.3579 - mean_q: 16.4871
Epoch 4960/4960
 - 0s - loss: 1.6027 - mean_q: 21.0898
Epoch 

Epoch 5092/5092
 - 0s - loss: 0.5466 - mean_q: 10.7320
Epoch 5093/5093
 - 0s - loss: 0.3714 - mean_q: 12.0756
Epoch 5094/5094
 - 0s - loss: 0.4110 - mean_q: 11.3452
Epoch 5095/5095
 - 0s - loss: 0.4742 - mean_q: 11.3514
Epoch 5096/5096
 - 0s - loss: 0.4456 - mean_q: 11.8753
Epoch 5097/5097
 - 0s - loss: 0.5296 - mean_q: 11.5272
Epoch 5098/5098
 - 0s - loss: 0.7281 - mean_q: 10.5597
Epoch 5099/5099
 - 0s - loss: 0.5107 - mean_q: 10.1419
Epoch 5100/5100
 - 0s - loss: 0.2668 - mean_q: 12.0865
Epoch 5101/5101
 - 0s - loss: 0.2876 - mean_q: 12.4932
Epoch 5102/5102
 - 0s - loss: 0.5032 - mean_q: 11.8090
Epoch 5103/5103
 - 0s - loss: 0.0867 - mean_q: 10.5157
Epoch 5104/5104
 - 0s - loss: 0.4087 - mean_q: 11.0833
Epoch 5105/5105
 - 0s - loss: 5.7236 - mean_q: 10.7129
Epoch 5106/5106
 - 0s - loss: 0.5552 - mean_q: 12.0009
Epoch 5107/5107
 - 0s - loss: 0.6630 - mean_q: 11.8091
Epoch 5108/5108
 - 0s - loss: 0.4023 - mean_q: 11.5911
Epoch 5109/5109
 - 0s - loss: 0.4965 - mean_q: 12.3193
Epoch 5110

Epoch 5241/5241
 - 0s - loss: 0.5099 - mean_q: 11.0945
Epoch 5242/5242
 - 0s - loss: 0.4981 - mean_q: 10.3451
Epoch 5243/5243
 - 0s - loss: 0.3151 - mean_q: 11.0214
Epoch 5244/5244
 - 0s - loss: 0.4197 - mean_q: 10.1526
Epoch 5245/5245
 - 0s - loss: 0.4541 - mean_q: 10.4104
Epoch 5246/5246
 - 0s - loss: 0.4882 - mean_q: 9.9430
Epoch 5247/5247
 - 0s - loss: 0.3851 - mean_q: 9.8780
Epoch 5248/5248
 - 0s - loss: 0.4190 - mean_q: 11.1613
Epoch 5249/5249
 - 0s - loss: 0.5636 - mean_q: 10.2493
Epoch 5250/5250
 - 0s - loss: 0.5712 - mean_q: 10.1307
Epoch 5251/5251
 - 0s - loss: 0.4039 - mean_q: 10.6010
Epoch 5252/5252
 - 0s - loss: 0.2102 - mean_q: 9.9870
Epoch 5253/5253
 - 0s - loss: 0.5181 - mean_q: 11.5650
Epoch 5254/5254
 - 0s - loss: 0.5462 - mean_q: 10.8515
Epoch 5255/5255
 - 0s - loss: 0.4450 - mean_q: 10.3993
Epoch 5256/5256
 - 0s - loss: 0.4609 - mean_q: 10.8693
Epoch 5257/5257
 - 0s - loss: 0.3033 - mean_q: 10.2768
Epoch 5258/5258
 - 0s - loss: 0.5630 - mean_q: 10.8364
Epoch 5259/52

 - 0s - loss: 0.1730 - mean_q: 9.7463
Epoch 5391/5391
 - 0s - loss: 0.5847 - mean_q: 10.9718
Epoch 5392/5392
 - 0s - loss: 0.5344 - mean_q: 10.2893
Epoch 5393/5393
 - 0s - loss: 0.4165 - mean_q: 10.2672
Epoch 5394/5394
 - 0s - loss: 0.5230 - mean_q: 11.1045
Epoch 5395/5395
 - 0s - loss: 0.2568 - mean_q: 10.9637
Epoch 5396/5396
 - 0s - loss: 0.7610 - mean_q: 11.4185
Epoch 5397/5397
 - 0s - loss: 0.5434 - mean_q: 11.4561
Epoch 5398/5398
 - 0s - loss: 0.4183 - mean_q: 11.2829
Epoch 5399/5399
 - 0s - loss: 0.3945 - mean_q: 10.7625
Epoch 5400/5400
 - 0s - loss: 0.5824 - mean_q: 10.7809
Epoch 5401/5401
 - 0s - loss: 0.5225 - mean_q: 10.9638
Epoch 5402/5402
 - 0s - loss: 0.5525 - mean_q: 10.6264
Epoch 5403/5403
 - 0s - loss: 0.1076 - mean_q: 9.9172
Epoch 5404/5404
 - 0s - loss: 0.6064 - mean_q: 11.9782
Epoch 5405/5405
 - 0s - loss: 0.6247 - mean_q: 11.6658
Epoch 5406/5406
 - 0s - loss: 65.9446 - mean_q: 10.1925
Epoch 5407/5407
 - 0s - loss: 0.6647 - mean_q: 12.2343
Epoch 5408/5408
 - 0s - los

Epoch 5540/5540
 - 0s - loss: 0.3363 - mean_q: 10.4938
Epoch 5541/5541
 - 0s - loss: 1.0035 - mean_q: 11.1709
Epoch 5542/5542
 - 0s - loss: 0.8632 - mean_q: 10.6465
Epoch 5543/5543
 - 0s - loss: 0.3828 - mean_q: 10.6177
Epoch 5544/5544
 - 0s - loss: 0.7112 - mean_q: 10.0689
Epoch 5545/5545
 - 0s - loss: 0.6699 - mean_q: 10.3937
Epoch 5546/5546
 - 0s - loss: 0.4485 - mean_q: 10.7196
Epoch 5547/5547
 - 0s - loss: 0.6751 - mean_q: 11.3832
Epoch 5548/5548
 - 0s - loss: 0.5313 - mean_q: 11.7124
Epoch 5549/5549
 - 0s - loss: 0.8499 - mean_q: 11.7621
Epoch 5550/5550
 - 0s - loss: 0.7744 - mean_q: 11.2425
Epoch 5551/5551
 - 0s - loss: 0.9858 - mean_q: 11.9579
Epoch 5552/5552
 - 0s - loss: 0.7439 - mean_q: 11.2483
Epoch 5553/5553
 - 0s - loss: 0.5779 - mean_q: 10.6467
Epoch 5554/5554
 - 0s - loss: 0.1370 - mean_q: 10.3402
Epoch 5555/5555
 - 0s - loss: 0.3103 - mean_q: 10.4796
Epoch 5556/5556
 - 0s - loss: 0.2183 - mean_q: 11.0939
Epoch 5557/5557
 - 0s - loss: 0.6383 - mean_q: 11.2027
Epoch 5558

 - 0s - loss: 0.6919 - mean_q: 13.1849
Epoch 5690/5690
 - 0s - loss: 0.5881 - mean_q: 13.3184
Epoch 5691/5691
 - 0s - loss: 1.3134 - mean_q: 12.2963
Epoch 5692/5692
 - 0s - loss: 0.3635 - mean_q: 13.1331
Epoch 5693/5693
 - 0s - loss: 0.9108 - mean_q: 12.9257
Epoch 5694/5694
 - 0s - loss: 0.6576 - mean_q: 13.1388
Epoch 5695/5695
 - 0s - loss: 0.2491 - mean_q: 12.7908
Epoch 5696/5696
 - 0s - loss: 0.4403 - mean_q: 11.8183
Epoch 5697/5697
 - 0s - loss: 0.4410 - mean_q: 11.3038
Epoch 5698/5698
 - 0s - loss: 0.8001 - mean_q: 10.8633
Epoch 5699/5699
 - 0s - loss: 0.3696 - mean_q: 11.5816
Epoch 5700/5700
 - 0s - loss: 0.7002 - mean_q: 11.6318
Epoch 5701/5701
 - 0s - loss: 0.3454 - mean_q: 10.9261
Epoch 5702/5702
 - 0s - loss: 0.4207 - mean_q: 10.6182
Epoch 5703/5703
 - 0s - loss: 0.7983 - mean_q: 12.0742
Epoch 5704/5704
 - 0s - loss: 0.9211 - mean_q: 12.2584
Epoch 5705/5705
 - 0s - loss: 0.3773 - mean_q: 10.6326
Epoch 5706/5706
 - 0s - loss: 0.3616 - mean_q: 12.1929
Epoch 5707/5707
 - 0s - lo

 - 0s - loss: 0.7316 - mean_q: 13.0830
Epoch 5839/5839
 - 0s - loss: 0.8060 - mean_q: 12.0631
Epoch 5840/5840
 - 0s - loss: 0.4780 - mean_q: 14.0889
Epoch 5841/5841
 - 0s - loss: 0.4909 - mean_q: 11.5049
Epoch 5842/5842
 - 0s - loss: 0.8017 - mean_q: 13.4019
Epoch 5843/5843
 - 0s - loss: 0.6061 - mean_q: 12.0586
Epoch 5844/5844
 - 0s - loss: 1.0658 - mean_q: 12.2203
Epoch 5845/5845
 - 0s - loss: 0.7216 - mean_q: 13.6130
Epoch 5846/5846
 - 0s - loss: 0.9258 - mean_q: 12.5322
Epoch 5847/5847
 - 0s - loss: 0.7553 - mean_q: 12.2753
Epoch 5848/5848
 - 0s - loss: 0.7769 - mean_q: 12.6607
Epoch 5849/5849
 - 0s - loss: 0.3780 - mean_q: 11.1579
Epoch 5850/5850
 - 0s - loss: 0.1582 - mean_q: 13.0740
Epoch 5851/5851
 - 0s - loss: 5.0566 - mean_q: 12.1852
Epoch 5852/5852
 - 0s - loss: 0.7315 - mean_q: 11.3799
Epoch 5853/5853
 - 0s - loss: 0.6405 - mean_q: 12.4139
Epoch 5854/5854
 - 0s - loss: 1.2602 - mean_q: 12.4913
Epoch 5855/5855
 - 0s - loss: 0.5186 - mean_q: 12.1995
Epoch 5856/5856
 - 0s - lo

Epoch 5988/5988
 - 0s - loss: 0.2732 - mean_q: 10.8654
Epoch 5989/5989
 - 0s - loss: 0.4047 - mean_q: 10.0640
Epoch 5990/5990
 - 0s - loss: 0.3285 - mean_q: 10.7867
Epoch 5991/5991
 - 0s - loss: 0.5011 - mean_q: 9.6606
Epoch 5992/5992
 - 0s - loss: 0.2325 - mean_q: 10.4287
Epoch 5993/5993
 - 0s - loss: 0.5952 - mean_q: 10.3297
Epoch 5994/5994
 - 0s - loss: 0.4183 - mean_q: 10.5921
Epoch 5995/5995
 - 0s - loss: 0.7931 - mean_q: 10.4847
Epoch 5996/5996
 - 0s - loss: 0.6312 - mean_q: 11.2636
Epoch 5997/5997
 - 0s - loss: 0.6762 - mean_q: 10.3304
Epoch 5998/5998
 - 0s - loss: 0.7346 - mean_q: 12.6866
Epoch 5999/5999
 - 0s - loss: 0.2649 - mean_q: 11.4367
Epoch 6000/6000
 - 0s - loss: 0.1964 - mean_q: 11.1156
Epoch 6001/6001
 - 0s - loss: 0.7758 - mean_q: 10.4772
Epoch 6002/6002
 - 0s - loss: 0.5187 - mean_q: 10.3721
Epoch 6003/6003
 - 0s - loss: 0.6198 - mean_q: 10.3694
Epoch 6004/6004
 - 0s - loss: 0.4153 - mean_q: 10.5152
Epoch 6005/6005
 - 0s - loss: 0.4908 - mean_q: 9.1576
Epoch 6006/6

 - 0s - loss: 0.6576 - mean_q: 10.5231
Epoch 6138/6138
 - 0s - loss: 14.3439 - mean_q: 10.3726
Epoch 6139/6139
 - 0s - loss: 0.3545 - mean_q: 10.1088
Epoch 6140/6140
 - 0s - loss: 0.4730 - mean_q: 10.6182
Epoch 6141/6141
 - 0s - loss: 0.6493 - mean_q: 10.5960
Epoch 6142/6142
 - 0s - loss: 0.2839 - mean_q: 11.5311
Epoch 6143/6143
 - 0s - loss: 0.5833 - mean_q: 10.4522
Epoch 6144/6144
 - 0s - loss: 4.6884 - mean_q: 11.2943
Epoch 6145/6145
 - 0s - loss: 0.6401 - mean_q: 11.3922
Epoch 6146/6146
 - 0s - loss: 5.4850 - mean_q: 11.3531
Epoch 6147/6147
 - 0s - loss: 0.3274 - mean_q: 11.6839
Epoch 6148/6148
 - 0s - loss: 0.4042 - mean_q: 9.9546
Epoch 6149/6149
 - 0s - loss: 0.7473 - mean_q: 11.9699
Epoch 6150/6150
 - 0s - loss: 0.6108 - mean_q: 12.1049
Epoch 6151/6151
 - 0s - loss: 0.7313 - mean_q: 10.7648
Epoch 6152/6152
 - 0s - loss: 0.7956 - mean_q: 11.3826
Epoch 6153/6153
 - 0s - loss: 0.2461 - mean_q: 11.1528
Epoch 6154/6154
 - 0s - loss: 0.3831 - mean_q: 10.4114
Epoch 6155/6155
 - 0s - lo

Epoch 6287/6287
 - 0s - loss: 0.3258 - mean_q: 11.1697
Epoch 6288/6288
 - 0s - loss: 0.8974 - mean_q: 10.6185
Epoch 6289/6289
 - 0s - loss: 0.4809 - mean_q: 12.1656
Epoch 6290/6290
 - 0s - loss: 0.3764 - mean_q: 12.1003
Epoch 6291/6291
 - 0s - loss: 0.3399 - mean_q: 12.1934
Epoch 6292/6292
 - 0s - loss: 0.4903 - mean_q: 11.5083
Epoch 6293/6293
 - 0s - loss: 0.2123 - mean_q: 11.2335
Epoch 6294/6294
 - 0s - loss: 0.4761 - mean_q: 11.1939
Epoch 6295/6295
 - 0s - loss: 0.6981 - mean_q: 11.1590
Epoch 6296/6296
 - 0s - loss: 0.3236 - mean_q: 10.7024
Epoch 6297/6297
 - 0s - loss: 0.6023 - mean_q: 12.1640
Epoch 6298/6298
 - 0s - loss: 0.4770 - mean_q: 12.3979
Epoch 6299/6299
 - 0s - loss: 5.1017 - mean_q: 11.4224
Epoch 6300/6300
 - 0s - loss: 0.5262 - mean_q: 10.5176
Epoch 6301/6301
 - 0s - loss: 0.6785 - mean_q: 11.4592
Epoch 6302/6302
 - 0s - loss: 0.5180 - mean_q: 10.5270
Epoch 6303/6303
 - 0s - loss: 0.4640 - mean_q: 10.2207
Epoch 6304/6304
 - 0s - loss: 0.4724 - mean_q: 10.8831
Epoch 6305

 - 0s - loss: 0.3376 - mean_q: 10.2474
Epoch 6437/6437
 - 0s - loss: 0.7428 - mean_q: 11.8295
Epoch 6438/6438
 - 0s - loss: 0.5019 - mean_q: 10.7060
Epoch 6439/6439
 - 0s - loss: 0.3670 - mean_q: 10.5379
Epoch 6440/6440
 - 0s - loss: 0.4646 - mean_q: 11.0724
Epoch 6441/6441
 - 0s - loss: 0.3688 - mean_q: 11.2394
Epoch 6442/6442
 - 0s - loss: 0.4202 - mean_q: 10.1352
Epoch 6443/6443
 - 0s - loss: 0.4322 - mean_q: 10.3832
Epoch 6444/6444
 - 0s - loss: 0.6271 - mean_q: 11.5866
Epoch 6445/6445
 - 0s - loss: 0.6672 - mean_q: 11.0850
Epoch 6446/6446
 - 0s - loss: 0.7755 - mean_q: 10.6277
Epoch 6447/6447
 - 0s - loss: 0.8057 - mean_q: 10.8197
Epoch 6448/6448
 - 0s - loss: 0.5735 - mean_q: 11.4016
Epoch 6449/6449
 - 0s - loss: 0.4814 - mean_q: 9.4706
Epoch 6450/6450
 - 0s - loss: 0.2771 - mean_q: 11.5780
Epoch 6451/6451
 - 0s - loss: 0.6372 - mean_q: 9.9734
Epoch 6452/6452
 - 0s - loss: 0.3824 - mean_q: 10.5097
Epoch 6453/6453
 - 0s - loss: 0.3939 - mean_q: 10.5089
Epoch 6454/6454
 - 0s - loss

Epoch 6586/6586
 - 0s - loss: 65.6127 - mean_q: 11.6223
Epoch 6587/6587
 - 0s - loss: 1.4240 - mean_q: 11.6448
Epoch 6588/6588
 - 0s - loss: 1.5439 - mean_q: 11.9117
Epoch 6589/6589
 - 0s - loss: 0.8533 - mean_q: 10.4508
Epoch 6590/6590
 - 0s - loss: 0.4990 - mean_q: 11.7279
Epoch 6591/6591
 - 0s - loss: 0.6347 - mean_q: 12.9891
Epoch 6592/6592
 - 0s - loss: 0.9155 - mean_q: 11.2474
Epoch 6593/6593
 - 0s - loss: 0.6173 - mean_q: 11.1635
Epoch 6594/6594
 - 0s - loss: 0.3556 - mean_q: 11.6786
Epoch 6595/6595
 - 0s - loss: 0.6868 - mean_q: 12.3801
Epoch 6596/6596
 - 0s - loss: 0.6126 - mean_q: 12.5624
Epoch 6597/6597
 - 0s - loss: 0.6729 - mean_q: 10.9336
Epoch 6598/6598
 - 0s - loss: 0.2979 - mean_q: 11.0680
Epoch 6599/6599
 - 0s - loss: 0.5846 - mean_q: 11.5836
Epoch 6600/6600
 - 0s - loss: 0.5809 - mean_q: 10.8122
Epoch 6601/6601
 - 0s - loss: 0.5733 - mean_q: 10.3590
Epoch 6602/6602
 - 0s - loss: 0.4918 - mean_q: 10.3844
Epoch 6603/6603
 - 0s - loss: 0.1899 - mean_q: 10.3504
Epoch 660

Epoch 6736/6736
 - 0s - loss: 0.4822 - mean_q: 10.4299
Epoch 6737/6737
 - 0s - loss: 0.3613 - mean_q: 11.0205
Epoch 6738/6738
 - 0s - loss: 0.6614 - mean_q: 8.8957
Epoch 6739/6739
 - 0s - loss: 0.9558 - mean_q: 11.3477
Epoch 6740/6740
 - 0s - loss: 0.6623 - mean_q: 10.2287
Epoch 6741/6741
 - 0s - loss: 0.5523 - mean_q: 11.0836
Epoch 6742/6742
 - 0s - loss: 65.0317 - mean_q: 10.6526
Epoch 6743/6743
 - 0s - loss: 1.2873 - mean_q: 11.8161
Epoch 6744/6744
 - 0s - loss: 1.4286 - mean_q: 12.7829
Epoch 6745/6745
 - 0s - loss: 0.7132 - mean_q: 13.3805
Epoch 6746/6746
 - 0s - loss: 5.0685 - mean_q: 12.9677
Epoch 6747/6747
 - 0s - loss: 0.8297 - mean_q: 12.1542
Epoch 6748/6748
 - 0s - loss: 0.8802 - mean_q: 13.0969
Epoch 6749/6749
 - 0s - loss: 4.7429 - mean_q: 11.6328
Epoch 6750/6750
 - 0s - loss: 0.6141 - mean_q: 10.8454
Epoch 6751/6751
 - 0s - loss: 4.7381 - mean_q: 12.3965
Epoch 6752/6752
 - 0s - loss: 0.5085 - mean_q: 12.1479
Epoch 6753/6753
 - 0s - loss: 0.6168 - mean_q: 12.2120
Epoch 6754

 - 0s - loss: 0.1106 - mean_q: 9.5455
Epoch 6886/6886
 - 0s - loss: 0.3797 - mean_q: 11.1773
Epoch 6887/6887
 - 0s - loss: 0.4568 - mean_q: 9.8207
Epoch 6888/6888
 - 0s - loss: 0.7114 - mean_q: 11.6661
Epoch 6889/6889
 - 0s - loss: 0.4428 - mean_q: 10.9207
Epoch 6890/6890
 - 0s - loss: 0.5530 - mean_q: 11.1116
Epoch 6891/6891
 - 0s - loss: 0.8142 - mean_q: 11.6634
Epoch 6892/6892
 - 0s - loss: 5.2554 - mean_q: 11.2083
Epoch 6893/6893
 - 0s - loss: 0.2894 - mean_q: 10.6286
Epoch 6894/6894
 - 0s - loss: 0.4883 - mean_q: 11.5191
Epoch 6895/6895
 - 0s - loss: 0.2688 - mean_q: 10.8436
Epoch 6896/6896
 - 0s - loss: 0.3803 - mean_q: 10.1617
Epoch 6897/6897
 - 0s - loss: 0.1756 - mean_q: 9.7997
Epoch 6898/6898
 - 0s - loss: 0.7335 - mean_q: 10.4255
Epoch 6899/6899
 - 0s - loss: 0.6432 - mean_q: 10.9506
Epoch 6900/6900
 - 0s - loss: 0.4930 - mean_q: 11.8399
Epoch 6901/6901
 - 0s - loss: 0.5488 - mean_q: 10.7114
Epoch 6902/6902
 - 0s - loss: 0.5261 - mean_q: 10.3012
Epoch 6903/6903
 - 0s - loss:

Epoch 7035/7035
 - 0s - loss: 5.7314 - mean_q: 11.3824
Epoch 7036/7036
 - 0s - loss: 0.6284 - mean_q: 11.1863
Epoch 7037/7037
 - 0s - loss: 0.1960 - mean_q: 10.4609
Epoch 7038/7038
 - 0s - loss: 0.5889 - mean_q: 11.0171
Epoch 7039/7039
 - 0s - loss: 0.5694 - mean_q: 12.0190
Epoch 7040/7040
 - 0s - loss: 0.6459 - mean_q: 10.8394
Epoch 7041/7041
 - 0s - loss: 0.3396 - mean_q: 11.3484
Epoch 7042/7042
 - 0s - loss: 0.4070 - mean_q: 10.7964
Epoch 7043/7043
 - 0s - loss: 0.3836 - mean_q: 11.3566
Epoch 7044/7044
 - 0s - loss: 0.6907 - mean_q: 11.9510
Epoch 7045/7045
 - 0s - loss: 0.1889 - mean_q: 10.9577
Epoch 7046/7046
 - 0s - loss: 0.5071 - mean_q: 10.1219
Epoch 7047/7047
 - 0s - loss: 0.2881 - mean_q: 11.4265
Epoch 7048/7048
 - 0s - loss: 0.5775 - mean_q: 11.6349
Epoch 7049/7049
 - 0s - loss: 0.5201 - mean_q: 11.2490
Epoch 7050/7050
 - 0s - loss: 0.3911 - mean_q: 11.2196
Epoch 7051/7051
 - 0s - loss: 0.8216 - mean_q: 10.8348
Epoch 7052/7052
 - 0s - loss: 5.7593 - mean_q: 11.9220
Epoch 7053

 - 0s - loss: 0.7114 - mean_q: 11.1577
Epoch 7185/7185
 - 0s - loss: 0.2698 - mean_q: 12.2714
Epoch 7186/7186
 - 0s - loss: 0.5202 - mean_q: 11.4326
Epoch 7187/7187
 - 0s - loss: 0.9335 - mean_q: 12.7423
Epoch 7188/7188
 - 0s - loss: 0.3977 - mean_q: 12.0504
Epoch 7189/7189
 - 0s - loss: 0.6299 - mean_q: 11.2088
Epoch 7190/7190
 - 0s - loss: 0.3746 - mean_q: 10.3836
Epoch 7191/7191
 - 0s - loss: 1.0542 - mean_q: 12.0998
Epoch 7192/7192
 - 0s - loss: 0.4862 - mean_q: 11.3331
Epoch 7193/7193
 - 0s - loss: 0.5657 - mean_q: 10.7731
Epoch 7194/7194
 - 0s - loss: 0.8335 - mean_q: 9.9336
Epoch 7195/7195
 - 0s - loss: 0.2663 - mean_q: 10.8989
Epoch 7196/7196
 - 0s - loss: 0.5223 - mean_q: 11.2890
Epoch 7197/7197
 - 0s - loss: 0.5494 - mean_q: 11.9664
Epoch 7198/7198
 - 0s - loss: 0.7094 - mean_q: 11.0606
Epoch 7199/7199
 - 0s - loss: 0.5345 - mean_q: 11.1363
Epoch 7200/7200
 - 0s - loss: 0.6041 - mean_q: 10.5483
Epoch 7201/7201
 - 0s - loss: 0.4517 - mean_q: 10.5209
Epoch 7202/7202
 - 0s - los

Epoch 7334/7334
 - 0s - loss: 0.6697 - mean_q: 11.5031
Epoch 7335/7335
 - 0s - loss: 0.6133 - mean_q: 10.6997
Epoch 7336/7336
 - 0s - loss: 0.3042 - mean_q: 10.1888
Epoch 7337/7337
 - 0s - loss: 0.4298 - mean_q: 11.6082
Epoch 7338/7338
 - 0s - loss: 0.7744 - mean_q: 11.7791
Epoch 7339/7339
 - 0s - loss: 0.7091 - mean_q: 10.6401
Epoch 7340/7340
 - 0s - loss: 0.4551 - mean_q: 11.0449
Epoch 7341/7341
 - 0s - loss: 0.8182 - mean_q: 10.7080
Epoch 7342/7342
 - 0s - loss: 0.4277 - mean_q: 10.5243
Epoch 7343/7343
 - 0s - loss: 0.4496 - mean_q: 10.6536
Epoch 7344/7344
 - 0s - loss: 0.9951 - mean_q: 11.7209
Epoch 7345/7345
 - 0s - loss: 0.4673 - mean_q: 11.2262
Epoch 7346/7346
 - 0s - loss: 0.3430 - mean_q: 10.5892
Epoch 7347/7347
 - 0s - loss: 0.6327 - mean_q: 9.9355
Epoch 7348/7348
 - 0s - loss: 0.5442 - mean_q: 10.0067
Epoch 7349/7349
 - 0s - loss: 0.2141 - mean_q: 10.2602
Epoch 7350/7350
 - 0s - loss: 0.2063 - mean_q: 10.0086
Epoch 7351/7351
 - 0s - loss: 0.4296 - mean_q: 11.1308
Epoch 7352/

Epoch 7483/7483
 - 0s - loss: 0.4834 - mean_q: 12.1286
Epoch 7484/7484
 - 0s - loss: 0.5392 - mean_q: 10.1728
Epoch 7485/7485
 - 0s - loss: 0.2847 - mean_q: 10.7156
Epoch 7486/7486
 - 0s - loss: 0.1943 - mean_q: 11.3069
Epoch 7487/7487
 - 0s - loss: 0.1464 - mean_q: 10.1078
Epoch 7488/7488
 - 0s - loss: 0.4000 - mean_q: 11.1331
Epoch 7489/7489
 - 0s - loss: 0.3527 - mean_q: 11.3717
Epoch 7490/7490
 - 0s - loss: 0.4025 - mean_q: 10.1195
Epoch 7491/7491
 - 0s - loss: 5.5209 - mean_q: 10.8322
Epoch 7492/7492
 - 0s - loss: 0.5614 - mean_q: 11.7060
Epoch 7493/7493
 - 0s - loss: 0.7928 - mean_q: 11.5263
Epoch 7494/7494
 - 0s - loss: 0.3290 - mean_q: 11.9979
Epoch 7495/7495
 - 0s - loss: 0.5103 - mean_q: 11.9987
Epoch 7496/7496
 - 0s - loss: 0.4516 - mean_q: 10.4363
Epoch 7497/7497
 - 0s - loss: 0.4290 - mean_q: 11.0061
Epoch 7498/7498
 - 0s - loss: 0.2436 - mean_q: 10.6646
Epoch 7499/7499
 - 0s - loss: 0.1857 - mean_q: 10.6157
Epoch 7500/7500
 - 0s - loss: 0.3110 - mean_q: 10.2833
Epoch 7501

 - 0s - loss: 0.2722 - mean_q: 11.2277
Epoch 7633/7633
 - 0s - loss: 0.8843 - mean_q: 12.1975
Epoch 7634/7634
 - 0s - loss: 0.3746 - mean_q: 11.8666
Epoch 7635/7635
 - 0s - loss: 0.5292 - mean_q: 12.8339
Epoch 7636/7636
 - 0s - loss: 0.7724 - mean_q: 12.2207
Epoch 7637/7637
 - 0s - loss: 1.2303 - mean_q: 12.0452
Epoch 7638/7638
 - 0s - loss: 0.5775 - mean_q: 11.3782
Epoch 7639/7639
 - 0s - loss: 0.5773 - mean_q: 11.5006
Epoch 7640/7640
 - 0s - loss: 5.6553 - mean_q: 11.0903
Epoch 7641/7641
 - 0s - loss: 0.2953 - mean_q: 11.1924
Epoch 7642/7642
 - 0s - loss: 0.5497 - mean_q: 10.5130
Epoch 7643/7643
 - 0s - loss: 0.3465 - mean_q: 10.6041
Epoch 7644/7644
 - 0s - loss: 0.6074 - mean_q: 11.0338
Epoch 7645/7645
 - 0s - loss: 0.4563 - mean_q: 10.1563
Epoch 7646/7646
 - 0s - loss: 0.6727 - mean_q: 11.7770
Epoch 7647/7647
 - 0s - loss: 0.5215 - mean_q: 11.2625
Epoch 7648/7648
 - 0s - loss: 0.2308 - mean_q: 11.4620
Epoch 7649/7649
 - 0s - loss: 0.3200 - mean_q: 11.0628
Epoch 7650/7650
 - 0s - lo

Epoch 7782/7782
 - 0s - loss: 0.6511 - mean_q: 12.7685
Epoch 7783/7783
 - 0s - loss: 0.5902 - mean_q: 12.1866
Epoch 7784/7784
 - 0s - loss: 0.2637 - mean_q: 12.5046
Epoch 7785/7785
 - 0s - loss: 0.2250 - mean_q: 12.0162
Epoch 7786/7786
 - 0s - loss: 0.2741 - mean_q: 11.7746
Epoch 7787/7787
 - 0s - loss: 0.1871 - mean_q: 11.7874
Epoch 7788/7788
 - 0s - loss: 0.5788 - mean_q: 11.3744
Epoch 7789/7789
 - 0s - loss: 0.8366 - mean_q: 11.7398
Epoch 7790/7790
 - 0s - loss: 0.6137 - mean_q: 10.7161
Epoch 7791/7791
 - 0s - loss: 0.6157 - mean_q: 11.4205
Epoch 7792/7792
 - 0s - loss: 0.6396 - mean_q: 11.5831
Epoch 7793/7793
 - 0s - loss: 0.2777 - mean_q: 11.2029
Epoch 7794/7794
 - 0s - loss: 0.4278 - mean_q: 11.8418
Epoch 7795/7795
 - 0s - loss: 0.8078 - mean_q: 11.2225
Epoch 7796/7796
 - 0s - loss: 0.2690 - mean_q: 11.5537
Epoch 7797/7797
 - 0s - loss: 0.3874 - mean_q: 10.4654
Epoch 7798/7798
 - 0s - loss: 0.3591 - mean_q: 12.2997
Epoch 7799/7799
 - 0s - loss: 0.5832 - mean_q: 10.6987
Epoch 7800

 - 0s - loss: 5.3901 - mean_q: 11.3310
Epoch 7932/7932
 - 0s - loss: 0.5168 - mean_q: 11.2756
Epoch 7933/7933
 - 0s - loss: 0.2996 - mean_q: 11.4349
Epoch 7934/7934
 - 0s - loss: 0.4058 - mean_q: 10.9507
Epoch 7935/7935
 - 0s - loss: 0.4737 - mean_q: 10.9619
Epoch 7936/7936
 - 0s - loss: 0.4805 - mean_q: 10.6673
Epoch 7937/7937
 - 0s - loss: 0.6965 - mean_q: 11.0999
Epoch 7938/7938
 - 0s - loss: 0.5938 - mean_q: 11.3070
Epoch 7939/7939
 - 0s - loss: 0.2510 - mean_q: 10.5041
Epoch 7940/7940
 - 0s - loss: 0.3876 - mean_q: 11.7099
Epoch 7941/7941
 - 0s - loss: 0.4141 - mean_q: 9.8775
Epoch 7942/7942
 - 0s - loss: 0.5697 - mean_q: 9.6144
Epoch 7943/7943
 - 0s - loss: 0.5426 - mean_q: 10.2212
Epoch 7944/7944
 - 0s - loss: 0.6721 - mean_q: 10.9468
Epoch 7945/7945
 - 0s - loss: 0.4212 - mean_q: 11.4531
Epoch 7946/7946
 - 0s - loss: 0.5742 - mean_q: 11.4114
Epoch 7947/7947
 - 0s - loss: 0.4357 - mean_q: 10.4847
Epoch 7948/7948
 - 0s - loss: 0.1516 - mean_q: 9.3578
Epoch 7949/7949
 - 0s - loss:

Epoch 8081/8081
 - 0s - loss: 0.3369 - mean_q: 10.8473
Epoch 8082/8082
 - 0s - loss: 0.4288 - mean_q: 9.6510
Epoch 8083/8083
 - 0s - loss: 0.3605 - mean_q: 9.9422
Epoch 8084/8084
 - 0s - loss: 0.4447 - mean_q: 11.3536
Epoch 8085/8085
 - 0s - loss: 0.4409 - mean_q: 10.9258
Epoch 8086/8086
 - 0s - loss: 0.7031 - mean_q: 10.2361
Epoch 8087/8087
 - 0s - loss: 0.3574 - mean_q: 10.7027
Epoch 8088/8088
 - 0s - loss: 1.0501 - mean_q: 10.4357
Epoch 8089/8089
 - 0s - loss: 0.5535 - mean_q: 10.5631
Epoch 8090/8090
 - 0s - loss: 0.3940 - mean_q: 11.6935
Epoch 8091/8091
 - 0s - loss: 0.3522 - mean_q: 9.7481
Epoch 8092/8092
 - 0s - loss: 0.2958 - mean_q: 10.8192
Epoch 8093/8093
 - 0s - loss: 0.4665 - mean_q: 11.4350
Epoch 8094/8094
 - 0s - loss: 0.2330 - mean_q: 10.4784
Epoch 8095/8095
 - 0s - loss: 0.5158 - mean_q: 10.4485
Epoch 8096/8096
 - 0s - loss: 0.5754 - mean_q: 11.2366
Epoch 8097/8097
 - 0s - loss: 0.9635 - mean_q: 11.0940
Epoch 8098/8098
 - 0s - loss: 0.4256 - mean_q: 11.5784
Epoch 8099/80

 - 0s - loss: 0.3618 - mean_q: 11.5518
Epoch 8231/8231
 - 0s - loss: 0.3486 - mean_q: 11.3805
Epoch 8232/8232
 - 0s - loss: 0.6160 - mean_q: 11.1153
Epoch 8233/8233
 - 0s - loss: 0.4507 - mean_q: 10.5547
Epoch 8234/8234
 - 0s - loss: 0.7878 - mean_q: 11.1963
Epoch 8235/8235
 - 0s - loss: 0.3016 - mean_q: 11.4825
Epoch 8236/8236
 - 0s - loss: 0.2836 - mean_q: 11.2980
Epoch 8237/8237
 - 0s - loss: 0.2062 - mean_q: 10.3923
Epoch 8238/8238
 - 0s - loss: 0.4294 - mean_q: 10.0431
Epoch 8239/8239
 - 0s - loss: 0.7459 - mean_q: 10.1770
Epoch 8240/8240
 - 0s - loss: 0.3673 - mean_q: 10.9674
Epoch 8241/8241
 - 0s - loss: 0.4923 - mean_q: 10.3922
Epoch 8242/8242
 - 0s - loss: 0.7953 - mean_q: 10.8389
Epoch 8243/8243
 - 0s - loss: 0.0906 - mean_q: 10.8204
Epoch 8244/8244
 - 0s - loss: 0.5378 - mean_q: 10.7689
Epoch 8245/8245
 - 0s - loss: 0.6800 - mean_q: 10.9313
Epoch 8246/8246
 - 0s - loss: 0.7432 - mean_q: 10.5339
Epoch 8247/8247
 - 0s - loss: 0.3823 - mean_q: 10.8551
Epoch 8248/8248
 - 0s - lo

KeyboardInterrupt: 

Save last weights:

In [None]:
online_network.save_weights(os.path.join(weights_folder, 'weights_last.h5f'))

In [None]:
# Dump all scores to txt-file
with open(os.path.join(name, 'episode_scores.txt'), 'w') as file:
    for item in episode_scores:
        file.write("{}\n".format(item))

print(episode_scores)

Don't forget to check TensorBoard for fancy statistics on loss and metrics using in terminal

`tensorboard --logdir=tensorboard`

after navigating to the folder containing the created folder `tensorboard`: 

In [None]:
weights_folder

Then visit http://localhost:6006/

## Testing model

Finally, create a function to evalutate the trained network. 
Note that we still using $\varepsilon$-greedy strategy here to prevent an agent from getting stuck. 
`test_dqn` returns a list with scores for the specified number of games.

In [None]:
def test_dqn(env, n_games, model, nb_actions, skip_start, eps=0.05, render=False, sleep_time=0.01):
    scores = []
    for i in range(n_games):
        obs = env.reset()
        score = 0
        done = False
        for skip in range(skip_start):  # skip the start of each game (it's just freezing time before game starts)
            obs, reward, done, info = env.step(0)
            score += reward
        while not done:
            state = obs
            q_values = model.predict(np.array([state]))[0]
            action = epsilon_greedy(q_values, eps, nb_actions)
            obs, reward, done, info = env.step(action)
            score += reward
            if render:
                env.render()
                time.sleep(sleep_time)
                if done:
                    time.sleep(1)
        scores.append(score)
    return scores

In [None]:
ngames = 5
scores = test_dqn(env, ngames, online_network, nb_actions, skip_start, render=True)
print(scores)

In [None]:
env.close()

Results are pretty poor since the training was too short. 

Try to train DQN on a cluster. You might want to adjust some hyperparameters (increase `n_steps`, `warmup`, `copy_steps` and `eps_decay_steps`; gradually decrease learning rate during training, select appropriate `batch_size` to fit gpu memory, adjust `gamma`, switch on double dqn apporach and so on). 

You can even try to make the network deeper and/or use more than one observation as an input of neural network. For instance, using few consecutive game observations would definetely improve the results since they contain some helpful information such as monsters directions, etc. Turning off TensorBoard callback on a cluster would be a good idea too.

In [17]:
online_network.save(os.path.join(weights_folder, 'ram_model_4kk.h5f'))