In [1]:
import random
import gym
import numpy as np
from collections import deque
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
import tensorflow as tf
import pandas as pd
import seaborn as sns
sns.set()

# Deep $Q$ networks (DQN)

In previous notebooks, we have seen how we can use `tensorflow` and autodifferentiation to do tabular $Q$-learning in the context of a regression problem. While this technique is powerful for environments with small (finite) observation spaces $\mathcal{S}$ and action spaces $\mathcal{O}$, we run into problems when our observation space is continuous (or even just large!).

Tabular $Q$-learning is only guaranteed to converge if all state-action pairs are visited infinitely many times. In practice, this generally just means a very large number to get a reasonable approximation of the $Q$-function. However, when the observation space becomes large (such as using image inputs), it is likely that we only encounter each state-action pair at most once. Thus, $Q$-learning is not guaranteed to converge.

Instead, we want a technique that can estimate $Q$-values such that similar states produce similar outputs. This would allow us to learn from some state-action pairs, and then generalize to other unseen state-action pairs. By using a **differentiable function approximator**, we get this kind of behaviour. Recall that $Q$-learning is a *regression* problem, meaning any kind of regression model could work - even a linear regression. However, the most popular model used by *deep* reinforcement learning researchers is the *deep* neural network.

If you are familiar with supervised learning in deep learning, you may be familiar with techniques like dropout, batch normalization, and activity regularization. So far, these kinds of techniques do not prove extremely useful in the context of reinforcement learning. Instead, fully-connected neural networks with a small number of hidden units consisting of rectified linear units tend to perform best. ([Pieter Abbeel](https://www.youtube.com/watch?v=l-mYLq6eZPY) notes that on simple problems, linear feedback control can perform well even in complex environments. Fully-connected neural networks that use ReLU activations function as multi-step piecewise linear feedback controllers, hence their success).

When using neural networks, rather than passing many state-action pairs to the network and predicting a scalar $Q(s_t, a_t)$, we pass only the state $s_t$ and produce a vectorized output $\vec{Q}(s_t)$ where each entry in the vector corresponds to the predicted $Q$-value for each action available to the agent. Note that this necessitates that $\mathcal{A}$ is finite (and generally small), a limitation of $DQN$ that we will overcome later in the section on policy gradients.

![image of deep-q-network mapping single state to multiple outputs](../images/q-network.png)

In this notebook, we make use of `tensorflow`'s `keras` API to build neural networks. We also take advantage of **batched environments** to accelerate data collection. The `keras` api build neural networks that process inputs in **batches**. This means that if our observation space for a single environment has a shape $84 \times 84 \times 3$, then the network expects inputs of shape $B \times 84 \times 84 \times 3$ where $B$ is the number of inputs in the batch.

When running the tabular $Q$-learning agent in `tensorflow` in the previous notebook, runtime was considerably slower than the simply numpy-based agent. The computation time spent evaluating and updating the policy dominated the time to perform a single step/update of the agent, compared to the time spent simulating a step in the environment. Ideally, the time spent should be 50% policy evaluation and 50% environment stepping. By using batched environments, we can even this out. Furthermore, this means that we get more data per wall-clock-time, which will accelerate learning. This will be different from most tutorials which use `keras`, where a single environment is used, and inputs are manipulated to trick keras into treating them like a batch.

Increasing the number of environments can stabilize training by diversifying the collected data over time.

In [2]:
class ReplayBuffer:
    def __init__(self, size=1000000):
        self.memory = deque(maxlen=size)
        
    def remember(self, s_t, a_t, r_t, s_t_next, d_t):
        self.memory.append((s_t, a_t, r_t, s_t_next, d_t))
        
    def sample(self, num=32):
        num = min(num, len(self.memory))
        return random.sample(self.memory, num)

In [3]:
class Agent:
    def __init__(self, state_shape, num_actions, num_envs, alpha=0.001, gamma=0.95, epsilon_i=1.0, epsilon_f=0.01, n_epsilon=0.1, hidden_sizes = []):
        self.epsilon_i = epsilon_i
        self.epsilon_f = epsilon_f
        self.n_epsilon = n_epsilon
        self.epsilon = epsilon_i
        self.gamma = gamma

        self.num_actions = num_actions
        self.num_envs = num_envs

        self.Q = Sequential()
        for size in hidden_sizes:
            self.Q.add(Dense(size, activation='relu', use_bias='false', kernel_initializer='he_uniform', dtype='float64'))
        self.Q.add(Dense(self.num_actions, activation="linear", use_bias='false', kernel_initializer='zeros', dtype='float64'))
#         self.optimizer = tf.keras.optimizers.SGD(alpha)
        self.optimizer = tf.keras.optimizers.Adam(alpha)        

    def act(self, s_t):
        if np.random.rand() < self.epsilon:
            return np.random.randint(self.num_actions, size=self.num_envs)
        return np.argmax(self.Q(s_t), axis=1)
    
    def decay_epsilon(self, n):
        self.epsilon = max(
            self.epsilon_f, 
            self.epsilon_i - (n/self.n_epsilon)*(self.epsilon_i - self.epsilon_f))

    def update(self, s_t, a_t, r_t, s_t_next, d_t):
        with tf.GradientTape() as tape:
            Q_next = tf.stop_gradient(tf.reduce_max(self.Q(s_t_next), axis=1))
            Q_pred = tf.reduce_sum(self.Q(s_t)*tf.one_hot(a_t, self.num_actions, dtype=tf.float64), axis=1)
            loss = tf.reduce_mean(0.5*(r_t + (1-d_t)*self.gamma*Q_next - Q_pred)**2)
        grads = tape.gradient(loss, self.Q.trainable_variables)
        self.optimizer.apply_gradients(zip(grads, self.Q.trainable_variables))

In [4]:
class DiscreteToBoxWrapper(gym.ObservationWrapper):
    def __init__(self, env):
        super().__init__(env)
        assert isinstance(env.observation_space, gym.spaces.Discrete), \
            "Should only be used to wrap Discrete envs."
        self.n = self.observation_space.n
        self.observation_space = gym.spaces.Box(0, 1, (self.n,))
    
    def observation(self, obs):
        new_obs = np.zeros(self.n)
        new_obs[obs] = 1
        return new_obs

In [5]:
class VectorizedEnvWrapper(gym.Wrapper):
    def __init__(self, make_env, num_envs=1):
        super().__init__(make_env())
        self.num_envs = num_envs
        self.envs = [make_env() for env_index in range(num_envs)]
    
    def reset(self):
        return np.asarray([env.reset() for env in self.envs])
    
    def reset_at(self, env_index):
        return self.envs[env_index].reset()
    
    def step(self, actions):
        next_states, rewards, dones, infos = [], [], [], []
        for env, action in zip(self.envs, actions):
            next_state, reward, done, info = env.step(action)
            next_states.append(next_state)
            rewards.append(reward)
            dones.append(done)
            infos.append(info)
        return np.asarray(next_states), np.asarray(rewards), \
            np.asarray(dones), np.asarray(infos)

In [6]:
def plot(data, window=100):
    sns.lineplot(
        data=data.rolling(window=window).mean()[window-1::window]
    )

In [7]:
def train(env_name, T=20000, num_envs=32, batch_size=32, hidden_sizes=[24, 24], alpha=0.001, gamma=0.95):
    env = VectorizedEnvWrapper(lambda: gym.make(env_name), num_envs)
    state_shape = env.observation_space.shape
    num_actions = env.action_space.n
    agent = Agent(state_shape, num_actions, num_envs, alpha=alpha, hidden_sizes=hidden_sizes, gamma=gamma)
    rewards = []
    buffer = ReplayBuffer()
    episode_rewards = 0
    s_t = env.reset()
    for t in range(T):
        a_t = agent.act(s_t)
        s_t_next, r_t, d_t, info = env.step(a_t)
        buffer.remember(s_t, a_t, r_t, s_t_next, d_t)
        s_t = s_t_next
        for batch in buffer.sample(batch_size):
            agent.update(*batch)
        agent.decay_epsilon(t/T)
        episode_rewards += r_t

        for i in range(env.num_envs):
            if d_t[i]:
                print(f'{100*t/T:2.1f}% \t eps={agent.epsilon:1.2f} \t {episode_rewards[i]}')
                rewards.append(episode_rewards[i])
                episode_rewards[i] = 0
                s_t[i] = env.reset_at(i)
            
    plot(pd.DataFrame(rewards), window=10)
    return agent

In [10]:
train("CartPole-v0", T=20000, num_envs=32, batch_size=1)

0.0% 	 eps=1.00 	 9.0
0.0% 	 eps=1.00 	 10.0
0.0% 	 eps=1.00 	 10.0
0.1% 	 eps=1.00 	 11.0
0.1% 	 eps=1.00 	 11.0
0.1% 	 eps=0.99 	 12.0
0.1% 	 eps=0.99 	 12.0
0.1% 	 eps=0.99 	 13.0
0.1% 	 eps=0.99 	 13.0
0.1% 	 eps=0.99 	 14.0
0.1% 	 eps=0.99 	 15.0
0.1% 	 eps=0.99 	 15.0
0.1% 	 eps=0.99 	 16.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 18.0
0.1% 	 eps=0.99 	 19.0
0.1% 	 eps=0.99 	 19.0
0.1% 	 eps=0.99 	 20.0
0.1% 	 eps=0.99 	 20.0
0.1% 	 eps=0.99 	 20.0
0.1% 	 eps=0.99 	 22.0
0.1% 	 eps=0.99 	 22.0
0.1% 	 eps=0.99 	 23.0
0.1% 	 eps=0.99 	 23.0
0.1% 	 eps=0.99 	 24.0
0.1% 	 eps=0.99 	 25.0
0.1% 	 eps=0.99 	 11.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 27.0
0.1% 	 eps=0.99 	 27.0
0.1% 	 eps=0.99 	 16.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 13.0
0.1% 	 eps=0.99 	 11.0
0.1% 	 eps=0.99 	 17.0
0.1% 	 eps=0.99 	 14.0
0.1% 	 eps=0.99 	 31.0
0.2% 	 eps=0.98 	 12.0
0.2% 	 eps=0.98 	 15.0
0.2% 	 eps=0.98 	 14.0
0.2% 	 eps=0.98 	 22.0
0.2% 	 eps=0

1.1% 	 eps=0.89 	 17.0
1.1% 	 eps=0.89 	 27.0
1.1% 	 eps=0.89 	 12.0
1.2% 	 eps=0.89 	 11.0
1.2% 	 eps=0.89 	 15.0
1.2% 	 eps=0.89 	 12.0
1.2% 	 eps=0.89 	 11.0
1.2% 	 eps=0.89 	 13.0
1.2% 	 eps=0.88 	 20.0
1.2% 	 eps=0.88 	 17.0
1.2% 	 eps=0.88 	 10.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 26.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 11.0
1.2% 	 eps=0.88 	 10.0
1.2% 	 eps=0.88 	 42.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 17.0
1.2% 	 eps=0.88 	 24.0
1.2% 	 eps=0.88 	 20.0
1.2% 	 eps=0.88 	 25.0
1.2% 	 eps=0.88 	 23.0
1.2% 	 eps=0.88 	 31.0
1.2% 	 eps=0.88 	 9.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 11.0
1.2% 	 eps=0.88 	 17.0
1.2% 	 eps=0.88 	 15.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 11.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 10.0
1.2% 	 eps=0.88 	 10.0
1.2% 	 eps=0.88 	 13.0
1.2% 	 eps=0.88 	 12.0
1.2% 	 eps=0.88 	 39.0
1.2% 	 eps=0.88 	 11.0
1.2% 	 eps=0.88 	 14.0
1.2% 	 eps=0.88 	 10.0
1.2% 	 eps=0.88 	 14.0
1.2% 	 eps=0.88 	 16.0
1.2% 	 eps=0

2.2% 	 eps=0.79 	 16.0
2.2% 	 eps=0.79 	 17.0
2.2% 	 eps=0.79 	 9.0
2.2% 	 eps=0.78 	 9.0
2.2% 	 eps=0.78 	 17.0
2.2% 	 eps=0.78 	 22.0
2.2% 	 eps=0.78 	 14.0
2.2% 	 eps=0.78 	 18.0
2.2% 	 eps=0.78 	 21.0
2.2% 	 eps=0.78 	 17.0
2.2% 	 eps=0.78 	 15.0
2.2% 	 eps=0.78 	 15.0
2.2% 	 eps=0.78 	 27.0
2.2% 	 eps=0.78 	 14.0
2.2% 	 eps=0.78 	 16.0
2.2% 	 eps=0.78 	 17.0
2.2% 	 eps=0.78 	 15.0
2.2% 	 eps=0.78 	 15.0
2.2% 	 eps=0.78 	 14.0
2.2% 	 eps=0.78 	 15.0
2.2% 	 eps=0.78 	 21.0
2.2% 	 eps=0.78 	 14.0
2.2% 	 eps=0.78 	 14.0
2.2% 	 eps=0.78 	 18.0
2.2% 	 eps=0.78 	 30.0
2.2% 	 eps=0.78 	 11.0
2.2% 	 eps=0.78 	 17.0
2.2% 	 eps=0.78 	 19.0
2.2% 	 eps=0.78 	 12.0
2.2% 	 eps=0.78 	 16.0
2.2% 	 eps=0.78 	 23.0
2.2% 	 eps=0.78 	 19.0
2.3% 	 eps=0.78 	 16.0
2.3% 	 eps=0.78 	 20.0
2.3% 	 eps=0.78 	 9.0
2.3% 	 eps=0.78 	 11.0
2.3% 	 eps=0.78 	 14.0
2.3% 	 eps=0.78 	 25.0
2.3% 	 eps=0.78 	 11.0
2.3% 	 eps=0.77 	 11.0
2.3% 	 eps=0.77 	 23.0
2.3% 	 eps=0.77 	 20.0
2.3% 	 eps=0.77 	 25.0
2.3% 	 eps=0.7

3.2% 	 eps=0.69 	 17.0
3.2% 	 eps=0.69 	 21.0
3.2% 	 eps=0.69 	 16.0
3.2% 	 eps=0.69 	 19.0
3.2% 	 eps=0.69 	 17.0
3.2% 	 eps=0.69 	 13.0
3.2% 	 eps=0.68 	 30.0
3.2% 	 eps=0.68 	 13.0
3.2% 	 eps=0.68 	 19.0
3.2% 	 eps=0.68 	 9.0
3.2% 	 eps=0.68 	 38.0
3.2% 	 eps=0.68 	 16.0
3.2% 	 eps=0.68 	 10.0
3.2% 	 eps=0.68 	 12.0
3.2% 	 eps=0.68 	 16.0
3.2% 	 eps=0.68 	 17.0
3.2% 	 eps=0.68 	 9.0
3.2% 	 eps=0.68 	 16.0
3.2% 	 eps=0.68 	 15.0
3.2% 	 eps=0.68 	 26.0
3.2% 	 eps=0.68 	 12.0
3.2% 	 eps=0.68 	 12.0
3.2% 	 eps=0.68 	 17.0
3.3% 	 eps=0.68 	 20.0
3.3% 	 eps=0.68 	 14.0
3.3% 	 eps=0.68 	 9.0
3.3% 	 eps=0.68 	 15.0
3.3% 	 eps=0.68 	 24.0
3.3% 	 eps=0.68 	 13.0
3.3% 	 eps=0.68 	 21.0
3.3% 	 eps=0.68 	 12.0
3.3% 	 eps=0.68 	 28.0
3.3% 	 eps=0.68 	 26.0
3.3% 	 eps=0.68 	 23.0
3.3% 	 eps=0.67 	 25.0
3.3% 	 eps=0.67 	 20.0
3.3% 	 eps=0.67 	 20.0
3.3% 	 eps=0.67 	 24.0
3.3% 	 eps=0.67 	 15.0
3.3% 	 eps=0.67 	 48.0
3.3% 	 eps=0.67 	 37.0
3.3% 	 eps=0.67 	 13.0
3.3% 	 eps=0.67 	 16.0
3.3% 	 eps=0.6

4.1% 	 eps=0.59 	 16.0
4.1% 	 eps=0.59 	 10.0
4.1% 	 eps=0.59 	 17.0
4.1% 	 eps=0.59 	 11.0
4.1% 	 eps=0.59 	 11.0
4.1% 	 eps=0.59 	 17.0
4.1% 	 eps=0.59 	 16.0
4.1% 	 eps=0.59 	 12.0
4.2% 	 eps=0.59 	 10.0
4.2% 	 eps=0.59 	 9.0
4.2% 	 eps=0.59 	 12.0
4.2% 	 eps=0.59 	 14.0
4.2% 	 eps=0.59 	 17.0
4.2% 	 eps=0.59 	 13.0
4.2% 	 eps=0.59 	 24.0
4.2% 	 eps=0.59 	 12.0
4.2% 	 eps=0.59 	 13.0
4.2% 	 eps=0.59 	 14.0
4.2% 	 eps=0.59 	 21.0
4.2% 	 eps=0.59 	 16.0
4.2% 	 eps=0.59 	 29.0
4.2% 	 eps=0.59 	 16.0
4.2% 	 eps=0.59 	 21.0
4.2% 	 eps=0.59 	 15.0
4.2% 	 eps=0.58 	 18.0
4.2% 	 eps=0.58 	 13.0
4.2% 	 eps=0.58 	 9.0
4.2% 	 eps=0.58 	 19.0
4.2% 	 eps=0.58 	 14.0
4.2% 	 eps=0.58 	 14.0
4.2% 	 eps=0.58 	 20.0
4.2% 	 eps=0.58 	 11.0
4.2% 	 eps=0.58 	 15.0
4.2% 	 eps=0.58 	 12.0
4.2% 	 eps=0.58 	 22.0
4.2% 	 eps=0.58 	 15.0
4.2% 	 eps=0.58 	 14.0
4.2% 	 eps=0.58 	 12.0
4.2% 	 eps=0.58 	 21.0
4.2% 	 eps=0.58 	 13.0
4.2% 	 eps=0.58 	 9.0
4.2% 	 eps=0.58 	 15.0
4.2% 	 eps=0.58 	 10.0
4.2% 	 eps=0.5

4.9% 	 eps=0.51 	 10.0
4.9% 	 eps=0.51 	 10.0
4.9% 	 eps=0.51 	 23.0
4.9% 	 eps=0.51 	 9.0
4.9% 	 eps=0.51 	 23.0
4.9% 	 eps=0.51 	 12.0
4.9% 	 eps=0.51 	 17.0
4.9% 	 eps=0.51 	 13.0
4.9% 	 eps=0.51 	 24.0
4.9% 	 eps=0.51 	 11.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 11.0
5.0% 	 eps=0.51 	 21.0
5.0% 	 eps=0.51 	 14.0
5.0% 	 eps=0.51 	 10.0
5.0% 	 eps=0.51 	 10.0
5.0% 	 eps=0.51 	 13.0
5.0% 	 eps=0.51 	 11.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 14.0
5.0% 	 eps=0.51 	 14.0
5.0% 	 eps=0.51 	 13.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 14.0
5.0% 	 eps=0.51 	 10.0
5.0% 	 eps=0.51 	 18.0
5.0% 	 eps=0.51 	 12.0
5.0% 	 eps=0.51 	 10.0
5.0% 	 eps=0.50 	 18.0
5.0% 	 eps=0.50 	 12.0
5.0% 	 eps=0.50 	 13.0
5.0% 	 eps=0.50 	 15.0
5.0% 	 eps=0.50 	 9.0
5.0% 	 eps=0.50 	 21.0
5.0% 	 eps=0.50 	 9.0
5.0% 	 eps=0.50 	 14.0
5.0% 	 eps=0.50 	 14.0
5.0% 	 eps=0.50 	 20.0
5.0% 	 eps=0.50 	 15.0
5.0% 	 eps=0.50 	 17.0
5.0% 	 eps=0.50 	 22.0
5.0% 	 eps=0.5

5.7% 	 eps=0.43 	 12.0
5.7% 	 eps=0.43 	 9.0
5.7% 	 eps=0.43 	 9.0
5.7% 	 eps=0.43 	 10.0
5.7% 	 eps=0.43 	 9.0
5.7% 	 eps=0.43 	 18.0
5.7% 	 eps=0.43 	 11.0
5.7% 	 eps=0.43 	 14.0
5.7% 	 eps=0.43 	 16.0
5.7% 	 eps=0.43 	 15.0
5.7% 	 eps=0.43 	 11.0
5.7% 	 eps=0.43 	 19.0
5.7% 	 eps=0.43 	 8.0
5.7% 	 eps=0.43 	 11.0
5.7% 	 eps=0.43 	 11.0
5.7% 	 eps=0.43 	 11.0
5.7% 	 eps=0.43 	 14.0
5.8% 	 eps=0.43 	 14.0
5.8% 	 eps=0.43 	 10.0
5.8% 	 eps=0.43 	 22.0
5.8% 	 eps=0.43 	 10.0
5.8% 	 eps=0.43 	 19.0
5.8% 	 eps=0.43 	 17.0
5.8% 	 eps=0.43 	 12.0
5.8% 	 eps=0.43 	 14.0
5.8% 	 eps=0.43 	 16.0
5.8% 	 eps=0.43 	 13.0
5.8% 	 eps=0.43 	 9.0
5.8% 	 eps=0.43 	 11.0
5.8% 	 eps=0.43 	 11.0
5.8% 	 eps=0.43 	 8.0
5.8% 	 eps=0.43 	 15.0
5.8% 	 eps=0.43 	 15.0
5.8% 	 eps=0.43 	 16.0
5.8% 	 eps=0.43 	 13.0
5.8% 	 eps=0.43 	 10.0
5.8% 	 eps=0.43 	 8.0
5.8% 	 eps=0.43 	 13.0
5.8% 	 eps=0.43 	 9.0
5.8% 	 eps=0.43 	 12.0
5.8% 	 eps=0.43 	 12.0
5.8% 	 eps=0.43 	 14.0
5.8% 	 eps=0.43 	 18.0
5.8% 	 eps=0.43 	 1

6.4% 	 eps=0.37 	 16.0
6.4% 	 eps=0.37 	 17.0
6.4% 	 eps=0.37 	 19.0
6.4% 	 eps=0.37 	 15.0
6.4% 	 eps=0.37 	 31.0
6.4% 	 eps=0.37 	 18.0
6.4% 	 eps=0.37 	 16.0
6.4% 	 eps=0.37 	 14.0
6.4% 	 eps=0.37 	 19.0
6.4% 	 eps=0.37 	 17.0
6.4% 	 eps=0.36 	 17.0
6.4% 	 eps=0.36 	 19.0
6.4% 	 eps=0.36 	 16.0
6.4% 	 eps=0.36 	 23.0
6.4% 	 eps=0.36 	 21.0
6.4% 	 eps=0.36 	 21.0
6.4% 	 eps=0.36 	 26.0
6.5% 	 eps=0.36 	 25.0
6.5% 	 eps=0.36 	 21.0
6.5% 	 eps=0.36 	 30.0
6.5% 	 eps=0.36 	 22.0
6.5% 	 eps=0.36 	 16.0
6.5% 	 eps=0.36 	 16.0
6.5% 	 eps=0.36 	 18.0
6.5% 	 eps=0.36 	 30.0
6.5% 	 eps=0.36 	 29.0
6.5% 	 eps=0.36 	 31.0
6.5% 	 eps=0.35 	 19.0
6.5% 	 eps=0.35 	 37.0
6.6% 	 eps=0.35 	 25.0
6.6% 	 eps=0.35 	 28.0
6.6% 	 eps=0.35 	 31.0
6.6% 	 eps=0.35 	 37.0
6.6% 	 eps=0.35 	 40.0
6.6% 	 eps=0.35 	 37.0
6.6% 	 eps=0.35 	 17.0
6.6% 	 eps=0.35 	 40.0
6.6% 	 eps=0.35 	 45.0
6.6% 	 eps=0.35 	 15.0
6.6% 	 eps=0.34 	 47.0
6.7% 	 eps=0.34 	 57.0
6.7% 	 eps=0.34 	 53.0
6.7% 	 eps=0.34 	 37.0
6.7% 	 eps=

9.6% 	 eps=0.05 	 138.0
9.6% 	 eps=0.05 	 43.0
9.6% 	 eps=0.05 	 37.0
9.6% 	 eps=0.05 	 44.0
9.6% 	 eps=0.05 	 43.0
9.6% 	 eps=0.05 	 58.0
9.6% 	 eps=0.05 	 38.0
9.6% 	 eps=0.05 	 76.0
9.7% 	 eps=0.04 	 68.0
9.7% 	 eps=0.04 	 66.0
9.7% 	 eps=0.04 	 52.0
9.7% 	 eps=0.04 	 53.0
9.7% 	 eps=0.04 	 59.0
9.7% 	 eps=0.04 	 72.0
9.7% 	 eps=0.04 	 44.0
9.7% 	 eps=0.04 	 45.0
9.7% 	 eps=0.04 	 51.0
9.7% 	 eps=0.04 	 60.0
9.7% 	 eps=0.04 	 55.0
9.8% 	 eps=0.03 	 66.0
9.8% 	 eps=0.03 	 49.0
9.8% 	 eps=0.03 	 63.0
9.8% 	 eps=0.03 	 84.0
9.8% 	 eps=0.03 	 43.0
9.8% 	 eps=0.03 	 91.0
9.8% 	 eps=0.03 	 59.0
9.8% 	 eps=0.03 	 49.0
9.8% 	 eps=0.03 	 43.0
9.8% 	 eps=0.03 	 93.0
9.8% 	 eps=0.03 	 49.0
9.8% 	 eps=0.02 	 62.0
9.8% 	 eps=0.02 	 50.0
9.9% 	 eps=0.02 	 51.0
9.9% 	 eps=0.02 	 91.0
9.9% 	 eps=0.02 	 57.0
9.9% 	 eps=0.02 	 45.0
9.9% 	 eps=0.02 	 46.0
9.9% 	 eps=0.02 	 45.0
9.9% 	 eps=0.02 	 71.0
9.9% 	 eps=0.02 	 71.0
9.9% 	 eps=0.02 	 57.0
9.9% 	 eps=0.02 	 35.0
9.9% 	 eps=0.02 	 36.0
9.9% 	 eps

12.9% 	 eps=0.01 	 47.0
12.9% 	 eps=0.01 	 57.0
12.9% 	 eps=0.01 	 53.0
13.0% 	 eps=0.01 	 61.0
13.0% 	 eps=0.01 	 57.0
13.0% 	 eps=0.01 	 59.0
13.0% 	 eps=0.01 	 58.0
13.0% 	 eps=0.01 	 42.0
13.0% 	 eps=0.01 	 37.0
13.0% 	 eps=0.01 	 55.0
13.0% 	 eps=0.01 	 40.0
13.0% 	 eps=0.01 	 35.0
13.0% 	 eps=0.01 	 59.0
13.0% 	 eps=0.01 	 40.0
13.0% 	 eps=0.01 	 68.0
13.0% 	 eps=0.01 	 85.0
13.1% 	 eps=0.01 	 60.0
13.1% 	 eps=0.01 	 74.0
13.1% 	 eps=0.01 	 74.0
13.1% 	 eps=0.01 	 63.0
13.1% 	 eps=0.01 	 62.0
13.1% 	 eps=0.01 	 51.0
13.1% 	 eps=0.01 	 79.0
13.2% 	 eps=0.01 	 107.0
13.2% 	 eps=0.01 	 73.0
13.2% 	 eps=0.01 	 57.0
13.2% 	 eps=0.01 	 65.0
13.2% 	 eps=0.01 	 52.0
13.2% 	 eps=0.01 	 112.0
13.2% 	 eps=0.01 	 82.0
13.2% 	 eps=0.01 	 60.0
13.3% 	 eps=0.01 	 51.0
13.3% 	 eps=0.01 	 68.0
13.3% 	 eps=0.01 	 39.0
13.3% 	 eps=0.01 	 42.0
13.3% 	 eps=0.01 	 44.0
13.3% 	 eps=0.01 	 61.0
13.3% 	 eps=0.01 	 61.0
13.3% 	 eps=0.01 	 54.0
13.3% 	 eps=0.01 	 74.0
13.3% 	 eps=0.01 	 61.0
13.3% 	 eps=0.

16.9% 	 eps=0.01 	 87.0
16.9% 	 eps=0.01 	 80.0
16.9% 	 eps=0.01 	 80.0
16.9% 	 eps=0.01 	 87.0
16.9% 	 eps=0.01 	 91.0
16.9% 	 eps=0.01 	 89.0
16.9% 	 eps=0.01 	 90.0
16.9% 	 eps=0.01 	 56.0
16.9% 	 eps=0.01 	 60.0
16.9% 	 eps=0.01 	 68.0
17.0% 	 eps=0.01 	 97.0
17.0% 	 eps=0.01 	 83.0
17.0% 	 eps=0.01 	 75.0
17.0% 	 eps=0.01 	 106.0
17.0% 	 eps=0.01 	 90.0
17.0% 	 eps=0.01 	 115.0
17.0% 	 eps=0.01 	 153.0
17.1% 	 eps=0.01 	 51.0
17.1% 	 eps=0.01 	 149.0
17.1% 	 eps=0.01 	 58.0
17.1% 	 eps=0.01 	 124.0
17.1% 	 eps=0.01 	 108.0
17.1% 	 eps=0.01 	 66.0
17.1% 	 eps=0.01 	 121.0
17.1% 	 eps=0.01 	 64.0
17.1% 	 eps=0.01 	 74.0
17.2% 	 eps=0.01 	 57.0
17.2% 	 eps=0.01 	 66.0
17.2% 	 eps=0.01 	 58.0
17.2% 	 eps=0.01 	 151.0
17.2% 	 eps=0.01 	 53.0
17.2% 	 eps=0.01 	 58.0
17.2% 	 eps=0.01 	 70.0
17.2% 	 eps=0.01 	 93.0
17.2% 	 eps=0.01 	 87.0
17.2% 	 eps=0.01 	 57.0
17.3% 	 eps=0.01 	 81.0
17.3% 	 eps=0.01 	 80.0
17.3% 	 eps=0.01 	 74.0
17.3% 	 eps=0.01 	 63.0
17.3% 	 eps=0.01 	 53.0
17.3% 	 

23.1% 	 eps=0.01 	 167.0
23.1% 	 eps=0.01 	 145.0
23.1% 	 eps=0.01 	 154.0
23.1% 	 eps=0.01 	 144.0
23.1% 	 eps=0.01 	 182.0
23.2% 	 eps=0.01 	 173.0
23.2% 	 eps=0.01 	 148.0
23.2% 	 eps=0.01 	 168.0
23.3% 	 eps=0.01 	 168.0
23.3% 	 eps=0.01 	 159.0
23.3% 	 eps=0.01 	 160.0
23.4% 	 eps=0.01 	 160.0
23.5% 	 eps=0.01 	 171.0
23.5% 	 eps=0.01 	 200.0
23.5% 	 eps=0.01 	 172.0
23.5% 	 eps=0.01 	 200.0
23.6% 	 eps=0.01 	 184.0
23.6% 	 eps=0.01 	 184.0
23.6% 	 eps=0.01 	 185.0
23.6% 	 eps=0.01 	 165.0
23.7% 	 eps=0.01 	 161.0
23.7% 	 eps=0.01 	 172.0
23.8% 	 eps=0.01 	 167.0
23.8% 	 eps=0.01 	 166.0
23.8% 	 eps=0.01 	 200.0
23.9% 	 eps=0.01 	 200.0
23.9% 	 eps=0.01 	 200.0
23.9% 	 eps=0.01 	 188.0
23.9% 	 eps=0.01 	 182.0
23.9% 	 eps=0.01 	 199.0
23.9% 	 eps=0.01 	 194.0
24.0% 	 eps=0.01 	 174.0
24.0% 	 eps=0.01 	 200.0
24.0% 	 eps=0.01 	 190.0
24.0% 	 eps=0.01 	 163.0
24.1% 	 eps=0.01 	 184.0
24.1% 	 eps=0.01 	 200.0
24.1% 	 eps=0.01 	 179.0
24.1% 	 eps=0.01 	 168.0
24.1% 	 eps=0.01 	 200.0


KeyboardInterrupt: 