# **Lunar Lander Environment**

**About the problem**

The task is to build an agent that can safely land a lunar rover by controlling its thrusters and main engine

**Import libraries**

In [7]:
import os
# Avoid reinstalling packages that are available on edstem
if not os.getenv("ED_COURSE_ID"):
    !pip install tensorflow gym keras keras-rl2

import sys
sys.path.append('./.local/lib/python3.9-packages')
import gym
import random

**Set up the environment**

In [50]:
# use gym's make method to generate the lunar lander environment
env = gym.make('LunarLander-v2')

# extract states and actions from environment
states = env.observation_space.shape[0]
actions = env.action_space.n

**Testing Random Actions**

In [83]:
# Trigger Ed's X display
#!xdpyinfo

episodes = 10
# Repeat process 10 times
for episode in range(1, episodes+1):
    # Each time, reset the environment
    state = env.reset()
    done = False
    score = 0 
    
    while not done:
        # render the environment so that it remains visible on the screen
        env.render()  
        # use action sample space    
        action = env.action_space.sample()
        # apply the action to the environment and collect feedback
        n_state, reward, done, info = env.step(action) 
        # Add the reward to the cummulative score
        score+=reward 
    # End of loop: print out the maximum score
    print('Episode:{} Score:{}'.format(episode, score))

Episode:1 Score:-106.30324357113311
Episode:2 Score:-247.77462608429434
Episode:3 Score:-477.435013064628
Episode:4 Score:-123.12532936816025
Episode:5 Score:-315.50604237601135
Episode:6 Score:-108.71368564720947
Episode:7 Score:-107.97177062799771
Episode:8 Score:-155.33190309144462
Episode:9 Score:-76.99679799523831
Episode:10 Score:-280.4376710114186


# 2. Create a Deep Learning Model with Keras

In [74]:
# Import dependencies needed for this step from numpy and keras
import numpy as np
from keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam

In [75]:
# Define a function that builds a model so that we can reuse it multiple times
# To build a model, the function needs the available states and actions
def build_model(states, actions):
    model = Sequential()
    model.add(Flatten(input_shape=(1,states)))
    model.add(Dense(24, activation='relu'))
    model.add(Dense(24, activation='relu'))
    model.add(Dense(actions, activation='linear'))
    return model

In [76]:
# Create an instance of a model by calling the build_model function
model = build_model(states, actions)
# TODO: Write code to inspect the built model by outputting the summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 8)                 0         
                                                                 
 dense (Dense)               (None, 24)                216       
                                                                 
 dense_1 (Dense)             (None, 24)                600       
                                                                 
 dense_2 (Dense)             (None, 4)                 100       
                                                                 
Total params: 916
Trainable params: 916
Non-trainable params: 0
_________________________________________________________________


2022-04-25 08:22:12.963218: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-25 08:22:12.964368: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


# 3. Build Agent with Keras-RL

In [77]:
# Import dependencies to build an agent
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

In [78]:
# Define a function to build a DQN agent given the model and the set of actions
def build_agent(model, actions):
    policy = BoltzmannQPolicy()
    memory = SequentialMemory(limit=50000, window_length=1)
    dqn = DQNAgent(model=model, memory=memory, policy=policy, 
                  nb_actions=actions, nb_steps_warmup=10, target_model_update=1e-2)
    return dqn

Use the DQN agent to train the reinforcement learning model.

Note that this step takes a few minutes. Move to the next steps after the button on the left changes from 'stop' to show that the run is complete. Do not worry about the 'too much output' warning halfways through the run.

**To test this step**, please use the *Run All* button instead of running this section alone.

In [79]:
# Create an instance of an agent 
dqn = build_agent(model, actions)
# Compile the model
dqn.compile(Adam(learning_rate=1e-3), metrics=['mae'])
# Fit the model
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)

AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy'