# Tensorforce

### Why Tensorforce?
We've seen how we can create a model using Tensorflow (Keras). Works great, but it can be done easier. 
This is where Tensorforce comes in. Tensorforce is an open-source deep reinforcement learning framework, with an emphasis on modularized flexible library design and straightforward usability for applications in research and practice. Tensorforce is built on top of Google’s TensorFlow framework version 2.0 (!) and compatible with Python 3. Models created with TensorForce can be used in any language with the Tensorflow API.

In [1]:
# Imports
import os
import logging

import tensorflow as tf

from tensorforce.agents import Agent
from tensorforce.environments import Environment
from tensorforce.execution import Runner

import numpy as np

In [2]:
# Set logging settings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
logger = tf.get_logger()
logger.setLevel(logging.ERROR)

### Assignment
Use the Tensorforce library to create a similar model to the one made in the previous exercise, with the CartPole-v1 environment from OpenAI gym. All needed imports are already there.  
Documentation for creating the agent and environment:
https://tensorforce.readthedocs.io/en/0.5.4/basics/getting-started.html

In [4]:
# Create an OpenAI-Gym environment using the imported `Environment` class from Tensorforce.
environment = Environment.create(environment='gym', level='CartPole-v1')

In [5]:
# Create the agent using the imported `Agent` class from Tensorforce.
agent = Agent.create(
    agent='ppo', environment=environment, batch_size=10, learning_rate=1e-3
)

In [6]:
# Initialize the runner
runner = Runner(agent=agent, environment=environment)

In [7]:
# Start the runner
runner.run(num_episodes=300)
runner.close()

Episodes: 100%|██████████| 300/300 [01:11, reward=268.00, ts/ep=268, sec/ep=0.30, ms/ts=1.1, agent=92.2%]


In [8]:
# Print statistics
print(f"Learning finished. Total episodes: {runner.episodes}. Average reward of last 100 episodes: {np.mean(runner.episode_rewards[-100:])}")

Learning finished. Total episodes: 300. Average reward of last 100 episodes: 260.05


In [9]:
# Evaluate and visualize model
environment.visualize = True
runner.run(num_episodes=100, evaluation=True)

Episodes: 100%|██████████| 100/100 [02:49, reward=440.00, ts/ep=440, sec/ep=0.78, ms/ts=1.8, agent=58.0%]

### References
- Tensorforce. (n.d.). Getting started — Tensorforce 0.5.4 documentation. Retrieved March 3, 2020, from https://tensorforce.readthedocs.io/en/0.5.4/basics/getting-started.html
- Tensorforce (0.5.4). (2020). Retrieved March 3, 2020, from https://github.com/tensorforce/tensorforce