# Lab: Modeling with Dask and Ray

To keep things simple, while still giving you a chance to try something hands on, we'll look at 

* Linear modeling with Dask and a different dataset
* Ray RL example using a more powerful algorithm (PPO) than we did earlier

## Dask and Powerplant Output

We'll use the UC Irvine ML repository's Combined Cycle Power Plant Data Set (https://archive.ics.uci.edu/ml/datasets/Combined+Cycle+Power+Plant)

This dataset consists of about 10,000 records of measurements relating to peaker power plants.

* Temperature (AT) in the range 1.81°C and 37.11°C,
* Ambient Pressure (AP) in the range 992.89-1033.30 millibar,
* Relative Humidity (RH) in the range 25.56% to 100.16%
* Exhaust Vacuum (V) in the range 25.36-81.56 cm Hg
* Net hourly electrical energy output (PE) 420.26-495.76 MW

We want to model the power output as a function of the other parameters.

In [None]:
import dask.dataframe as ddf

df = ddf.read_csv('data/powerplant.csv', sample=False)
df

In [None]:
df.head()

In [25]:
# Feel free to copy-paste-and-modify from the example to get a model and predictions!

In [26]:
# Call your test set y_test and your predictions y_predicted, to score your model with the next cell

In [None]:
from dask_ml.metrics import mean_squared_error
from math import sqrt

sqrt(mean_squared_error(y_test, y_predicted))

## Ray RLlib and PPO

PPO of Proximal Policy Optimization is a more powerful (and more complicated) algorithm than the DQN we've looked at.

But thanks to Ray's implementations, you can swap it in easily.

Note that we import `ppo` from `ray.rllib.agents`

By replacing "DQN" with "PPO" you can quickly get better results.

>
> Interested in PPO details? Check out this writeup: https://jonathan-hui.medium.com/rl-proximal-policy-optimization-ppo-explained-77f014ec3f12
>

In [None]:
import ray
import ray.rllib.agents.ppo as ppo

ray.shutdown()
ray.init()

In [28]:
# Copy the code from the example, but replace references to DQN with references to PPO

# HINT: try 10 iterations -- that will be plenty for PPO to solve the problem

In [None]:
# You should be able to hit "500" reward by the end of the training loop

In [None]:
ray.shutdown()