# Environment installation
Install all environment dependencies using `pip install -e .`

*Reference the README for further details*

# Environment states

The environment monitors the following variables:
* (HA)2(org)
* H+ Extraction
* H+ Scrub
* H+ Strip
* OA Extraction
* OA Scrub
* OA Strip 
* Recycle
* Extraction
* Scrub 
* Strip


## Using the module
1. Make the necessary imports
2. Create environment instance using gym.make()

In [1]:
#manage imports
import gym
import gym_solventx 

In [2]:
env = gym.make('gym_solventx-v0')

## Opitons that can be passed while creating environment
1. Goals list (include 'Purity, 'Recovery, 'Stages', 'OA Extraction', 'OA Scrub', 'OA Strip', 'Recycle', 'Profit')
2. Discrete reward
3. Bounds_file (a file that can restrict environment upper and lower limits)

In [None]:
env = gym.make('gym_solventx-v0', 
      goals_list=['Purity', 'Recovery'], 
      bounds_file='gym_solventx/envs/methods/input/bounds.csv')

In [None]:
env = gym.make('gym_solventx-v0',
      DISCRETE_REWARD=True,
      goals_list=['Purity', 'Recovery'], 
      bounds_file='gym_solventx/envs/methods/input/bounds.csv')

In [None]:
#additional goals list options include 'Purity, 'Recovery, 'Stages', 'OA Extraction', 'OA Scrub', 'OA Strip', 'Recycle', 'Profit'
env = gym.make('gym_solventx-v0', 
      goals_list=['Purity', 'Recovery', 'Recycle'])

## Reset environment and perform actions

1. There are 23 discrete actions.  
    0 - Increase (HA)2(org)  
    1 - Decrease (HA)2(org)  
    2 - Increase H+ Extraction  
    3 - Decrease H+ Extraction  
    4 - Increase H+ Scrub  
    5 - Decrease H+ Scrub  
    6 - Increase H+ Strip  
    7 - Decrease H+ Strip  
    8 - Increase OA Extraction  
    9 - Decrease OA Extraction  
    10 - Increase OA Scrub  
    11 - Decrease OA Scrub  
    12 - Increase OA Strip  
    13 - Decrease OA Strip  
    14 - Increase Recycle  
    15 - Decrease Recycle  
    16 - Increase Extraction Stages  
    17 - Decrease Extraction Stages  
    18 - Increase Scrub Stages  
    19 - Decrease Scrub Stages  
    20 - Increase Strip Stages  
    21 - Decrease Strip Stages  
    22 - Do Nothing
2. Before we can apply an action we need to call the reset() method.
3. An action can be applied by calling the step() method.
4. The result of the action can be observed using the render method.


In [None]:
actions = [22,0,1] #do nothing, increase (HA)2(org), decrease (HA)2(org)
envstate =  env.reset()
env.render() #observe initial configuration

for action in actions:
    observation, reward, done, _ = env.step(action)
    env.render()

## Invalid actions
Some actions can cause the environment to go to invalid states. In such cases the step() method executes but no change is applied to the environment.

In [None]:
actions = [17,17,17,17,17,17] #decrease extration stages
envstate =  env.reset()
env.render() #observe initial configuration

for action in actions:
    observation, reward, done, _ = env.step(action)
    env.render()

## Sampling from environment action space
1. We can randomly sample from the discrete action space.
2. The below code shows 15 random actions:

In [None]:
all_rewards = []
for _ in range(1):  #epoch number
  done = False
  envstate = env.reset()
  for index in range(15): #action count
    action = env.action_space.sample()
    observation, reward, done, _ = env.step(action)
    env.render()
    all_rewards.append(reward)
    
  print('All Rewards:', all_rewards)
  all_rewards = []

Test environment to completion
---

The maximum action count for the environment is 500 actions.

***Note: this takes a significant amount of time***

However if you would like to run an environment to completion, here is example code (*note: the outputs folder is created at the end of an episode*):

In [None]:
all_rewards = []
for _ in range(1):  #epoch number
  done = False
  action_count = 0
  envstate = env.reset()
  env.render() #observe initial configration
  while not done: #take action
    action_count += 1
    action = env.action_space.sample()
    observation, reward, done, _ = env.step(action)
    
    if action_count % 25 == 0: #render every 25 steps
        env.render()
    all_rewards.append(reward)
    
  print('All Rewards:', all_rewards)
  all_rewards = []

Action Stats
===

1. To get the stats of actions taken during an episode you can call the `get_stats()` function. This returns a dictionary with the stats.

2. To get a visual representation, instead pass the `SHOW_PLOT=True` parameter

3. You can also get the best reward the environment reached during an episode by calling `env.best_reward`

In [None]:
stats = env.get_stats()
print('Stats:', stats)
print('Actions:', sum(stats.values()), end='\n\n')

stats = env.get_stats(SHOW_PLOT=True)

print('Best reward:', env.best_reward)

Render options
===  
To utilize graph generation during testing you can add the option `create_graph_every=n` to the render method which will create a graph and save it every n steps/actions

Additionally, to force the environment to render quietly you can change the mode to 'file' (default 'human') like so:

In [None]:
outputs  = []
envstate = env.reset()

actions = [0, 1]
for action in actions:
    env.step(action)
    output = env.render(mode='file', create_graph_every=1) #graph generation in ./output/graphs
    
    #store output for later demonstration
    outputs.append(output)
    
print('Done.')

#print outputs
for log in outputs:
    print(log)

Graph Options
===
1. To render a single graph you can call the `create_graph()` function independently. This will model the current configuration.

2. Additional options for the `create_graph()` function include `render` which will immediately render the graph in your default pdf viewer (***note this file must be closed to render a new graph, the pdf however will be saved regardless***)
And `filename=desired_name_of_file` if you would like the graph to have a specific name

In [None]:
envstate = env.reset()
env.create_graph() #graph generated in './output/graphs/ssgraph.pdf'

In [None]:
envstate = env.reset()
env.create_graph(render=True, filename='demo_graph') #graph generated in './output/graphs/demo_graph.pdf'