# Actions

<img src="images/cartpole.png" width="750"/>

In [1]:
import gym

env = gym.make("CartPole-v1")
obs = env.reset()
env.reset()
print(obs)

[-0.02182618 -0.03077107 -0.03747389 -0.03904124]


<img src="images/api/1.png" width="500"/>

## The next step is to make a decision and send it to `gym`

- The entity making the decision is called the **agent**.
- The decision is called the **action**.

<img src="images/api/2.png" width="600"/>

### General nature of actions

1. **Continuous** or **discrete**?
2. **Scalar**, **vector** or **tensor**?
3. **Range** of the elements?
4. **Python datatype** of elements?

In [2]:
env.action_space

Discrete(2)

- `Discrete` is a `gym` data type representing a **sequence of integers**, starting from zero.

| Question | Answer |
| --- | --- |
| continous or discrete? | `Discrete` $\implies$ discrete |
| scalar, vector or tensor? | `Discrete` $\implies$ integer $\implies$ scalar |
| range of elements | `Discrete(2)` $\implies$ allowed values: `0` and `1` |
| python datatype of elements? | `int` |

### Meaning of action choices

Search in the documentation

1. [Environment page in Gym's website](https://gym.openai.com/envs/CartPole-v1)
    - This has been replaced by the [new documentation site](https://www.gymlibrary.ml/environments/classic_control/cart_pole/)
2. [Wiki in Gym's GitHub repo](https://github.com/openai/gym/wiki/CartPole-v0)
    - This is now old and probably unmaintained. The latest information is [here]( https://www.gymlibrary.ml/environments/classic_control/cart_pole/)
3. [Environment source code](https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py)
4. [Research paper linked in the environment page in Gym's website](https://gym.openai.com/envs/CartPole-v1/#barto83)
    - This has been replaced by https://www.gymlibrary.ml/environments/classic_control/cart_pole/#description

### Sending an action to `gym` with `env.step()`

- Accepts one argument, which must be a valid action

In [4]:
new_obs, _, _, _ = env.step(0)
print(new_obs)

[-0.01667347 -0.42041793  0.05241162  0.61282417]


<img src="images/api/3.png" width="600"/>

### Sending another action `1`: push cart to the right

In [5]:
env.step(1)

(array([-0.02508183, -0.22606609,  0.0646681 ,  0.3370987 ]), 1.0, False, {})

### Sending an illegal action `2` leads to error

In [6]:
env.step(2)

AssertionError: 2 (<class 'int'>) invalid

## Seeing the simulation progress visually

### Taking many actions in a loop

- `reset()` the env
- take action `0` in a loop (push cart to the left)
- call `render()` after every step

In [11]:
obs = env.reset()
for _ in range(30):
    print(obs)
    obs, _, _, _ = env.step(0)
    env.render()

[-0.02388159  0.04795998  0.03902257 -0.03665129]
[-0.02292239 -0.1476992   0.03828955  0.26808378]
[-0.02587638 -0.34334609  0.04365122  0.57259338]
[-0.0327433  -0.53905204  0.05510309  0.87870206]
[-0.04352434 -0.7348777   0.07267713  1.18818683]
[-0.05822189 -0.93086265  0.09644087  1.50273648]
[-0.07683915 -1.12701393  0.1264956   1.82390586]
[-0.09937942 -1.32329269  0.16297371  2.15306224]
[-0.12584528 -1.5195988   0.20603496  2.49132202]
[-0.15623725 -1.71575318  0.2558614   2.83947714]
[-0.19055232 -1.91147851  0.31265094  3.19791276]
[-0.22878189 -2.10637979  0.3766092   3.56652105]
[-0.27090948 -2.29992702  0.44793962  3.94461956]
[-0.31690802 -2.49144407  0.52683201  4.33088707]
[-0.36673691 -2.68010847  0.61344975  4.72333298]
[-0.42033908 -2.86496804  0.70791641  5.11931665]
[-0.47763844 -3.0449797   0.81030274  5.51562806]
[-0.53853803 -3.21907393  0.92061531  5.90862913]
[-0.60291951 -3.38624489  1.03878789  6.29443608]
[-0.67064441 -3.5456619   1.16467661  6.66910239]
