# FUFI environment
The environment for our real life Cart-Pole

## Description
A pole is attached by an un-actuated joint to a cart, which moves along a track with friction.
The pendulum is placed upright on the cart and the goal is to balance the pole by applying forces in the left and right direction on the cart.

****
                                    Action Space

The action is a `ndarray` with shape `(1,)` which can take values in the interval `{F_min, F_max}` indicating the magnitude of the force the cart is pushed with. Basically we take a discrete action space, in which the minimum corresponds to -F_max - AKA the maximum force our engine can give - and +F_max as the maximum. The steps between one value and the other is the minimum force our engine can produce, let's say 0.1 for the time being. Sign - and + refers to our sistem of reference.

    | Num  | Action                              |
    |------|-------------------------------------|
    | -10  | Push cart to the left with F=10 N   |
    | -0.9 | Push cart to the left with F=0.9 N  |
    | ---- | ----------------------------------- |
    | ---- | ----------------------------------- |
    | 0.9  | Push cart to the right with F=0.9 N |
    | 10   | Push cart to the right with F=10 N  |
    
****
                                    Observation Space

The observation is a `ndarray` with shape `(4,)` with the values corresponding to the following positions and velocities:

    |Num|   Observation  |    Min     |    Max    |
    |---|----------------|------------|-----------|
    | 0 |  Pole a_t      | -F_max/m_p | F_max/m_p |
    | 1 | Pole Theta     | ~ (-24°)   | ~ (24°)   |
    | 2 | Pole Theta dot |    -Inf    |   Inf     |

*Note:* While the ranges above denote the possible values for observation space of each element, it is not reflective of the allowed values of the state space in an unterminated episode. In particular:

-The cart x-position (index 0) can be take values between `(-4.8, 4.8)`, but the episode terminates if the cart leaves the `(-2.4, 2.4)` range, that means it hits the end of the track.

-The pole angle can be observed between `(-.418, .418)` radians `(or ±24°)`, but the episode terminates if the pole angle is not in the range `(-.0349, .0349)` `(or ±2°)`.

****
                                    Rewards

Since the goal is to keep the pole upright for as long as possible, a reward of `+1` for every step taken,including the termination step, is allotted. The threshold for rewards is raised to 1000.
****
                                  Manner
There are 2 ways implemented in this code:
1. *Real world* : we take the real FUFI and attach it to the Arduino. The states are measured and imported from a `.txt` file.

2. *Ideas world* : simulation mode. The states are computed using the motion equations of the sistem.
****
                                  Starting State

In simulation mode all observations are assigned a uniformly random value in `(-0.05, 0.05)`, in real world mode we take as initial state the inital measures.
****
                                  Episode End

The episode ends if any one of the following occurs:
1. Termination: Pole Angle is greater than ±2°;
2. Termination: Cart Position is greater than ±2.4,that is when the center of the cart reaches the edge of the display;
3. Truncation: Episode length is greater than 1500;
****
                                Arguments
1. `gym.make('Fufi')`

In [None]:
# importing things
import math
from typing import Optional, Union

import numpy as np

import gym
from gym import logger, spaces
from gym.envs.classic_control import utils
from gym.error import DependencyNotInstalled


In [None]:
class FUFI(gym.Env[np.ndarray, Union[int, np.ndarray]]):
  metadata = {
        "render_modes": ["human", "rgb_array"],
        "render_fps": 50,
    }

  def __init__(self, render_mode: Optional[str] = None):
## ---------------------------------------- FUFI parameters --------------------------------------- ##
## To be changed according to the instrument

    self.gravity = 9.8          # m/s**2
    self.masscart = 1.0         # Kg
    self.masspole = 0.1         # Kg
    self.total_mass = self.masspole + self.masscart
    self.length = 0.5           # m. Actually half the pole's length
    self.polemass_length = self.masspole * self.length
    self.max_force_mag = 10.0    # N. Maximum force the engine can produce
    self.tau = 0.02             # seconds between state updates
    self.kinematics_integrator = "euler"
    self.sensibility = 0.1      # minimum force mag our engine can produce.
                                # Here let's take 0.1

    # Angle at which to fail the episode
    self.theta_threshold_radians = 2 * 2 * math.pi / 360
    # Hits the wall
    self.x_threshold = 2.4

    # Angle limit set to 2 * theta_threshold_radians so failing observation
    # is still within bounds.
    high = np.array(
        [
            self.x_threshold * 2,
            np.finfo(np.float32).max,
            self.theta_threshold_radians * 2,
            np.finfo(np.float32).max,
        ],
        dtype=np.float32,
    )

    self.action_space = spaces.Discrete(2)
    self.observation_space = spaces.Box(-high, high, dtype=np.float32)

    self.render_mode = render_mode

    self.screen_width = 600
    self.screen_height = 400
    self.screen = None
    self.clock = None
    self.isopen = True
    self.state = None

    self.steps_beyond_terminated = None
