# GARL

### Documentación

Problemas interesantes para Aprendizaje por refuerzo
 * Gymnasium: https://gymnasium.farama.org/environments/box2d/

## Instalación

!pip install gymnasium  
!pip install gymnasium[box2d] 

## Acciones adicionales

### En macos

pip uninstall swig  
xcode-select -—install (si no se tienen ya)  
pip install swig  / sudo port install swig-python
pip install 'gymnasium[box2d]' # en zsh hay que poner las comillas  

### en Windows

Si da error, se debe a la falta de la versión correcta de Microsoft Visual C++ Build Tools, que es una dependencia de Box2D. Para solucionar este problema, puede seguir los siguientes pasos:  
 * Descargar Microsoft Visual C++ Build Tools desde https://visualstudio.microsoft.com/visual-cpp-build-tools/.
 * Dentro de la app, seleccione la opción "Herramientas de compilación de C++" para instalar.
 * Reinicie su sesión en Jupyter Notebook.
 * Ejecute nuevamente el comando !pip install gymnasium[box2d] en la línea de comandos de su notebook.

In [1]:
import gymnasium as gym
import numpy as np

## Human

In [3]:
# prueba lunar lander por humano
import pygame
import gymnasium.utils.play


env = gym.make("LunarLander-v2", render_mode="rgb_array")


lunar_lander_keys = {
    (pygame.K_UP,): 2,
    (pygame.K_LEFT,): 1,
    (pygame.K_RIGHT,): 3,
}
gymnasium.utils.play.play(env, zoom=3, keys_to_action=lunar_lander_keys, noop=0)

## Agent


### Genetic Algorithm

In [None]:
import sys

sys.path.append("..")
import NEAT.genome_serde as gen_serde
import LunarLander.fitness as fit
import NEAT.feed_forward_nn as ffnn
from RNG.random import Rng
import LunarLander.controllers as ctrl
import numpy as np

best = gen_serde.deserialize_genome(
    "../output/winner.pkl"
)
fitness = []
rounds = 100
for r in range(rounds):
    with fit.create_environment(False) as env:
        observation, info = env.reset()
        nn = ffnn.FeedForwardNeuralNetwork.create_from_genome(best)
        controller = ctrl.AIController(env, nn, Rng(42), False, False)
        fitness_round = 0.0

        done = False
        while not done:
            action = controller.get_action(observation)
            observation, reward, terminated, truncated, info = env.step(action)
            fitness_round += reward
            done = terminated or truncated

        fitness.append(fitness_round)

print(np.max(fitness))
print(np.min(fitness))
print(np.mean(fitness))
print(np.std(fitness))

306.6054315589405
9.644317640446374
227.13774075100847
63.59033873074514


In [14]:
import networkx as nx
from networkx.drawing.nx_pydot import read_dot
import matplotlib.pyplot as plt
import pygraphviz as pgv
import NEAT.formatter as fmtr

# Asegúrate de que la ruta al archivo .dot sea la correcta.
G = fmtr.ForAnimationFormatterFn(best).to_networkx_graph()
# Dibuja la gráfica
pos = nx.multipartite_layout(G, subset_key="layer")

node_colors = {0: "#FBE96A", 1: "#C3BD92", 2: "#69E9FA"}
node_color_map = [node_colors[data["layer"]] for node_id, data in G.nodes(data=True)]
edge_color_map = ["green" if G[u][v]["weight"] > 0 else "red" for u, v in G.edges()]

weights = [G[u][v]["weight"] for u, v in G.edges()]
abs_weights = [abs(w) for w in weights]
max_weight = max(abs_weights)
widths = [5 * w / max_weight for w in abs_weights]

# Dibujar nodos y aristas
plt.figure(figsize=(15, 10))
nx.draw_networkx(
    G,
    pos,
    with_labels=True,
    node_color=node_color_map,
    edge_color=edge_color_map,
    width=widths,
)
plt.show()

ValueError: max() arg is an empty sequence

# ¿No has tenido bastante?



Prueba a controlar el flappy bird https://github.com/markub3327/flappy-bird-gymnasium

pip install flappy-bird-gymnasium

import flappy_bird_gymnasium  
env = gym.make("FlappyBird-v0")

Estado (12 variables):
  * the last pipe's horizontal position
  * the last top pipe's vertical position
  * the last bottom pipe's vertical position
  * the next pipe's horizontal position
  * the next top pipe's vertical position
  * he next bottom pipe's vertical position
  * the next next pipe's horizontal position
  * the next next top pipe's vertical position
  * the next next bottom pipe's vertical position
  * player's vertical position
  * player's vertical velocity
  * player's rotation

  Acciones:
  * 0 -> no hacer nada
  * 1 -> volar