Skip to content

Partially-observable taxi environment, with internal vectorization

License

Notifications You must be signed in to change notification settings

DavidSlayback/gym-po-taxi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gym-po-taxi

Partially-observable taxi environment, with internal vectorization

Links to look at for own implementation

Gymnax: Classic control, bsuite, MinAtar, FourRooms, MetaMaze, PointRobot, Bandits in JAX. Supports Podracer architecture

  • Most interesting environments are probably MemoryChain, FourRooms, MetaMaze, PointRobot

ROOMS and C-ROOMS: ROOMS and C-ROOMs for reference

  • Velocity-based vs just position
  • Fixed layouts ahead of time. Random agent spawn. Fixed or set or random goal
  • Discrete action (8 or 4 cardinal directions) vs Continuous (2D)
    • 2 forms of action failure. 0.2 chance of taking random action (cardinal) or flipping signs (continuous). 0.2 standard deviation for Gaussian movement
  • What to do for walls?
    • Discrete case is easy. Don't move.
    • Continuous case could be the same. Alternatively, draw the vector, stop right at wall.
  • Observation?
    • Non-continuous:
      • Fully observable: grid discrete state. Goal state if random?
      • Partially observable: 4D Hansen (adjacent), 8D Hansen, nxn grid
    • Continuous:
      • Fully observable: (x,y) coordinate, Need (dx, dy) if velocity-based. Goal state if random?
      • Partially observable:
        • (x,y) w/o velocity, (x,y) downsampled to grid
        • 4/8D Hansen (0/1 walls in range 1M), 4/8D walls (distance of closest wall)

Pocman/Pacman: Fully/partially-observable pocman from POMCP

Battleship: Partially observable battleship

Rocksample: Also has battleship

Isaacverse: GPU physics control

Mo-Gym: Multi-objective. Fancy fourrooms, reacher with more objectives,

gym-sokoban: pixel-based though...

CARL: Context-adaptive RL, reconfigure envs (Mario, Brax, control)

highway-env: Must infer behaviors of others

Other

  • SpaceRobot: Non-actuated base space robot
  • Learn2Race: Needs GPU. Eh...
  • tmrl: TrackMania racing, 19-D LIDAR option
  • ShinRL: Future reference, interesting

About

Partially-observable taxi environment, with internal vectorization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages