env.step(env.action_space.sample()) vs env.pos #27

MegaCreater · 2021-03-20T16:36:56Z

Hello,
I am very new to this. A want to use TensorFlow-agents to train models as I don't know much about stable-baselines3. I had generated environment using TakeoffAviary.py.
`

env=TakeoffAviary(gui=False)
env.step(env.action_space.sample())
(array([-4.51520228e-08, -3.96662917e-08, 2.25000240e-02, 3.55740654e-04,
-6.93952996e-04, -6.59155518e-05, -3.26110701e-05, -2.43708535e-05,
-4.18719402e-05, 4.62792489e-01, -8.66369966e-01, -1.87686425e-01]), -0.8888879398403151, False, {'answer': 42})
env.pos
array([[-6.77280342e-07, -5.94994376e-07, 1.12500120e-01]])
env.rpy
array([[ 0.00111759, -0.00218012, -0.00020708]])
env.ang_v
array([[ 0.06450283, -0.12075242, -0.02615925]])
env.vel
array([[-9.78332104e-05, -7.31125605e-05, -1.25615821e-04]])
`

as obs, reward, done, info = env.step(env.action_space.sample()). So why obs[:3] != env.pos[0] or obs[3:6]!=env.rpy[0] similarly for velocity and angular velocity also ? (obs-space is Box(12,) -> x,y,z,r,p,y,vx,vy,vz,wx,wy,wz receptively (in world frame)? (Is i am correct?)). What does env.pos or env.rpy or env.ang_v or env.vel returns ?

The text was updated successfully, but these errors were encountered:

JacopoPan · 2021-03-20T16:56:00Z

Hi @MegaCreater,

I should double check your numbers and slicing but one sure reason why the obs in TakeoffAviary (and all other env classes meant for RL) does not contain the raw, world frame values is because it is clipped and normalized to the -1, +1 range:

gym-pybullet-drones/gym_pybullet_drones/envs/single_agent_rl/TakeoffAviary.py

Line 114 in 7688e72

def _clipAndNormalizeState(self,

You can use CtrlAviary for unbounded, raw obs.

MegaCreater · 2021-03-20T17:04:38Z

Hi @MegaCreater,

I should double check your numbers and slicing but one sure reason why the obs in TakeoffAviary (and all other env classes meant for RL) does not contain the raw, world frame values is because it is clipped and normalized to the -1, +1 range:

gym-pybullet-drones/gym_pybullet_drones/envs/single_agent_rl/TakeoffAviary.py

Line 114 in 7688e72

def _clipAndNormalizeState(self,

You can use CtrlAviary for unbounded, raw obs.

Hi @JacopoPan ,
That i got that why values are btwn -1 and 1 (or 0 and 1 for z) (That is not the issue.). I just only want to get/Know that obs, reward, done, info = env.step(env.action_space.sample()); obs numerical values are not equal to env-pos,vel or ang_v parameters (numerically not equal.).

JacopoPan · 2021-03-20T17:23:52Z

Because kinematic information is retrieved from PyBullet in:

gym-pybullet-drones/gym_pybullet_drones/envs/BaseAviary.py

Line 503 in 7688e72

def _updateAndStoreKinematicInformation(self):

Which is called at every step:

gym-pybullet-drones/gym_pybullet_drones/envs/BaseAviary.py

Line 272 in 7688e72

def step(self,

And those value are used to build a state:

gym-pybullet-drones/gym_pybullet_drones/envs/BaseAviary.py

Line 536 in 7688e72

def _getDroneStateVector(self,

But then that state is clipped to max, min values and divided by ranges in:

gym-pybullet-drones/gym_pybullet_drones/envs/single_agent_rl/BaseSingleAgentAviary.py

Line 355 in 7688e72

obs = self._clipAndNormalizeState(self._getDroneStateVector(0))

Before being returned as an obs.
If you spot any numerical mistake please let me know but I wouldn't expect obs and env.pos to contain the same values.

MegaCreater · 2021-03-21T02:37:30Z

Thanks @JacopoPan ,
For your fast and detailed explanation. I got it (that obs return normalised values whereas env.pos reruns original values). Is there is any way so that I get max(x,y,z) and min(x,y,z) (original\not normalised).
If you want I make Google-Colaboratory examples for your example (non-gui) and for TensorFlow also. (For gui=True, it shows an error cannot connect to server x (same as it shows for cv2.imshow); we just had to make changes or add attribute like gui_show=False; so it don't show GUI but one can record video that can be later on visualised for Google-Colab.).

JacopoPan added the question Further information is requested label Mar 20, 2021

MegaCreater closed this as completed Mar 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

env.step(env.action_space.sample()) vs env.pos #27

env.step(env.action_space.sample()) vs env.pos #27

MegaCreater commented Mar 20, 2021 •

edited

Loading

JacopoPan commented Mar 20, 2021

MegaCreater commented Mar 20, 2021

JacopoPan commented Mar 20, 2021

MegaCreater commented Mar 21, 2021 •

edited

Loading

env.step(env.action_space.sample()) vs env.pos #27

env.step(env.action_space.sample()) vs env.pos #27

Comments

MegaCreater commented Mar 20, 2021 • edited Loading

JacopoPan commented Mar 20, 2021

MegaCreater commented Mar 20, 2021

JacopoPan commented Mar 20, 2021

MegaCreater commented Mar 21, 2021 • edited Loading

MegaCreater commented Mar 20, 2021 •

edited

Loading

MegaCreater commented Mar 21, 2021 •

edited

Loading