Remove latest observation from CommonRLEnv POMDP state #516

johannes-fischer · 2023-08-23T14:29:54Z

I find it inconsistent that state(::MDPCommonRLEnv) returns the state (converted to type RLO) but state(::POMDPCommonRLEnv) returns a state-observation tuple, without converting anything. Since the observation is not really part of the state, I think it makes sense to not provide it as part of state(::POMDPCommonRLEnv). Additionally, the state should also be optionally converted to another type, just like the observation.

This PR implements the proposed changes.

Add RLS parameter for optional state type conversion. Return only converted state instead of state-observation tuple in `RL.state()`

zsunberg · 2023-08-24T19:57:00Z

The original reason for including the observation in the env state is: If the observation is not included, someone can call observe(env) multiple times and get different results, and by calling observe many times, could get a better estimate of the state than is actually "allowed" by the POMDP.

I suppose that perhaps this is more confusing than beneficial though.

Does hearing the original reason change your opinion at all @johannes-fischer ?

The best path forward would probably be to make it an option.

johannes-fischer · 2023-08-25T08:49:50Z

I get the reason for including the observation in the POMDPCommonRLEnv struct, which I left unchanged. This allows to return the same observation for subsequent calls to observe(env). However, this does not require to also return the observation as part of state(env) does it?

With this PR, subsequent calls to observe will still return the same observation. Getting different observations for the same env state can only happen with

observe(env)
state = state(env)
setstate!(state) # calls `initialobs`
observe(env)

I don't think this will be called accidentally, so I would not worry about this potential state information leak. The same information leak is accessible in the current implementation by using o = rand(initialobs(env.m, state(env)[1]).

zsunberg · 2023-09-09T22:29:15Z

Sorry for the delay - the semester started up last week.

I see what you're saying now and you're definitely right. Thanks for the fix! The only thing I changed is that you no longer need CommonRLInterface.@provide. will merge once the tests pass!

johannes-fischer · 2023-09-10T12:50:27Z

No worries, sounds good!

johannes-fischer added 3 commits August 23, 2023 16:17

Change state representation in POMDPCommonRLEnv

ec1d1cf

Add RLS parameter for optional state type conversion. Return only converted state instead of state-observation tuple in `RL.state()`

Add convenience conversions in CommonRLEnv

b58909c

Adapt test

8006b6b

johannes-fischer force-pushed the common_rl_pomdp_state branch from f934c7c to 8006b6b Compare August 23, 2023 14:55

johannes-fischer changed the title ~~Remove last observation from CommonRLEnv POMDP state~~ Remove latest observation from CommonRLEnv POMDP state Aug 23, 2023

removed RL.@provide

e53cc83

zsunberg merged commit 851d26e into JuliaPOMDP:master Sep 11, 2023
4 checks passed

johannes-fischer deleted the common_rl_pomdp_state branch September 21, 2023 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove latest observation from CommonRLEnv POMDP state #516

Remove latest observation from CommonRLEnv POMDP state #516

johannes-fischer commented Aug 23, 2023

zsunberg commented Aug 24, 2023

johannes-fischer commented Aug 25, 2023

zsunberg commented Sep 9, 2023

johannes-fischer commented Sep 10, 2023

Remove latest observation from CommonRLEnv POMDP state #516

Remove latest observation from CommonRLEnv POMDP state #516

Conversation

johannes-fischer commented Aug 23, 2023

zsunberg commented Aug 24, 2023

johannes-fischer commented Aug 25, 2023

zsunberg commented Sep 9, 2023

johannes-fischer commented Sep 10, 2023