make env optional arg while creating from buffers #137

avjmachine · 2023-08-31T18:45:00Z

Description

Make the env argument as optional in the create_dataset_from_buffers() function in utils

Fixes # (issue), Depends on # (pull request)

Type of change

Please delete options that are not relevant.

Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.
To upload images to a PR -- simply drag and drop or copy paste.

Checklist:

I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
I have run pytest -v and no errors are present.
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

balisujohn

Thanks for your work on this so far. I requested changes in two places. I also think there should be a new test which creates a dataset without providing an env argument, and makes sure that everything still works as intended. Once that is all done, I will review again.

minari/utils.py

raise error if obs space or action space not provided when env is none remove unnecessary if conditions as obs & action space can't be none

balisujohn · 2023-10-09T07:28:38Z

Thanks for your work on this; it looks almost ready! I think it would be good to add a test to test_dataset_creation where a dataset is created from buffers without the env being set, and then also tries to load that dataset after it is created. Additionally, it looks like pre-commit is failing. Once those two things are done, it should be ready to merge.

minari/utils.py

younik · 2023-10-22T03:15:58Z

minari/utils.py

-            dataset.spec.env_spec.max_episode_steps for dataset in datasets_to_combine
+            dataset.spec.env_spec.max_episode_steps
+            for dataset in datasets_to_combine  # pyright: ignore[reportGeneralTypeIssues]


Actually, since now the datasets can have a env_spec=None, we should address also this case.
You can create the list of env_spec and filter out None values

As of now, I'm raising an error if any of the datasets to be combined have no env_spec. Please do let me know if you'd prefer some other specific action instead of this.

minari/utils.py

younik · 2023-10-22T03:22:41Z

minari/utils.py

    if observation_space is None:
-        observation_space = env.observation_space
+        observation_space = (
+            env.observation_space  # pyright: ignore[reportOptionalMemberAccess]


why do you need this? There is no way reaching here when env is None; do you got errors from the pre-commit?

Yes, I got errors from pre-commit.

minari/utils.py

tests/utils/test_dataset_combine.py

younik

I had to play around a bit to understand how to solve so I just pushed the code to your branch; now should be good.
Last thing before merging, can you add a test for combining datasets without env_spec?

avjmachine · 2023-10-24T04:51:24Z

@younik Okay, so using assert you could handle the pyright optional member access issue! Thanks a lot for helping with that!
I'll start working on that test, but before that I have a question.

I see that the "validate_datasets_to_combine" has the check "Tests if the datasets were created with the same environment" at L184-186 in utils.py removed. Is this intentional? Don't we need to compare if all the datasets have exactly the same env_spec? Am I missing something here?

younik · 2023-10-24T05:33:20Z

minari/utils.py

+        env_spec = dataset.spec.env_spec
+        if env_spec is not None:
+            assert (
+                common_env_spec is not None


Oh my bad, you are right, thanks! Can you fix it? You cannot simply assert check env_spec == common_env_spec tho, as max_episode_steps may differ.

Would be nice to also add a test on combining with different env_spec then (an error is expected)

okay, sure. I'll add the fix and the test as well.

I've added the fix. Please verify if this is fine. Also, I wanted to know if we could use assert in non-testing code as I have read that assert statements are unsafe in production environments could be disabled in python interpreters during optimization (quoting one source here). They recommend using assert only in testing and debugging code, but I've continued to use assert here as our repo already has many asserts in non-testing code.

The test case and pyright issue is still work in progress though I've done some refactoring to make use of the dataset generation with no env in the combining dataset test cass too.

Edit: Just noticed that we need a separate for loop statement for updating the max_episode_steps after line 186. I missed that. Will correct it. But please do advise me on the assert statement.

Thanks for the fix! I left a couple of comments.
You are right about the assert, the difference is: when you believe something cannot happen you can use assert (e.g. asserts on line 401 and 404 of utils.py, as there is the check above). When it is something that can happen, but you want to let the user that it is not allowed, you should raise an error.
Unfortunately, in Minari we often use asserts as exceptions. I would be happy to merge another PR that addresses it.

Edit: Just noticed that we need a separate for loop statement for updating the max_episode_steps after line 186. I missed that. Will correct it. But please do advise me on the assert statement.

What do you mean? I don't see it.

minari/utils.py

younik · 2023-10-27T20:45:55Z

tests/common.py

+    if dataset.env_spec is None:
+        raise ValueError("Recovering environment is not possible when env_spec is None")


this it is not necessary I believe, as recover_environment will throw an error itself if env_spec is None

removed the redundant raise error

younik · 2023-10-27T20:56:07Z

minari/utils.py

+        env_spec = dataset.spec.env_spec
+        if env_spec is not None:
+            assert (
+                common_env_spec is not None


Thanks for the fix! I left a couple of comments.
You are right about the assert, the difference is: when you believe something cannot happen you can use assert (e.g. asserts on line 401 and 404 of utils.py, as there is the check above). When it is something that can happen, but you want to let the user that it is not allowed, you should raise an error.
Unfortunately, in Minari we often use asserts as exceptions. I would be happy to merge another PR that addresses it.

Edit: Just noticed that we need a separate for loop statement for updating the max_episode_steps after line 186. I missed that. Will correct it. But please do advise me on the assert statement.

What do you mean? I don't see it.

avjmachine · 2023-10-30T04:14:38Z

@younik I'm currently working on the merge conflicts with the eval_env changes that got merged yesterday. I had one question regarding this. Could there be a scenario where there is an eval_env even when env is None? I mean, can a user provide a eval_env for a dataset generated with buffers without an env?

younik · 2023-10-30T05:39:46Z

@younik I'm currently working on the merge conflicts with the eval_env changes that got merged yesterday. I had one question regarding this. Could there be a scenario where there is an eval_env even when env is None? I mean, can a user provide a eval_env for a dataset generated with buffers without an env?

Thank you, good point. I don't see any common case when this would happen, but also no reason to constraint it. I would simply do what is easier in code, which is no constraint I guess.

avjmachine · 2023-10-30T13:36:14Z

minari/utils.py

+    if isinstance(env, (str, EnvSpec, type(None))) and (observation_space is None or action_space is None):
+        raise ValueError("Both observation space and action space must be provided, if env is str|EnvSpec|None")


I'm assuming here that action_space and observation_space cannot be extracted unless it is of type gym.Env

Uhm, if it is str or EnvSpec, you can recover the environment with gym.make; I would do that

So, if it is a str or EnvSpec, there would be 2 steps to extract the action_space and observation_space?

recover env as gym.Env with gym.make;

recover action_space and observation_space from this gym.Env.

Is this correct?

Yes, exactly

implemented the above logic

… EnvSpec

…ace None

younik · 2023-11-02T23:23:29Z

minari/utils.py

-    ), "The datasets to be combined have different values for `env_spec` attribute."
+    # checking equivalence of all datasets with an env
+    for dataset in datasets_to_combine:
+        if dataset.spec.env_spec is not None:


They are required to be not None, it shouldn't be possible to combine datasets with some None env_spec and some None.

okay, I think this answers my question in the previous comment.

younik · 2023-11-02T23:24:02Z

minari/utils.py

+            env_spec_copy = copy.deepcopy(dataset.spec.env_spec)
+            env_spec_copy.max_episode_steps = common_env_spec.max_episode_steps
+            if env_spec_copy != common_env_spec:
+                raise ValueError(
+                    "The datasets to be combined have different values for `env_spec` attribute."
+                )


can you just move this check in the other for loop and remove this one?

minari/utils.py

younik · 2023-11-02T23:32:11Z

minari/utils.py

    if eval_env is None:
-        warnings.warn(
-            f"`eval_env` is set to None. If another environment is intended to be used for evaluation please specify corresponding Gymnasium environment (gym.Env | gym.envs.registration.EnvSpec).\
+        if env_spec is not None:
+            warnings.warn(
+                f"`eval_env` is set to None. If another environment is intended to be used for evaluation please specify corresponding Gymnasium environment (gym.Env | gym.envs.registration.EnvSpec).\
                  If None the environment used to collect the data (`env={env_spec}`) will be used for this purpose.",
-            UserWarning,
-        )
+                UserWarning,
+            )
+        else:
+            warnings.warn(
+                "Since both `eval_env` and `env` used to collect the data are None, no environment can be used for evaluation",
+                UserWarning,
+            )


I would simplify this by having two independent checks. One for env_spec with a warning stating if you didn't declare it, and the other for eval_env (without checking what env_spec is).

made the 2 checks independent

younik · 2023-11-03T03:57:29Z

minari/utils.py

        eval_env (Optional[str|gym.Env|EnvSpec]): Gymnasium environment(gym.Env)/environment id(str)/environment spec(EnvSpec) to use for evaluation with the dataset. After loading the dataset, the environment can be recovered as follows: `MinariDataset.recover_environment(eval_env=True).
-                                                If None the `env` used to collect the buffer data should be used for evaluation.
+                                                If None, and if the `env` used to collect the buffer data is available, latter should be used for evaluation.


not related to your PR, but should -> will

…docstring

…during metadata creation

raise error when some datasets have env_spec, and others don't combine two for-loops that check equality & update episodes for common_env_spec into one

younik

LGTM; thanks for implementing this!

younik · 2023-11-09T00:52:28Z

Actually, can you fix pre-commit @avjmachine

Looks like you should use env_spec in line 527 & 530 in utils, as env can be just a string.
For the documentation, it is not related to your PR, so you can ignore it

avjmachine · 2023-11-16T05:49:22Z

Actually, can you fix pre-commit @avjmachine

Looks like you should use env_spec in line 527 & 530 in utils, as env can be just a string. For the documentation, it is not related to your PR, so you can ignore it

In line 524, we are overwriting the env variable if its just a string, using "env = gym.make(env)". So, will this work or is there a risk of env not having an observation_space or action_space attribute even after using gym.make?

younik · 2023-11-16T05:57:52Z

Yes, it is

Actually, can you fix pre-commit @avjmachine
Looks like you should use env_spec in line 527 & 530 in utils, as env can be just a string. For the documentation, it is not related to your PR, so you can ignore it

In line 524, we are overwriting the env variable if its just a string, using "env = gym.make(env)". So, will this work or is there a risk of env not having an observation_space or action_space attribute even after using gym.make?

No, there isn't; however, pyright is not able to infer it and throws an error.
Changing assert env is not None to assert isinstance(env, gym.Env) should work

… pyright issues

avjmachine · 2023-11-16T06:45:44Z

Yes, it is

Actually, can you fix pre-commit @avjmachine
Looks like you should use env_spec in line 527 & 530 in utils, as env can be just a string. For the documentation, it is not related to your PR, so you can ignore it

In line 524, we are overwriting the env variable if its just a string, using "env = gym.make(env)". So, will this work or is there a risk of env not having an observation_space or action_space attribute even after using gym.make?

No, there isn't; however, pyright is not able to infer it and throws an error. Changing assert env is not None to assert isinstance(env, gym.Env) should work

Done. I didn't get this pyright error on my local env. Hence, got confused.
Also, fixed additional pyright errors which I got while committing. You can review those changes. I changed int64 to integer for type check based on info here and here

old review

make env optional arg while creating from buffers

1a9c761

balisujohn previously requested changes Sep 1, 2023

View reviewed changes

minari/utils.py Outdated Show resolved Hide resolved

minari/utils.py Outdated Show resolved Hide resolved

Merge branch 'main' into env-optional-create-ds-buffers

28403a7

avjmachine force-pushed the env-optional-create-ds-buffers branch from b115336 to 28403a7 Compare September 18, 2023 02:14

resolve review comments on PR#137

3bd0795

raise error if obs space or action space not provided when env is none remove unnecessary if conditions as obs & action space can't be none

Merge branch 'main' into env-optional-create-ds-buffers

3661142

younik reviewed Oct 12, 2023

View reviewed changes

minari/utils.py Outdated Show resolved Hide resolved

minari/utils.py Outdated Show resolved Hide resolved

avjmachine added 3 commits October 19, 2023 12:11

Merge branch 'main' into env-optional-create-ds-buffers

a4fdfa6

reduce complexity in raise error, ignore pyright

f49b89f

Merge branch 'main' into env-optional-create-ds-buffers

5835868

younik reviewed Oct 22, 2023

View reviewed changes

rodrigodelazcano mentioned this pull request Oct 22, 2023

Add evaluation environment specs to dataset metadata #155

Merged

7 tasks

avjmachine and others added 4 commits October 22, 2023 12:32

handle env_spec, scores data when env is None

2e12d14

make env_spec optional, handle cases when None

1b50d31

modify test cases when env, env_spec is None

20be8e8

fix pre-commit

5824961

younik force-pushed the env-optional-create-ds-buffers branch from 1a61d32 to 5824961 Compare October 23, 2023 03:53

younik reviewed Oct 23, 2023

View reviewed changes

add assert message

eeaa6bf

younik reviewed Oct 24, 2023

View reviewed changes

younik mentioned this pull request Oct 24, 2023

[Proposal] let Minari Dataset's attribute ref_min/max score be attribute of environment #141

Closed

avjmachine added 2 commits October 27, 2023 19:09

fix env_spec equality check for combining datasets

e4dd0ae

refactor generate test dataset w/o env into common

8b35aff

younik reviewed Oct 27, 2023

View reviewed changes

rodrigodelazcano mentioned this pull request Oct 29, 2023

Move metadata to JSON and add data_format #156

Merged

Merge branch 'main' into env-optional-create-ds-buffers

ec0d376

avjmachine commented Oct 30, 2023

View reviewed changes

avjmachine added 4 commits October 30, 2023 20:09

fix method to validate datasets to combine

d6566ec

remove redundant exception while recovering env when env_spec None

6e5a4c3

recover action & observation space through gym.make from env as str |…

993381e

… EnvSpec

correct wrong position of assert env not None to when obs & action sp…

7ce14bd

…ace None

younik reviewed Nov 3, 2023

View reviewed changes

avjmachine added 3 commits November 5, 2023 02:03

correct args description for eval_env in create_dataset_from_buffers …

938fdf7

…docstring

made warning checks for no env_spec and no eval_env_spec independent …

9ca7af1

…during metadata creation

fix and optimize validating combine datasets

34b8a0a

raise error when some datasets have env_spec, and others don't combine two for-loops that check equality & update episodes for common_env_spec into one

younik approved these changes Nov 9, 2023

View reviewed changes

Merge branch 'main' into env-optional-create-ds-buffers

ca21ecd

avjmachine added 2 commits November 16, 2023 12:04

fix, bypass pyright precommit errors

f7d05ef

assert env is a gym env before accessing obs & action space, to avoid…

62d84a8

… pyright issues

younik added 3 commits November 16, 2023 11:36

update pre-commit

4d1c60e

add fix to pre-commit

90591f3

fix pre-commit

d6a8fd3

younik merged commit ff11da1 into Farama-Foundation:main Nov 17, 2023
6 checks passed

		if dataset.env_spec is None:
		raise ValueError("Recovering environment is not possible when env_spec is None")

		if isinstance(env, (str, EnvSpec, type(None))) and (observation_space is None or action_space is None):
		raise ValueError("Both observation space and action space must be provided, if env is str\|EnvSpec\|None")

make env optional arg while creating from buffers #137

make env optional arg while creating from buffers #137

Conversation

avjmachine commented Aug 31, 2023 • edited

Description

Type of change

Screenshots

Checklist:

balisujohn left a comment

Choose a reason for hiding this comment

balisujohn commented Oct 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younik left a comment

Choose a reason for hiding this comment

avjmachine commented Oct 24, 2023 • edited

younik Oct 24, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avjmachine Oct 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avjmachine commented Oct 30, 2023

younik commented Oct 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younik left a comment

Choose a reason for hiding this comment

younik commented Nov 9, 2023

avjmachine commented Nov 16, 2023

younik commented Nov 16, 2023

avjmachine commented Nov 16, 2023

avjmachine commented Aug 31, 2023 •

edited

avjmachine commented Oct 24, 2023 •

edited

younik Oct 24, 2023 •

edited

avjmachine Oct 27, 2023 •

edited