Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small simple bug-fix: update lost infos in function create_dataset function #144

Merged
merged 8 commits into from Oct 10, 2023

Conversation

im-Kitsch
Copy link
Contributor

Description

This is fix for simple bug-fix for function create_dataset_from_collector_env().

  1. Add docstring for parameters
  2. Add metadata attribute to dataset, meta data like algorithm name author author Email is not saved though it's passed into function.

for function create_dataset_from_collector_env()
ref_min/ref_max score is not saved to metadata, it's easily to find the problem.

I think the bugs are quite clear and definitely is a bug, you can take a look,

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have run pytest -v and no errors are present.
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@balisujohn
Copy link
Collaborator

balisujohn commented Sep 11, 2023

Good catches. Thanks for your contribution! All these fixes look correct to me. I think it would be good to make the values for author_name, author_email, code_permalink only be added if the corresponding argument is not equal to None. Once that's done I will merge this PR after a final review. As for minari version as an argument, I'm wondering if we should make it non-optional, but we can sort that out in a future PR.

@im-Kitsch
Copy link
Contributor Author

im-Kitsch commented Sep 11, 2023

Hi,

I pushed update to save author_name, author_email, code_permalink and algorithm name only when it's not None. And additionally I changed a code of combine dataset so that it could optional parameters are not saved.

I agree that we need consider if minari version should be optional or mandatory.

Futhermore, I think the dataset's structure should be clearly specified and let it be stable.

E.g. here in combine_dataset, if author or author_email are not saved then it will throw error.

Minari/minari/utils.py

Lines 277 to 290 in ea40978

combined_data_file.attrs.modify(
"total_episodes", last_episode_id + dataset.total_episodes
)
combined_data_file.attrs.modify(
"total_steps",
combined_data_file.attrs["total_steps"] + dataset.spec.total_steps,
)
# TODO: list of authors, and emails
with h5py.File(dataset.spec.data_path, "r") as dataset_file:
combined_data_file.attrs.modify("author", dataset_file.attrs["author"])
combined_data_file.attrs.modify(
"author_email", dataset_file.attrs["author_email"]
)

There should be still small errors like this one. I think this is just caused by more attributes added and not stablely specified.

Copy link
Collaborator

@balisujohn balisujohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems almost ready to merge. Please remove the commented out lines, and add a basic test to for behavior when the optional dataset_metadata keys modified in this PR are provided when creating an instance of MinariDataset

minari/utils.py Outdated
combined_data_file.attrs.modify(
optional_parameter, dataset_file.attrs[optional_parameter]
)
# combined_data_file.attrs.modify("author", dataset_file.attrs["author"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove commented out lines

@im-Kitsch
Copy link
Contributor Author

Hi,
I added one test case for combine dataset and modified create dataset test case, could you take a look?

I think these tests should be modified in the future again, since currently those additional infomations could not be got directly from Minaridataset instance. But have to be read manualy from h5py file.

minari/utils.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_combine.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_combine.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
tests/utils/test_dataset_creation.py Outdated Show resolved Hide resolved
Comment on lines +277 to +279
assert dt_file.attrs["code_permalink"] == _final_code_link
assert dt_file.attrs["author"] == "WillDudley" + str(n_data - 1)
assert dt_file.attrs["author_email"] == "wdudley@farama.org" + str(n_data - 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this behavior is a little weird in combine dataset (you just save the attributes of the last)
Anyway, it is fixed in #133 that we will merge after, so it is okay for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I don't think only saving the attribute of last is a good choice, but it's just to keep same with combine_dataset() implementation. Definitely it will be changed in the future, but this PR is just to save the fogortten metadata code_permalink and algorithm_name.

Copy link
Member

@younik younik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fix the pre-commit, and then looks good to me

@im-Kitsch
Copy link
Contributor Author

Just fix the pre-commit, and then looks good to me

Hi, I think the pre-commit is fixed now.

@younik younik merged commit c43a612 into Farama-Foundation:main Oct 10, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants