Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust handling of inconsistent TabularInput keys #45

Closed
naeioi opened this issue Mar 11, 2020 · 1 comment
Closed

Robust handling of inconsistent TabularInput keys #45

naeioi opened this issue Mar 11, 2020 · 1 comment

Comments

@naeioi
Copy link
Member

naeioi commented Mar 11, 2020

Currently, CsvOutput emits a warning if the keys of a TabularInput change after the first call to logger.log(TabularInput). A new key not seen before will be ignored and an old key not presented will be left blank. In other words, CsvOutput conservatively handles dynamic fieldnames.

This behaviour of CsvOutput makes it tricky to log performance of Multi- and Meta- ML algorithms, where there are usually per-task fields but not every task is presented in every iteration, resulting in missing of logs for some tasks.

The desired behaviour to handle inconsistent keys should be

  • When a new key is encountered
    • Expand header with the new key.
    • Expand old rows with empty cells for the new key.
  • If the value of any key is missing, leave the cell blank.
@avnishn
Copy link
Member

avnishn commented May 19, 2020

Introduction

Dowel is a tool that the garage Team uses for logging results from our various Reinforcement learning experiments.

Dowel can be used to log different types of data such as floats or strings. The logs can be logged to stdout (the console), CSV files, and Tensorboard.

You can check out an example of how Dowel is used here. In fact, almost all parts of the Dowel API are used in this example.

The problem

After statistics such as loss have been logged, and a call to logger.dump_all() is made for the first time, new tabular data can’t be written to a CSV output. This is because currently data cannot be inconsistently logged to CSV, meaning that on every single call to dump_all, the same logger keys must appear. Data that is inconsistently logged will not appear in the CSV output. This is a design flaw that we have been able to work around but affects our workflows.

Your goal is to solve the problem as well as introduce tests into our testing framework in order to verify your solution.

Some General Instructions

  1. Fork Dowel and install all necessary dependencies.
  2. Take a look at this toy example which when run exposes the bug and the accompanying issue mentioned above.
  3. When you have finished writing your solution and tests, upload a PR onto your fork, not onto the upstream repository.
  4. When you are done email us back with the link to your pull request.

If you have any questions, open an issue in your fork, and tag @avnishn and @zequnyu. Our preferred mode of communication on any questions that you have is through github issues and pull requests, as this is how the Garage team communicates generally. For this reason, we won’t respond to any direct emails with regards to help with your project. We will however respond to any other questions that you have via email (interview scheduling, etc).

Best of luck, and let us know if there are any issues as early on as possible

suprememichael added a commit to suprememichael/dowel that referenced this issue May 20, 2020
terickson87 added a commit to terickson87/dowel that referenced this issue May 20, 2020
Adhyyan1252 added a commit to Adhyyan1252/dowel that referenced this issue May 21, 2020
Before this commit, adding a new key in the tabular type
after the first call would lead to a warning.

Now, if a new key is added after the initial call
CSV_output will rewrite the CSV file to include
additional columns corresponding to the new keys.

This allows for dynamic keys when using Tabulars.

Resolves rlworkgroup#45
nicolengsy added a commit to nicolengsy/dowel that referenced this issue May 22, 2020
GuanyangLuo added a commit to GuanyangLuo/dowel that referenced this issue Jun 2, 2020
This commit adds robust handling of inconsistent TabularInput keys,
with two implementations. The current behavior of ignoring new keys is
kept as the default, but the users can now optionally specify how to
record new keys and the corresponding values. (They must consider the
trade-off between the two implementations.)
@avnishn avnishn closed this as completed Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants