Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added SUM as aggregation type for custom statistics #4816

Merged
merged 21 commits into from
Jan 8, 2021

Conversation

brccabral
Copy link
Contributor

@brccabral brccabral commented Jan 3, 2021

Added SUM as aggregation type for custom statistics

@brccabral brccabral changed the title log number of rewards in cmd summary Added SUM as aggregation type for custom statistics Jan 3, 2021
@chriselion chriselion self-requested a review January 6, 2021 01:51
@chriselion chriselion self-assigned this Jan 6, 2021
Copy link
Contributor

@chriselion chriselion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is a good change and I can see how it would be useful. I have some feedback on the particular implementation though. There are two (mostly mutually exclusive) ways to go from here:

Option 1

Instead of renaming StatsSummary.mean, you could add a sum field, and also add an aggregation_method field. Then in the StatsWriter implementation (for example

self.summary_writers[category].add_scalar(f"{key}", value.mean, step)
) you can use aggregation_method to decide whether to use mean or sum.

You could also add an aggregated_value property to StatsSummary like

@property
def aggregated_value(self):
  return self.sum if aggregated_value == StatsAggregationMethod.SUM else self.mean

Option 2

Add a method to StatsReporter like increment_stat, and call that from record_environment_stats() when agg_type == SUM. That would look something like

    def increment_stat(self, key: str, value: float) -> None:
        with StatsReporter.lock:
            if StatsReporter.stats_dict[self.category][key]:
                # Add to the last stat. If we're always using increment_stat, this
                # should be the only element in the list. 
                StatsReporter.stats_dict[self.category][key][-1] += value
            else:
                # New list with the value as the only element
                StatsReporter.stats_dict[self.category][key] = [value]

(but make sure you test it 😃 )


I think I prefer the second approach because it's more consistent with the existing idea of "aggregating" in the StatsReporter and it doesn't require storing the AggregationMethod, but the first approach is more flexible (especially since we're planning to let users add their own StatsWriters soon, see the plugins PR).

Do you have a preference for either one?

@@ -35,6 +38,7 @@ def __init__(self) -> None:
def on_message_received(self, msg: IncomingMessage) -> None:
"""
Receive the message from the environment, and save it for later retrieval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the extra whitespaces that were added in the docstrings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the extra spaces are necessary for PyCharm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to find the settings to disable them; I use PyCharm as do several other team members, and this hasn't been a problem before. At a minimum, you need to revert files that only contain the whitespace changes:

  • ml-agents/mlagents/trainers/ppo/trainer.py
  • ml-agents/mlagents/trainers/torch/components/reward_providers/gail_reward_provider.py
  • ml-agents/mlagents/trainers/trainer/rl_trainer.py

On the other hand, if you know of a way to automate this from the command line (and enforce that the :param ... names match the args), I'd love that in another PR.

@@ -272,8 +272,8 @@ def test_agent_manager_stats():
manager.record_environment_stats(env_stats, worker_id=0)

expected_stats = {
"averaged": StatsSummary(mean=2.0, std=mock.ANY, num=2),
"most_recent": StatsSummary(mean=4.0, std=0.0, num=1),
"averaged": StatsSummary(stats_value=2.0, std=mock.ANY, num=2),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also extend this test to include StatsAggregationMethod.SUM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit to extend it

log_info.append(f"Std of Reward: {stats_summary.std:0.3f}")
log_info.append(f"Num of Reward: {stats_summary.num:0.3f}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you should add this; it's not useful for training.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it was. If you have a Decision Request script attached, the expected value may change.
In my case, MaxStep = 3000, DecisionRequest = 6, but I wasn't sure how this works at that time.
My custom stat was 0.3 every step.
0.3 * 3000 = 900
but instead I was getting 150, and I didn't know why.
Only after I added the "Num of Reward" I noticed that I was getting 500 Num.
That is because 3000 / 6 = 500.
So,
0.3 * 3000 / 6 = 150.
Because I didn't know how DecisionRequest worked, it took me a while to figure this out.
With the "Num of Rewards" I think it would have saved me a few hours.
Well, that was my experience, I won't mind if you want to remove it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd like you to remove it. You'll be able to add your own StatsWriter that gets called soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this (and updated the tests)

@@ -75,17 +76,24 @@ def test_run_training(
learn.run_training(0, options)
mock_init.assert_called_once_with(
trainer_factory_mock.return_value,
"results/ppo",
os.path.join("results", "ppo"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a bad change, but does it need to happen in this PR? Does this test fail on a certain platform (Windows?) without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it fails in Windows (I am using Windows)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll make a note for us to run pytest on Windows too.

@brccabral
Copy link
Contributor Author

Hi @chriselion
In my commit "b7328ba" I did just as you mentioned in Option 1.

class StatsSummary(NamedTuple):
    mean: float
    std: float
    sum: float
    num: int
    aggregation: StatsAggregationMethod

    @staticmethod
    def empty() -> "StatsSummary":
        return StatsSummary(0.0, 0.0, 0.0, 0, StatsAggregationMethod.AVERAGE)

but I would have to rewrite all the writers classes (ConsoleWriter, GaugeWriter, TensorboardWriter) as they all have their own write_stats() method.
By renaming it to "stats_value", all three classes get the implementation without major changes.
Also, if the aggregation is SUM, there is no need to store AVG data in stats_dict[].
And, in the future, I could use other aggregations, MAX/MIN/PERCENTILE.
Would the class StatsSummary have more and more fields?

Option 2 would not give much flexibility to add other aggregations, we are writing our own implementations, but numpy already have them implemented.

@chriselion
Copy link
Contributor

I would have to rewrite all the writers classes (ConsoleWriter, GaugeWriter, TensorboardWriter) as they all have their own write_stats() method.

If you add the property I suggested, you wouldn't need to add any extra logic in the writers. You could even call the property stats_value.

Also, if the aggregation is SUM, there is no need to store AVG data in stats_dict[].

Maybe, but I'd rather not make the logic any more complicated.

And, in the future, I could use other aggregations, MAX/MIN/PERCENTILE. Would the class StatsSummary have more and more fields?

We can add them as we need them.

std=np.std(StatsReporter.stats_dict[self.category][key]),
num=len(StatsReporter.stats_dict[self.category][key]),
)
if (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not your fault, but this method is getting messy and should be cleaned up. Something like

stat_values = StatsReporter.stats_dict[self.category][key]
if len(stat_values) == 0:
    return StatsSummary.empty()

return StatsSummary(
    mean=np.mean(stat_values),
    sum=np.sum(stat_values),
    std=np.std(stat_values),
    num=len(stat_values),
    aggregation_method = StatsReporter.stats_aggregation[self.category][key],
)

should be a bit cleaner.

@brccabral
Copy link
Contributor Author

brccabral commented Jan 7, 2021

Hi @chriselion ,
I did the changes you requested, except the docstring, someone already have a tracking issue in jetbrains
https://youtrack.jetbrains.com/issue/PY-26281

This is what happens in my case:
image

@@ -95,8 +108,8 @@ def write_stats(
) -> None:
for val, stats_summary in values.items():
set_gauge(
GaugeWriter.sanitize_string(f"{category}.{val}.mean"),
float(stats_summary.mean),
GaugeWriter.sanitize_string(f"{category}.{val}.aggregated_value"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to break some of our internal processes (that live outside of this repo).

Can you make it:

set_gauge(GaugeWriter.sanitize_string(f"{category}.{val}.mean"), float(stats_summary.mean))
set_gauge(GaugeWriter.sanitize_string(f"{category}.{val}.sum"), float(stats_summary.sum))

instead?

@chriselion
Copy link
Contributor

Thanks, I think it's almost there! I left a few final comments but otherwise it looks pretty good.

Sorry to keep harping on the newlines, but I don't think a bug in pycharm's display is a worthwhile reason to change. I'm OK with them in the files that you have other changes in, but files where those are the only changes should be reverted (I can do this for you in git, as long as you have the setting on the PR to allow repo owners to push changes).

@brccabral
Copy link
Contributor Author

All done, @chriselion

self,
step: Union[
TerminalStep, DecisionStep
], # pylint: disable=unsubscriptable-object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -37,14 +40,23 @@ def _dict_to_str(param_dict: Dict[str, Any], num_tabs: int) -> str:
)


class StatsSummary(NamedTuple):
class StatsSummary(NamedTuple): # pylint: disable=inherit-non-class
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chriselion
Copy link
Contributor

Thanks, I made a few small cleanups and reverts. Will merge this when tests pass.

@chriselion chriselion merged commit f813287 into Unity-Technologies:master Jan 8, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants