Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow computing emissions and dumping results periodically for other output modes #448

Open
miquelmarti opened this issue Sep 7, 2023 · 1 comment
Labels
enhancement New feature or request P2 Priority 2

Comments

@miquelmarti
Copy link

Currently, in the CSV output mode and other output modes results and emissions are only computed and persisted at the end of the run. I would rather get the results dumped to disk in the same way that for the API output mode emissions are computed and partial results persisted periodically:

if (
self._cc_api__out is not None or self._cc_prometheus_out is not None
) and self._api_call_interval != -1:
if self._measure_occurrence >= self._api_call_interval:
emissions = self._prepare_emissions_data(delta=True)
logger.info(
f"{emissions.emissions_rate * 1000:.6f} g.CO2eq/s mean an estimation of "
+ f"{emissions.emissions_rate*3600*24*365:,} kg.CO2eq/year"
)
if self._cc_api__out:
self._cc_api__out.out(emissions)
if self._cc_prometheus_out:
self._cc_prometheus_out.out(emissions)
self._measure_occurrence = 0

This way:

  • If a run crashes unexpectedly partial results are still saved
  • I can visualize the outputs during training
  • I can monitor emissions by reading the outputs and preempt runs in case of increased emissions, for example

I think the easiest way would be to allow configuring which output modes should appear in the if statements in the lines above, but ideally one could configure different rates for different outputs.

@miquelmarti miquelmarti changed the title Allow periodic flush of results for other output modes than API Allow computing emissions and dumping results periodically for other output modes Sep 8, 2023
@benoit-cty benoit-cty added the enhancement New feature or request label Sep 13, 2023
@benoit-cty
Copy link
Contributor

Hello,

Yes it could be nice to write to a CSV the same data as the API.

We have a flush() method that is more a checkpoint: it does not reset the data, it only store them to avoid loosing them if there is a crash and to give a view when training for a very long time.

See #438 (comment)

You have a parameter on_csv_write :

  • update : the existing run_id row (erasing former data)
  • append : add a new row to CSV file (defaults)

So it seems we already have all the parts needed for what you ask. Feel free to propose a PR.

@benoit-cty benoit-cty added the P2 Priority 2 label Mar 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P2 Priority 2
Projects
None yet
Development

No branches or pull requests

2 participants