-
Notifications
You must be signed in to change notification settings - Fork 19
Add external data logging selection #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add external data logging selection #176
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces selective external data logging functionality, allowing users to control which external data channels are written to HDF5 output files while keeping all channels available to controllers during simulation.
Key Changes:
- New
external_datadictionary format withlog_channelsparameter to selectively log external signals - Backward compatibility for the deprecated top-level
external_data_fileformat (with deprecation warning) - New example (Example 05) demonstrating LMP-based battery control with selective logging
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
hercules/utilities.py |
Adds validation and normalization for new external_data structure with backward compatibility handling |
hercules/hercules_model.py |
Implements selective logging logic based on log_channels configuration |
tests/hercules_model_test.py |
Adds comprehensive test coverage for new selective logging feature |
hercules/grid/grid_utilities.py |
Code formatting improvements (spacing around operators) |
tests/grid_utilities_test.py |
Code formatting improvements (spacing and line continuations) |
examples/05_wind_and_storage_with_lmp/ |
New example demonstrating LMP-based control with selective external data logging |
docs/hercules_input.md |
Documents new external data configuration format and selective logging feature |
docs/h_dict.md |
Updates h_dict documentation with external_data structure details |
docs/examples_overview.md |
Adds new example to the examples list |
docs/_toc.yml |
Includes new example in documentation table of contents |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…rcules into pr/paulf81/176
misi9170
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @paulf81 , This mostly looks good but I've left a handful of minor documentation-type comments. I think once those are addressed (or if you're happy how they are, since none of them are really critical, this can be merged).
docs/h_dict.md
Outdated
| ### External Data (`external_data`) | ||
| | Key | Type | Description | Default | | ||
| |-----|------|-------------|---------| | ||
| | `external_data_file` | str | Path to CSV file with external time series data | Required | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Required? Some simulations don't need external data files? Perhaps I'm not fully understanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtle, it is required if you say there is external_data (include this section), better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, interesting. So, having the top-level external_data field with nothing in it is not allowed, is that correct? I wonder if there is a protocol here; it seems that there might be cases when it's nice to be able to have "empty" top levels (say, an empty wind_farm), just as placeholders; but if I'm understanding correctly, that is not currently supported. In that case, I see your point, and I suppose this is fine (Required if external_data is provided at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also allow that having this blank is ``external_data_fileis equivalent to havingexternal_data` blank
docs/hercules_input.md
Outdated
|
|
||
| The CSV file must contain: | ||
| - A `time_utc` column with UTC timestamps in ISO 8601 format | ||
| - One or more data columns with external signals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we could say that the names of the channels other than time_utc are arbitrary---anything will be carried forward and interpolated (so, also perhaps worth saying they all need to be floats, since they are going to be interpolated). Similarly, the channels to log are arbitrary. However, some controllers and plotting utilities that work on external signals may require certain names for channels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a shot!
| lmp_data_path = Path("external_data_lmp.csv") | ||
| if not lmp_data_path.exists(): | ||
| print("Generating LMP data file...") | ||
| # Create 4 hours of data at 5-minute intervals | ||
| # Start time matching the example: 2024-06-24T16:59:08Z | ||
| start_time = pd.Timestamp("2024-06-24T16:59:08Z") | ||
| # 4 hours = 240 minutes, 5-minute intervals = 49 time points (0, 5, 10, ..., 240) | ||
| time_points = pd.date_range(start=start_time, periods=49, freq="5min") | ||
|
|
||
| # Create the dataframe | ||
| df = pd.DataFrame( | ||
| { | ||
| "time_utc": time_points, | ||
| "lmp_da": np.full(49, 10.0), # Constant $10 | ||
| "lmp_rt": np.linspace(0, 50, 49), # Ramp from $0 to $50 | ||
| } | ||
| ) | ||
|
|
||
| # Save to CSV | ||
| df.to_csv(lmp_data_path, index=False) | ||
| print(f"Created {lmp_data_path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this code very fast to run? Not sure it's worth having the if not exists business---just run it every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing
hercules/hercules_model.py
Outdated
| else: | ||
| raise ValueError( | ||
| f"Unsupported file format for '{filename}'. " | ||
| "Supported formats: CSV (.csv), Feather (.feather), Pickle (.pkl, .pickle)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add .ftr, .p here? We seem to mostly used .ftr, so this message seems confusing since it makes it look like .ftr is not supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch
hercules/utilities.py
Outdated
|
|
||
| # If old-style external_data_file is used at top level, convert to new structure with warning | ||
| if "external_data_file" in h_dict: | ||
| import warnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this import to the top of the file? Seems it will likely be useful elsewhere, and it doesn't seem like it's necessary to only import it in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
|
Note that also included in this PR is a reversion to a format of |
This Pull Request adds the ability to select which external data channels to log (previously only logging all allowed). Since its possible the external_data_file includes channels that are time shifts of other channels to facilitate control this option avoids bloat in the logged output.
Backwards compatiblity (with deprecation is warning) is included, such that if the log_channels are not identified then the current behavior of logging all channels is applied.