-
Notifications
You must be signed in to change notification settings - Fork 330
Add function to export datasets to NetCDF and create metadata-only database #7213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
…Is, improve error handling Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
…attern Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
…nd write_metadata_to_db API Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
@astafan8 Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
…, improve error logging, and convert all tests to use tmp_path fixture Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
…emove duplicate logging Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Fixed test failures by addressing implementation issues: replaced private |
@copilot there are CI failures in tests and/or typing, and also in building documentation, could you address all of those issues? |
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Fixed all CI failures including typing and linting errors, and documentation building issues. Key fixes:
All syntax checks pass and the code now follows proper QCoDeS coding standards. 3519da4 |
exp_attrs["start_time"], | ||
exp_attrs["end_time"], | ||
with atomic(target_conn) as target_conn_atomic: | ||
target_exp = load_or_create_experiment( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any changes to extract_runs_into_db must be reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted all changes to extract_runs_into_db function to its original state using _create_exp_if_needed instead of load_or_create_experiment. 912cd49
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
Co-authored-by: astafan8 <15662810+astafan8@users.noreply.github.com>
is_run_id_in_database, | ||
) | ||
|
||
if TYPE_CHECKING: | ||
from pathlib import Path | ||
log = logging.getLogger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In new modules it would be best if this variable is called _LOG
to mark that it is private and that it's a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the logging variable to _LOG
to follow QCoDeS naming conventions for private constants. 0ce7f43
try: | ||
dataset.export("netcdf", path=export_path) | ||
netcdf_export_path = dataset.export_info.export_paths.get("nc") | ||
assert netcdf_export_path is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use assert outside test code so this should be rewritten to explicitly raise a more descriptive error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced assert with proper error handling that raises RuntimeError with descriptive error message when NetCDF export path is None. This provides better debugging information and follows best practices for production code. 6374e2f
|
||
|
||
@pytest.fixture | ||
def dataset_factory(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add types to this function and the inner function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add types to the whole file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comprehensive type hints to all functions in the test file including fixtures and test functions. Updated imports to use proper type-checking blocks and quoted type references following QCoDeS coding standards. 6374e2f
Co-authored-by: jenshnielsen <548266+jenshnielsen@users.noreply.github.com>
Co-authored-by: jenshnielsen <548266+jenshnielsen@users.noreply.github.com>
This PR implements a new function
export_datasets_and_create_metadata_db()
that addresses the common issue of data duplication when users have both database files with raw data and exported NetCDF files.Problem
When running measurements with QCoDeS, users typically have:
This results in duplicate data storage, with the database file becoming large due to raw data that's already available in the more portable NetCDF format.
Solution
The new function:
Usage Example
Key Features
Implementation Details
qcodes.dataset.database_extract_runs
module alongside related functionalityDataSet.export()
) and database operations_add_run_to_runs_table()
without_populate_results_table()
for metadata-only storageTesting
The implementation includes extensive testing:
This enables users to maintain lightweight databases for metadata while relying on NetCDF files for actual data analysis, significantly reducing storage requirements for large experimental datasets.
Fixes #7212.