Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stricter build_path / better end_of_period #237

Merged
merged 6 commits into from Aug 15, 2023

Conversation

aulemahal
Copy link
Collaborator

@aulemahal aulemahal commented Aug 11, 2023

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
    • This PR fixes #xyz
  • (If applicable) Documentation has been added / updated (for bug fixes / features).
  • (If applicable) Tests have been added.
  • This PR does not seem to break the templates.
  • HISTORY.rst has been updated (with summary of main changes).
    • Link to issue (:issue:number) and pull request (:pull:number) has been added.

What kind of change does this PR introduce?

Two different things, but that I both realized when updating the catalogs.

Stricter build_path

When I migrated from miranda, I relaxed the "structure" code because it felt too restricting and I wanted to simplify the logic. However, this wasn't a good idea. When moving the ESPO-R5-E5L indicators, I forgot to include the "experiment" field somewhere and used build_path to copy the files. Result : one scenario overwrote the other, I lost half of the data.

This PR changes the things a bit, the main change being : All facets, except those marked optional, are necessary. build_path will FAIL if any is missing.

And:

  • New way to specific a folder level in the schema : "()" (with parenthesis). This facet is marked as optional and if it is missing from the data, the level is skipped.
  • Removal of the "option: " structure from the folder schema. This was only used to put "[bias_adjust_project version]" if the former was non-null. Instead, the schemas are duplicated : one for the "raw" case and one for the "biasadjusted" case. (And similarly for derived data).
  • The previous point allowed me to rewrite _get_needed_fields without the funcky magic needed before.
  • Removal of the "strict" keyword. It is always strict. The previous strict=True was overly strict because of caveats of _get_needed_fields. Those are now fixed and strict=False shouldn't be needed.
  • Some syntax in the yml file changed.
  • Passing a dataframe/catalog will now also add a "new_path_type" column to the output, so one can make sure all entries have been constructed from the same schema.

Better end_of_period

When I updated pandas to 2, I modified date_parser and it changed how the "end_of_period" was handled.

date_parser('2020', end_of_period=True)
# Initial xscen: "2020-12-31 23:00"
# Current xscen : "2020-12-31 00:00:00"
# This PR: "2020-12-31 23:59:59"

Thus, when searching for a coverage, the error due to the hour of the period end will be reduced.

Does this PR introduce a breaking change?

build_path is now always strict.

Other information:

Do you agree?

I could re-implement strict=False if needed. It would mark all fields as optional on-the-fly. The default would stay "True".

aulemahal and others added 2 commits August 14, 2023 17:54
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
xscen/data/file_schema.yml Outdated Show resolved Hide resolved
xscen/data/file_schema.yml Outdated Show resolved Hide resolved
xscen/data/file_schema.yml Outdated Show resolved Hide resolved
xscen/data/file_schema.yml Outdated Show resolved Hide resolved
xscen/data/file_schema.yml Outdated Show resolved Hide resolved
xscen/data/file_schema.yml Show resolved Hide resolved
Copy link
Contributor

@juliettelavoie juliettelavoie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I don't think a strict =False option is necessary.. Users can always pass their own schema.

tests/test_catutils.py Outdated Show resolved Hide resolved
aulemahal and others added 2 commits August 15, 2023 14:06
Co-authored-by: juliettelavoie <juliette.lavoie@hotmail.ca>
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
@aulemahal aulemahal merged commit e1083a4 into main Aug 15, 2023
9 checks passed
@aulemahal aulemahal deleted the strict-build-path-better-period-end branch August 15, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants