Conversation
Applying to the NUBASE set of data to avoid unnecessary dependencies.
Columns have been added over time. Rather than have copies of very similar lists, create one that contains all columns and remove as appropriate for each year.
We setup to help our future selves if columns every changed. We know they currently don't so there is no need to check the year if we will always do the same thing. Use the help function we created to convert Z to symbol, rather than the raw dictionary access.
The original parser class still reads a year's worth of data, with this new class doing the aggregation.
Not sure why they are both present, and some changes applied in the current branch were causing them to argue so lets stop using isort.
Convention seems to be to use the src directory so we will follow that.
A refactor somewhere else opened up quite the can of worms in relation to the data types output by importlib.resources and mypy checking. We can now use read_fwf for all the types that are used by the core parsing and are hopefully robust if a user tries to parse their own file.
To make things as modular as possible, there is now the class to parse and individual file as well as these new classes to manage the different data sets as a whole.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #24 +/- ##
==========================================
- Coverage 99.86% 99.85% -0.01%
==========================================
Files 10 12 +2
Lines 737 710 -27
==========================================
- Hits 736 709 -27
Misses 1 1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add some tests related to the `dir()` method.
Make us of parametrization for simpler code.
We use the inbuilt import sort checking in ruff so have removed the use of isort.
We need to be careful to not simply replicate the functionality of a pandas dataframe, however I think a user may expect to be able to access columns wit square brackets so we will make use of those.
Let's not reinvent the wheel when accessing the final merged dataframe with all the data. Give the user access to the dataframe and let them do what they need from there.
We no longer index on the year and the main dataframe is accessed via a different member name.
Owner
Author
|
Having spend far too long on this I realised that the goal of this PR was to reinvent the wheel in relation to dataframe access and slicing. I therefore backed out a large fraction of the code that was changed. The refactoring of the parsing functionality to separate AME and NUBASE into their own classes is still useful which is why I haven't just dropped this branch entirely. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This started out with the idea to be able to chain filtering of the data with simple getter functions, e.g.
but in getting that to work, it became a fairly major restructure of the code.
The parsing of the files and how they are combined has not changed, but the higher level structure to gather and store the details for NUBASE and AME has changed significantly.