Skip to content

Allow filters of the underlying dataframe to be chained#24

Merged
php1ic merged 26 commits intomainfrom
chaining
Apr 5, 2026
Merged

Allow filters of the underlying dataframe to be chained#24
php1ic merged 26 commits intomainfrom
chaining

Conversation

@php1ic
Copy link
Copy Markdown
Owner

@php1ic php1ic commented Apr 1, 2026

This started out with the idea to be able to chain filtering of the data with simple getter functions, e.g.

data = MassTable().full_data
myfilter = data.get_A(100).getSymbol("Ag")

but in getting that to work, it became a fairly major restructure of the code.

The parsing of the files and how they are combined has not changed, but the higher level structure to gather and store the details for NUBASE and AME has changed significantly.

php1ic added 17 commits March 29, 2026 17:42
Applying to the NUBASE set of data to avoid unnecessary dependencies.
Columns have been added over time. Rather than have copies of very
similar lists, create one that contains all columns and remove as
appropriate for each year.
We setup to help our future selves if columns every changed. We know
they currently don't so there is no need to check the year if we will
always do the same thing.

Use the help function we created to convert Z to symbol, rather than the
raw dictionary access.
The original parser class still reads a year's worth of data, with this
new class doing the aggregation.
Not sure why they are both present, and some changes applied in the
current branch were causing them to argue so lets stop using isort.
Convention seems to be to use the src directory so we will follow that.
A refactor somewhere else opened up quite the can of worms in relation
to the data types output by importlib.resources and mypy checking. We
can now use read_fwf for all the types that are used by the core parsing
and are hopefully robust if a user tries to parse their own file.
To make things as modular as possible, there is now the class to parse
and individual file as well as these new classes to manage the different
data sets as a whole.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 99.27536% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 99.85%. Comparing base (3b53023) to head (1658d02).

Files with missing lines Patch % Lines
src/nuclearmasses/utils/converter.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #24      +/-   ##
==========================================
- Coverage   99.86%   99.85%   -0.01%     
==========================================
  Files          10       12       +2     
  Lines         737      710      -27     
==========================================
- Hits          736      709      -27     
  Misses          1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@php1ic php1ic changed the title All filters of the undelying dataframe to be chained Allow filters of the undelying dataframe to be chained Apr 1, 2026
php1ic added 9 commits April 4, 2026 15:14
Add some tests related to the `dir()` method.
Make us of parametrization for simpler code.
We use the inbuilt import sort checking in ruff so have removed the use
of isort.
We need to be careful to not simply replicate the functionality of a
pandas dataframe, however I think a user may expect to be able to access
columns wit square brackets so we will make use of those.
Let's not reinvent the wheel when accessing the final merged dataframe
with all the data. Give the user access to the dataframe and let them do
what they need from there.
We no longer index on the year and the main dataframe is accessed via a
different member name.
@php1ic php1ic changed the title Allow filters of the undelying dataframe to be chained Allow filters of the underlying dataframe to be chained Apr 5, 2026
@php1ic
Copy link
Copy Markdown
Owner Author

php1ic commented Apr 5, 2026

Having spend far too long on this I realised that the goal of this PR was to reinvent the wheel in relation to dataframe access and slicing. I therefore backed out a large fraction of the code that was changed.

The refactoring of the parsing functionality to separate AME and NUBASE into their own classes is still useful which is why I haven't just dropped this branch entirely.

@php1ic php1ic merged commit 78fb744 into main Apr 5, 2026
13 checks passed
@php1ic php1ic deleted the chaining branch April 5, 2026 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants