Allow filters of the underlying dataframe to be chained by php1ic · Pull Request #24 · php1ic/nuclearmasses

php1ic · 2026-04-01T20:08:14Z

This started out with the idea to be able to chain filtering of the data with simple getter functions, e.g.

data = MassTable().full_data
myfilter = data.get_A(100).getSymbol("Ag")

but in getting that to work, it became a fairly major restructure of the code.

The parsing of the files and how they are combined has not changed, but the higher level structure to gather and store the details for NUBASE and AME has changed significantly.

Applying to the NUBASE set of data to avoid unnecessary dependencies.

Columns have been added over time. Rather than have copies of very similar lists, create one that contains all columns and remove as appropriate for each year.

We setup to help our future selves if columns every changed. We know they currently don't so there is no need to check the year if we will always do the same thing. Use the help function we created to convert Z to symbol, rather than the raw dictionary access.

The original parser class still reads a year's worth of data, with this new class doing the aggregation.

Not sure why they are both present, and some changes applied in the current branch were causing them to argue so lets stop using isort.

Convention seems to be to use the src directory so we will follow that.

A refactor somewhere else opened up quite the can of worms in relation to the data types output by importlib.resources and mypy checking. We can now use read_fwf for all the types that are used by the core parsing and are hopefully robust if a user tries to parse their own file.

To make things as modular as possible, there is now the class to parse and individual file as well as these new classes to manage the different data sets as a whole.

codecov-commenter · 2026-04-01T20:08:59Z

Codecov Report

❌ Patch coverage is 99.27536% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 99.85%. Comparing base (3b53023) to head (1658d02).

Files with missing lines	Patch %	Lines
src/nuclearmasses/utils/converter.py	93.33%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #24      +/-   ##
==========================================
- Coverage   99.86%   99.85%   -0.01%     
==========================================
  Files          10       12       +2     
  Lines         737      710      -27     
==========================================
- Hits          736      709      -27     
  Misses          1        1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add some tests related to the `dir()` method.

Make us of parametrization for simpler code.

We use the inbuilt import sort checking in ruff so have removed the use of isort.

We need to be careful to not simply replicate the functionality of a pandas dataframe, however I think a user may expect to be able to access columns wit square brackets so we will make use of those.

Let's not reinvent the wheel when accessing the final merged dataframe with all the data. Give the user access to the dataframe and let them do what they need from there.

We no longer index on the year and the main dataframe is accessed via a different member name.

php1ic · 2026-04-05T17:33:34Z

Having spend far too long on this I realised that the goal of this PR was to reinvent the wheel in relation to dataframe access and slicing. I therefore backed out a large fraction of the code that was changed.

The refactoring of the parsing functionality to separate AME and NUBASE into their own classes is still useful which is why I haven't just dropped this branch entirely.

php1ic added 17 commits March 29, 2026 17:42

Refactor filtering to follow DRY

101b11e

Use multiple inheritance rather than a chain

292c4f0

Applying to the NUBASE set of data to avoid unnecessary dependencies.

Tidy up the NUBASE column labelling

e81454c

Columns have been added over time. Rather than have copies of very similar lists, create one that contains all columns and remove as appropriate for each year.

AME mass file inheritance refactor

9547733

AME reaction file 1 inheritance refactor

132073d

AME reaction file 2 inheritance refactor

5f98806

Use the function rather than raw dictionary access

f182c82

As with c229570 remove redundant checks

181b922

Refactor NUBASE parsing into a dedicated class

ff2cf90

The original parser class still reads a year's worth of data, with this new class doing the aggregation.

Remove isort config in favour of ruff

a48f11a

Not sure why they are both present, and some changes applied in the current branch were causing them to argue so lets stop using isort.

Update ruff sorting config

3cfc7de

Convention seems to be to use the src directory so we will follow that.

Add tests against the relative error calculations

0356c0d

Top level classes to deal with all AME and NUBASE data

602b43a

To make things as modular as possible, there is now the class to parse and individual file as well as these new classes to manage the different data sets as a whole.

Update the MassTable class to make use of the new structure

31f5b6a

Add tests for the new top level classes

5e8bb70

php1ic changed the title ~~All filters of the undelying dataframe to be chained~~ Allow filters of the undelying dataframe to be chained Apr 1, 2026

php1ic added 9 commits April 4, 2026 15:14

Add some test to the MassTable after the refactor

40abc4d

Add more test coverage for MassTable

bdb16e6

Correct index testing

2ea5d98

Add some tests related to the `dir()` method.

Update unit checking with more non-time units

f0b3547

Make us of parametrization for simpler code.

Remove isort check from linting command

a2d317e

We use the inbuilt import sort checking in ruff so have removed the use of isort.

Allow a MassTable instance to be subscriptable

f1eb793

We need to be careful to not simply replicate the functionality of a pandas dataframe, however I think a user may expect to be able to access columns wit square brackets so we will make use of those.

Revert all functionality based around top level dataframe access

39e763a

Let's not reinvent the wheel when accessing the final merged dataframe with all the data. Give the user access to the dataframe and let them do what they need from there.

Update usage examples in the README

ad48107

We no longer index on the year and the main dataframe is accessed via a different member name.

Rename file so it matches the class within

1658d02

php1ic changed the title ~~Allow filters of the undelying dataframe to be chained~~ Allow filters of the underlying dataframe to be chained Apr 5, 2026

php1ic merged commit 78fb744 into main Apr 5, 2026
13 checks passed

php1ic deleted the chaining branch April 5, 2026 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow filters of the underlying dataframe to be chained#24

Allow filters of the underlying dataframe to be chained#24
php1ic merged 26 commits intomainfrom
chaining

php1ic commented Apr 1, 2026

Uh oh!

codecov-commenter commented Apr 1, 2026 •

edited

Loading

Uh oh!

php1ic commented Apr 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

php1ic commented Apr 1, 2026

Uh oh!

codecov-commenter commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

php1ic commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Apr 1, 2026 •

edited

Loading

php1ic commented Apr 5, 2026 •

edited

Loading