-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement new calculate_population_statistics
method.
#453
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small suggestions and a question, but otherwise looks good.
I tried reasoning through the code for calculating the indices but it made my head hurt after a while 😆. It didn't look obviously wrong but I'm not confident that I've got it right either! How about making a separate standalone function for caculating indices for an n-dimensional matrix and writing a unit test for it (using some hand-calculated values)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to take shape 👍. Some fixing up to do still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Thanks for the feedback @jamesturner246 @alexdewar . This should be good for another review now. I don't really understand the wall of text output for why the build is failing on Windows. I think it doesn't like this: healthgps/src/HealthGPS/analysis_module.cpp Lines 68 to 69 in b596d0b
but I'm not really sure what it wants me to do. Any ideas? |
Humm.. Try: size_t total_num_bins =
std::accumulate(factor_bins_.cbegin(), factor_bins_.cend(), 1u, std::multiplies<>()); Note |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. 👍
Re. the Windows bug, looks like a pedantic casting warning, Try using 1u
as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (except for that weird MSVC failure).
Yep well done @jamesturner246 that was the problem, I ended up having to make the accumulator type in |
This begins to tackle #420, but does not completely close that issue because I wanted to keep the PRs at at a reasonably-digestible size. More PRs to come if/when this gets approved. Happy to discuss this PR in person as it's a bit weird.
factors
to the header file as a data member and rename tofactors_to_calculate_
. Eventually this will be set elsewhere anyway.factor_bins_
to the header file as a data member. This is because it needs to be accessed elsewhere in the class.factor_bin_widths_
as a data member. This helps us calculate which bin a given factor value is in later on.calculate_population_statistics
function with thecalculated_factors_
vector in the signature instead of theseries
object. We can remove the original function when we have completed implementation of this new way of calculating the analysis.The new, overloaded
calculate_population_statistics
function is the main focus of this PR. At the moment all it is doing is calculating the correct index in thecalculated_factors_
vector for a given entity. This is necessary because we are representing a matrix of arbitrary dimensions in a flat vector. The number of dimensions is equal to the size offactors_to_calculate_
.The calculation for fetching the correct index is based on the following idea for calculating the index in a 3D, row-major ordered matrix with dimensions of length
L
,M
, andN
.but extended to be able to deal with an arbitrary number of dimensions (I think I have done this right, really this is the bit of the PR that I'm most interested in being reviewed).