Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add var, std, median #3378

Merged
merged 15 commits into from Feb 5, 2024
Merged

Add var, std, median #3378

merged 15 commits into from Feb 5, 2024

Conversation

jl-wynen
Copy link
Member

Fixes #3357

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably update

reduction_ops = (
'sum',
'mean',
'all',
'any',
'min',
'max',
'nansum',
'nanmean',
'nanmin',
'nanmax',
)
.

}
)
if isinstance(x, DataGroup):
return data_group_nary(sc_func, x, dim=dim, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sc_func and not _reduce_with_numpy with the same args used above for dataset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's shorter. I could actually do the same for Dataset. It has a tiny performance penalty put that shouldn't matter.

src/scipp/core/reduction.py Show resolved Hide resolved
Comment on lines +271 to +279
ddof:
'Delta degrees of freedom'.
For sample variances, set ``ddof=1`` to obtain an unbiased estimator.
For normally distributed variables, set ``ddof=0`` to obtain a maximum
likelihood estimate.
See :func:`numpy.var` for more details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add a comment (or even a specific error?) why this is required by Scipp? Otherwise we risk a future developer coming and changing this.

Comment on lines 411 to 412
This function computes the standard deviation of the input values which is *not*
the related to the ``x.variances`` property but instead defined as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar.

Comment on lines 497 to 498
This function computes the standard deviation of the input values which is *not*
the related to the ``x.variances`` property but instead defined as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar

# Yes, using identical with floats.
# It works and allclose doesn't support datasets and data groups.
sc.testing.assert_identical(
sc.var(x, 'xx', ddof=0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add separate tests to ensure that ddof is required, to prevent changes by a future developer not aware of our intentions.

@jl-wynen jl-wynen merged commit 6a40604 into main Feb 5, 2024
4 checks passed
@jl-wynen jl-wynen deleted the var-std-median branch February 5, 2024 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add std and var reductions
2 participants