Skip to content

fix: minor basic stats quality fixes #2521

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ethanglaser
Copy link
Contributor

@ethanglaser ethanglaser commented Jun 9, 2025

Description

A few small corrections


PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
  • I have added a respective label(s) to PR if I have a permission for that.
  • I have resolved any merge conflicts that might occur with the base branch.

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

@ethanglaser ethanglaser added the enhancement New feature or request label Jun 9, 2025
Copy link

codecov bot commented Jun 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Flag Coverage Δ
azure 79.93% <100.00%> (+0.01%) ⬆️
github 71.62% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
onedal/basic_statistics/basic_statistics.py 96.29% <100.00%> (ø)

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@icfaust icfaust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires a 'black' fix. otherwise good to go

@@ -157,7 +157,7 @@ def fit(self, data, sample_weight=None, queue=None):
data_table, weights_table = to_table(data, sample_weight, queue=queue)

dtype = data_table.dtype
raw_result = raw_result = self._compute_raw(
raw_result = self._compute_raw(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes

@@ -48,7 +48,7 @@ def generate_data(par, size, seed=777):

params_spmd = {"ns": 19, "nf": 31}

data, weights = generate_data(params_spmd, size)
data, weights = generate_data(params_spmd, rank)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be made to generate different data for each rank?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - that was the original mistake here. size is the same for every rank, so the same data is generated. rank is different on every rank, so different data is generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's still being generated in a loop where each rank contains the data from the previous one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although maybe that would be reflected in the seed parameter and this should be tweaked further. The data generation function here is pretty wonky. I'll take a closer look tomorrow.

@ethanglaser ethanglaser marked this pull request as draft June 10, 2025 18:34
@ethanglaser ethanglaser marked this pull request as ready for review June 10, 2025 23:28
@Alexsandruss Alexsandruss merged commit b742d86 into uxlfoundation:main Jun 16, 2025
27 of 28 checks passed
david-cortes-intel pushed a commit to david-cortes-intel/scikit-learn-intelex that referenced this pull request Jun 18, 2025
* fix: minor basic stats quality fixes

* blacked

* vary seed by rank instead of size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants