Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BasicStatistics API change #1644

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

olegkkruglov
Copy link
Contributor

@olegkkruglov olegkkruglov commented Jan 10, 2024

Changes proposed in this pull request:

  • Refactored basic_statistics.cpp file
  • Changed API for BasicStatistics class to save results to class attribute instead of returning dict with result to be consistent with IncrementalBasicStatistics and other estimators. Saved attributes are arrays if input data is 2d and numbers if input data is 1d (like it is done in numpy)
  • Updated tests
  • Added support of n_jobs and usm_ndarray to BasicStatistics
  • Rename compute_raw method to _compute_raw since it is private and designed for internal use

onedal/basic_statistics/basic_statistics.py Show resolved Hide resolved
onedal/basic_statistics/basic_statistics.py Outdated Show resolved Hide resolved
onedal/basic_statistics/tests/test_basic_statistics.py Outdated Show resolved Hide resolved
setup_sklearnex.py Outdated Show resolved Hide resolved
sklearnex/basic_statistics/basic_statistics.py Outdated Show resolved Hide resolved
onedal/basic_statistics/basic_statistics.py Outdated Show resolved Hide resolved

tol = fp32tol if res.dtype == np.float32 else fp64tol
assert_allclose(gtr, res, rtol=tol)

@pytest.mark.parametrize("queue", get_queues())
@pytest.mark.parametrize("dtype", [np.float32, np.float64])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason behind this reordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by reordering? i have just added a new test test_multiple_options_uniform to make sure it works well if result_options is a list type

onedal/spmd/basic_statistics/basic_statistics.py Outdated Show resolved Hide resolved
sklearnex/basic_statistics/basic_statistics.py Outdated Show resolved Hide resolved
onedal/basic_statistics/tests/test_basic_statistics.py Outdated Show resolved Hide resolved
@olegkkruglov
Copy link
Contributor Author

/intelci: run

@icfaust
Copy link
Contributor

icfaust commented Jan 30, 2024

I'm wondering if BasicStatistics and IncrementalBasicStatistics shouldn't be two functions, possibly located in 'metrics' which returns a dict. @KulikovNikita @olegkkruglov @samir-nasibli

@olegkkruglov olegkkruglov force-pushed the bs-api-change branch 2 times, most recently from d78b82a to 8162e95 Compare April 18, 2024 11:31
@olegkkruglov
Copy link
Contributor Author

/intelci: run

1 similar comment
@olegkkruglov
Copy link
Contributor Author

/intelci: run

@olegkkruglov
Copy link
Contributor Author

/intelci: run

@ethanglaser
Copy link
Contributor

This would drop basic statistics compute completely and break API - I guess we would need to deprecate it first?

@olegkkruglov
Copy link
Contributor Author

This would drop basic statistics compute completely and break API - I guess we would need to deprecate it first?

I would say that it is not breaking but more like fixing and making more consistent. As far as I understand, we have to use fit instead of compute in all estimators and compute was used here accidentally. But if we still need to use deprecation in such case then do we have any guidelines about how should it be done?

@ethanglaser
Copy link
Contributor

This would drop basic statistics compute completely and break API - I guess we would need to deprecate it first?

I would say that it is not breaking but more like fixing and making more consistent. As far as I understand, we have to use fit instead of compute in all estimators and compute was used here accidentally. But if we still need to use deprecation in such case then do we have any guidelines about how should it be done?

Agreed that it is fixing, but that doesn't mean its not breaking - anyone that is using our BasicStatistics API will be impacted by this change. @Alexsandruss how do we generally handle this? (ie should we add deprecation warning message but leave compute() and just call fit() from it initially or can we remove compute() immediately)

I would also expect that this may require an accompanying docs update, spmd interfaces, etc.

@samir-nasibli
Copy link
Contributor

This would drop basic statistics compute completely and break API - I guess we would need to deprecate it first?

I would say that it is not breaking but more like fixing and making more consistent. As far as I understand, we have to use fit instead of compute in all estimators and compute was used here accidentally. But if we still need to use deprecation in such case then do we have any guidelines about how should it be done?

Agreed that it is fixing, but that doesn't mean its not breaking - anyone that is using our BasicStatistics API will be impacted by this change. @Alexsandruss how do we generally handle this? (ie should we add deprecation warning message but leave compute() and just call fit() from it initially or can we remove compute() immediately)

I would also expect that this may require an accompanying docs update, spmd interfaces, etc.

We can leave compute as well with deprecation warning.

@samir-nasibli
Copy link
Contributor

I'm wondering if BasicStatistics and IncrementalBasicStatistics shouldn't be two functions, possibly located in 'metrics' which returns a dict. @KulikovNikita @olegkkruglov @samir-nasibli

I think this is good idea to discuss. In case if we will accept this, deprecation warnings also needed here.
As an option we can also add function for BS in metrics, just reusing original BS.
Adding just function for IncrementalBasicStatistics idea will be harder.

Personally I am good with current solution as well.

@olegkkruglov
Copy link
Contributor Author

I would also expect that this may require an accompanying docs update, spmd interfaces, etc.

@ethanglaser deprecation is added, spmd part is updated, but I can't see any docs containing information about old interfaces which should be updated according to these changes. Am I missing something?

@olegkkruglov olegkkruglov force-pushed the bs-api-change branch 2 times, most recently from 68ee924 to d43a50d Compare May 10, 2024 13:26
@olegkkruglov
Copy link
Contributor Author

/intelci: run

@samir-nasibli
Copy link
Contributor

@olegkkruglov please rebase you branch

@olegkkruglov
Copy link
Contributor Author

/intelci: run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants