Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeanThreshold, MedianThreshold, and other threshold support in GenericUnivariateSelect #21699

Open
charlesbmi opened this issue Nov 17, 2021 · 3 comments · May be fixed by #27722
Open

MeanThreshold, MedianThreshold, and other threshold support in GenericUnivariateSelect #21699

charlesbmi opened this issue Nov 17, 2021 · 3 comments · May be fixed by #27722
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices module:feature_selection New Feature

Comments

@charlesbmi
Copy link

charlesbmi commented Nov 17, 2021

Describe the workflow you want to enable

I would like to select features by thresholding their mean value (i.e., mean-across-samples), similar to how VarianceThreshold selects features by thresholding their variance-across-samples.

Describe your proposed solution

Two possible options:

Describe alternatives you've considered, if relevant

Another alternative, although this seems counter to how these functions are designed

Additional context

Setting a MeanThreshold would be useful when working with non-negative features, such as pixel intensity in images. For example, we might want to exclude pixels that are regularly saturated in our dataset, as they may be less informative.

Specifically, in my research field of neuroscience (single-neuron recordings), our "features" are the (non-negative) action-potential-counts for each neuron. We often exclude neurons with very-low-firing-rates to minimize discretization error. Here are a few examples of neuroscience papers that set a MeanThreshold per neuron (i.e., feature):

@DuarteSJ
Copy link
Contributor

Is anyone working on this issue? Can I?

@adrinjalali
Copy link
Member

I think improving GenericUnivariateSelect sounds like the best path forward here.

@adrinjalali adrinjalali added Moderate Anything that requires some knowledge of conventions and best practices help wanted labels Apr 26, 2024
@glemaitre
Copy link
Member

Is this PR solving the issue: #27722

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices module:feature_selection New Feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants