Improving feature names #42

jbschiratti · 2018-07-12T10:12:51Z

This PR aims at having more meaningful feature names when extract_featuresis called with return_df = True.

…returns a dataframe with a meaningful multiindex).

agramfort · 2018-07-12T11:29:50Z

mne_features/feature_extraction.py

+            elif _params['ratios'] == 'only':
+                return ratios_names
+            else:
+                return pow_names + ratios_names


this is not really clean. This gymnastic should be done via the function pow_freq_bands. You have some logic in a function agnostic class which percolates from a custom function. You should attach this logic to the pow_freq_bands callable.

I agree, this is not clean. This first commit was just to allow @l-omar-chehab to move on (and have meaning feature names for compute_pow_freq_bands). Now, we can think about making it nice.

jbschiratti · 2018-07-17T16:53:36Z

Here is one idea on how the feature names could be "attached" to the feature functions.

If called with ratios=None, the feature function compute_pow_freq_bands will return an array with shape (n_channels * n_freq_bands,). In this case, possible [meaningful] feature names could be ch0_band0, ch0_band1, ch0_band2..., ch1_band0,... To "attach" this information to compute_pow_freq_bands we could use a decorator:

@with_feature_names('ch[0]_band[1]')
def compute_pow_freq_bands(sfreq, data, ...)
    ...

So that, in FeatureFunctionTransformer, self.func.feat_names_pattern gives: 'ch[0]_band[1]. Then, in the get_feature_names method of FeatureFunctionTransformer, we could have something like this:

pattern = self.func.feat_names_pattern  # 'ch[0]_band[1]'
feature_names = self._get_feature_names_helper(pattern, self.out_shape)
return feature_names

where out_shape corresponds to X_out.shape with X_out, the array returned by the feature function. For this to work, feature functions will need to return multidimensional ndarrays. This is not a big issue since the transform method of FeatureFunctionTransformer could be changed to return X_out.ravel() instead of X_out.

In the code above, _get_feature_names_helper would transform 'ch[0]_band[1]' into the list ['ch0_band0', 'ch0_band1', ..., 'ch1_band0', 'ch1_band1',...] with using the following rationale: [0] in the input string mean iterating from 0 to out_shape[0] - 1 and [1] iterating from 0 to out_shape[1] - 1. This way, 'ch[0]_band[1] would be equivalent to:

['ch%s_band%s' % (i, j) for i in range(out_shape[0]) for j in range(out_shape[1])]

... you get the idea!

If a feature function does not have a feat_names_pattern attribute, nothing changes.

agramfort · 2018-07-18T07:27:24Z

I would do it like this:

import numpy as np


def pow_freq_bands(X, bands):
    return np.random.randn(X.shape[0], len(bands))


def _pow_freq_bands_feature_names(X, bands):
    return ['ch%s_band%s' % (i, j) for i in range(X.shape[0]) for j in range(len(bands))]


pow_freq_bands.get_feature_names = _pow_freq_bands_feature_names

if __name__ == '__main__':
    func = pow_freq_bands
    X = np.random.randn(2, 20)
    bands = [(8, 12), (18, 22)]
    if hasattr(func, 'get_feature_names'):
        feature_names = func.get_feature_names(X, bands)

    print(feature_names)

by adding an attribute to the function. func.get_feature_names should expect the same parameters as func to make things easy.

clear?

`_compute_pow_freq_bands_feat_names` in univariate.py + changes in feature_extraction.py. * Minor changes in tests and examples to get rid of several warnings (issue mne-tools#44).

codecov · 2018-09-20T16:59:25Z

Codecov Report

Merging #42 into master will increase coverage by 0.38%.
The diff coverage is 95.29%.

@@            Coverage Diff             @@
##           master      #42      +/-   ##
==========================================
+ Coverage   92.74%   93.12%   +0.38%     
==========================================
  Files          10       10              
  Lines        1089     1164      +75     
==========================================
+ Hits         1010     1084      +74     
- Misses         79       80       +1

Impacted Files	Coverage Δ
mne_features/tests/test_feature_extraction.py	`88.67% <100%> (ø)`	⬆️
mne_features/univariate.py	`97.26% <100%> (+0.48%)`	⬆️
mne_features/utils.py	`91.78% <83.33%> (-0.76%)`	⬇️
mne_features/feature_extraction.py	`95.27% <91.66%> (+1.01%)`	⬆️
mne_features/tests/test_univariate.py	`84.54% <94.11%> (+1.74%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 461a8a8...8748c87. Read the comment docs.

agramfort · 2018-09-20T21:00:52Z

mne_features/feature_extraction.py

+        _feature_func = _get_python_func(self.func)
+        _params = self.get_params()
+        if hasattr(_feature_func, 'get_feature_names'):
+            self.feature_names = _feature_func.get_feature_names(X, **_params)


I don't like that calling transform affects the state of the object. Only a fit is allowed to do this. Can you see a way out? also any attribute that is data dependent should end with _

I see. Then, what about

def fit(self, X, y=None): """Fit the FeatureFunctionTransformer (does not extract features). Parameters ---------- X : ndarray, shape (n_channels, n_times) y : ignored Returns ------- self """ self._check_input(X) _feature_func = _get_python_func(self.func) _params = self.get_params() if hasattr(_feature_func, 'get_feature_names'): self.feature_names_ = _feature_func.get_feature_names(X, **_params) return self

in FeatureFunctionTransformer?

yes this is ok in terms of API but _params = self.get_params() could be in the if block

; only used to get feature names when this is possible).

jbschiratti · 2018-09-27T10:21:53Z

@agramfort If you're OK and if Travis is OK, it's good to go !

agramfort · 2018-09-27T11:25:57Z

there is to test to check that the feature names are correct. Please add one. thx

jbschiratti · 2018-09-27T12:42:23Z

Test added !

agramfort · 2018-09-27T13:02:27Z

mne_features/tests/test_univariate.py

+    fb = np.array([[4., 8.], [30., 70.]])
+    ratios_col_names = ['ch0_0_1', 'ch0_1_0', 'ch1_0_1', 'ch1_1_0',
+                        'ch2_0_1', 'ch2_1_0']
+    pow_col_names = ['ch0_0', 'ch0_1', 'ch1_0', 'ch1_1', 'ch2_0', 'ch2_1']


no way to have more explicit names likes alpha, beta etc?

We could improve that but, at some point, the user would need to name the frequency bands he wishes to use. What about allowing the freq_bands parameter in compute_pow_freq_bands to be a dict as the one below?

freq_bands = {'delta': [0.5, 4], 'theta': [4, 8], 'alpha': [8, 13], 'beta': [13, 30], 'low-gamma': [30, 70], 'high-gamma': [70, 100]}

agramfort · 2018-09-27T15:15:01Z

yes +1 !

`compute_energy_freq_bands`) to be a dict with band names as keys. Added feature names for `compute_enregy_freq_bands` + updated tests.

jbschiratti · 2018-09-28T16:45:06Z

The last commit improves the feature names when freq_bands is a dict such as:

freq_bands = {'delta': [0.5, 4], 
              'theta': [4, 8], 
              'alpha': [8, 13], 
              'beta': [13, 30], 
              'low-gamma': [30, 70], 
              'high-gamma': [70, 100]}

agramfort · 2018-09-28T16:56:47Z

Thanks

Attempt at improving get_feature_names (so that extract_features …

064225b

…returns a dataframe with a meaningful multiindex).

agramfort reviewed Jul 12, 2018

View reviewed changes

jbschiratti mentioned this pull request Jul 12, 2018

evocative feature names for power band ratios #41

Closed

* Specific feature names for compute_pow_freq_bands : added

1fa9f34

`_compute_pow_freq_bands_feat_names` in univariate.py + changes in feature_extraction.py. * Minor changes in tests and examples to get rid of several warnings (issue mne-tools#44).

jbschiratti mentioned this pull request Sep 20, 2018

Warnings with Scikit-Learn >= 0.20 #44

Open

Fixes for Travis.

bf9d818

agramfort reviewed Sep 20, 2018

View reviewed changes

Added fit method to FeatureFunctionExtractor (does not extract features

80dfa84

; only used to get feature names when this is possible).

Added test for feature names with compute_pow_freq_bands.

a71fe05

agramfort reviewed Sep 27, 2018

View reviewed changes

Allow freq_bands parameter (in compute_pow_freq_bands and

8748c87

`compute_energy_freq_bands`) to be a dict with band names as keys. Added feature names for `compute_enregy_freq_bands` + updated tests.

agramfort merged commit 911d868 into mne-tools:master Sep 28, 2018

jbschiratti deleted the feature_names branch September 28, 2018 17:06

paulroujansky mentioned this pull request Oct 16, 2020

Improving feature names 2 #60

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving feature names #42

Improving feature names #42

jbschiratti commented Jul 12, 2018

agramfort Jul 12, 2018

jbschiratti Jul 12, 2018

jbschiratti commented Jul 17, 2018

agramfort commented Jul 18, 2018

codecov bot commented Sep 20, 2018 •

edited

agramfort Sep 20, 2018

jbschiratti Sep 21, 2018 •

edited

agramfort Sep 21, 2018

jbschiratti commented Sep 27, 2018

agramfort commented Sep 27, 2018

jbschiratti commented Sep 27, 2018

agramfort Sep 27, 2018

jbschiratti Sep 27, 2018

agramfort commented Sep 27, 2018 via email

jbschiratti commented Sep 28, 2018 •

edited

agramfort commented Sep 28, 2018

Improving feature names #42

Improving feature names #42

Conversation

jbschiratti commented Jul 12, 2018

agramfort Jul 12, 2018

Choose a reason for hiding this comment

jbschiratti Jul 12, 2018

Choose a reason for hiding this comment

jbschiratti commented Jul 17, 2018

agramfort commented Jul 18, 2018

codecov bot commented Sep 20, 2018 • edited

Codecov Report

agramfort Sep 20, 2018

Choose a reason for hiding this comment

jbschiratti Sep 21, 2018 • edited

Choose a reason for hiding this comment

agramfort Sep 21, 2018

Choose a reason for hiding this comment

jbschiratti commented Sep 27, 2018

agramfort commented Sep 27, 2018

jbschiratti commented Sep 27, 2018

agramfort Sep 27, 2018

Choose a reason for hiding this comment

jbschiratti Sep 27, 2018

Choose a reason for hiding this comment

agramfort commented Sep 27, 2018 via email

jbschiratti commented Sep 28, 2018 • edited

agramfort commented Sep 28, 2018

codecov bot commented Sep 20, 2018 •

edited

jbschiratti Sep 21, 2018 •

edited

jbschiratti commented Sep 28, 2018 •

edited