added mean, min, and max feature extraction methods #20

Nathaniel-Haines · 2018-02-07T03:58:28Z

Let me know what you all think of this method of feature extraction. The output format is the same as the boft extractor (1 row, and a column for each feature), and specifying the 'by' argument allows users to group observations by other features in the data before summarizing (e.g. by subjects, trials, or whatever). By default, the functions will summarize data across all rows.

This is all default pandas functionality too, so it is quick and easy.

ljchang · 2018-02-07T04:08:35Z

I think it is fine to start with these. It's pretty easy to do this using the already built in functionality

fex.groupby(column_name).mean()

I think the only thing that yours adds is the renaming of the column names.

At some point we need to figure out how to be able to chain or pass a list of features to extract, or possibly a pipeline. I know I sound like a broken record, but I really like how pliers solves this problem. It can deal with multiple features in one line and merges them all together in the output automatically.

Nathaniel-Haines · 2018-02-07T04:14:22Z

It renames the columns as well as transposing the output when not grouping, but yes the added functionality is not much. That said, I think it would be nice for all feature extraction methods to follow the standard format 'Fex.extract_*' so that users do not need to know which are pandas default versus specific to feat. I agree that a pipeline would be best. I can begin looking into how Pliers does this tomorrow and see what I can do to implement something similar.

…

On Tue, Feb 6, 2018 at 11:08 PM Luke Chang ***@***.***> wrote: I think it is fine to start with these. It's pretty easy to do this using the already built in functionality fex.groupby(column_name).mean() I think the only thing that yours adds is the renaming of the column names. At some point we need to figure out how to be able to chain or pass a list of features to extract, or possibly a pipeline. I know I sound like a broken record, but I really like how pliers solves this problem. It can deal with multiple features in one line and merges them all together in the output automatically. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#20 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AYbVKxZUPcPs34sjAAlOQLsCoD_TWXwJks5tSSHEgaJpZM4R8G7U> .

coveralls · 2018-02-07T04:15:08Z

Pull Request Test Coverage Report for Build 67

21 of 24 (87.5%) changed or added relevant lines in 2 files are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.08%) to 86.495%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
feat/data.py	18	21	85.71%

Files with Coverage Reduction	New Missed Lines	%
feat/tests/test_feat.py	2	95.65%

Totals
Change from base Build 66:	0.08%
Covered Lines:	269
Relevant Lines:	311

💛 - Coveralls

ljchang · 2018-02-07T04:20:03Z

So pliers has an stimuli, extractor, and transformer classes. It's a pretty cool architecture that makes it extensible for almost anything, but it might be overkill for our purposes. They also have a really cool functionality built into their transformer class that treats a bunch of features and pipelines as graphs and parallelizes them. Lots to learn from their code.

It would be great to have feature extractors as methods on our Fex data class. However, because they are outputting data into a different format, in some ways it might make more sense for the long term to have an extractor class that is more of the sklearn api style. each algorithm or extractor is its own class with a consistent api (e.g., fit, transform). We could do something like pliers where it can be output to a flat dataframe that can then be used for analyses.

ljchang · 2018-02-07T04:21:25Z

I would say we are still in the exploration phase, so we should try things a few different ways and see if there are any designs that really feel natural for a variety of use cases. I would recommend merging, but then we might remove all of these methods if we come up with better and cleaner way to do this.

Nathaniel-Haines · 2018-02-07T12:27:43Z

Sounds good! I agree that there will be a lot to try. I will merge this for now and then start looking into some of the ideas that you mentioned @ljchang

added mean, min, and max feature extraction methods

added mean, min, and max feature extraction methods

d9b2a41

Nathaniel-Haines merged commit 9a380b8 into cosanlab:master Feb 7, 2018

TiankangXie pushed a commit to TiankangXie/feat that referenced this pull request Feb 15, 2021

Merge pull request cosanlab#20 from Nathaniel-Haines/master

3370b18

added mean, min, and max feature extraction methods

ejolly pushed a commit that referenced this pull request Apr 26, 2022

Merge pull request #20 from Nathaniel-Haines/master

a4ded60

added mean, min, and max feature extraction methods

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added mean, min, and max feature extraction methods #20

added mean, min, and max feature extraction methods #20

Nathaniel-Haines commented Feb 7, 2018

ljchang commented Feb 7, 2018

Nathaniel-Haines commented Feb 7, 2018 via email

coveralls commented Feb 7, 2018 •

edited

ljchang commented Feb 7, 2018

ljchang commented Feb 7, 2018

Nathaniel-Haines commented Feb 7, 2018

added mean, min, and max feature extraction methods #20

added mean, min, and max feature extraction methods #20

Conversation

Nathaniel-Haines commented Feb 7, 2018

ljchang commented Feb 7, 2018

Nathaniel-Haines commented Feb 7, 2018 via email

coveralls commented Feb 7, 2018 • edited

Pull Request Test Coverage Report for Build 67

💛 - Coveralls

ljchang commented Feb 7, 2018

ljchang commented Feb 7, 2018

Nathaniel-Haines commented Feb 7, 2018

coveralls commented Feb 7, 2018 •

edited