Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added mean, min, and max feature extraction methods #20

Merged
merged 1 commit into from Feb 7, 2018
Merged

added mean, min, and max feature extraction methods #20

merged 1 commit into from Feb 7, 2018

Conversation

Nathaniel-Haines
Copy link
Collaborator

Let me know what you all think of this method of feature extraction. The output format is the same as the boft extractor (1 row, and a column for each feature), and specifying the 'by' argument allows users to group observations by other features in the data before summarizing (e.g. by subjects, trials, or whatever). By default, the functions will summarize data across all rows.

This is all default pandas functionality too, so it is quick and easy.

@ljchang
Copy link
Member

ljchang commented Feb 7, 2018

I think it is fine to start with these. It's pretty easy to do this using the already built in functionality

fex.groupby(column_name).mean()

I think the only thing that yours adds is the renaming of the column names.

At some point we need to figure out how to be able to chain or pass a list of features to extract, or possibly a pipeline. I know I sound like a broken record, but I really like how pliers solves this problem. It can deal with multiple features in one line and merges them all together in the output automatically.

@Nathaniel-Haines
Copy link
Collaborator Author

Nathaniel-Haines commented Feb 7, 2018 via email

@coveralls
Copy link
Collaborator

coveralls commented Feb 7, 2018

Pull Request Test Coverage Report for Build 67

  • 21 of 24 (87.5%) changed or added relevant lines in 2 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.08%) to 86.495%

Changes Missing Coverage Covered Lines Changed/Added Lines %
feat/data.py 18 21 85.71%
Files with Coverage Reduction New Missed Lines %
feat/tests/test_feat.py 2 95.65%
Totals Coverage Status
Change from base Build 66: 0.08%
Covered Lines: 269
Relevant Lines: 311

💛 - Coveralls

@ljchang
Copy link
Member

ljchang commented Feb 7, 2018

So pliers has an stimuli, extractor, and transformer classes. It's a pretty cool architecture that makes it extensible for almost anything, but it might be overkill for our purposes. They also have a really cool functionality built into their transformer class that treats a bunch of features and pipelines as graphs and parallelizes them. Lots to learn from their code.

It would be great to have feature extractors as methods on our Fex data class. However, because they are outputting data into a different format, in some ways it might make more sense for the long term to have an extractor class that is more of the sklearn api style. each algorithm or extractor is its own class with a consistent api (e.g., fit, transform). We could do something like pliers where it can be output to a flat dataframe that can then be used for analyses.

@ljchang
Copy link
Member

ljchang commented Feb 7, 2018

I would say we are still in the exploration phase, so we should try things a few different ways and see if there are any designs that really feel natural for a variety of use cases. I would recommend merging, but then we might remove all of these methods if we come up with better and cleaner way to do this.

@Nathaniel-Haines
Copy link
Collaborator Author

Sounds good! I agree that there will be a lot to try. I will merge this for now and then start looking into some of the ideas that you mentioned @ljchang

@Nathaniel-Haines Nathaniel-Haines merged commit 9a380b8 into cosanlab:master Feb 7, 2018
TiankangXie pushed a commit to TiankangXie/feat that referenced this pull request Feb 15, 2021
added mean, min, and max feature extraction methods
ejolly pushed a commit that referenced this pull request Apr 26, 2022
added mean, min, and max feature extraction methods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants