-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor WormFeatures by feature type instead of category #133
Comments
Yesterday I was thinking of nixing this issue. Today I think we should make it a higher priority. I would specify the issue differently. I'd like to be able to process certain features so that I can test features which are not working see #114 . Currently we can't ignore features. This will also become important with handling skeleton only inputs. Here's the new layout, I welcome feedback. Somewhere we have a list of features. This currently exists in feature_metadata. We might create another list and then a unit test that makes sure entries in 1 exist in the other. In the new list, we have a list of features and their dependencies. My preference would be to have the list order determine computation order rather than trying to determine this in code. The dependencies list would allow us to check that previous entries were enabled and enable them if necessary. i.e. something like:
We then compute 1, then 2, etc. If we only want 1 & 4 it is relatively trivial to determine that we need 3 as well. This is less difficult that creating the order in code. WormFeatures get's rewritten as just an enumeration over the list of features. I'm not sure where to store the resulting features. Perhaps WormFeatures gets renamed and then features becomes a dictionary ... (Feedback needed) Each feature would be rewritten as a class with relevant methods (comparison, saving, loading, explaining/plotting, etc) In the beginning of each constructor, all relevant dependencies would be extracted from the WormFeatures (or new container) class. This would allow the constructor to grab the relevant data, but would ensure that we don't nest dependencies deep in the calculation code. I have a class called FeatureProcessingOptions which I would then instrument with methods like ignoreCategory('Morphology') or onlyRun(['morphology.length','morphology.area_per_length']) in addition to other options that currently exist. I'm thinking of starting a branch for this work. Thoughts? |
Hello, Talking with Andre about the failed tests, the mismatches in "locomotion.turns.upsilons.start_frames" and in "locomotion.turns", might be because a different version of the code was used in the database creation. If you guys thing this could be the reason, I could take care of it. About the reorganisation. I like the idea to have the features as a dictionary. This is more similar to the MATLAB structures since they are dynamic data types, and we could even introduce new features without having to change the class definition. Obviously, there is the chance that spurious keys could appear if one is not careful, but I think this is a lesser concern (it could have happen too with the original MATLAB structures). Are we using nested dictionaries, or the keys on the dictionary would be something like 'locomotion.turns.upsilons.start_frames'? I would prefer the later. I really like the idea of having each feature as a separated class. I think it makes more sense. For example, let's say I have a tracker that only gives me the worm positions. It would be nice to call a single class/function to extract the path features from the x-y coordinates. |
The values were correct at some point. Either a bug was introduced at some point or the translation from my Matlab code was wrong and code was introduced in the Python version that made the error that was always there known. I'll hopefully have it fixed soon. The dictionary would go directly to the feature, although it wouldn't go as deep as you are suggesting. start_frames is an attribute of the upsilons feature, not a feature in itself. Thus the dictionary entry would be: I had originally discouraged this refactoring during our code translation from Matlab to Python. Now that the feature code is relatively well established I think it is time to make it a bit more user friendly. The goal will be to make what you are suggesting regarding path features relatively easy. |
@JimHokanson p.s. I think you know this but I found an example of getting a class instance to behave like a dict: the Pandas dataframe. The "key" (no pun intended) is to define a You referenced this StackOverflow question to me in our recent conversation so I'll just throw it in here for posterity. |
I just realized that another benefit of having each of the features "know" its type is that it simplifies the histograms = HistogramManager([wf1, wf2, wf3]) Then Instead if |
I worked on this a bit tonight. I'm adding the code directly to the master branch so that things stay up to date. There is a new test file in the examples that creates the morphology features via iteration. It still needs a lot of work and will probably need some refactoring once completed but this should make a lot of things cleaner. |
An update: The new code can be run at: The code is incomplete but it should give some idea of where things are headed. Outline: Benefits:
Big issues left to handle:
@clinzy This new code will hopefully be more amenable to documentation ... I've reused classes where possible but some of the old classes are going to be thrown away. I can help with moving code if necessary to bring things into the new approach. |
@JimHokanson is it safe to close this issue or are the "Big Issues left to handle" still holding this up? I think the fact that the features are calculated in this new way should not change downstream processing much, am I right? Thanks |
Right now our code breaks down features by "category":
But these are really just labels we attach to group the features for reporting purposes. As far as actual calculation, what's more important is the "type":
One can imagine that code common to all event-based features would be best grouped into one class, rather than right now, when coils are in the LocomotionFeature class and omegas, upsilons, and motion_events are in PostureFeature class. Perhaps all four of the event-based features belong in an EventBasedFeature class.
See the individual features and which bucket they fit into in these two schemas in the supplemental documentation.
This is not urgent, or even necessary, but in the future it may be better to group these functionally rather than by category.
The text was updated successfully, but these errors were encountered: