-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add: psd_aggregated #825
base: main
Are you sure you want to change the base?
Add: psd_aggregated #825
Conversation
Features based on power spectral density: "mean_frequency", "median_frequency", "peak_frequency", "mean_power". The scipy's periodogram is used to calculate the base signal for the features. This features are useful in frequency domain analisys for EMG signals. The main reference for this features are: [1] Phinyomark, A., Phukpattaranont, P., & Limsakul, C. (2012). Feature reduction and selection for EMG signal classification. Expert systems with applications, 39(8), 7420-7431.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done, thank you for your first PRs to tsfresh and welcome!
I had a few questions about the overall setting.
Could you also have a look why the unit tests are failing and could you add unittests for your new feature? We try to aim a high test coverage at least for the feature calculators.
:return: the value of the feature | ||
:return type: float | ||
""" | ||
return np.sum(power_spectrum)/len(power_spectrum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a np.mean function - could you use that?
#the array has only positive values, if there is no better distance, we have found the index | ||
if best_distance == previous_distance: | ||
return frequency[i] | ||
previous_distance = distance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you implemented the median by yourself and did not just use the median implementation (or quantile) of numpy? Typically, those a very fast and will also cover some edge cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On page 7424 in [1] is described the equation for the median frequency and is said: "MDF is a frequency at which the spectrum is divided into two regions with equal amplitude", I'm not sure if that definition is the same as a traditional median (I'm conceptually not that strong to make that conclusion), maybe using the median yields the same results as this algorithm, but I replicated the equation from the paper in my original reference.
:return: the value of the feature | ||
:return type: float | ||
""" | ||
return np.sum(frequency * power_spectrum) / np.sum(power_spectrum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here it might make sense to use np.mean
Are you aware of https://docs.scipy.org/doc/scipy/reference/signal.html? There is a wide range of signal processing techniques that is already implemented. I unfortunately do not have much experience with signal processing |
@theinem - is there anything I can help you to finish with this PR? |
I would like to make the improvements you asked for, but for several weeks I have had serious problems with available time and it will continue like this for several more weeks/months 😓 |
@theinem - this is not a problem from our side. We are happy to get your code in also in a few weeks. If you feel you can not make it (at all), just tell use and we will try to take over! |
Thanks, @nils-braun. I think it would be better if the community uses that code and improves it. It is definitely very complicated for me due to my current responsibilities. |
Features based on power spectral density: "mean_frequency", "median_frequency", "peak_frequency", "mean_power". The Scipy's periodogram is used to calculate the base signal for the features.
This features are useful in frequency domain analisys for EMG signals. The main reference for this features are: [1] Phinyomark, A., Phukpattaranont, P., & Limsakul, C. (2012). Feature reduction and selection for EMG signal classification. Expert systems with applications, 39(8), 7420-7431.
I'm a newbie to contributing to projects on Github, but I tried to follow the steps in your documentation.