New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The ultimate method for R-peak detection? #222
Comments
Interesting. This is conceptually similar to ensemble learning in predictive modelling. It would be interesting to see if an ensemble detector results in significantly better performance than the individual detectors (on a benchmark dataset that is a diverse and large as possible). Benchmarking all the detectors (with their default parameters) would be interesting anyway. I think we had a similar idea (#78) a while ago. |
Yeah, this could be a super useful little study. The hard part is findings/creating the benchmark dataset, but with the simulators and the |
Wow 😃
|
@TiagoTostas as my understanding of the specificities of the different methods is limited, could you help maybe remove the methods that are in your opinion useless (that would do more harm than good) from this meta-method: NeuroKit/neurokit2/ecg/ecg_findpeaks.py Lines 145 to 154 in 7a96f37
As for the preprocessing study, you might be right, I assumed that it would have been simpler to generate it ourselves as we could really control all of the parameters (sampling rate, heart rate, and distortion) to have a virtually unlimited set, but we might indeed be reinventing the wheel here (moreover it's true that one of the main limitations would be the absence of biological artefacts like ectopic beats etc., as our simulator only generates a "healthy" signal). If the databases on which we could test different algorithms are accessible in OA, maybe it would then be useful to facilitate their usage as a testing framework. For instance, we could either directly store/sanitize/formats/combine them in some repo, or create a function that downloads them. Then, we could have a function to which we pass a preprocessing function ( |
(-) The methods based on moving averages and thresholding are usually not very specific (PanTompkins, Christov and Hamilton). I verified this when I was doing some tests previously, but it is better explained in the reference above if you are curious.
Yeah, I think that's the case :( But usually, this is done by each researcher (I believe) and that's why it leads to these incoerences. The physionet is well documented in terms of importing the annothations to python and we can also get some inspiration from the link, so I think that adding this validation block to the pipeline would be a great/useful addition |
@DominiqueMakowski, this is essentially what the wfdb packages do with the PhysioNet databases. However, the problem is that some of these are poorly annotated as pointed out by @berndporr in this paper. I believe this is what led his team to create the Glasgow University Database (GUDB), which is quite nice (I use it to benchmark the biopeaks/NeuroKit R-peak detector). I feel like we should start by benchmarking our own detectors to provide users with some clear indication of (relative) performance. Making general statements about detector performances is quite tricky as pointed out by @TiagoTostas, since performance is heavily influences by ideosyncratic pre-processing steps, implementation details of the detector, and importantly also the evaluation criteria (e.g., tolerance for match between manually annotated peaks and peaks identified by the detectors). We could use the GUDB for that and further distort the ECG (although the database already includes a "runnning" and "handbike" condition). |
I've look into setting up a pipeline for benchmarking our detectors. The problem with the GUDB is that the data need to be requested and downloaded manually. This means that we'd have to host the database ourselves somewhere. In contrast, with PhysioNet the data can be fetched from their servers using the wfdb API. For now, we can set up a benchmark using wfdb and PhysioNet and then see how we can improve this, |
I have the database on a GIT at the university here as well but it's not public. The reason the data needs to be requested is because central IT haven't got enough space as far as I know but I see the appeal to have the dataset available via an API. I'm setting up an http server at the moment anyway for another project and then point the API to it. Need to talk to IT regarding this but see the appeal of course. We have exam season and can look into this next week. |
Hi @berndporr, thanks for chiming in |
That makes sense. Would it be an option to host the dataset without the MP4 files (or host them separately)? I belive those create most of the bulk.
That would be amazing! |
Hi all,
and then have a play with the usage example. The datasets are simply on this github as gh-pages. The API then does an http request and parses them on the fly. |
There are many many R-peaks detection methods and algorithms with no clear guidelines on which one to use / which one is the best etc. It is likely that they all have some strengths and weaknesses.
For some reason, I woke with this idea in mind: what if we found a way of combining all these methods of R-peak detection. The tricky part is that methods return peak indices, which combination into a probabilistic statement is not straightforward.
So I thought, why not consider each peak detection by a method as the peak of a (probability) distribution, like a normal curve which width would be approximately covering a QRS segment. If we convolve each peak with this normal distribution, we then have some sort of continuous pseudo-probabilistic signal. We can then combine the results from all the methods by summing them.
Once we have the combination of these convolved peaks, we can select its peaks as the most probable peaks. I added a first draft here. Here's how it looks like:
Does that make sense to you? could it be improved? Or is it a bad idea?
@JanCBrammer @TiagoTostas
The text was updated successfully, but these errors were encountered: