New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion of correctness of IClabel-python vs EEGLab #5
Comments
Okay let us know when they are on board and we can schedule another time to chat.
Yes this order is correct. So last time @jacobf18 and I spoke, we identified this issue as well. There are three main components to the ICLabel pipeline:
So according to Jacob, he verified the network weights match that of EEGlab by running through various examples and confirming the outputs match numerically. However, a unit test that verifies this would be nice. He also manually tested the formatting part. We unit tested the generation of features on simulation data + raw loaded EEGLab data and all are matching numerically with EEGlab except for I believe the differences in output can stem from any of these three parts, so we need to really be systematic in our testing of each of the three components of the pipeline against EEGLab in order to pinpoint exactly where are the differences.
Yes that was an incorrect specification of the sampling rate, which I fixed. |
Alright, I think the first task for @anandsaini024 will be to run the network on Python and MATLAB, compare the outputs, familiarize himself with the code. He can also work on the unit test to verify that the network weights match and to verify the formatting part (if you don't beat him to it 😉). As you said, we have to be very systematic in our testing approach and we have to compare MATLAB and Python output at every stage. For At the moment, he is still waiting for his work permit, but he will pick up his laptop and a MATLAB license this week or at the beginning of next week. |
Notes 3/23/22: cc: @jacobf18 Short term goals (to aim to complete by summer)Our short term goal is to exactly replicate all aspects of EEGLab as possible. This would result in It is possible again that the two runs through Matlab/Python do not match. For example, a unit test for network weights, feature generation and data pipeline would be enough to convince us that all differences in the two networks are solely from a "randomness" injected in the
Longer term goalsThe longer term goal is to really improve the IClabeler. This would involve two efforts: improving the model and improving the benchmarking. The model needs to be benchmarked, so the benchmark is the bulk of the effort. The benchmarking needs to account for real-world variability and requires high-quality training/testing data. The result of this would be
Datasets available: NK (I think 1020), EGI, ... Other "semi-labeled" datasets could come from Alexandre G. from https://swipe4ica.github.io/#/. Reference: |
@agramfort we are thinking of leveraging https://swipe4ica.github.io/#/ to build out an independent and bigger dataset to improve the ICLabel ported over from EEGLab. I was thinking of having a hs student interested in working with me to contribute to this.
@mscheltienne in addition, you mentioned you have some external data. How difficult would it be to convert that data to BIDS and then we could have someone just run through all the data (i.e. this hs student) and label the IC components by hand? The main work on me/us would be we need to set up an almost-tutorial like ipython notebook for him to understand and sort of conceptualize what he's doing. |
@agramfort <https://github.com/agramfort> we are thinking of leveraging
https://swipe4ica.github.io/#/ to build out an independent and bigger
dataset to improve the ICLabel ported over from EEGLab.
I was thinking of having a hs student interested in working with me to
contribute to this.
- How easy is it to get access to this dataset from the web app?
the EEG data is not public but the rendered images from the app are
the source code is here https://github.com/Swipe4ICA/swipe4ica.github.io
Message ID: ***@***.***>
…
On Tue, Apr 5, 2022 at 4:32 AM Adam Li ***@***.***> wrote:
@agramfort <https://github.com/agramfort> we are thinking of leveraging
https://swipe4ica.github.io/#/ to build out an independent and bigger
dataset to improve the ICLabel ported over from EEGLab.
I was thinking of having a hs student interested in working with me to
contribute to this.
- How easy is it to get access to this dataset from the web app?
- Is it ready for usage?
@mscheltienne <https://github.com/mscheltienne> in addition, you
mentioned you have some external data. How difficult would it be to convert
that data to BIDS and then we could have someone just run through all the
data (i.e. this hs student) and label the IC components by hand? The main
work on me/us would be we need to set up an almost-tutorial like ipython
notebook for him to understand and sort of conceptualize what he's doing.
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABHKHALARU2NI4WQ3DXDZ3VDOQ2XANCNFSM5P6H4KEQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
All our datasets have to be converted to a common format anyway, so let's go with BIDS. For the ANT Neuro and EGI datasets I can provide, I don't think it will be too much work to convert them to BIDS and to save the ICA decomposition in a BIDS format. |
Note for @adam2392 I need to push up the datafiles under |
FYI: @mscheltienne and @anandsaini024 #11 |
FYI @jacobf18 and I talked briefly today. He found an issue that he'll push up a PR for in the |
Now that this is "correct", should we release v0.1 on pypi? |
I guess we can. Just one question, do we want to be backward compatible/use deprecation cycle from release 0.1 or can we keep some flexibility until a later release, e.g. when moved to I'm thinking about the output of the main |
Fair point. Let's finalize this API issue first before moving to v0.1. Do you know where EEGLab documents the output order of the predicted class probabilities? I think the one I wrote in the examples is right, but it would be nice to double check this and have a reference to something. Re how to output:
|
Does this count as documentation? It's probably where I found it the first time.. I'll have a look at the |
Ah yes that counts :p |
@mscheltienne reposting here as an issue to begin discussion again.
Quick update: our intern should have started on the 1st of March. But he did not yet get his work permit (mandatory in Switzerland), thus his start has been delayed and we hope he will be on board next week.
I finally took the time to test it! It looks very promising but seems to favor a lot 'Other'.
Am I correct in assuming that the classes outputted are in the order:
I ran it on this file quickly:
Eye and heartbeat are correct. The last time we talked, you told me it failed at classifying even simple blinks, did you find what caused this?
Originally posted by @mscheltienne in #4 (comment)
The text was updated successfully, but these errors were encountered: