-
Notifications
You must be signed in to change notification settings - Fork 3
Fix the function to extract last 32bits from number #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: fix_cnt_better_auto_data_format
Are you sure you want to change the base?
Fix the function to extract last 32bits from number #24
Conversation
Codecov Report
@@ Coverage Diff @@
## fix_cnt_better_auto_data_format #24 +/- ##
===================================================================
+ Coverage 89.11% 89.12% +<.01%
===================================================================
Files 416 416
Lines 74865 74868 +3
Branches 12340 12340
===================================================================
+ Hits 66718 66728 +10
+ Misses 5307 5302 -5
+ Partials 2840 2838 -2 |
1 similar comment
Codecov Report
@@ Coverage Diff @@
## fix_cnt_better_auto_data_format #24 +/- ##
===================================================================
+ Coverage 89.11% 89.12% +<.01%
===================================================================
Files 416 416
Lines 74865 74868 +3
Branches 12340 12340
===================================================================
+ Hits 66718 66728 +10
+ Misses 5307 5302 -5
+ Partials 2840 2838 -2 |
massich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I don't fully get is the logic we are following to detect the new position.
I mean if the distance is 0 it means that event_table_pos was right in the first place. How does it handle the overflow?
If I understand properly we start searching for this event_table_pos at whatever is in the SETUP and we keep incrementing by unit32. What if its 16?
What if we try to replicate the process that generate the value of the SETUP? If I understand properly the value depends on n_bytes, sfreq, n_samples and n_channels. We have them all but n_bytes.
We can carry the computation twice one with int32 and once with np.int64 or c_longlong then we detect the overflow and compare the result of the overflow with whatever is in the event_table_pos. If they match it means that we found it.
Just thinking out loud
|
|
||
|
|
||
| def _compute_robust_event_table_position(fid): | ||
| def _compute_robust_event_table_position(fid, data_format): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need the data_format? we could always test agains [2, 4] and who ever matches we return it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand properly we start searching for this event_table_pos at whatever is in the SETUP and we keep incrementing by unit32. What if its 16?
Yes, but why 16? For example, if the value of readed event table pos is 3 , and the file size is 30 bytes, the maximum of uint32 is 10. The real event table pos must in [3, 13, 23], right? And now, n_bytes is unknown, if cal_event_table=5, when n_bytes=2 and cal_event_table=13, when n_bytes=4. Then we can say the overflow occurs, real event table pos is 13, n_bytes=4.
What if we try to replicate the process that generate the value of the SETUP? If I understand properly the value depends on n_bytes, sfreq, n_samples and n_channels. We have them all but n_bytes.
Yes, tha's the reason, we will try both n_bytes =2 and n_bytes = 4.
We can carry the computation twice one with int32 and once with np.int64 or c_longlong then we detect the overflow and compare the result of the overflow with whatever is in the event_table_pos. If they match it means that we found it.
Yes, and Maybe int16 and int32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need the
data_format? we could always test agains[2, 4]and who ever matches we return it.
The inference is based on the fact n_samples is reliable. But sometimes it not true (the 914flankers.cnt case), which makes the inference not so reliable. At that time, use the data format from user, could help use find out the correct event table pos correctly.
That was the problem all along, that the solution was breaking the current test suite 'cos the computation was not identifying those cases. i.e: https://github.com/mne-tools/mne-python/pull/6550/checks?check_run_id=165686581 |
|
Then we need to find a way around it. But we need to make both scenarios work. Any ideas on what to do if we can't trust |
|
I guess the question is: If so, it means that overflow or not we would match. If we don't match, we have to assume that either |
|
WDYT? |
Definetly, I think. At first, there is little possible that the calc_event_table_pos could be wrong if n_samples is correct, n_channels has little possible to be wrong and overflow. And we also don't need to consider the situation n_samples overflows ( this takes a long experiment time, approximatly hundreds of hours).
Yes, In my code, the n_samples will be considered as wrong instead of the event_table_pos. And the n_samples will returned directly without correction. Because I am not sure whether the data in the gap between n_samples and TEEG has any meaning, and should considered as sample point. |
|
did we finish all this? now I'm confused. can you refresh me @AimerLee |
In previous version, the way to obtain the last 32 bits of the event_table_pos is in fact wrong, which is fix in this PR.
As for the file mentioned in the Issue #6550, named 914flankers.cnt, is in fact a special file which has no overflow problem, but the calculated event table pos and the readed version from SETUP section just unmatch. That means reading this file will still throw an exception. One possible reason is there is the number of samples is not correct, and I will give another PR to fix this problem (one simple solution is not to throw an exception, but return the readed value)
Before that, I have test all the cnt file in my hand, if possible, could try the new version on your dataset agian?