Fix the function to extract last 32bits from number #24

AimerLee · 2019-07-10T16:24:13Z

In previous version, the way to obtain the last 32 bits of the event_table_pos is in fact wrong, which is fix in this PR.

As for the file mentioned in the Issue #6550, named 914flankers.cnt, is in fact a special file which has no overflow problem, but the calculated event table pos and the readed version from SETUP section just unmatch. That means reading this file will still throw an exception. One possible reason is there is the number of samples is not correct, and I will give another PR to fix this problem (one simple solution is not to throw an exception, but return the readed value)

Before that, I have test all the cnt file in my hand, if possible, could try the new version on your dataset agian?

codecov · 2019-07-11T02:50:59Z

Codecov Report

Merging #24 into fix_cnt_better_auto_data_format will increase coverage by <.01%.
The diff coverage is 83.33%.

@@                         Coverage Diff                         @@
##           fix_cnt_better_auto_data_format      #24      +/-   ##
===================================================================
+ Coverage                            89.11%   89.12%   +<.01%     
===================================================================
  Files                                  416      416              
  Lines                                74865    74868       +3     
  Branches                             12340    12340              
===================================================================
+ Hits                                 66718    66728      +10     
+ Misses                                5307     5302       -5     
+ Partials                              2840     2838       -2

codecov · 2019-07-11T02:50:59Z

Codecov Report

Merging #24 into fix_cnt_better_auto_data_format will increase coverage by <.01%.
The diff coverage is 83.33%.

@@                         Coverage Diff                         @@
##           fix_cnt_better_auto_data_format      #24      +/-   ##
===================================================================
+ Coverage                            89.11%   89.12%   +<.01%     
===================================================================
  Files                                  416      416              
  Lines                                74865    74868       +3     
  Branches                             12340    12340              
===================================================================
+ Hits                                 66718    66728      +10     
+ Misses                                5307     5302       -5     
+ Partials                              2840     2838       -2

massich

What I don't fully get is the logic we are following to detect the new position.

I mean if the distance is 0 it means that event_table_pos was right in the first place. How does it handle the overflow?

If I understand properly we start searching for this event_table_pos at whatever is in the SETUP and we keep incrementing by unit32. What if its 16?

What if we try to replicate the process that generate the value of the SETUP? If I understand properly the value depends on n_bytes, sfreq, n_samples and n_channels. We have them all but n_bytes.

We can carry the computation twice one with int32 and once with np.int64 or c_longlong then we detect the overflow and compare the result of the overflow with whatever is in the event_table_pos. If they match it means that we found it.

Just thinking out loud

massich · 2019-07-11T08:45:47Z

mne/io/cnt/_utils.py



-def _compute_robust_event_table_position(fid):
+def _compute_robust_event_table_position(fid, data_format):


why do we need the data_format? we could always test agains [2, 4] and who ever matches we return it.

If I understand properly we start searching for this event_table_pos at whatever is in the SETUP and we keep incrementing by unit32. What if its 16?

Yes, but why 16? For example, if the value of readed event table pos is 3 , and the file size is 30 bytes, the maximum of uint32 is 10. The real event table pos must in [3, 13, 23], right? And now, n_bytes is unknown, if cal_event_table=5, when n_bytes=2 and cal_event_table=13, when n_bytes=4. Then we can say the overflow occurs, real event table pos is 13, n_bytes=4.

What if we try to replicate the process that generate the value of the SETUP? If I understand properly the value depends on n_bytes, sfreq, n_samples and n_channels. We have them all but n_bytes.
Yes, tha's the reason, we will try both n_bytes =2 and n_bytes = 4.

We can carry the computation twice one with int32 and once with np.int64 or c_longlong then we detect the overflow and compare the result of the overflow with whatever is in the event_table_pos. If they match it means that we found it.
Yes, and Maybe int16 and int32.

why do we need the data_format? we could always test agains [2, 4] and who ever matches we return it.

The inference is based on the fact n_samples is reliable. But sometimes it not true (the 914flankers.cnt case), which makes the inference not so reliable. At that time, use the data format from user, could help use find out the correct event table pos correctly.

massich · 2019-07-11T09:15:19Z

As for the file mentioned in the Issue mne-tools#6550, named 914flankers.cnt, is in fact a special file which has no overflow problem, but the calculated event table pos and the readed version from SETUP section just unmatch. That means reading this file will still throw an exception. One possible reason is there is the number of samples is not correct, and I will give another PR to fix this problem (one simple solution is not to throw an exception, but return the readed value)

That was the problem all along, that the solution was breaking the current test suite 'cos the computation was not identifying those cases. i.e: https://github.com/mne-tools/mne-python/pull/6550/checks?check_run_id=165686581

massich · 2019-07-11T12:31:40Z

Then we need to find a way around it. But we need to make both scenarios work.

Any ideas on what to do if we can't trust n_samples?

massich · 2019-07-11T12:36:37Z

I guess the question is:
If n_samples is right do we always find a match between even_table_pos and the computation?

If so, it means that overflow or not we would match. If we don't match, we have to assume that either even_table_pos or n_samples are wrong. In which chase we can assume that the wrong one is n_samples.

massich · 2019-07-11T12:36:56Z

WDYT?

AimerLee · 2019-07-11T12:57:36Z

I guess the question is:
If n_samples is right do we always find a match between even_table_pos and the computation?

Definetly, I think. At first, there is little possible that the calc_event_table_pos could be wrong if n_samples is correct, n_channels has little possible to be wrong and overflow. And we also don't need to consider the situation n_samples overflows ( this takes a long experiment time, approximatly hundreds of hours).

If so, it means that overflow or not we would match. If we don't match, we have to assume that either >even_table_pos or n_samples are wrong. In which chase we can assume that the wrong one is >n_samples.

Yes, In my code, the n_samples will be considered as wrong instead of the event_table_pos. And the n_samples will returned directly without correction. Because I am not sure whether the data in the gap between n_samples and TEEG has any meaning, and should considered as sample point.

                                  Meaningless data?
|                               |xxxxxxxxxxxxxxxxxx|                     |
                          n_samples            Event_table_pos

massich · 2019-08-05T14:42:14Z

did we finish all this? now I'm confused. can you refresh me @AimerLee

ArandaLee added 3 commits July 11, 2019 00:08

fix iss6550

757d096

delete one useless line

874fe92

a better way to solve the read problem

2e64a54

massich reviewed Jul 11, 2019

View reviewed changes



		def _compute_robust_event_table_position(fid):
		def _compute_robust_event_table_position(fid, data_format):

Fix the function to extract last 32bits from number #24

Are you sure you want to change the base?

Fix the function to extract last 32bits from number #24

Uh oh!

Conversation

AimerLee commented Jul 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 11, 2019

Codecov Report

Uh oh!

codecov bot commented Jul 11, 2019

Codecov Report

Uh oh!

massich left a comment

Choose a reason for hiding this comment

Uh oh!

massich Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

AimerLee Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

AimerLee Jul 11, 2019

Choose a reason for hiding this comment

Uh oh!

massich commented Jul 11, 2019

Uh oh!

massich commented Jul 11, 2019

Uh oh!

massich commented Jul 11, 2019

Uh oh!

massich commented Jul 11, 2019

Uh oh!

AimerLee commented Jul 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

massich commented Aug 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AimerLee commented Jul 10, 2019 •

edited

Loading

AimerLee commented Jul 11, 2019 •

edited

Loading