Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG FIX read raw curry #8400

Merged
merged 16 commits into from Oct 23, 2020
Merged

MRG FIX read raw curry #8400

merged 16 commits into from Oct 23, 2020

Conversation

todflak
Copy link
Contributor

@todflak todflak commented Oct 21, 2020

Reference issue

Fixes #8391
Fixes #8398

What does this implement/fix?

Fixes problem loading some Curry data files in which a Reference electrode is denoted in the list of channel labels, but that channel is not actually present in the data file. Revised code examines the info file (DAP file for Curry 7 format), looking in CHAN_IN_FILE section for channels that are not actually saved into the corresponding raw data file.

While I was at it, I added code to populate info['meas_date'] property with the acquisition start time, which is extracted from the info file.

Additional information

I have tested only with Curry 7 format datasets. I would be happy to test with Curry 8 format files if someone could provide them.

…ved Reference channel.

Also ensures that the channel order that may be explicitly declared in the DAP file is respected when loading the data.
fixes mne-tools#8391
I finally learned how to use flake8 linter in my IDE!
@todflak todflak changed the title Fix read raw curry WIP Fix read raw curry Oct 21, 2020
@agramfort
Copy link
Member

@todflak
Copy link
Contributor Author

todflak commented Oct 21, 2020 via email

…to deal with many possible problems.

(For example, in the test files there was "StartSec =  11280", which seems like a mistake in creation of the test file)

Also fixed a problem with the existing test_curry.py
@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020

Hi @agramfort , I am finally figuring out how to do the unit tests on my local machine instead of relying on the automated CI system. I found a problem in the testing script (test_curry.py) which was causing some failures.
But I see now that maybe I should include some additional tests in test_curry.py, and corresponding data files, to test my revisions. Do you agree that I should do that?

_mock_info_file(src=in_base_name + 'rs3', dst=out_base_name + 'rs3',
sfreq=sfreq, time_step=time_step)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was simply missing, and the mock RS3 file was empty. This revision simply copies the original RS3 to the testing data folder.

@larsoner
Copy link
Member

But I see now that maybe I should include some additional tests in test_curry.py, and corresponding data files, to test my revisions. Do you agree that I should do that?

If you can make some new tiny (1 sec?) test files and add them to mne-testing-data that would be great. Or alternatively if you know how you could modify some existing test file to have the information you need to test your new functionality, we can either modify an existing file in mne-testing-data or just modify it on the fly by copying to a temporary directory and inserting the line(s) where necessary, e.g.:

https://github.com/mne-tools/mne-python/blob/master/mne/io/edf/tests/test_edf.py#L136-L146

int(param_dict["startmin"]),
int(param_dict["startsec"]),
int(param_dict["startmillisec"]) * 1000,
datetime.now().astimezone().tzinfo)\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line implies that someone reading the file in Paris will think the recording started at a different absolute/UTC time than someone reading the file in New York. This seems like it can't be correct. Hopefully the file format either stores in UTC, or says what time zone it's stored in? If not, then I think we should just set it to UTC and know that it's wrong if the data were recorded in a different time zone. That way at least anyone who reads the file from any physical location will get the same wrong time zone, instead of all getting potentially different ones (and it only being correct in the time zone where the data were collected).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.... I thought about that, and I guess it is the better approach. I will change it.
From what I can see, the times in the file (StartHour) must be local times. I have 16 recordings I'm looking at, and the time don't make sense if you interpret them as UTC.
That being said, I wish I had access to an official description of the Curry file formats. Everything I am doing is based on my best guesses about what I see in my example files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some files have a StartOffsetUTCMin, perhaps this could be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I assume this is the UTC offset in minutes, so you would just add this to the startmin above and set the tzinfo to be UTC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... actually in some of our test files this value is 9999 even though the other time values seem reasonable, so now I'm not so sure. But maybe this was hacked/changed? I would check with your own recorded files locally to see if this parameter is reasonable, and if it is and it gives you the right recording onset time, assume our test files are weird/broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange. I don't have StartOffsetUTCMin in any of my example files. And 9999 certainly seems like a strange value.
I noticed some other strange values in some of the test files, such as StartSec = 11280 in file test_bdf_stim_channel Curry 7.dap (which was also blowing up my datetime construction until I added the try/except). So I question whether all the test files are exactly as they would be coming from the real Curry software.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DiGyt do you remember where the Curry files came from? They have a weird StartOffsetUTCMin=9999 value in them.

if chanidx > 0: # if chanidx was explicitly declared to be ' 0',
# it means the channel is not actually saved in the data file
# (e.g. the "Ref" channel), so don't add it to our list.
# Git issue #8391
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if chanidx > 0: # if chanidx was explicitly declared to be ' 0',
# it means the channel is not actually saved in the data file
# (e.g. the "Ref" channel), so don't add it to our list.
# Git issue #8391
if chanidx <= 0: # if chanidx was explicitly declared to be ' 0',
# it means the channel is not actually saved in the data file
# (e.g. the "Ref" channel), so don't add it to our list.
# Git issue #8391
continue

Then you can unindent everything below. It makes the diff and git blame nicer, and makes it so there is more PEP8-compatible space to use, and makes it immediately clear it's a short-circuit / ignore branch (don't need to scroll down to look for an else to know)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks

for i in range(0, ch_count):
for j in range(0, ch_count - i - 1):
if (all_chans[j]["ch_idx"] > all_chans[j + 1]["ch_idx"]):
temp = all_chans[j]
all_chans[j] = all_chans[j + 1]
all_chans[j + 1] = temp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(0, ch_count):
for j in range(0, ch_count - i - 1):
if (all_chans[j]["ch_idx"] > all_chans[j + 1]["ch_idx"]):
temp = all_chans[j]
all_chans[j] = all_chans[j + 1]
all_chans[j + 1] = temp
all_chans = sorted(all_chans, key=lambda ch: ch['ch_idx'])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I would have done a lambda function like that in .Net, but had no idea how to do it in Python! Thank you!

@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020

Thanks @larsoner , I understand what you are saying... there is already an example in the existing test_curry.py code that modifies test files on the fly in a temp folder. I'll see if I can manage that.
In any event, I realize now that I must include new tests in order to hit the code coverage check. As it is currently, some of my revised code is not executed at all in the existing tests. So, that will be my goal for the day.
(This is the first time I've ever contributed to an open source project, so this process is a bit unfamiliar to me. Thanks for helpin ng me through it.)

@larsoner
Copy link
Member

Thanks for helpin ng me through it.)

No problem, there is indeed a lot to learn, including (as you've seen) how picky our CIs are. Thanks for working through it!

@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020

I have only one failed check remaining, the "codecov". It would be nice to be able to get an immediate check instead of waiting for the whole CI process to give me another "fail" on that one. Is it possible for me to run the "codecov" test locally, before I push again? Or would that be too difficult for me to set up?

@larsoner
Copy link
Member

Codecov just combines all coverage reports, so you can do something similar locally along these lines:

$ pytest --cov=mne --cov-report=html mne/io/curry/
...
$ google-chrome htmlcov/mne_io_curry_curry_py.html

Screenshot from 2020-10-22 09-35-14

…a timezone marker, decided to create the info['meas_date'] as UTC instead of local time.

Incorporated code revisions suggested by @larsoner.
Added 5 tests for new features into test_curry.py
@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020

I created a few new files to add into mne-testing-data/curry, and I created a separate pull request on that repo for those files.
I'm not clear on how those can get added into the codebase for testing in this PR... does someone have to first allow that PR on the files and then they will be available for the CI here?

@agramfort
Copy link
Member

@todflak tell me when to have a look. Ideally when CIs are green

@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020

I believe I need some help now. I'm not sure why my new tests are working on my local system, but failing on the remote during CI. It seems likely that the failures are due to the new files I've added to the mne-testing-data/curry not being available. I say this because I have 5 new tests; of these, 3 tests depend on the new data files, while 2 tests are using data files that were already existing in the mne-testing-data/curry --- the tests that are using the previously existing data files are not reporting errors.
I'm confused, because @larsoner executed my pull request on mne-tools/mne-testing-data, and I can see my new files listed in the repository. So I think I am just not understanding the environment in which the CI code is executing, and why it's not seeing these files (if that is, indeed, the cause of the problem).

@larsoner
Copy link
Member

I think you missed the comment here to follow the rest of the steps (you're on step 5 now) here:

https://github.com/mne-tools/mne-testing-data#add-or-change-files-in-the-repo-for-use-with-mne

Basically you need to update a couple of values in mne/datasets/utils.py at this point

@todflak
Copy link
Contributor Author

todflak commented Oct 22, 2020 via email

@todflak
Copy link
Contributor Author

todflak commented Oct 23, 2020

Finally looking good, just still waiting for Travis CI build. After that, should be ready for review.
(Sorry that I did not do all this in "draft mode" ... I forgot that part of the instructions.)
Should I change the topic title to "MRG" ?

@larsoner
Copy link
Member

(Sorry that I did not do all this in "draft mode" ... I forgot that part of the instructions.)

No problem, I often forget and it's rarely used by us so far ...

Should I change the topic title to "MRG" ?

Yes once you're happy and think it's good to go (green CIs, tests, working, etc.) it's good to do this. I'll take a look shortly!

@todflak
Copy link
Contributor Author

todflak commented Oct 23, 2020 via email

@todflak todflak changed the title WIP Fix read raw curry MRG FIX read raw curry Oct 23, 2020
@todflak
Copy link
Contributor Author

todflak commented Oct 23, 2020

I'm content with this, OK to review and merge if approved.

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than one minor issue, LGTM! Feel free to push a commit to deal with my complaint, and I'll open a follow-up PR to ensure info['chs'] has the correct set of keys.

mne/io/curry/curry.py Show resolved Hide resolved
@larsoner larsoner merged commit f4077c0 into mne-tools:master Oct 23, 2020
0 of 2 checks passed
@larsoner
Copy link
Member

Thanks @todflak !

@larsoner
Copy link
Member

Argh I realized I forgot to ask you to update doc/changes/latest.inc -- can you open a quick PR to add an entry to the BUG section?

@todflak
Copy link
Contributor Author

todflak commented Oct 23, 2020

Other than one minor issue, LGTM! Feel free to push a commit to deal with my complaint, and I'll open a follow-up PR to ensure info['chs'] has the correct set of keys.

I'm sorry to be dense, but I don't really understand what the issue is, or how to deal with it.... or, did you already deal with it?

@todflak
Copy link
Contributor Author

todflak commented Oct 23, 2020

Argh I realized I forgot to ask you to update doc/changes/latest.inc -- can you open a quick PR to add an entry to the BUG section?

Again, sorry for my not following the instructions. Can I do this in my current fork? Or do I need to update my fork from the master upstream? Or do I need to go through the process again of creating a fork, and making the modification in that? I hate to ask such beginner questions, but I'm new to the Git-world, and I've gotten myself into hours of hell trying to undo some minor screw up with other code control systems (SVN)

@larsoner
Copy link
Member

I'm sorry to be dense, but I don't really understand what the issue is, or how to deal with it.... or, did you already deal with it?

Yes I just commited it directly then merged

Can I do this in my current fork? Or do I need to update my fork from the master upstream? Or do I need to go through the process again of creating a fork, and making the modification in that?

The general procedure is to keep master on your local system and master on your fork up to date with upstream/master. So something like this should work (assuming you have a clean branch / no pending changes):

git checkout master  # change to local master branch
git remote add upstream git://github.com/mne-tools/mne-python.git  # add remote ref
git fetch upstream  # fetch remote changes
git pull upstream/master  # make local copy up to date with remote changes
git push origin master  # make remote fork up to date with remote upstream
git checkout -b latest
... # hack away at latest.inc
git commit -am 'DOC: Update latest.inc [skip travis] [skip github]'
git push origin -u latest  # push "latest" branch from local to remote fork, setting it to track so "git push" just works next time

@todflak todflak deleted the fix_read_raw_curry branch October 23, 2020 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read_raw_curry() does not populate info['meas_date'] read_raw_curry() with preload=True produces error
3 participants