[MRG, API] Start of automatic methods section with create_methods_paragraph #457

adam2392 · 2020-06-22T18:45:22Z

PR Description

Addresses preliminarily: #347

A summary of MEEG summary (MEG, EEG, iEEG):
I don't want this to be too much headache to start with, so figured that the easiest most robust summary we can provide is modularized as such:

dataset description: subject, session, kinds, and dataset_description.json file
participants.tsv file summary per subject: age, sex, hand that is supported in mne-bids. Note this file is only RECOMMENDED. I went through a lot of effort to get this to work without making the report look crazy ugly for now because this I figured is one of the most crucial summaries every study should have, but since it's structure is not very imposed by BIDS, then it's hard to summarize consistently.
modality-agnostic-summary: per session scans, and their length, sfreq, channel counts, etc.
modality-specific-summary: adding iEEG channel counts (e.g. SEEG, ECoG, etc.). Similarly, I suppose if someone wants to add MEG/EEG, it can be a relatively short summary here. (tabled to future)

TODO:

Update convert_group_studies to use create_methods_paragraph.
Have maintainers first review the groundwork to make sure this is in the right direction.
Add documentation to the docstrings
Add a summary function for channels.tsv and sidecar.json files ~~kind=ieeg data (idk how to add for 'eeg', or 'meg', so would prefer someone else add that functionality in)~~
Add REQUIRED elements from dataset_description.json and the reference/DOI for mne-bids.
Add example outputs from OpenNeuro datasets (i.e. iEEG)
How to deal w/ emptyroom subjects? I don't work w/ these, so skipping them for now. Assuming this is okay, I added a XXX to the inline comments for someone else to fix.
How to add MEG/iEEG/EEG specific data, such as "Gradiometers" and "Magnometers" for MEG data?
Adding scan-level summary even without the *_scans.tsv files, which is considered "RECOMMENDED" and not "REQUIRED".

Example Output from Local/OpenNeuro datasets
These are the datasets I ran the method generation w/:

local_dataset I have, that will get put onto openneuro.
ds001779,
ds002778
ds002904
ds000246': 'https://github.com/OpenNeuroDatasets/ds000246',
ds000248': 'https://github.com/OpenNeuroDatasets/ds000248',
ds000117': 'https://github.com/OpenNeuroDatasets/ds000117',
ds001810': 'https://github.com/OpenNeuroDatasets/ds001810',
ds001971': 'https://github.com/OpenNeuroDatasets/ds001971',
somato

Ran datasets w/ the following code:

methods_paragraph = create_methods_paragraph(bids_root)
print(methods_paragraph)

See output on PR here: ./report.txt.

Merge checklist

Maintainer, please confirm the following before merging:

All comments resolved
This is not your own PR
All CIs are happy
PR title starts with [MRG]
whats_new.rst is updated
PR description includes phrase "closes <#issue-number>"
Commit history does not contain any merge commits

jasmainak · 2020-06-22T19:15:58Z

nice, can you share an example paragraph generated by this report?

jasmainak · 2020-06-22T19:17:22Z

maybe update the ds000117 example?

mne_bids/tests/test_report.py

adam2392 · 2020-06-23T14:57:24Z

maybe update the ds000117 example?

Done! lmk what you think.

nice, can you share an example paragraph generated by this report?

It is copied into the PR description.

codecov-commenter · 2020-06-23T15:00:07Z

Codecov Report

Merging #457 into master will decrease coverage by 1.03%.
The diff coverage is 83.06%.

@@            Coverage Diff             @@
##           master     #457      +/-   ##
==========================================
- Coverage   93.70%   92.66%   -1.04%     
==========================================
  Files          11       13       +2     
  Lines        1762     1950     +188     
==========================================
+ Hits         1651     1807     +156     
- Misses        111      143      +32

Impacted Files	Coverage Δ
mne_bids/commands/mne_bids_report.py	`0.00% <0.00%> (ø)`
mne_bids/write.py	`96.73% <ø> (ø)`
mne_bids/report.py	`91.17% <91.17%> (ø)`
mne_bids/config.py	`95.55% <100.00%> (+0.10%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e2798b3...d357e9d. Read the comment docs.

jasmainak · 2020-06-23T15:13:40Z

There are 2 datasets (364.69 +/- 116.31 seconds) with sampling rates 1000.0 (n=1), 999.0 (n=1).

I don't get this part.

Can you make the text almost like a copy-paste for publication? I would also add information about:

Manufacturer of device (e.g., Vectorview)
Number of channels
Filtering information
Sampling frequency

what's the output of pybids for the same dataset? hopefully not the same?

jasmainak · 2020-06-23T15:16:41Z

This is the output for ds000117:

The dataset consists of 1 patients with 1 sessions (01) consisting of 1 kinds of data (meg). The dataset consists of 1 subjects (10.0 +/- 0.0; 1 right ; 1 male ). There are 6 datasets (447.33 +/- 106.20 seconds) with sampling rates 1100.0 (n=6)

This seems off. There aren't "6 datasets". We should test on a few datasets to ensure it gives something reasonable. I would pick 10 ephys datasets at random from openneuro and try on them.

adam2392 · 2020-06-23T15:50:53Z

There are 2 datasets (364.69 +/- 116.31 seconds) with sampling rates 1000.0 (n=1), 999.0 (n=1).
Can you make the text almost like a copy-paste for publication?

What would be the copy-paste version of this part? Do you have a specific structure in mind?

Manufacturer of device (e.g., Vectorview)

Number of channels

Filtering information

Sampling frequency

By filtering info, I suppose you mean the SoftwareFilters in sidecar.json? 1/2/4 can be added.

what's the output of pybids for the same dataset? hopefully not the same?

pybids doesn't give any output at all. See the output pasted in the corresponding issue of this PR. I think it's due to the fact that the pybids-report still only supports nifti files.

examples/convert_group_studies.py

adam2392 · 2020-06-23T17:46:57Z

This is the output for ds000117:
This seems off. There aren't "6 datasets". We should test on a few datasets to ensure it gives something reasonable. I would pick 10 ephys datasets at random from openneuro and try on them.

https://mne.tools/mne-bids/stable/auto_examples/convert_group_studies.html#sphx-glr-auto-examples-convert-group-studies-py

There are 6 .fif files tho, so aren't there 6 datasets?

agramfort · 2020-06-23T19:36:52Z

@adam2392 please paste here what you obtain on various datasets to get a feeling of how it reads. Thx

jasmainak · 2020-06-23T19:40:57Z

There are 6 .fif files tho, so aren't there 6 datasets?

I think they are 6 runs not dataset. To me, a dataset is everything inside bids_root

adam2392 · 2020-06-23T21:31:18Z

A few thoughts:

participants.tsv summary is usually desired (from all pubs I've seen and written), but it also is difficult to achieve consistency since it is a RECOMMENDED file + format, so optionally, we should be able to summarize report w/o it.
the BIDS spec version is currently 1.4.0, so that can be updated in the code
MEG_TEMPLATE and EEG_TEMPLATE can be pretty easily added at the end of create_methods_paragraph(). If someone wants to add that in, the naive thing to do would be just grab different channels from the sidecar.json file I suppose?

jasmainak · 2020-06-23T21:40:08Z

@adam2392 can you put this comment in the PR description?

adam2392 · 2020-06-25T14:33:56Z

@adam2392 can you put this comment in the PR description?

@jasmainak Besides the current breaking of the bids-validator, I was wondering if you could lmk how the direction currently feels.

Some notes:

I can fix rounding very easily to two decimal places.
We could add a dynamic check on participants.tsv file since participants is always a desired summary. But currently, we have no way of determining if the file complies with how we assume the formatting to be (e.g. M vs male vs man etc.). Idk how desirable this is, versus just "switching off the summary of participants because we lack control".
I added the total # of scans (e.g. total # runs summed across entire bids_root), but didn't update the summaries yet.

jasmainak · 2020-06-26T02:16:33Z

"switching off the summary of participants because we lack control".

not sure I understand this point. But I would say, don't add too much code complexity and branching. It will make life harder for future developers.

can you add a couple of more examples in the description? At least 6 or 7 in total? Just to get a sense of how the methods paragraph might be useful/handy for researchers. I'll ask a couple of my colleagues to provide feedback what else might be useful to include.

mne_bids/report.py

mne_bids/tests/test_report.py

jasmainak · 2020-06-26T02:50:29Z

Dataset was created with BIDS version 1.2.2
using MNE-BIDS

We don't really know if it was MNE-BIDS? If so, we should leave it out

adam2392 · 2020-06-26T15:14:22Z

Dataset was created with BIDS version 1.2.2
using MNE-BIDS

We don't really know if it was MNE-BIDS? If so, we should leave it out

Is this still true in the context of #460 ?

mne_bids/config.py

mne_bids/report.py

jasmainak · 2020-06-26T20:05:07Z

@adam2392 take a look at the datasets tested in MNE-study-template. Would you mind posting the description for these datasets as well? If we have a substantial number (around 10), we can start to see if this looks good. And then in the study template, you could add a line to add the generated paragraph to the MNE report using add_htmls_to_section so there is an additional layer of testing for MNE-BIDS.

adam2392 · 2020-06-27T18:15:51Z

@adam2392 take a look at the datasets tested in MNE-study-template. Would you mind posting the description for these datasets as well? If we have a substantial number (around 10), we can start to see if this looks good. And then in the study template, you could add a line to add the generated paragraph to the MNE report using add_htmls_to_section so there is an additional layer of testing for MNE-BIDS.

Okay, so I took those datasets and running those locally and pasting into the PR description. Also, FYI some of the descriptions aren't up to date w/ some of the minor changes, but I don't want to re-download the datasets locally and redo (They're mainly spelling or grammar issues, or logic on the template string itself, which have been updated). If this is absolutely needed, I can re-download, but it takes a bit of time on my currently home internet :p.

I don't now what you mean by adding it into the study template? Do you mean in the circleci of mne-study-template? If so, which file? Logistically, do I just simply add mne-bids to the install, generate the report using this PR (once it's merged), and then add the add_htmls_to_section function passing in the mne-bids generated report?

A problem I had w/ a large dataset
Due to _summarize_sidecars and _summarize_channels, this makes the summary generation for larger datasets (e.g. I have 102 patients), very slow. What's the opinion on parallelization?

agramfort · 2020-06-27T19:49:07Z

@adam2392 please paste here examples of results so we don't all need to test ourself to give you more feedback. Any dataset is fine but the more the better. we just need to read some examples.

adam2392 · 2020-07-20T23:47:14Z

almost good to go from my end. Can you make the PR title to MRG when all green? thx @adam2392

Not sure why the docs are failing... Do you know if it's something I can fix?

mne_bids/report.py

mne_bids/tests/test_report.py

Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>

agramfort · 2020-07-21T13:54:24Z

I like what Richard suggests

adam2392 · 2020-07-21T14:06:21Z

One issue I'm having is... is there a way one can "add" Template objects together? It would make handling participants string easier.

…to automethods

jasmainak · 2020-07-21T18:52:15Z

please go ahead and incorporate @hoechenberger and @sappelhoff 's suggestions :-) I am lagging behind ...

adam2392 · 2020-07-22T17:49:58Z

Okay I added in the feedback and I think things are all good to go.

agramfort · 2020-07-22T20:39:19Z

here is what I get

(base) alex@:mne-bids(automethods)$ mne_bids report --bids_root ~/mne_data/ds000117/
Summarizing participants.tsv /Users/alex/mne_data/ds000117/participants.tsv...
Summarizing scans.tsv files [PosixPath('/Users/alex/mne_data/ds000117/sub-13/ses-meg/sub-13_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-14/ses-meg/sub-14_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090506/sub-emptyroom_ses-20090506_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090601/sub-emptyroom_ses-20090601_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090515/sub-emptyroom_ses-20090515_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20091208/sub-emptyroom_ses-20091208_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090409/sub-emptyroom_ses-20090409_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20091126/sub-emptyroom_ses-20091126_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090518/sub-emptyroom_ses-20090518_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-emptyroom/ses-20090511/sub-emptyroom_ses-20090511_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-15/ses-meg/sub-15_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-12/ses-meg/sub-12_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-08/ses-meg/sub-08_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-01/ses-meg/sub-01_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-06/ses-meg/sub-06_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-07/ses-meg/sub-07_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-09/ses-meg/sub-09_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-10/ses-meg/sub-10_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-11/ses-meg/sub-11_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-16/ses-meg/sub-16_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-05/ses-meg/sub-05_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-02/ses-meg/sub-02_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-03/ses-meg/sub-03_ses-meg_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000117/sub-04/ses-meg/sub-04_ses-meg_scans.tsv')]...
The participant template found: comprised of 9 men and 7 women;
handedness were all unknown; ages ranged from 23.0 to 31.0 (mean = 26.38, std = 2.76; 1 with unknown age)
------------------------------------ REPORT ------------------------------------
The Multisubject, multimodal face processing dataset was created with BIDS
version 1.0.2 by Wakeman, DG, and Henson, RN. This report was generated with
MNE-BIDS (https://doi.org/10.21105/joss.01896). The dataset consists of 16
participants (comprised of 9 men and 7 women; handedness were all unknown; ages
ranged from 23.0 to 31.0 (mean = 26.38, std = 2.76; 1 with unknown age))and 2
recording sessions: meg, and mri. Data was recorded using a MEG system
(Elekta/Neuromag manufacturer) sampled at 1100 Hz with line noise at 50 Hz using
SpatialCompensation. There were 104 scans in total. For each dataset, there were
on average 404.0 (std = 0.0) recording channels per scan, out of which 404.0
(std = 0.0) were used in analysis (0.0 +/- 0.0 were removed from analysis).
(base) alex@:mne-bids(automethods)$ mne_bids report --bids_root ~/mne_data/ds000248
Summarizing scans.tsv files [PosixPath('/Users/alex/mne_data/ds000248/sub-01/sub-01_scans.tsv')]...
------------------------------------ REPORT ------------------------------------
The ds000248 dataset was created with BIDS version 1.2 by Alexandre Gramfort,
and Matti S Hämäläinen. This report was generated with MNE-BIDS
(https://doi.org/10.21105/joss.01896). The dataset consists of 1 participants
(). Data was recorded using a MEG system (Elekta manufacturer) sampled at 600.61
Hz with line noise at 60 Hz. There was 1 scan in total. For each dataset, there
were on average 376.0 (std = 0.0) recording channels per scan, out of which
374.0 (std = 0.0) were used in analysis (2.0 +/- 0.0 were removed from
analysis).
(base) alex@:mne-bids(automethods)$ mne_bids report --bids_root ~/mne_data/ds000246
Summarizing participants.tsv /Users/alex/mne_data/ds000246/participants.tsv...
Summarizing scans.tsv files [PosixPath('/Users/alex/mne_data/ds000246/sub-emptyroom/sub-emptyroom_scans.tsv'), PosixPath('/Users/alex/mne_data/ds000246/sub-0001/sub-0001_scans.tsv')]...
The participant template found: sex were all unknown;
handedness were all unknown; ages ranged from 25.0 to 25.0 (mean = 25.0, std = 0.0; 1 with unknown age)
------------------------------------ REPORT ------------------------------------
The MEG-BIDS Brainstorm data sample dataset was created with BIDS version 1.0.2
by Elizabeth Bock, Peter Donhauser, Francois Tadel, Guiomar Niso, and Sylvain
Baillet. This report was generated with MNE-BIDS
(https://doi.org/10.21105/joss.01896). The dataset consists of 1 participants
(sex were all unknown; handedness were all unknown; ages ranged from 25.0 to
25.0 (mean = 25.0, std = 0.0; 1 with unknown age)). Data was recorded using a
MEG system (CTF manufacturer) sampled at 2400 Hz with line noise at 60 Hz using
SpatialCompensation with parameters 3rd GradientOrder. There were 3 scans in
total. Recording durations ranged from 360 to 360 seconds (mean = 360.0, std =
0.0), for a total of 720 seconds of data recorded over all scans. For each
dataset, there were on average 340.0 (std = 0.0) recording channels per scan,
out of which 340.0 (std = 0.0) were used in analysis (0.0 +/- 0.0 were removed
from analysis).

jasmainak · 2020-07-22T20:43:14Z

Looks pretty fair!

jasmainak

I'm +1 on merging this!

jasmainak · 2020-07-22T20:44:54Z

We can keep tweaking more in future PRs but this is good for a start.

agramfort · 2020-07-22T20:54:24Z

ok for you @hoechenberger ?

adam2392 · 2020-07-30T14:35:40Z

Any remaining changes needed here? Don't want this to go stale and we forget what happened :p.

sappelhoff · 2020-07-30T16:00:21Z

I can't make enough time to review this right now, sorry :|

agramfort · 2020-07-30T16:05:25Z

maybe we merge this and improve later?

sappelhoff · 2020-07-30T16:07:25Z

maybe we merge this and improve later?

+1, I hate stale PRs and love iterations :-)

jasmainak · 2020-07-30T20:10:14Z

Merged, thanks a ton @adam2392 !!! This is great :-)

hoechenberger · 2020-07-30T21:33:08Z

Argh I was just working on a review… but yet let's iterate, then! :)

jasmainak · 2020-07-30T22:58:50Z

oops, sorry :)

…

On Thu, Jul 30, 2020 at 5:33 PM Richard Höchenberger < ***@***.***> wrote: Argh I was just working on a review… but yet let's iterate, then! :) — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#457 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADY6FIVRSFJCZI2J3PRVAQ3R6HRKJANCNFSM4OE6BDQQ> .

adam2392 changed the title ~~Automethods~~ [WIP, API] Start of automatic methods section with create_methods_paragraph Jun 22, 2020

sappelhoff reviewed Jun 23, 2020

View reviewed changes

mne_bids/tests/test_report.py Outdated Show resolved Hide resolved

sappelhoff reviewed Jun 23, 2020

View reviewed changes

examples/convert_group_studies.py Outdated Show resolved Hide resolved

adam2392 mentioned this pull request Jun 26, 2020

Recent addition to bids-validator breaks our tests #459

Closed