# Making sure your data is BIDS-compliant on Flywheel

BIDS compliance is absolutely necessary for running many pre-processing pipelines on your data. Having correct metadata ensures that the pre-processing correctly handles your data. Incorrect BIDS naming can result in difficult-to-debug errors.

For these reasons, we cover in-depth how to make sure your data is BIDS-compliant.

First, let's look at an example subject. 

<img src="images/initial_bids_error.png" alt="drawing" width="600"/>

These red flags mean that there are errors converting the images to a BIDS-compliant name. Here you have two options

  1. Click the "Info" button on the scan and make sure the BIDS section is completed correctly
  2. Use the Flywheel API to change the BIDS properties


## The Flywheel data model

Flywheel stores zip archives of the dicom data and nifti files. The names of the nifti files are usually determined by `dcm2niix` and that name sticks with the file for the rest of its life. I believe these files are managed using MongoDB's GridFS framework, so there are some non-intuitive things that can happen. 

  - More than one file can exist with the exact same name. The files are indexed by a BSON id, which is
    unique for each file. The field containing the file name is not necessarily unique.

  - Unless the data was uploaded in BIDS format, the actual BIDS-named data does not exist in Flywheel.
    Instead, when a job needs requires BIDS input, the job uses the Flywheel API to download and rename
    the data to be BIDS compliant.

    - The renaming and JSON Sidecar creation uses information stored in Flywheel's MongoDB. Therefore to
      change the BIDS filename, you simply update the files Document/BSON and next time a gear is run,
      this new metadata will be used to create the BIDS directory.

The metadata associated with each file can be edited through the Flywheel web interface or through the API. The web interface is not complete and still a little buggy (i.e. not all UI's contain BIDS fields). The SDK is fairly complete and documented, so that will be the focus of this notebook.

        

## Editing BIDS info from the web interface

You can change the information on your scans by clicking the info icon in the file's row.

<img src="images/select_info.png" alt="drawing" width="600"/>


The fields can then be edited manually. This is the initial interface presented:

<img src="images/basic_file_info.png" alt="drawing" width="600"/>

where basic options about the contents of the file are editable. This will always be available when clicking the Info button. If the `BIDS Curation` gear has been run, you can scroll down and see the BIDS fields:

<img src="images/bids_info.png" alt="drawing" width="600"/>

It is unclear to me if the `BIDS Curation` gear needs to be re-run for these changes to be committed to MongoDB.

## Editing BIDS info from the API

The Flywheel SDK can be installed from PyPI using

```bash

pip install flywheel-sdk

```

The documentation for this package can be found [here](https://flywheel-io.github.io/core/branches/master/python/index.html). 

In [1]:
import flywheel

# Create client
fw = flywheel.Client()

Note the documentation is out of date and the constructor for `flywheel.Client` does not take any arguments. Instead you have to login with your API key usinf the commandline `fw login`. Verify that your login worked and you can retrieve data from Flywheel:

In [2]:
fw.get_current_user()['email']

'tinashemtapera@gmail.com'

Now we need to find the document that represents this scanning session. This is can be done through a search:

In [3]:
# How do you find the session?
project = fw.projects.find_first('label=Reward2018')
print(project)

{'analyses': None,
 'created': datetime.datetime(2018, 12, 19, 15, 21, 10, 217000, tzinfo=tzutc()),
 'description': None,
 'files': [{'classification': {},
            'created': datetime.datetime(2019, 1, 9, 19, 48, 52, 13000, tzinfo=tzutc()),
            'hash': 'v0-sha384-1c77469675dad45012de0cbf1da5053e27918713ad822de64502a711d0238e56bc7b3f94946ba1e8d1921d1c8d68d1d4',
            'id': '66463c4e-c46b-4975-868d-fb48c45039e8',
            'info': {u'BIDS': {u'Filename': u'',
                               u'Folder': u'',
                               u'Path': u'',
                               u'error_message': u"Filename u'' is too short",
                               u'ignore': False,
                               u'template': u'project_file',
                               u'valid': False}},
            'info_exists': True,
            'mimetype': 'text/plain',
            'modality': None,
            'modified': datetime.datetime(2019, 1, 9, 22, 38, 19, 579000, tzinfo=tzutc

You'll notice that there is an `'_id'` field in all of these documents. This is the BSON id that is used to identify files in flywheel. Everything - files, subjects, sessions, analyses - has one of these that can be used for direct access. Confusingly, you can't access it directly with `project['id']`: you'll need to use `project['_id']`.

In [4]:
project['_id']

'5c1a61e69011bd0011368884'

In [5]:
sessions = project.sessions.find('created=2018-03-22')

In [6]:
sessions

[]

T. T.

In [24]:
for subject in project.subjects():
        print('%s: %s' % (subject.id, subject.label))

5c352b991de80b0024480dda: 10180
5c352a081de80b0024480dcc: 102102
5c3535641de80b0024480e41: 104059
5c3527fe1de80b0024480db7: 10410
5c3541301de80b0024480ebc: 105168
5c353ab01de80b00198acafb: 105272
5c3542de1de80b00198acb34: 105490
5c35186f1de80b00198ac9f9: 105634
5c3539ea1de80b001c0da9c8: 106573
5c353e151de80b0024480ea0: 107055
5c3545391de80b00198acb42: 109741
5c3511cd1de80b0024480c9f: 11010
5c3532511de80b0024480e26: 11176
5c351c2f1de80b00156d3d76: 11186
5c1a8b619011bd0011369953: 11242
5c1a8b249011bd001436aa02: 11305
5c1a8acd9011bd0015369b48: 113220
5c1a89509011bd001436a68d: 11399
5c1a88959011bd0013369c49: 11419
5c1a876b9011bd0015369675: 11569
5c1a87089011bd0013369978: 11588
5c1a85f79011bd0014369b3b: 11599
5c1a8b2d9011bd0015369ba6: 11647
5c1a8a939011bd0013369f79: 116531
5c1a8a2c9011bd0011369848: 11706
5c1a89b69011bd001436a741: 117256
5c1a88cf9011bd001436a556: 11762
5c1a87df9011bd001436a2f3: 11801
5c1a86b89011bd0015369557: 11866
5c1a863c9011bd00113693cc: 118990
5c1a85ae9011bd0014369914: 1

In [26]:
subject = fw.get("5c1a82ca9011bd0015368ee6")

In [29]:
for ses in subject.sessions():
    print('%s: %s' % (ses.id, ses.label))

5c1a82ca9011bd0015368ee7: neff2


In [30]:
ses1 = fw.get("5c1a82ca9011bd0015368ee7")

In [32]:
ses1.acquisitions()

[{'analyses': None,
  'collections': None,
  'created': datetime.datetime(2018, 12, 19, 17, 41, 30, 537000, tzinfo=tzutc()),
  'files': [{'classification': {},
             'created': datetime.datetime(2018, 12, 19, 17, 43, 36, 832000, tzinfo=tzutc()),
             'hash': '',
             'id': '39b3bce9-dec6-4dfa-b246-8d4b5db67436',
             'info': {u'BIDS': u'NA'},
             'info_exists': True,
             'mimetype': 'application/zip',
             'modality': 'MR',
             'modified': datetime.datetime(2018, 12, 19, 21, 57, 49, 216000, tzinfo=tzutc()),
             'name': 'B0map_onesizefitsall_v4.dicom.zip',
             'origin': {'id': 'mattcieslak@gmail.com',
                        'method': None,
                        'name': None,
                        'type': 'user',
                        'via': None},
             'replaced': None,
             'size': 2946086,
             'tags': [],
             'type': 'dicom',
             'zip_member_count': Non

B0map_onesizefitsall_v4.dicom.zip

In [134]:
f1 = fw.get("5c1a82ca9011bd0011368ed1")

In [135]:
f1.label

'foo'

In [136]:
newname = {"label":"foo2"}
f1.update(newname)

{'modified': 1}

In [137]:
f1

{'analyses': [],
 'collections': None,
 'created': datetime.datetime(2018, 12, 19, 17, 41, 30, 537000, tzinfo=tzutc()),
 'files': [{'classification': {u'Custom': [],
                               u'Intent': [u'Fieldmap'],
                               u'Measurement': [u'B0']},
            'created': datetime.datetime(2018, 12, 19, 17, 43, 36, 832000, tzinfo=tzutc()),
            'hash': '',
            'id': '39b3bce9-dec6-4dfa-b246-8d4b5db67436',
            'info': {u'name': u'foo'},
            'info_exists': None,
            'mimetype': 'application/zip',
            'modality': 'MR',
            'modified': datetime.datetime(2019, 2, 14, 21, 38, 26, 320000, tzinfo=tzutc()),
            'name': 'B0map_onesizefitsall_v4.dicom.zip',
            'origin': {'id': 'mattcieslak@gmail.com',
                       'method': None,
                       'name': None,
                       'type': 'user',
                       'via': None},
            'replaced': None,
            'siz

In [132]:
f1.reload()

{'analyses': [],
 'collections': None,
 'created': datetime.datetime(2018, 12, 19, 17, 41, 30, 537000, tzinfo=tzutc()),
 'files': [{'classification': {u'Custom': [],
                               u'Intent': [u'Fieldmap'],
                               u'Measurement': [u'B0']},
            'created': datetime.datetime(2018, 12, 19, 17, 43, 36, 832000, tzinfo=tzutc()),
            'hash': '',
            'id': '39b3bce9-dec6-4dfa-b246-8d4b5db67436',
            'info': {u'name': u'foo'},
            'info_exists': None,
            'mimetype': 'application/zip',
            'modality': 'MR',
            'modified': datetime.datetime(2019, 2, 14, 21, 38, 26, 320000, tzinfo=tzutc()),
            'name': 'B0map_onesizefitsall_v4.dicom.zip',
            'origin': {'id': 'mattcieslak@gmail.com',
                       'method': None,
                       'name': None,
                       'type': 'user',
                       'via': None},
            'replaced': None,
            'siz