# Making sure your data is BIDS-compliant on Flywheel

BIDS compliance is absolutely necessary for running many pre-processing pipelines on your data. Having correct metadata ensures that the pre-processing correctly handles your data. Incorrect BIDS naming can result in difficult-to-debug errors.

For these reasons, we cover in-depth how to make sure your data is BIDS-compliant.

First, let's look at an example subject. 

<img src="images/initial_bids_error.png" alt="drawing" width="600"/>

These red flags mean that there are errors converting the images to a BIDS-compliant name. Here you have two options

  1. Click the "Info" button on the scan and make sure the BIDS section is completed correctly
  2. Use the Flywheel API to change the BIDS properties


## The Flywheel data model

Flywheel stores zip archives of the dicom data and nifti files. The names of the nifti files are usually determined by `dcm2niix` and that name sticks with the file for the rest of its life. I believe these files are managed using MongoDB's GridFS framework, so there are some non-intuitive things that can happen. 

  - More than one file can exist with the exact same name. The files are indexed by a BSON id, which is
    unique for each file. The field containing the file name is not necessarily unique.

  - Unless the data was uploaded in BIDS format, the actual BIDS-named data does not exist in Flywheel.
    Instead, when a job needs requires BIDS input, the job uses the Flywheel API to download and rename
    the data to be BIDS compliant.

    - The renaming and JSON Sidecar creation uses information stored in Flywheel's MongoDB. Therefore to
      change the BIDS filename, you simply update the files Document/BSON and next time a gear is run,
      this new metadata will be used to create the BIDS directory.

The metadata associated with each file can be edited through the Flywheel web interface or through the API. The web interface is not complete and still a little buggy (i.e. not all UI's contain BIDS fields). The SDK is fairly complete and documented, so that will be the focus of this notebook.

        

## Editing BIDS info from the web interface

You can change the information on your scans by clicking the info icon in the file's row.

<img src="images/select_info.png" alt="drawing" width="600"/>


The fields can then be edited manually. This is the initial interface presented:

<img src="images/basic_file_info.png" alt="drawing" width="600"/>

where basic options about the contents of the file are editable. This will always be available when clicking the Info button. If the `BIDS Curation` gear has been run, you can scroll down and see the BIDS fields:

<img src="images/bids_info.png" alt="drawing" width="600"/>

It is unclear to me if the `BIDS Curation` gear needs to be re-run for these changes to be committed to MongoDB.

## Editing BIDS info from the API

The Flywheel SDK can be installed from PyPI using

```bash

pip install flywheel-sdk

```

The documentation for this package can be found [here](https://flywheel-io.github.io/core/branches/master/python/index.html). 

In [1]:
import flywheel

# Create client
fw = flywheel.Client()

Note the documentation is out of date and the constructor for `flywheel.Client` does not take any arguments. Instead you have to login with your API key usinf the commandline `fw login`. Verify that your login worked and you can retrieve data from Flywheel:

In [3]:
fw.get_current_user()['email']

'mcieslak@upenn.edu'

Now we need to find the document that represents this scanning session. This is can be done through a search:

In [6]:
# How do you find the session?
project = fw.projects.find_first('label=Reward2018')
print(project)

{'analyses': None,
 'created': datetime.datetime(2018, 12, 19, 15, 21, 10, 217000, tzinfo=tzutc()),
 'description': None,
 'files': [{'classification': {},
            'created': datetime.datetime(2019, 1, 9, 19, 48, 52, 13000, tzinfo=tzutc()),
            'hash': 'v0-sha384-1c77469675dad45012de0cbf1da5053e27918713ad822de64502a711d0238e56bc7b3f94946ba1e8d1921d1c8d68d1d4',
            'id': '66463c4e-c46b-4975-868d-fb48c45039e8',
            'info': {'BIDS': {'Filename': '',
                              'Folder': '',
                              'Path': '',
                              'error_message': "Filename u'' is too short",
                              'ignore': False,
                              'template': 'project_file',
                              'valid': False}},
            'info_exists': True,
            'mimetype': 'text/plain',
            'modality': None,
            'modified': datetime.datetime(2019, 1, 9, 22, 38, 19, 579000, tzinfo=tzutc()),
            'n

You'll notice that there is an `'_id'` field in all of these documents. This is the BSON id that is used to identify files in flywheel. Everything - files, subjects, sessions, analyses - has one of these that can be used for direct access. Confusingly, you can't access it directly with `project['id']`: you'll need to use `project['_id']`.

In [8]:
project['_id']

'5c1a61e69011bd0011368884'

In [19]:
sessions = project.sessions.find('created=2018-03-22')

In [20]:
sessions

[]