# Update Metadata in Flywheel

Welcome! This is an introductory worksheet to explore how we can use the flywheel sdk to read and update metadata (and data) in flywheel!

**Date modified:** 02/16/2025<br>
**Authors:** Amy Hegarty, Intermountain Neuroimaging Consortium

**Sections:**
1. IMPORT STATMENTS
2. FLYWHEEL LOGIN
3. ACQUISITION RENAMING
4. DELETE UNWANTED FILES
5. UPDATE INTENDEDFOR SETS


**NOTE**: Take special note, studies not collected using teh `reproin` naming convention should apply acquisition renaming (Workbook __Section 3__) to all data before running bids-* workflows

-----

Before starting...
1. Be sure you have configured your conda environment to view ics managed conda environments and packages. If you haven't get started [here](https://inc-documentation.readthedocs.io/en/latest/pl_and_blanca_basics.html#setting-up-conda-environments).

2. Be sure to select the `incenv` kernel from the list of available kernels. If you don't see the `incenv` kernel, contact Amy Hegarty <Amy.Hegarty@colorado.edu> or follow the instructions [here](https://inc-documentation.readthedocs.io/en/latest/pl_and_blanca_basics.html#setting-up-conda-environments) to setup a new kernel in a shared conda environment. 

## __IMPORT STATEMENTS__
Here we will load all packages used in the worksheet.

In [2]:
import os
import pandas as pd
# Third party packages come second
import flywheel
from flywheel import ApiException
from io import StringIO

## __FLYWHEEL LOGIN__
Be sure you have first logged into flywheel using the command line interface. Once you have stored your API key, you will not need to log in subsequent times. Follow instructions [here](https://inc-documentation.readthedocs.io/en/latest/cli_basics.html#cli-from-blanca-compute-node). 

In [4]:
fw = flywheel.Client()

## __ACQUISITON RENAMING__
There are many cases where it may be useful to programatically update acquisition names within a project. This is most often used when the original data was not collected using `reproin` naming convention. Example code here shows how a user can store a map `acquisition_label_remapping.csv` in Flywheel project, then apply the remapping to a given session at a click of a button. Important to note, if acquisitions have duplicate `seriesDescription` labels the second, and third, and forth, so on instances of the acqusition will be appended with a suffix. 

### helper functions

In [28]:
def get(obj):
    return fw.get(obj.id)

def acquisitions_ordered_by_number(session):
    "return a list acqusitions in order by series number..."
    
    ordered_list = {'SeriesNumber':[],'SeriesDescription':[],'acquisition':[], 'acq_id':[]}
    for acq in get(session).acquisitions():
        file = next(f for f in get(acq).files if f['type'] == 'dicom')
        if file:
            try:
                file.info["SeriesDescription"]
                ordered_list['SeriesDescription'] += [file.info["SeriesDescription"]]
                ordered_list['SeriesNumber'] += [file.info["SeriesNumber"]]
                ordered_list['acquisition'] += [acq.label]
                ordered_list['acq_id'] += [acq.id]
            except Exception as e:
                pass


    df = pd.DataFrame.from_dict(ordered_list)
    return df.sort_values('SeriesNumber',ignore_index=True)

### Example: apply acquisition renaming to single session (add suffix for duplicates)

In [None]:
# rename_acqisitions = {
# 'localizer_32ch': 'localizer_32ch_ignore-BIDS',
#  'localizer_32ch_uncombined': 'localizer_32ch_uncombined_ignore-BIDS',
#  'localizer_32ch_uncombined_1': 'localizer_32ch_uncombined_ignore-BIDS',
#  'Combined_Image': 'Combined_Image_ignore-BIDS',
#  'GFactor': 'GFactor_ignore-BIDS',
#  'SNRMap': 'SNRMap_ignore-BIDS',
#  'NoiseCovariance': 'NoiseCovariance_ignore-BIDS',
#  'ABCD_QA_fMRI': 'func-bold_task-abcdqa_run-01',
#  'ABCD_QA_fMRI_1':'func-bold_task-abcdqa_run-02',
#  'ABCD_QA_dMRI': 'dwi_acq-abcdqa_run-01',
#  'FBIRN_QA_fMRI_flip77': 'func-bold_task-abcdqa_acq-flip77_run-01',
#  'FBIRN_QA_fMRI_flip10': 'func-bold_task-abcdqa_acq-flip10_run-01',
#  'PhoenixZIPReport': 'PhoenixZIPReport_ignore-BIDS'
# }

# !!!SESSION!!!
qa_session = fw.lookup('<path-to-session>')

# SETP 1: pull mapping labels from flywheel
try:
    project = fw.get_container(qa_session.parents["project"])
    
    # assume acquisition labeling key is called "acquisition_label_remapping.csv"
    sourcefile = project.get_file("acquisition_label_remapping.csv")
    
    # read file directly to memory
    data_str = sourcefile.read().decode('utf-8')
    
    # import as dictionary
    rename_acqisitions = {row[0] : row[1] for _, row in pd.read_csv(StringIO(data_str)).iterrows()}
    
except Exception as e:
    print("unable to load acquisition labeling from flywheel...")
    raise e

# start by putting the acquisitions in series number order!!! -- Doesn't always start this way!
ordered_list = acquisitions_ordered_by_number(qa_session)
reset = False

# loop through acquisitions (in order and relabelling...)
for index, row in ordered_list.iterrows():
    acq = fw.get(row["acq_id"])
    
    # find new acquisition label from dictionary
    try:    
        new_label = rename_acqisitions[acq.label]
    except:
        new_label = acq.label
    
    # This is for troubleshooting only, resets the labels to the default from the scanner import
    if reset:
        new_label=row["SeriesDescription"]
    
    # if duplicates exist (update command fails) , you will need to append with a numeric suffix.. 
    for i in range(10):
        suffix = f"_{i}" if i > 0 else ""
        try:
            new_label += suffix
            full_acq.update({'label': new_label})
            print(f"updating: {acq.label} ---> {new_label}")
            break
        except ApiException:
            # label already in use... increase counter and try again
            print("igornoring ApiExpection")
        except Exception as e:
            raise(e)
       

## __DELETE UNWANTED FILES (E.G. DERIVATIVE LOCALIZER FILES)__
Removing data from Flywheel should not be taken lightly! Always run in `dry_run` mode before executing. Specific situations may warrent programatically deleting files such as removing extraneous derivative files.

__NEVER DELETE SOURCE DATA!!__  

In [None]:
session  = fw.lookup('<path-to-session>')
dry_run = True
for acq in session.acquisitions():
    if "localizer" in acq.label:
        full_acq = fw.get(acq.id)
        for f in full_acq.files:
            if f.type == 'nifti':
                print(f.name)
                if not dry_run:
                    fw.delete_file(f.file_id)
                

## __UPDATE INTENDEDFOR SETS WHEN MULTIPLE FIELDMAPS PAIRS EXIST__
Some occasions we may need to re-assign the fMRI / dMRI files which should be distortion corrected with a matching fieldmap pair. These are generally matched to the same geometric dimensions and as close in time as possible during data acquisition. 

### helper functions

In [13]:
def build_lookup(full_session):
    sdescrp=[]; snum=[]; acq_name=[]; acq_id=[]; bids_label=[]; task=[]; folder=[]; direction=[]
    
    for acq in full_session.acquisitions():
        # intialize / resest apply flag
        full_acq = fw.get_acquisition(acq.id)
        
        if "ignore-BIDS" not in acq.label:
            for file in full_acq['files']:
                if file['type'] == 'nifti':
                    if "SeriesDescription" not in file.info:
                        continue
                    sdescrp.append(file.info["SeriesDescription"])
                    snum.append(file.info["SeriesNumber"])
                    acq_name.append(acq.label)
                    acq_id.append(acq.id)
                    
                    if file.info["BIDS"]:
                        bids_label.append("ses-"+full_session.label+"/"+file.info["BIDS"]["Folder"]+"/"+file.info["BIDS"]["Filename"])
                    else:
                        bids_label.append(None)
                    
                    if "Task" in file.info["BIDS"]:
                        task.append(file.info["BIDS"]["Task"])
                    else:
                        task.append(None)
                        
                    if "Dir" in file.info["BIDS"]:
                        direction.append(file.info["BIDS"]["Dir"])
                    else:
                        direction.append(None)
                        
                    if "Folder" in file.info["BIDS"]:
                        folder.append(file.info["BIDS"]["Folder"])
                    else:
                        folder.append(None)
                        
                    break
                
    df = pd.DataFrame({"SeriesNumber": snum, 
                       "SeriesName": sdescrp, 
                       "Acquisition":acq_name, 
                       "ID": acq_id, 
                       "BIDS": bids_label,
                       "Modality": folder,
                       "Task": task,
                       "Direction": direction})
    df = df.sort_values("SeriesNumber", ignore_index=True)
    return df


def list_to_dict(rlist):
    rlist = [x.strip(' ') for x in rlist]
    return dict(map(lambda s : s.split(': '), rlist))


# automatically determine which scans go with which fieldmap (using scan order and filters if given...)
def assign_intendedfors(acq_name, lookup_table, mod_filter=None, task_filter=None, force_acq_order=True):
    
    df = lookup_table
    
    # find direction order to ignore: reverse current direction order (used for reverse phase encoding)
    ignore_dir = df.loc[df['Acquisition'] == acq_name, 'Direction'].iat[0][::-1]  
    
    # limit lookup table to sequences after the fieldmap of interest
    df = df[(df['Acquisition'] == acq_name).idxmax():]
    
    # ignore fieldmaps of opposing directions.
    index_ignore = df[(df['Direction']==ignore_dir) & (df['Modality']=="fmap")].index
    if not index_ignore.empty:
        df = df.drop(index_ignore)
     
    # finally...generate list of all acqs after current fieldmap, before next fieldmap is collected
    if force_acq_order:
        if "fmap" in df['Modality'][1:].values:
            df = df.loc[: df[(df['Modality'] == 'fmap')].index[1], :]
        
    # if task filter is given reduce lookup table to only desired tasks
    if task_filter:
        df = df[df['Task'].isin(task_filter)]
        
    # if modality filter is given reduce lookup table to only desired tasks
    if mod_filter:
        df = df[df['Modality'].isin(mod_filter)]
    
    return df
    

### Example: update intendedFors for single session

In [None]:
dry_run = True

ses = (fw.get_session("<session-id>"))

tab = build_lookup(actual_session)

# generate new intendedfor sets:
for index, row in tab[tab['Modality'] == "fmap"].iterrows():

    df = assign_intendedfors(tab['Acquisition'].iat[index],tab,mod_filter=["func","dwi"])

    pull_acq = fw.get_acquisition(tab['ID'].iat[index])
    print(tab['Acquisition'].iat[index])
    
    for ffile in pull_acq.files:
        if ffile.type in ['nifti']:
            #Get info object
            ffile_info=ffile.info
            old_intendedfor = ffile_info['IntendedFor']
            ffile_info['IntendedFor']=df["BIDS"].values.tolist()

            print(old_intendedfor)
            print('----------------->>>>>>')
            print(df["BIDS"].values.tolist())
            print('=======================')


    if not dry_run:
        ffile.replace_info(ffile_info)