# Download Analyses from Flywheel

Welcome! This is an introductory worksheet to explore how we can select and download analysis files from Flywheel. 

**Date modified:** 11/12/2024<br>
**Authors:** Amy Hegarty, Intermountain Neuroimaging Consortium

**Sections:**
1. USER INPUTS
2. IMPORT STATEMENTS
3. FLYWHEEL LOGIN
4. QUICK COMMANDS
5. ANALYSIS TABLES
6. RUN ANALYSIS BY TEMPLATE
7. DOWNLOAD ANALYSIS FILES
-----

Before starting...
1. Be sure you have configured your conda environment to view ics managed conda environments and packages. If you haven't get started [here](https://inc-documentation.readthedocs.io/en/latest/pl_and_blanca_basics.html#setting-up-conda-environments).

2. Be sure to select the `incenv` kernel from the list of available kernels. If you don't see the `incenv` kernel, contact Amy Hegarty <Amy.Hegarty@colorado.edu> or follow the instructions [here](https://inc-documentation.readthedocs.io/en/latest/pl_and_blanca_basics.html#setting-up-conda-environments) to setup a new kernel in a shared conda environment. 

## __USER INPUTS__
Gather all users defined variables for the worksheet

In [None]:
user_inputs = {
    "group": "<group>",
    "project": "<project>", 
    "download-path": "<path-to-download-directory>"
}

## __IMPORT STATEMENTS__
Here we will load all packages used in the worksheet. This includes some custom helper functions stored in helper_functions.py

In [None]:
from pathlib import Path
import os
import flywheel
import pandas as pd
import sys
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
log = logging.getLogger('main')

# import custom helper functions, need to first add path to system envrionment... 
#      do that using current directory inside jupyter notebooks, 
#      or __file__ attribute in script
try:
    sys.path.insert(0, os.path.dirname(__file__))
except NameError:
    sys.path.insert(0, os.path.dirname(os.getcwd()))
from _helper_functions import tables, fileIO, gears


# set default permissions
os.umask(0o002);

## __FLYWHEEL LOGIN__
Be sure you have first logged into flywheel using the command line interface. Once you have stored your API key, you will not need to log in subsequent times. Follow instructions [here](https://inc-documentation.readthedocs.io/en/latest/cli_basics.html#cli-from-blanca-compute-node). 

In [None]:
# get flywheel client
fw = flywheel.Client('')

project = fw.projects.find_one('label='+pycontext["project"])## GENERATE TABLE
Use a custom function `get_table` stored in helper_functions.py to walk through all project analyses. Locate analyses which meet our matching critera and ouput their ids in a table. 

## __QUICK COMMANDS__

In [None]:
sessions = fw.projects.find_one('label='+user_inputs["project"]).sessions.find()  # get session objects
session_ids = [s.id for s in sessions if "pilot" not in " ".join(s.tags)]         # get session ids for non pilot sessions
print(session_ids)

## __ANALYSIS TABLES__
It can be useful to construct tables of all analyses for a specified project. These tables can be organized to:
1. __`gear_table`__ include all instances of a specific gear (e.g. fmriprep)
2. __`session_table`__ include all sessions within the project, store analysis ids for all auto workflow analyses

Inspect these tables to ensure analyses are complete for all sessions and use as input for downloads

In [None]:
# get table (all analyses for specified gear...)
gearname = 'bids-fmriprep'
gear_table = tables.get_table_by_gearname(user_inputs, gearname)

# display table
print(gear_table.info())

# path for table output
label=user_inputs["project"].lower()+"."+gearname+".table.csv"

# save table for future use
gear_table.to_csv(label,index=False)
log.info("Analysis Spreadsheet saved: %s", os.path.join(os.getcwd(),label))

In [None]:
# get a list of all analysis ids used for our download methods...
list(gear_table.loc[gear_table['analysis.label'].str.contains('lower-motor NR24'),"analysis.id"])

In [None]:
# get table (all analyses for each project session)
session_table = tables.get_table_by_template(user_inputs, template_file_name="gear_template.json")

# display table
print(session_table.info())

# path for table output
label=user_inputs["project"].lower()+".bysession.table.csv"

# save table for future use
session_table.to_csv(label,index=False)
log.info("Analysis Spreadsheet saved: %s", os.path.join(os.getcwd(),label))

## __RUN ANALYSIS BY TEMPLATE__
After reviewing your analysis checks, if there are any sessions with incomplete workflows, you may want to run any remaining gears using the auto workflow methods. Inspect each session first before running the workflow to confirm if any prior gears failed, and if so, why.

In [None]:
# for more information on running analyses using the auto-workflow template, visit `2_gear_workflow` section of this repo
gears.run_auto_gear('<session_id>', template_file_name="gear_template.json")

## __DOWNLOAD ANALYSIS FILES__
Use the analysis table and download all analysis files to selected path. 

In [None]:
# option 1: download directly from analysis table
os.makedirs(user_inputs["download-path"], exist_ok=True)
fileIO.download_session_analyses_byid(gear_table.loc[gear_table['subject.label'] == '102', 'analysis.id'].values[0],user_inputs["download-path"])

In [None]:
# option 2: download directly from list of analysis ids
analysis_ids = [
    '<analysis-id-1>'',
    '<analysis-id-2>',
]
os.makedirs(user_inputs["download-path"], exist_ok=True)
for aid in analysis_ids:
    fileIO.download_session_analyses_byid(aid,user_inputs["download-path"])