# UPenn Flywheel Data Transfer to ASC FMRISrv 

This notebook was shared by Dr Nicole Cooper from CNLab referenced this notebook for Flywheel MURI scan downloads as an example... The same should work for CNLab & AHALab Flywheel projects.

+ 02/21/2020 - [José Carreras-Tartak](mailto:jcarreras@falklab.org) original author
+ 04/28/2021 - [Etienne Jacquot](mailto:etienne.jacquot@asc.upenn.edu) revisited


## *Getting Started w/ [UPenn Flywheel](https://upenn.flywheel.io/) Python-SDK*:

The AHA lab does not have a project on Flywheel so maybe not all the steps are exact yet. This eventually will be in place though. For now let us try based on a specific sessionID

- Please navigate here for access via Pennkey: https://upenn.flywheel.io/
- you need an **api key**, be careful with this secret


In [None]:
import flywheel
import tarfile
import os
import time
import zipfile
from zipfile import ZipFile

import configparser

### Create you Flywheel API secret config file 


- If the cell below returns True, you have a .config file with an api key of some kind. If it returns false OR if your api key needs to be changed, you can run the cell below that to create your apikey or manually navigate and create your [configs/config.ini](./configs/config.ini) (_we add *.ini to the .gitignore_)



_____

```python
config['UPENN-FLYWHEEL'] = {'apikey':your_api_key}  # <-- define your api key

with open(home_dir + '/configs/config.ini', 'w') as configFile: # <-- write to file!
    config.write(configFile)

```

_______

### Read Flywheel API secret into Python w/ ConfigParser

- You must login and navigate to https://upenn.flywheel.io/#/profile, this has your API key 


In [None]:
# Check if the API Key exists. If the output is false
home_dir = "/home/YOUR_JANUS_USERNAME@asc.upenn.edu"

config = configparser.ConfigParser()
config.read(home_dir + "/configs/config.ini")
config.has_option('UPENN-FLYWHEEL', 'apikey')

In [None]:
# specify your home directory where the config file should be saves

your_api_key = "upenn.flywheel.io:YOUR_API_KEY"

config = configparser.ConfigParser()
config['UPENN-FLYWHEEL'] = {'apikey':your_api_key}
with open(home_dir + '/configs/config.ini', 'w') as configFile:
    config.write(configFile)

In [None]:
# add UPenn Flywheel api key to your config.ini
fw_cred = {}
config = configparser.ConfigParser()

config.read(home_dir + '/configs/config.ini') 
for item,value in config['UPENN-FLYWHEEL'].items():
    fw_cred[item]=value

In [None]:
# read your API key
api = fw_cred['apikey']

### Confirm your access to Flywheel via python SDK

- The `fw.get_current_user()` command is a quick way to ensure you have established a secure connection to UPenn Flywheel

In [None]:
# Create client using your API key
fw = flywheel.Client(api)

In [None]:
# print your flywheel information & confirm it works as expected
self = fw.get_current_user()
print('UPenn Flywheel User: %s %s (%s)' % 
      (self.firstname, self.lastname, self.email))

_______

## Proceed by Navigating to Flywheel, you'll notice the URL always has respective identifiers

In this example, our notebook tests for a known session ID associated w/ Dr Lydon-Staley AHA Lab:

- https://upenn.flywheel.io/#/projects/5ba2913fe849c300150d02ed/sessions/6088730ee6de2e3066bd7249
    - where the session ID is in the URL --> `6088730ee6de2e3066bd7249`



### Set your Flywheel Project Container & Corresponding Local Out Project


In [None]:
# replace with name of Flywheel project container (i.e. "geoscan")
# in_project = "geoscan"
in_project = ''

# replace with output project folder name in fMRI server (i.e. "geoscanR01")
# out_project = "GS"
out_project = ""

### Set your session specific ID & corresponding out ID

- not sure why the `opID` is entirely needed here... TBD

In [None]:
## MODIFY BELOW
# replace with ppt ID as listed on Flywheel (e.g. for geoscan, typically "gsXXX")
#ipID = "gs004"
ipID = ""

# replace with ppt ID as it will be stored in the server (i.e. "GSXXX")
#opID = "GS004"
opID = '' # <--- I think this could be whatever, so long as this is unique on the FMRI host

### Verify that output directory in the server is accurate

- You may need to create this directory ahead of time...

In [None]:
outpath = '/fmriDataRaw/fmri_data_raw/{PROJECT}'.format(PROJECT=out_project)

os.listdir(outpath)

________

## Proceed with looking up your subject data & downloading Dicom tarball

NOTE!!

* Location for DICOMS on the server IS:

    - `/fmriDataRaw/fmri_data_raw/{PROJECT}/`

e.g. untar the appropriate folder to e.g. `/fmriDataRaw/fmri_data_raw/{PROJECT}/`


### Flywheel uses `Group / Project / Subject / Session` to identify scan ... 

- the **group** is `falklab`

- the **project** is `bbprime` *(fw://unknown/Unsorted)*

- the **subject** is `bpp00` *(probably a default for the unsorted group)*

- the **session** is `CAMRIS^Falk`

#### Thus our lookup string is --> `'falklab/bbprime/bpp00/CAMRIS^Falk'` 

In [None]:
#group_label = 'falklab'
group_label = 'falklab'

#project_label = 'bbprime'
project_label = in_project # <-- values are set early on in the notebook... maybe that isn't helpful though?

#subject_label = 'bpp00'
subject_label = ipID # <-- values are set early on in the notebook... maybe that isn't helpful though?

session_label = 'CAMRIS^Falk'

######################################################

lookup_string = '{}/{}/{}/{}'.format(group_label,project_label,subject_label,session_label)
lookup_string

### Proceed with looking up the known session in the *Unsorted* project

Create `session` object to lookup session of interest, you want to then confirm metadata is accurate!

- For a helpful video overview on finding your data on Flywheel w/ Python-SDK, I strongly encourage you to visit here:
https://docs.flywheel.io/hc/en-us/articles/360048440933-Webinar-Series-Finding-your-stuff-in-Flywheel-with-the-Python-SDK

*TODO --> CONTACT UPENN FLYWHEEL ADMIN TEAM TO FIGURE OUT LAB PROJECTS!*

In [None]:
#session = fw.lookup('{}'.format(lookup_string))
session = fw.lookup('{group}/{proj}/{pid}'.format(group=group_label,proj=in_project,pid=ipID))
session

### Download the Flywheel Session tarball to FMRISrv

- Once we have the tar zip we can then extract our dicoms to the network


- *On running for Dr Lydon-Staley test subject, this tarball file is nearly 1GB*

#### What about the `./working_data` directory? 

*TODO --> Where does working data directory go? Is that just in the jupyterhub environment? does the tarball get deleted after or saved to the network in raw data?* 
*working data is still here! -AR*

In [None]:
!mkdir working_data

In [None]:
fw.download_tar(session,'./working_data/{opID}.tar'.format(opID=opID))


## Extract contents of Flywheel tar download:

In the following cells, you will:

1. Load tarball into jupyterhub notebook memory space

2. Set your dicom out directory and confirm permissions

3. Loop through tarball `.getmembers()` and then extract zipped dicoms

### Load into Memory:

In [None]:
f = open("working_data/{opID}.tar".format(opID=opID), 'rb') # <--- Flywheel download as Read Bytes
print ('Opening tar in memory as:',f,'\n')
tar_data = tarfile.open(fileobj=f, mode='r:') # <--- Unpack tar in memory

### Set and Create your Out Directory:

- Jupyterhub does not respect secondary group permissions... so when I create a directory it's default to FMRISrvUser1@asc.upenn.edu instead of FMRISrvAHAUsers@asc.upenn.edu ... will manually correct

In [None]:
output_dicom_dir = '{outpath}/{opID}/'.format(outpath=outpath,opID=opID)
print(output_dicom_dir)

In [None]:
# Create the directory if not exist
if not os.path.exists(os.path.dirname(output_dicom_dir)):
    try:
        print('makedirs --> {}'.format(output_dicom_dir))
        os.makedirs(os.path.dirname(output_dicom_dir))
    except:
        print('oops! failed to create --> {}'.format(output_dicom_dir))        

## Confirm permissions for out directory

### Had to make the outdir permission 777 -R

- Secondary group permission is not respected in jhub so I had to manually change for my user created folder ... 

```bash
sudo chgrp fmrisrvahausers@asc.upenn.edu -R /AHAData/fmri_data_raw/
```

In [None]:
ls -la $output_dicom_dir

## EXTRACT YOUR TARBALL DICOM TO FMRISRV NETWORK STORAGE

In [None]:
for member in tar_data.getmembers():
    
    if 'dicom.zip' in member.name:       # <--- Only extract files with 'dicom.zip' 
        
        print('Extracting: {}\n'.format(member.name))
        
        tfile = tar_data.extractfile(member.name)
        dicom_zip = zipfile.ZipFile(tfile, mode='r')
        dicom_zip.extractall(output_dicom_dir)
tar_data.close()

### You have now successfully downloaded the dicom data from Flywheel to ASC servers

- this goes to `/fmriDataRaw/fmri_data_raw/bbprime/BPP00/`

In [119]:
os.listdir('{}'.format(output_dicom_dir))


['1.3.12.2.1107.5.2.43.66044.2021080512571528606660516.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080512581939991561266.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080512582373350662367.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080512582373389062373.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080512582373410162377.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513014931893163223.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513190841242164653.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513270887350458671.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513271927378459079.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513361889387753097.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513362938632153505.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.2021080513451211828847523.0.0.0.dicom',
 '1.3.12.2.1107.5.2.43.66044.30000021080414490281300000161.dicom']

## Delete .tar files?

If you would like to delete the .tar files, open the terminal, navigate to this directory, and enter the following code to recursively empty the .tar folder. You can either keep these or download them again as needed from flywheel. *BE CAREFUL* Make sure you are pointing to the correct directory! This will recursively delete every file in a directory!

```python
rm -r [path/to/DICOMS/working_data/]
```