# Full project downloading data from XNAT

#### Maria Yanez Lopez 2018 (maria.yanez-lopez@imperial.ac.uk)
#### ~ adapted for full project download Niall Bourke Feb 2019 (n.bourke@imperial.ac.uk)
### Documentation: 

https://github.com/pyxnat/pyxnat/blob/master/pyxnat/core/downloadutils.py

https://groups.google.com/forum/#!topic/xnat_discussion/K8h4VP4CBMg

https://gist.github.com/mattsouth/db8f2d09acf3c57ba605fa93c4e8d03e

https://ubuntuforums.org/showthread.php?t=786879

https://wiki.imperial.ac.uk/pages/viewpage.action?spaceKey=HPC&title=Jupyter

Version 2.0 ~ Niall Bourke  
Updated 28/02/2019  
 


This scripts downloads DICOM data from XNAT according to users specifications.


Pre-requisites:

-Need to think carefully which data you need from XNAT
-Pyxnat 1.0.1 needs to be installed (with conda)

Scripts needed for it to work:

- XDC function is required locally to donload csv files for indexing (could be done manually)
- The BIDS formating is done with a bash function, saved in dependencies


### Import python libraries

In [1]:
import sys, os, getpass                           
from pyxnat import Interface

### Introduce your XNAT login details (same as college credentials) and project folder (can use BIGBUCKET)

In [39]:
userName = raw_input('Type XNAT User Name: ')
passWord = getpass.getpass('Type XNAT Password: ')
projectID = raw_input('Type XNAT Project ID: ')
server = 'http://cif-xnat.hh.med.ic.ac.uk'

In [40]:
print 'INPUT'
print 'Server: ', server
print 'Username: ', userName
print 'Password: ', ''.join(['*']*len(passWord))
print 'ProjectID: ', projectID 


INPUT
Server:  http://cif-xnat.hh.med.ic.ac.uk
Username:  nbourke
Password:  *********
ProjectID:  TauTBI


### Create PYXNAT interface

In [41]:
central = Interface(server=server, user=userName, password=passWord)
subjects = central.select.project(projectID).subjects().get()
allSessions = []
number_subjects = 0

### Browse through project, collect subjects/sessions/scans and print subject labels

In [42]:
for i, subject in enumerate(subjects):
    label = central.select.project(projectID).subject(subject).label()
    print label, ('%i/%i' % (i+1, len(subjects)))
    sessions = central.select.project(projectID).subjects(subject).experiments().get()
    allSessions.append(sessions)

CIF1501 1/35
CIF1502 2/35
CIF1411 3/35
CIF1446 4/35
CIF1464 5/35
CIF0876 6/35
CIF0910 7/35
CIF0957 8/35
CIF0958 9/35
CIF0971 10/35
CIF1012 11/35
CIF1013 12/35
CIF1101 13/35
CIF1072 14/35
CIF1156 15/35
CIF1154 16/35
CIF1164 17/35
AJ_BRAINSTIM 18/35
CIF1389 19/35
CIF1467 20/35
CIF1474 21/35
CIF1476 22/35
CIF1492 23/35
CIF1230 24/35
CIF1257 25/35
CIF1285 26/35
CIF1286 27/35
CIF1324 28/35
CIF1331 29/35
CIF1348 30/35
CIF1512 31/35
CIF1374 32/35
CIF1375 33/35
CIF1379 34/35
CIF1551 35/35


## Modify the output diretory, where the datasets will be saved form XNAT

In [43]:
dirName = os.path.join('/rds/general/project/c3nl_djs_imaging_data/live/data/raw/', projectID)
#print dirName

# Create target Directory if don't exist
if not os.path.exists(dirName):
    os.mkdir(dirName)
    print("Directory " , dirName ,  " Created ")
else:    
    print("Directory " , dirName ,  " already exists")
    
Results_Dir = dirName # needs to exist or next cell will throw error

# Set so path is always the tbi group raw direcotry and will download to a folder with the name of project being downloaded


('Directory ', '/rds/general/project/c3nl_djs_imaging_data/live/data/raw/TauTBI', ' Created ')


### Download datasets
This script will look into the project predefined. Check the printed output to look for duplicates and incomplete datasets.

In [44]:
subjectCounter = 0
for s, subjectID in enumerate(subjects):
    subjectLabel = central.select.project(projectID).subject(subjectID).label()
    
    for experimentID in allSessions[s]:
        try:
            scans = central.select.project(projectID).subject(subjectID).experiments(experimentID).scans()
            scanIDs = scans.get()
            coll = central.select.project(projectID).subject(subjectID).experiments(experimentID)
            print '\n%s %s' % (subjectLabel, experimentID)
            number_subjects+=1    
            filenames = central.select.project(projectID).subject(subjectID).experiment(experimentID).scans()
            filenames.download(Results_Dir, type='ALL', extract=False, removeZip=True)
            break
        except LookupError:
            print("There are no scans to download")
        continue #pass
                
                
print "The total number of scanning sessions downloaded is = " + str(number_subjects)



CIF1501 CIF_E03899

CIF1502 CIF_E03901

CIF1411 CIF_E03525

CIF1446 CIF_E03664

CIF1464 CIF_E03722

CIF0876 CIF_E01818

CIF0910 CIF_E01846

CIF0957 CIF_E02025

CIF0958 CIF_E02033

CIF0971 CIF_E02116

CIF1012 CIF_E02245

CIF1013 CIF_E02246

CIF1101 CIF_E02550

CIF1072 CIF_E02420

CIF1156 CIF_E02699

CIF1154 CIF_E02693

CIF1164 CIF_E02741

AJ_BRAINSTIM CIF_E02858

CIF1389 CIF_E03214

CIF1467 CIF_E03739

CIF1474 CIF_E03760

CIF1476 CIF_E03770

CIF1492 CIF_E03844

CIF1230 CIF_E02973

CIF1257 CIF_E03040

CIF1285 CIF_E03129

CIF1286 CIF_E03133

CIF1324 CIF_E03277

CIF1331 CIF_E03296

CIF1348 CIF_E03401

CIF1512 CIF_E03249

CIF1374 CIF_E02908

CIF1375 CIF_E02922

CIF1379 CIF_E03018

CIF1551 CIF_E03919
The total number of scanning sessions downloaded is = 35


### Sweet now we're rolling! 
To make life easy all our labs notebooks are going assume a BIDS format.
As data curating can be a pain in the derrière, lets run a nice little function to sort that for us ;)

#### A CIF_config.json has been created to match MRI acquisitions and label them in the correct format. 


### The following XDC function can be used to pull project and subject information from xnat
I have done this locally and saved an indexing function in c3nl_tools

## Extracting and indexing data from xnat

#### 1: bids_preproc.sh  
    Indexes files downloaded from XNAT with more meaningfull lables such as participant ID and scan session.  
    This sets up the initial file structure to run the conversion to BIDS  
    
#### 2: bidsProc2.sh  
    Loops over all subjects->sessions->modalities->scans and converts DICOMS to NIFTI.   
    The labels for each of the scans on the scan card are then converted to match the BIDS format and file structure  