# Full project downloading data from XNAT

#### Maria Yanez Lopez 2018 (maria.yanez-lopez@imperial.ac.uk)
#### ~ adapted for full project download Niall Bourke Feb 2019 (n.bourke@imperial.ac.uk)
### Documentation: 

https://github.com/pyxnat/pyxnat/blob/master/pyxnat/core/downloadutils.py

https://groups.google.com/forum/#!topic/xnat_discussion/K8h4VP4CBMg

https://gist.github.com/mattsouth/db8f2d09acf3c57ba605fa93c4e8d03e

https://ubuntuforums.org/showthread.php?t=786879

https://wiki.imperial.ac.uk/pages/viewpage.action?spaceKey=HPC&title=Jupyter

Version 2.0 ~ Niall Bourke  
Updated 28/02/2019  
 
This scripts downloads DICOM data from XNAT according to users specifications.


### Import python libraries

In [None]:
import sys, os, getpass                           
from pyxnat import Interface

### Introduce your XNAT login details (same as college credentials) and project folder

In [None]:
userName = raw_input('Type XNAT User Name: ')
passWord = getpass.getpass('Type XNAT Password: ')
projectID = raw_input('Type XNAT Project ID: ')
server = 'http://cif-xnat.hh.med.ic.ac.uk'

In [None]:
print 'INPUT'
print 'Server: ', server
print 'Username: ', userName
print 'Password: ', ''.join(['*']*len(passWord))
print 'ProjectID: ', projectID 


### Create PYXNAT interface

In [None]:
central = Interface(server=server, user=userName, password=passWord)
subjects = central.select.project(projectID).subjects().get()
allSessions = []
number_subjects = 0

### Browse through project, collect subjects/sessions/scans and print subject labels

In [None]:
for i, subject in enumerate(subjects):
    label = central.select.project(projectID).subject(subject).label()
    print label, ('%i/%i' % (i+1, len(subjects)))
    sessions = central.select.project(projectID).subjects(subject).experiments().get()
    allSessions.append(sessions)

## Modify the output diretory, where the datasets will be saved form XNAT

In [None]:
dirName = os.path.join('/rds/general/project/c3nl_djs_imaging_data/live/data/raw/', projectID)
#print dirName

# Create target Directory if don't exist
if not os.path.exists(dirName):
    os.mkdir(dirName)
    print("Directory " , dirName ,  " Created ")
else:    
    print("Directory " , dirName ,  " already exists")
    
Results_Dir = dirName # needs to exist or next cell will throw error

# Set so path is always the tbi group raw direcotry and will download to a folder with the name of project being downloaded


### Download datasets
This script will look into the project predefined. Check the printed output to look for duplicates and incomplete datasets.

In [None]:
subjectCounter = 0
for s, subjectID in enumerate(subjects):
    subjectLabel = central.select.project(projectID).subject(subjectID).label()
    
    for experimentID in allSessions[s]:
        try:
            scans = central.select.project(projectID).subject(subjectID).experiments(experimentID).scans()
            scanIDs = scans.get()
            coll = central.select.project(projectID).subject(subjectID).experiments(experimentID)
            print '\n%s %s' % (subjectLabel, experimentID)
            number_subjects+=1    
            filenames = central.select.project(projectID).subject(subjectID).experiment(experimentID).scans()
            filenames.download(Results_Dir, type='ALL', extract=False, removeZip=True)
            break
        except LookupError:
            print("There are no scans to download")
        continue #pass
                
                
print "The total number of scanning sessions downloaded is = " + str(number_subjects)


## Sweet now we're rolling! 
To make life easy all our labs notebooks are going assume a BIDS format.
As data curating can be a pain in the derrière, lets run a nice little function to sort that for us ;)

## Extracting and indexing data from xnat

#### 1: bids_1_preproc -i project
    Indexes files downloaded from XNAT with more meaningfull lables such as participant ID and scan session.  
    This sets up the initial file structure to run the conversion to BIDS  
    
#### 2: bids_2_proc -i project  
    Loops over all subjects->sessions->modalities->scans and converts DICOMS to NIFTI.   
    The labels for each of the scans on the scan card are then converted to match the BIDS format and file structure  
    
## Dependencies

##### A CIF_config.json has been created to match MRI acquisitions and label them in the correct format. 
This may need to be update if new seqences are being collected. 

##### Index files
I have used XDC to pull metaData about scan labels from xnat. This requires local setup. I have a copy of this function that runs through the TBI list. If the data is not in the imaging directory let me know and I could add this and run through it. 

The following XDC function can be used to pull project and subject information from xnat
I have done this locally and saved an indexing function in c3nl_tools
