Skip to content

Commit

Permalink
Merge pull request #112 from neuroscout/fetch_utils
Browse files Browse the repository at this point in the history
ENH: Add high-level predictor fetching utilities
  • Loading branch information
adelavega committed Dec 20, 2022
2 parents 56bd1a5 + 2d08071 commit a84996f
Show file tree
Hide file tree
Showing 13 changed files with 426 additions and 9 deletions.
6 changes: 6 additions & 0 deletions docs/source/_autosummary/pyns.fetch_utils.fetch_images.rst
@@ -0,0 +1,6 @@
pyns.fetch\_utils.fetch\_images
===============================

.. currentmodule:: pyns.fetch_utils

.. autofunction:: fetch_images
@@ -0,0 +1,6 @@
pyns.fetch\_utils.fetch\_predictors
===================================

.. currentmodule:: pyns.fetch_utils

.. autofunction:: fetch_predictors
6 changes: 6 additions & 0 deletions docs/source/_autosummary/pyns.fetch_utils.get_paths.rst
@@ -0,0 +1,6 @@
pyns.fetch\_utils.get\_paths
============================

.. currentmodule:: pyns.fetch_utils

.. autofunction:: get_paths
6 changes: 6 additions & 0 deletions docs/source/_autosummary/pyns.fetch_utils.install_dataset.rst
@@ -0,0 +1,6 @@
pyns.fetch\_utils.install\_dataset
==================================

.. currentmodule:: pyns.fetch_utils

.. autofunction:: install_dataset
33 changes: 33 additions & 0 deletions docs/source/_autosummary/pyns.fetch_utils.rst
@@ -0,0 +1,33 @@
pyns.fetch\_utils
=================

.. automodule:: pyns.fetch_utils







.. rubric:: Functions

.. autosummary::
:toctree:

fetch_images
fetch_predictors
get_paths
install_dataset













3 changes: 2 additions & 1 deletion docs/source/api.rst
Expand Up @@ -7,4 +7,5 @@ API
:recursive:

pyns.api
pyns.endpoints
pyns.endpoints
pyns.fetch_utils
115 changes: 115 additions & 0 deletions docs/source/fetching.rst
@@ -0,0 +1,115 @@
Fetching predictors & images
=============================

To facilitate creating custom analysis workflows, `pyNS` provides a number of high-level utilities for fetching
predictors from the Neuroscout API, and the corresponding images from the preprocessed BIDS dataset.

.. note::

Analysis pipelines created using these utilities will not be centrally registered on Neuroscout, and
will not be available to other users by the Neuroscout API or web interface.

If your analysis type is supported by `Neuroscout-CLI <https://neuroscout-cli.readthedocs.io/en/latest/>`_
(e.g. summary statistics GLM), it is recommended to use the
web interface to create your analysis or the follow the guide for :doc:`analyses` using pyNS.

If you use these data in a publication, please cite the following paper:

Alejandro de la Vega, Roberta Rocca, Ross W Blair, Christopher J Markiewicz, Jeff Mentch, James D Kent, Peer Herholz, Satrajit S Ghosh, Russell A Poldrack, Tal Yarkoni (2022). *Neuroscout, a unified platform for generalizable and reproducible fMRI research*. eLife 11:e79277
https://doi.org/10.7554/eLife.79277

In addition, please cite the original dataset(s), and the predictor extractors you use.


--------------------------------------
Fetching & re-sampling predictor data
--------------------------------------

The method :meth:`pyns.fetch_utils.fetch_predictors` can be used to fetch predictor data,
resample it to the TR of the images, and return it as a pandas DataFrame.

You only need two things: a list of predictors, and the name of the BIDS dataset.
Optionally, you can also restrict the data to a subset of subjects, runs or tasks (reccomended for testing).

.. code-block:: python
fetch_predictors(predictor_names=['speech', 'rms'], dataset_name='Budapest',
subject='sid000005', run=[1, 2], resample=True, rescale=False)
+----+---------+------------+--------------+--------------+-------+-----------+----------+
| | onset | duration | speech | rms | run | subject | run_id |
+====+=========+============+==============+==============+=======+===========+==========+
| 0 | 0 | 1 | 9.5801e-06 | 6.18876e-07 | 1 | sid000005 | 1433 |
+----+---------+------------+--------------+--------------+-------+-----------+----------+
| 2 | 1 | 1 | -2.57011e-05 | -1.49298e-06 | 1 | sid000005 | 1433 |
+----+---------+------------+--------------+--------------+-------+-----------+----------+
| 4 | 2 | 1 | 6.755e-05 | 3.50004e-06 | 1 | sid000005 | 1433 |
+----+---------+------------+--------------+--------------+-------+-----------+----------+
| 6 | 3 | 1 | -0.000173993 | -7.91888e-06 | 1 | sid000005 | 1433 |
+----+---------+------------+--------------+--------------+-------+-----------+----------+
| 8 | 4 | 1 | 0.000439006 | 1.70871e-05 | 1 | sid000005 | 1433 |
+----+---------+------------+--------------+--------------+-------+-----------+----------+


This will return a pandas DataFrame with the predictors resampled to the TR (in this case 0.33s)
with `onset` and `duration` columns. In addition, columns describing the entities identifying each columns
(e.g. `subject`, `run`...) are included as columns.

Note that you can choose to rescale the predictors to have a mean of 0 and standard deviation of 1, by setting
`rescale=True`. This operation will occur prior to densification and resampling of variables.

It's possible to retrieve `BIDSRunVariableCollection` collection (`return_type='collection'`), which can be used to
apply further transformations to the data.

.. note::

To learn about low-level utilities for fetching predictors, see the :doc:`querying` documentation.

-----------------------------
Fetching preprocessed images
-----------------------------

.. note::

Datalad is required to download images. See `DataLad documentation <https://handbook.datalad.org>`_
for installation instructions.

The method :meth:`pyns.fetch_utils.fetch_images` facilitates downloading preprocessed images from the
Neuroscout datasets. It can be used to download images for a single subject, or for all subjects in a
dataset.

Simply provide a directory where Neuroscout datasets should be installed, and the dataset name.
Optionally, you can also restrict the data to a subset of subjects, runs or tasks (reccomended for testing).

.. code-block:: python
preproc_dir, img_paths = fetch_images('Budapest', '/tmp/', subject=subject)
img_paths[0]
<BIDSImageFile filename='/tmp/Budapest/fmriprep/sub-sid000005/func/sub-sid000005_task-movie_run-1_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz'>
:meth:`pyns.fetch_utils.fetch_images` installs the dataset using datalad, and returns the path to the
preprocessed dataset, as well as a list of `BIDSImageFile` objects for each image.

The `BIDSImageFile` objects can be used to load the images into memory using `nibabel <https://nipy.org/nibabel/>`_,
and can be used to extract metadata about the image, such as the associated entities:

.. code-block:: python
target = img_paths[0]
img = target.get_image()
target.get_entities()
{'datatype': 'func',
'desc': 'preproc',
'extension': '.nii.gz',
'run': 1,
'space': 'MNI152NLin2009cAsym',
'subject': 'sid000005',
'suffix': 'bold',
'task': 'movie'}
Using these methods you can easily create custom analysis workflows.
18 changes: 12 additions & 6 deletions docs/source/index.rst
@@ -1,15 +1,20 @@
Welcome to pyNS's documentation!
===================================
pyNS: Neuroscout API client documentation
=========================================

.. image:: neuroscout-logo.svg
:width: 400
:alt: Neuroscout Logo


**pyNS** is the Python client library for accessing the `Neuroscout API <https://neuroscout.org/api>`_

**pyNS** enables advanced used cases not supported by the `neuroscout.org <https://neuroscout.org>`_
web-based analysis builder, such as batch-creation of analyses, or meta-analytic applications.
**pyNS** is the Python client for the `Neuroscout API <https://neuroscout.org/api>`_, allowing users
to programmatically query and interactive with the Neuroscout database. This allows users to
create analyses, query for analyses, and download analysis results.

**pyNS** provides a number of high-level functions for common tasks (that would typically require
multiple API calls), such as creating and registering analyses, and fetching predictor and imaging data directly.

Advanced use cases include: batch-creation of analyses (e.g. for meta-analysis) and the
creation of custom analysis pipelines.

**pyNS** mirrors the official Neuroscout API with a Pythonic interface.
Note that the best reference for the API is the official `API docs <https://neuroscout.org/api>`_
Expand All @@ -30,4 +35,5 @@ Contents
quickstart
querying
analyses
fetching
api
4 changes: 4 additions & 0 deletions docs/source/querying.rst
Expand Up @@ -84,6 +84,10 @@ Under the hood, `pyNS` looks up the ``dataset_id`` and ``task_id`` for the given
Getting the predictor data
----------------------------------

.. note::

High-level utilities are available to facilitate this process. See the :doc:`fetching` documentation.

An important aspect of `pyNS` is the ability to retrieve moment by moment events for specific predictors.

The simplest way is to simply use ``predictor_id`` to query for a specific Predictor, for a specific ``run_id``:
Expand Down
2 changes: 2 additions & 0 deletions optional_requirements.txt
@@ -0,0 +1,2 @@
pybids
pandas
2 changes: 1 addition & 1 deletion pyns/__init__.py
Expand Up @@ -8,7 +8,7 @@
from .api import Neuroscout
from . import endpoints

__all__ = ['Neuroscout', 'endpoints']
__all__ = ['Neuroscout', 'endpoints', 'fetch_utils']

__author__ = ['Alejandro de la Vega']
__license__ = 'MIT'
2 changes: 1 addition & 1 deletion pyns/endpoints/base.py
Expand Up @@ -152,7 +152,7 @@ def _id_to_entities(df):
else:
names = {
r: endpoint.get(r)['name']
for r in df[col].unique()
for r in df[col].dropna().unique()
}
df[col.replace('_id', '_name')] = df[col].map(names)
return df
Expand Down

0 comments on commit a84996f

Please sign in to comment.