# Python packages

# TOC

* Introduction to python packages
* The Python standard library
* Python packages for (neuro)science
* Anaconda distribution
* Installing packages

# Introduction to python packages
* If you start a console, only very limited functionality is available ([the python standard library](https://docs.python.org/3/library/))
* Python packages/modules: 
    * Packages are either in the standard library (installed when python is installed) or 3rd party (need to be installed - see later)
    * Need to be loaded with `import` statements (usually at the beginning of a file)
    * **Namespace:** Functions from packages is called with `<package_name>.<function_name>` (c.f. R)


## Using packages

In [1]:
import os
os.listdir()

['04_Basic_python.ipynb',
 '10_Plotting.slides.html',
 'requirements.txt',
 '01_Demos.slides.html',
 '09_Pandas_intro_detailed.ipynb',
 'images',
 '01_Demos.ipynb',
 '00_Resources.ipynb',
 '03_How_to_run_code.slides.html',
 '02_Packages.ipynb',
 '05_Debugging.slides.html',
 '06_Paths_and_files.ipynb',
 'README.md',
 '06_Paths_and_files.slides.html',
 '05_Debugging.ipynb',
 '04_Basic_python.slides.html',
 '07_Why_python.slides.html',
 '09_Pandas_intro.slides.html',
 '11_Play.ipynb',
 '.ipynb_checkpoints',
 '08_Advanced_basic_python.slides.html',
 '08_Advanced_basic_python.ipynb',
 '03_How_to_run_code.ipynb',
 '07_Why_python.ipynb',
 '11_Play.slides.html',
 'data',
 '09_Pandas_intro.ipynb',
 '10_Plotting.ipynb',
 '02_Packages.slides.html']

In [2]:
import numpy
numpy.array([1,2,3])

array([1, 2, 3])

### We can define aliases

In [3]:
import numpy as np
np.array([1,2,3])

array([1, 2, 3])

### We can import objects from a module

In [4]:
from numpy import median
median([1,1,2,3,4])

2.0

### This only imports the given object

**Running**

```
array([1,2,3])
```

**will result in**
```
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-27d393022d9f> in <module>()
----> 1 array([1,2,3])

NameError: name 'array' is not defined
```

# The Python standard library
Contains all kinds of good stuff. For example:
* os: operating system tools
* re: regular expressions
* collections: useful data structures
* multiprocessing: simple parallelization tools
* pickle: serialization
* json: reading and writing JSON
* argparse: command-line argument parsing
* functools: functional programming tools
* datetime: date and time functions

etc. etc. etc.

## Knowing a language well is a quick path to programming competence
* One boring `import` statement can can save you hours of algorithmic brilliance
* E.g., regular expressions vs. the `string` module
    * Find all occurrences of subject_id: three uppercase letters immediately followed by 3 numbers -- but only if they're not followed by additional numbers
    * You can be a master of string parsing, or you can learn some regex basics
    
[Tal Yarkoni's slides](https://github.com/neurohackweek/python-tips-and-tricks/blob/master/python-tips-and-tricks.ipynb)

### Painful and unreliable approach that uses no imports.
### Note that we're cheating by using additional information that's in this particular string but may not generalize to others.

In [5]:
target = "Exp: 20344@L234342; begin{subjects}:AAA001--NaN//YYY843828--75%//GNEFE82--82%//BOO444--45%"

sub_part = target.split(':')[2]
subs = sub_part.split('//')
keep = []
for s in subs:
    s = s.split('--')[0]
    if len(s) >= 6 and s[:3].isalpha() and s[3:6].isdigit():
        if len(s) > 6 and s[7].isdigit():
            continue
        keep.append(s[3:6])

print(keep)

['001', '444']


### The correct approach--use regular expressions!

In [6]:
target = "Exp: 20344@L234342; begin{subjects}:AAA001--NaN//YYY843828--75%//GNEFE82--82%//BOO444--45%"

import re
print(re.findall('[A-Z]{3}(\d{3})[^\d]*', target))

['001', '444']


# Python packages for (neuro)science

## Scientific Python packages
- [`NumPy`](http://numpy.org) provides efficient storage and computation for multi-dimensional data arrays.
- [`SciPy`](http://scipy.org) contains a wide array of numerical tools such as numerical integration and interpolation.
- [`Pandas`](http://pandas.pydata.org) provides a DataFrame object along with a powerful set of methods to manipulate, filter, group, and transform data.
- [`Matplotlib`](http://matplotlib.org) provides a useful interface for creation of publication-quality plots and figures.
- [`Seaborn`](http://seaborn.pydata.org) even prettier plots.
- [`Altair`](https://altair-viz.github.io) other pretty plots.
- [`Plotly`](https://plotly.com/python/) interactive pretty plots.
- [`Scikit-learn`](http://scikit-learn.org) provides a uniform toolkit for applying common machine learning algorithms to data.
- [`Jupyter`](http://jupyter.org) provides an enhanced terminal and an interactive notebook environment that is useful for exploratory analysis, as well as creation of interactive, executable documents.

[A Whirlwind Tour of Python. Introduction.](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/00-Introduction.ipynb)

## Neuroimaging Python packages
**Python as glue**

Reducing the amount of switching you do between languages reduces switching costs!


### Analysis

* [`Nipype`](http://nipype.readthedocs.io/en/latest/): Neuroimaging in Python: Pipelines and Interfaces
* [`Dipy`](https://dipy.org): Diffusion Imaging In Python
* [`NiBabel`](http://nipy.org/nibabel/): Access a cacophony of neuroimaging file formats
* [`MNE`](http://mne-tools.github.io/mne-python-intro/): M/EEG  analysis

### Machine learning

* [`Nilearn`](http://nilearn.github.io): Machine learning for Neuro-Imaging in Python
* [`PyMVPA`](http://www.pymvpa.org): Multivariate Pattern Analysis in Python

### Visualization

* [`PySurfer`](http://pysurfer.github.io): visualizing cortical surfaces
* [`pycortex`](https://gallantlab.github.io/pycortex/): visualize fMRI or other volumetric neuroimaging data on cortical surfaces
* [`niwidgets`](http://nipy.org/niwidgets/): interactive neuroimaging plots


### Stimulus delivery & feature extraction

* [`PsychoPy`](http://www.psychopy.org): Psychology software in Python 
* [`pliers`](https://github.com/tyarkoni/pliers): Automated feature extraction

#  Anaconda distribution
* Is a scientific Python distribution
* Includes the python standard library
* Comes with many scientific packages (numpy, pandas, seaborn...), therefore, we rarely need to install packages
* Also can install e.g. R
* Anaconda-Navigator

# Installing packages

* Anaconda comes with package managers 
    * `conda`: python and others (e.g. R)
    * `pip`: python only
    
* Not all pip packages can be installed with conda, and vice versa.
* If a package is available via `conda`, I'd suggest using `conda`; else, use `pip`
 

## `pip`
* `pip install <package_name>`
* `pip install nibabel`


## `conda`
* `conda install <package_name>`
* `conda install scikit-learn`

# Exercise

* open your terminal or anaconda prompt (windows) and
* use `conda` to install the `nibabel` package

[Hint](https://www.google.com/search?q=conda+install+nibabel)