Permalink
Browse files

Merge remote-tracking branch 'origin/master'

* origin/master:
  DOC: Bump version to 2.2.0
  DOC: Changelog draft for 2.2.0 release.
  BF+DOC: Add missing modules to the module reference.
  DOC: Changelog draft for 2.2.0 release.
  DOC: Integrate IPython notebooks into the web page.
  ENH: Verbosity for Mr. H.
  DOC: A new start for the stats tutorial.
  ENH: More debug
  ENH: Allow for access of fitted distributions in MCNullDist
  RF: Do not use deprecated function (self-deprecated even!)
  BF+DOC: Minor TODO in MDP-adaptor code addressed
  DOC: Advertise contributions oportunity on frontpage
  DOC: Added new dataset sources to the modref.
  NF: Wrap sklearn's dataset generators/loaders
  DOC: Few more bits for the tutorial
  DOC: Various fixes/changes to the tutorial.
  • Loading branch information...
2 parents 549056f + 258a72f commit 967dcf6c9888f0075d58ef8d2353e748286644a2 @yarikoptic yarikoptic committed Sep 16, 2012
View
@@ -31,23 +31,9 @@ tracking systems accordingly and can be queried by visiting the URLs::
Releases
========
-* upcoming ()
+* 2.2.0 (Sun, Sep 16 2012)
- * Fixes
-
- - HDF5 now properly stores object-type ndarray, where it the array shape
- was unintentionally modified on-load before (Fixes #84).
- - HDF5 can now reconstruct 'builtin' objects (Fixes #86).
-
- * Enhancements
-
- - Initial Python 3 compatibility (spear-headed by Tiziano Zito).
- - Bayesian hypothesis testing with
- :class:`~mvpa2.clfs.transerror.BayesConfusionHypothesis` now supports
- literal hypotheses specification, custom hypotheses subsets, and
- computing of posterior probabilities.
-
- * New functionality
+ * New functionality (14 commits)
- New HDF5-based storage backend for
:class:`~mvpa2.measures.searchlight.Searchlight` the can significantly
@@ -57,7 +43,57 @@ Releases
:class:`~mvpa2.measures.nnsearchlight.M1NNSearchlight` (and
helper :func:`~mvpa2.measures.nnsearchlight.sphere_m1nnsearchlight`) to
run mean-1-nearest-neighbor searchlights.
+ - New mappers for adding an axis to a dataset
+ (:class:`~mvpa2.mappers.shape.AddAxisMapper`), and for transposing a
+ dataset (:class:`~mvpa2.mappers.shape.TransposeMapper`).
+ - Improved implementation of SciPy's :func:`~mvpa2.misc.stats.ttest_1samp`
+ with support for masked arrays and alternative hypotheses.
+ - Individual tutorial chapters are now available for download as IPython
+ notebooks. A ``rst2ipynb`` converter is available in ``tools/``.
+ - New ``pymvpa2-tutorial`` command line utility to start a PyMVPA tutorial
+ session, either in a console IPython session, or using the IPython
+ notebook server.
+ - New wrapper functions for data generators/loaders in ``sklearn.datasets``,
+ available in :mod:`mvpa2.datasets.sources.sklearn_data`.
+
+ * Enhancements (89 commits)
+
+ - Initial Python 3 compatibility (spear-headed by Tiziano Zito).
+ - Bayesian hypothesis testing with
+ :class:`~mvpa2.clfs.transerror.BayesConfusionHypothesis` now supports
+ literal hypotheses specification, custom hypotheses subsets, and
+ computing of posterior probabilities.
+ - Allow for accessing fitted distributions in
+ :class:`~mvpa2.clfs.stats.MCNullDist`.
+ - Extensions and improvements to the tutorial chapter on statistical
+ evaluation.
+ - Expose distance function as a property `dfx` of
+ :class:`~mvpa2.clfs.knn.kNN`.
+ - Extended :class:`~mvpa2.generators.base.Sifter` with ability to discard
+ unbalanced partitions.
+ - :func:`~mvpa2.base.hdf5.h5save` now creates missing directories
+ automatically by default.
+ - Dedicated training for
+ :class:`~mvpa2.algorithms.hyperalignment.Hyperalignment`, and new
+ auto-train capability.
+ - :class:`~mvpa2.clfs.transerror.BayesConfusionHypothesis` now computes
+ optional posterior probabilities, and supports hypothesis definitions
+ using literal labels.
+ * API changes
+
+ - All command line tools have been renamed to have a consistent 'pymvpa2-'
+ prefix.
+
+ * Fixes (77 commits)
+
+ - HDF5 now properly stores object-type ndarray, where it the array shape
+ was unintentionally modified on-load before (Fixes #84).
+ - HDF5 can now reconstruct 'builtin' objects (Fixes #86).
+ - Check value data type and convert to float when collecting performance
+ statistics to avoid numerical problems.
+ - Do not fail in :class:`~mvpa2.clfs.transerror.BayesConfusionHypothesis`
+ when a dataset does not provide class labels.
* 2.1.0 (Fri, June 29 2012)
View
@@ -7,7 +7,7 @@ DOC_DIR=$(CURDIR)/doc
TUT_DIR=$(CURDIR)/datadb/tutorial_data/tutorial_data
DOCSRC_DIR=$(DOC_DIR)/source
DOCBUILD_DIR=$(BUILDDIR)/doc
-NOTEBOOKBUILD_DIR=$(BUILDDIR)/notebooks
+NOTEBOOKBUILD_DIR=$(HTML_DIR)/notebooks
MAN_DIR=$(BUILDDIR)/man
APIDOC_DIR=$(HTML_DIR)/api
PDF_DIR=$(BUILDDIR)/pdf
@@ -201,7 +201,7 @@ mpl-stamp: build
echo "backend : Agg" >| $(CURDIR)/build/matplotlibrc
touch $@
-htmldoc: examples2rst build pics mpl-stamp
+htmldoc: examples2rst build pics mpl-stamp tutorial2notebooks
@echo "I: Creating an HTML version of documentation"
cd $(DOC_DIR) && MVPA_EXTERNALS_RAISE_EXCEPTION=off \
PYTHONPATH=$(CURDIR):$(PYTHONPATH) \
@@ -240,13 +240,15 @@ examples2rst-stamp: mkdir-DOCBUILD_DIR
touch $@
tutorial2notebooks: tutorial2notebooks-stamp
-tutorial2notebooks-stamp: mkdir-NOTEBOOKBUILD_DIR
+tutorial2notebooks-stamp:
+ mkdir -p $(NOTEBOOKBUILD_DIR)
tools/rst2ipnbpy \
--baseurl http://pymvpa.org \
--apiref_baseurl http://pymvpa.org/generated \
--glossary_baseurl http://pymvpa.org/glossary.html \
--outdir $(NOTEBOOKBUILD_DIR) \
--exclude doc/source/tutorial_prerequisites.rst \
+ --verbose \
doc/source/tutorial_*.rst
touch $@
@@ -17,7 +17,7 @@
This example will demonstrate how to embed MDP_'s flows_ into a PyMVPA-based
analysis. We will perform a classification of a large number of images of
handwritten digits from the :ref:`MNIST <datadb_mnist>` database. To get a
-better sense on how MDP blends into PyMVPA, we will do the same analysis with
+better sense of how MDP blends into PyMVPA, we will do the same analysis with
MDP only first, and then redo it in PyMVPA -- only using particular bits from
MDP.
View
@@ -88,6 +88,18 @@ News
</script>
+Contributing
+============
+
+We welcome all kinds of contributions, and you do **not need to be a
+programmer** to contribute! If you have some feature in mind that is missing,
+some example use case that you want to share, you spotted a typo in the
+documentation, or you have an idea how to improve the user experience all
+together -- do not hesitate and :ref:`contact us <chap_support>`. We will then
+figure out how your contribution can be best incorporated. Any contributor will
+be acknowledged and will appear in the list of people who have helped to
+develop PyMVPA on the front-page of the `pymvpa.org <http://www.pymvpa.org>`_.
+
License
=======
@@ -64,6 +64,7 @@ Datasets: Input, Output, Storage and Preprocessing
datasets.formats
datasets.mri
datasets.miscfx
+ datasets.sources.sklearn_data
Mappers: Data Transformations
@@ -85,6 +86,7 @@ Mappers: Data Transformations
mappers.procrustean
mappers.projection
mappers.prototype
+ mappers.shape
mappers.slicing
mappers.som
mappers.svd
@@ -99,6 +101,7 @@ Generators: Repetitive Data Processing
:toctree: generated
generators
+ generators.base
generators.partition
generators.permutation
generators.resampling
@@ -37,7 +37,9 @@ Through the course of the tutorial we would analyze :ref:`real BOLD fMRI data
tutorial, you need to download the :ref:`corresponding data from the PyMVPA
website <datadb_tutorial_data>`. Once downloaded, extract the tarball, open a
terminal, go into the directory with the extracted tarball content and run:
-:command:`./start_tutorial_session.sh`.
+:command:`./start_tutorial_session.sh`. On a NeuroDebian-enabled system,
+the tutorial data is also available from the ``python-mvpa2-tutorialdata``
+package.
If you want to prevent yourself from re-typing all code snippets into the
terminal window, you might want to investigate IPython's ``%cpaste``
@@ -14,6 +14,12 @@
Part 4: Classifiers -- All Alike, Yet Different
***********************************************
+.. note::
+
+ This tutorial part is also available for download as an `IPython notebook
+ <http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html>`_:
+ [`ipynb <notebooks/tutorial_classifiers.ipynb>`_]
+
This is already the second time that we will engage in a classification
analysis, so let's first recap what we did before in the :ref:`first tutorial
part <chap_tutorial_start>`:
@@ -36,7 +42,7 @@ a cross-validated classification analysis.
5. Inspect results
Our previous choice of the classifier was guided by the intention to
-replicate the :ref:`Haxby et al. (2001) <HGF+01>`, but what if we want to
+replicate :ref:`Haxby et al. (2001) <HGF+01>`, but what if we want to
try a different algorithm? In this case a nice feature of PyMVPA comes into
play. All classifiers implement a common interface that makes them easily
exchangeable without the need to adapt any other part of the analysis code.
@@ -50,7 +56,7 @@ If, for example, we want to try the popular :mod:`support vector machine <mvpa2.
0.1875
Instead of k-nearest-neighbor, we create a linear SVM classifier,
-internally using popular LIBSVM library (note that PyMVPA provides
+internally using the popular LIBSVM library (note that PyMVPA provides
additional SVM implementations). The rest of the code remains identical.
SVM with its default settings seems to perform slightly worse than the
simple kNN-classifier. We'll get back to the classifiers shortly. Let's
@@ -148,7 +154,7 @@ the results of the cross-validation analysis a bit further.
The returned value is actually a `~mvpa2.datasets.base.Dataset` with the
results for all cross-validation folds. Since our error function computes
-only a single scalar value for each fold the dataset only contain a single
+only a single scalar value for each fold the dataset only contains a single
feature (in this case the accuracy), and a sample per each fold.
..
@@ -14,6 +14,12 @@
Part 2: Dataset Basics and Concepts
***********************************
+.. note::
+
+ This tutorial part is also available for download as an `IPython notebook
+ <http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html>`_:
+ [`ipynb <notebooks/tutorial_datasets.ipynb>`_]
+
A `~mvpa2.datasets.base.Dataset` is the basic data container in PyMVPA. It
serves as the primary form of input data storage, but also as container for
more complex results returned by some algorithm. In this tutorial part we will
@@ -253,8 +259,8 @@ array([[ 1, 1, -1],
.. _NumPy documentation: http://docs.scipy.org/doc/
-All three slicing-styles equally applicable to the selection of feature subsets
-within a dataset. Remember, features are represented on the second axis
+All three slicing-styles are equally applicable to the selection of feature
+subsets within a dataset. Remember, features are represented on the second axis
of a dataset.
>>> ds[:, [1,2]].samples
@@ -266,7 +272,7 @@ array([[ 1, -1],
By applying a selection by indices to the second axis, we can easily get
the last two features of our example dataset. Please note the `:` is supplied
as first axis slicing. This is the Python way to indicate *take everything
-along this axis*, hence take all samples.
+along this axis*, hence including all samples.
As you can guess, it is also possible to select subsets of samples and
features at the same time.
@@ -284,7 +290,7 @@ array([1, 0])
The above code applies the same slicing directly to the NumPy array with
the samples, and the result is fundamentally different. For NumPy arrays
-the style of slicing allows to select specific elements by their indices on
+this style of slicing allows to select specific elements by their indices on
each axis of an array. For PyMVPA's datasets this mode is not very useful,
instead we typically want to select rows and columns, i.e. samples and
features given by their indices.
@@ -376,7 +382,7 @@ explore this dataset a little further.
Besides samples the dataset offers number of attributes that enhance the
data with information that is present in the NIfTI image header in the file. Each sample has
-information about its volume id in the time series and the actual acquisition
+information about its volume ID in the time series and the actual acquisition
time (relative to the beginning of the file). Moreover, the original voxel
index (sometimes referred to as ``ijk``) for each feature is available too.
Finally, the dataset also contains information about the dimensionality
@@ -14,6 +14,12 @@
Part 7: "When Is The Signal" -- Event-related Data Analysis
***********************************************************
+.. note::
+
+ This tutorial part is also available for download as an `IPython notebook
+ <http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html>`_:
+ [`ipynb <notebooks/tutorial_eventrelated.ipynb>`_]
+
In all previous tutorial parts we have analyzed the same fMRI data. We
analyzed it using a number of different strategies, but they all had one
thing in common: A sample in each dataset was always a single fMRI volume.
@@ -312,7 +318,7 @@ voxel features``.
That was it. Perhaps you are scared by the amount of code. Please note that
it could have been done shorter, but this way allows to plot any other voxel
-coordinate combination as well. matplotlib allows to stored this figure in
+coordinate combination as well. matplotlib allows for saving this figure in
SVG_ format that allows for convenient post-processing in Inkscape_ -- a
publication quality figure is only minutes away.
@@ -368,7 +374,7 @@ correct this is a spatio-temporal searchlight. The searchlight focus
travels along all possible locations in our ventral temporal ROI, but at
the same time also along the peristimulus time segment covered by the
events. The spatial searchlight extent is the center voxel and its
-immediate neighbors and the temporal dimension comprises two time-points in
+immediate neighbors and the temporal dimension comprises of two additional ime-points in
each direction. The result is again a dataset. Its shape is compatible
with the mapper of ``evds``, hence it can also be back-projected into the
original 4D fMRI brain space.
@@ -14,6 +14,12 @@
Part 3: Mappers -- The Swiss Army Knife
***************************************
+.. note::
+
+ This tutorial part is also available for download as an `IPython notebook
+ <http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html>`_:
+ [`ipynb <notebooks/tutorial_mappers.ipynb>`_]
+
In the :ref:`previous tutorial part <chap_tutorial_datasets>` we have discovered a
magic ingredient of datasets: a mapper. Mappers are probably the most
powerful concept in PyMVPA, and there is little one would do without them.
@@ -153,7 +159,7 @@ acquired. However, there are more complicated scenarios which we will look
at later on. Chunks of independent data correspond to what fMRI volumes are
assumed to be independent. The properties of the MRI acquisition process
cause subsequently acquired volumes to be *very* similar, hence they cannot
-be considered as independent. Ideally, the experiment is split into several
+be considered independent. Ideally, the experiment is split into several
acquisition sessions, where the sessions define the corresponding data
chunks.
@@ -195,8 +201,8 @@ directly:
We got the dataset that we already know from the last part, but this time
is also has information about chunks and targets.
-The next step is to extract the *patterns of activation* that we are
-interested in from the dataset. But wait! We know that fMRI data is
+The next step is to extract the *patterns of activation* from the dataset
+that we are interested in. But wait! We know that fMRI data is
typically contaminated with a lot of noise, or actually *information* that
we are not interested in. For example, there are temporal drifts in the
data (the signal tends to increase when the scanner is warming up). We
@@ -317,7 +323,7 @@ independently.
To achieve this, we first add a new sample attribute to assign a
corresponding label to each sample in the dataset, indication to which of
-both run-types is belongs to:
+both run-types it belongs to:
>>> rnames = {0: 'even', 1: 'odd'}
>>> fds.sa['runtype'] = [rnames[c % 2] for c in fds.sa.chunks]
@@ -47,9 +47,9 @@ What Do I Need To Get Python Running
------------------------------------
PyMVPA code is compatible with Python 2.X series (more precisely >= 2.4).
-Python 3.x is not yet supported. For most stable performance we recommend
-Python 2.6 since that is the version we are using for the development, but,
-once again, anything from 2.5 to 2.7 should be fine.
+Python 3.x is supported as well, but not as widely used (yet), and many
+3rd-party Python modules are still lacking Python 3 support. For now, we
+recommend Python 2.7 for production, but Python 2.6 should work equally well.
Any machine which has Python 2.X available can be used for PyMVPA-based
processing (see :ref:`Download section <chap_download>` on how to deploy
@@ -79,9 +79,9 @@ as AFNI_ and FSL_.
For those who just want to quickly try PyMVPA, or do not want to deal with
installing multiple software package we recommend the `NeuroDebian Virtual
-Machine`_. This is a virtual Debian installation that can be ran on Linux,
-Windows, and MacOS X. It includes many Python packages, PyMVPA, and other
-neuroscience software (including AFNI_ and FSL_).
+Machine`_. This is a virtual Debian installation that can be deployed on Linux,
+Windows, and MacOS X in a matter of minutes. It includes many Python packages,
+PyMVPA, and other neuroscience software (including AFNI_ and FSL_).
.. _NeuroDebian Virtual Machine: http://neuro.debian.net/vm.html
.. _AFNI: http://afni.nimh.nih.gov/afni
@@ -168,8 +168,8 @@ to explore an enhanced interactive environment for Python -- IPython_.
http://fperez.org/papers/ipython07_pe-gr_cise.pdf
- An article from the author of IPython in the Computing in Science and Engineering
- journal, describing goals and basic features of IPython.
+ An article from one of the authors of IPython in the *Computing in Science and
+ Engineering* journal, describing goals and basic features of IPython.
http://showmedo.com/videotutorials/series?name=CnluURUTV
Oops, something went wrong.

0 comments on commit 967dcf6

Please sign in to comment.