Skip to content

Commit

Permalink
major documentation improvement
Browse files Browse the repository at this point in the history
  • Loading branch information
François Laurent committed Mar 6, 2018
1 parent 78ca5ac commit 2d48ed5
Show file tree
Hide file tree
Showing 34 changed files with 860 additions and 423 deletions.
180 changes: 180 additions & 0 deletions doc/commandline.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
.. _commandline:

Command-line starter
====================

Input data
----------

The input data are tracking data of single molecules, stored in a text file.

The extension of input data files will usually be |txt|, |xyt| or |trxyt| but this is not required.

We will first import an example |trxyt| file and sample the molecule locations in a step referred to as "tessellation", before we perform the inference and feature extraction.

Note that there is no standalone import.
Molecule locations and trajectories should always be sampled in a way or another.

All the analyses derivating from a same dataset are stored in a single |rwa| file.

.. seealso::

:ref:`datamodel`


.. _commandline_tessellation:

Partitioning the data points
----------------------------

The first step consists of tessellating the space and partitioning the data points into the cells of the tessellation::

> tramway tessellate -h
> tramway tessellate kmeans -i example.trxyt -l kmeans

``-h`` shows the help message for the ``tessellate`` command.

``-i`` takes the input data file, the ``kmeans`` subcommand is the tessellation method and ``-l`` specifies a label to attach to the analysis for further reference.

The available methods are:

* ``grid``: regular grid with equally sized square or cubic areas.
* ``kdtree``: kd-tree tessellation with midpoint splits.
* ``kmeans``: tessellation based on the k-means clustering algorithm.
* ``gwr`` (or ``gas``): tessellation based on the Growing-When-Required self-organizing gas.

``kmeans`` and ``gwr`` methods run optimization on the data and consequently are vulnerable to numerical scaling.
It is recommended to scale your data adding the option ``-w``.

A key parameter is ``--knn`` (or shorter ``-n``).
It combines with any of the above methods and allows to impose a lower bound on the number of points (or nearest neighbors) associated with each cell of the mesh, independently of the way the mesh has been grown::

> tramway tessellate gwr -i example.rwa -w -n 50 -l gwr*

Note that in the above example the *example.rwa* file already exists and we add the analysis.

The ``*`` symbol will be replaced by the lowest available natural integer starting from 0.
This prevents from overwriting an analysis with the same label, if any.

Other key parameters are ``--distance`` (shorter ``-d``) and ``--min-location-count`` (shorter ``-s``).

The former drives how the cells scale, especially in dense areas. Per default it is set to the average translocation distance.
A lower value will yield smaller cells.

The latter parameter allows to discard the cells that would contain less locations than thereby specified.
This filter applies before ``knn``::

> tramway tessellate gwr -i example.rwa -d 0.1 -s 10 -n 30 -l gwr*

You can check the content of the *example.rwa* file::

> tramway dump -i example.rwa

in example.rwa:
<class 'pandas.core.frame.DataFrame'>
'kmeans' <class 'tramway.tessellation.base.CellStats'>
'gwr0' <class 'tramway.tessellation.base.CellStats'>
'gwr1' <class 'tramway.tessellation.base.CellStats'>

.. seealso::

:ref:`tessellation`


Visualizing the partition
-------------------------

To visualize spatial 2D tessellations::

> tramway draw cells -i example.rwa -L kmeans

To print the figure in an image file::

> tramway draw cells -i example.rwa -L gwr0 -p png

This will generate an *example.png* file.

To overlay the Delaunay graph instead of the Voronoi graph::

> tramway draw cells -i example.rwa -L gwr1 -D


.. _commandline_inference:

Inferring diffusivity and other parameters
------------------------------------------

Inferring diffusivity and force with the *DF* mode::

> tramway infer df -i example.rwa -L kmeans -l df-map*

Other inference modes are *D* (``d``), *DD* (``dd``) and *DV* (``dv``).

*DV* is notably more time-consuming than the other inference modes and generates diffusivity and potential energy maps::

> tramway infer dv -i example.rwa -L gwr1 -l dv-map*


.. seealso::

:ref:`inference`


Visualizing maps
----------------

2D maps can be plotted with::

> tramway draw map -i example.rwa -L gwr1,dv-map0

One can overlay the locations as white dots with high transparency over maps colored with one of the *matplotlib* supported colormaps (see also https://matplotlib.org/users/colormaps.html)::

> tramway draw map -i example.rwa -L kmeans,df-map0 -cm jet -P size=1,color='w',alpha=.05


.. _commandline_feature:

Extracting features
-------------------

The only feature available for now is curl for 2D force maps::

> tramway extract curl -i example.rwa -L kmeans,df-map0 --radius 2 -l curl_2

For each cell, if a contour of successively adjacent cells can be found the curl is calculated along this contour and a map of local curl values can thus be extracted.

The optional ``radius`` argument drives the radius of the contour in number of cells.
At radius ``1`` the contour is formed by cells that are immediately adjacent to the center cell.
At radius ``2`` the contour is formed by cells that are adjacent to the radius-1 cells.
And so on.

Note that at higher radii the contours may partly consist of segments of lower-radii contours.

The extracted map can be plotted just like any map::

> tramway draw map -i example.rwa -L kmeans,df-map0,curl_2


Final analysis tree
-------------------

To sum up this primer, the content of the *example.rwa* file that results from all the above steps is dumped below::

> tramway dump -i example.rwa

in example.rwa:
<class 'pandas.core.frame.DataFrame'>
'kmeans' <class 'tramway.tessellation.base.CellStats'>
'df-map0' <class 'tramway.inference.base.Maps'>
'curl_2' <class 'tramway.inference.base.Maps'>
'gwr0' <class 'tramway.tessellation.base.CellStats'>
'gwr1' <class 'tramway.tessellation.base.CellStats'>
'dv-map0' <class 'tramway.inference.base.Maps'>



.. |txt| replace:: *.txt*
.. |xyt| replace:: *.xyt*
.. |trxyt| replace:: *.trxyt*
.. |rwa| replace:: *.rwa*

Binary file added doc/concepts.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 24 additions & 0 deletions doc/concepts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. _concepts:

Concepts
========

|tramway| generates maps of the dynamic parameters that dictate the motion of single molecules.
These parameters may include diffusivity, forces (directional biases), interaction (potential) energies and drift.

|tramway| takes molecule locations or trajectories as input data.

.. image:: concepts.*

A preliminary processing step before generating maps consists of segmentating the space and time into cells that accomodate enough molecule locations so that the inference can generate reliable estimates and, on the other hand, are small enough so that relevant spatial(-temporal) variations in the dynamic parameters can be observed.

This step is referred to as "`tessellation <commandline.html#tessellation>`_" but may consist of temporal windowing for example.

The central and often most time-consuming step consists of `inferring <commandline.html#inference>`_ the value of the parameters in each cell.

As such, maps of these parameters can readily exhibit descriptive information.
A further step consists of extracting features from these maps.


.. |tramway| replace:: **TRamWAy**

3 changes: 2 additions & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,8 @@ def __getattr__(cls, name):

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'nature'
#html_theme = 'nature'
html_theme = 'sphinx_rtd_theme'

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
Expand Down
61 changes: 46 additions & 15 deletions doc/datamodel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The term *analysis* here refers to a process that takes a single data input and

Artefacts are thus organized in a tree structure such that artefacts are nodes and analyses are edges.

A typical *.rwa* file or :class:`~tramway.core.analyses.Analyses` object will contain an array of molecule locations or trajectories as topmost data element.
A typical *.rwa* file or :class:`~tramway.core.analyses.base.Analyses` object will contain an array of molecule locations or trajectories as topmost data element.
A first level of analyses will consist of spatial tessellations (or data partitions) with resulting :class:`~tramway.tessellation.base.CellStats` partition objects (one per analysis).
A second level of analyses will consist of inferences with resulting :class:`~tramway.inference.base.Maps` map objects (again, one per analysis).

Expand All @@ -33,32 +33,63 @@ A single dataset can be split in several files.
Analyses *.rwa* files
---------------------

In Python, an |rwa| file can be loaded as follows::
In Python, an |rwa| file can be loaded as follows:

from tramway.io import HDF5Store
.. code-block:: python
hdf = HDF5Store(path_to_rwa_file)
analyses = hdf.peek('analyses')
hdf.close()
from tramway.core import *
or in a slightly shorter way::
analyses = load_rwa(path_to_rwa_file)
from tramway.helper.analysis import *
analyses = find_analysis(path_to_rwa_file)
The :class:`~tramway.core.analyses.base.Analyses` object features a dict-like interface.

and if one needs a particular analysis chain, providing analysis labels::
In the REPL, the *analyses* object can be quickly inspected as follows:

analyses = find_analysis(path_to_rwa_file, labels=('my-mesh', 'my-df-maps'))
.. code-block:: python
A convenient way to browse the labels, comments and artefact types in a file is::
>>> print(analyses)
<class 'pandas.core.frame.DataFrame'>
'kmeans' <class 'tramway.tessellation.base.CellStats'>
'df-map0' <class 'tramway.inference.base.Maps'>
'curl_2' <class 'tramway.inference.base.Maps'>
'gwr0' <class 'tramway.tessellation.base.CellStats'>
'gwr1' <class 'tramway.tessellation.base.CellStats'>
'dv-map0' <class 'tramway.inference.base.Maps'>
>>> analyses['kmeans']['df-map0']
<tramway.core.analyses.lazy.Analyses object at 0x7fc41e5b5f08>
>>> analyses['kmeans']['df-map0'].data
<tramway.inference.base.Maps object at 0x7fc468359e10>
> tramway dump -i path_to_rwa_file
or in Python::
The above example shows that every analysis artefact is encapsulated in an :class:`~tramway.core.analyses.lazy.Analyses` object and can be accessed with the `data` (or `artefact`) attribute.

print(format_analyses(analyses))
To extract analysis artefacts of a particular type from an analysis tree with a single pathway:

.. code-block:: python
>>> print(analyses)
<class 'pandas.core.frame.DataFrame'>
'kmeans' <class 'tramway.tessellation.base.CellStats'>
'df-map0' <class 'tramway.inference.base.Maps'>
'curl_2' <class 'tramway.inference.base.Maps'>
>>> from tramway.tessellation import CellStats
>>> from tramway.inference import Maps
>>> cells, maps = find_artefacts(analyses, (CellStats, Maps))
Here `maps` will correspond to the *curl_2* label.
To select *df-map0* instead:

.. code-block:: python
>>> cells, maps = find_artefacts(analyses, (CellStats, Maps), quantifiers=('last', 'first'))
Quantifier '*last*' is the default one.

See also :func:`~tramway.core.analyses.lazy.find_artefacts` for more options.


.. |txt| replace:: *.txt*
Expand Down
2 changes: 1 addition & 1 deletion doc/quickstart.demos.rst → doc/examples.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _quickstart.demos:
.. _examples:

Demos
=====
Expand Down

0 comments on commit 2d48ed5

Please sign in to comment.