Skip to content

Commit

Permalink
improving matrix io format docs
Browse files Browse the repository at this point in the history
  • Loading branch information
cornhundred committed Feb 16, 2017
1 parent 8b0484e commit 5b210e9
Show file tree
Hide file tree
Showing 14 changed files with 31 additions and 16 deletions.
Binary file modified docs/_build_html/.doctrees/case_studies.doctree
Binary file not shown.
Binary file modified docs/_build_html/.doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/_build_html/.doctrees/getting_started.doctree
Binary file not shown.
Binary file modified docs/_build_html/.doctrees/matrix_format_io.doctree
Binary file not shown.
4 changes: 3 additions & 1 deletion docs/_build_html/_sources/case_studies.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@ Clustergrammer was built to visualize biological data but is applicable for visu

- `Cancer Cell Line Encyclopedia Gene Expression Data`_
- `Zika Virus RNA-seq Data Visualization`_
- `Single Cell RNA-seq Data Visualization`_
- `Iris flower dataset`_
- `MNIST Handwritten Digit Dataset`_

.. _`Cancer Cell Line Encyclopedia Gene Expression Data`: http://amp.pharm.mssm.edu/clustergrammer/CCLE/
.. _`Zika Virus RNA-seq Data Visualization`: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb
.. _`Iris flower dataset`: http://nbviewer.jupyter.org/github/MaayanLab/iris_clustergrammer_visualization/blob/master/Iris%20Dataset.ipynb
.. _`MNIST Handwritten Digit Dataset`: https://maayanlab.github.io/MNIST_heatmaps/
.. _`MNIST Handwritten Digit Dataset`: https://maayanlab.github.io/MNIST_heatmaps/
.. _`Single Cell RNA-seq Data Visualization`: http://nbviewer.jupyter.org/github/MaayanLab/single_cell_RNAseq_Visualization/blob/master/Single%20Cell%20RNAseq%20Visualization%20Example.ipynb
1 change: 1 addition & 0 deletions docs/_build_html/_sources/getting_started.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Clustergrammer produces highly interactive visualizations that enable intuitive
- :ref:`interactive_dim_reduction` (e.g. filter rows based on variance)a
- :ref:`interactive_categories`
- :ref:`cropping`
- :ref:`search`

Press play or interact with the demo (see :ref:`interacting_with_viz` for more information):

Expand Down
11 changes: 7 additions & 4 deletions docs/_build_html/_sources/matrix_format_io.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,10 @@ Clustergrammer takes as input either:
- a tab-separated matrix file
- a Pandas DataFrame (using :ref:`clustergrammer_py`)

The tab-separated matrix file can take several formats shown below, which can include row/column categories and name/category titles. In call cases, row and column names must be unique. Optional, name/category titles will be shown as titles above row/column names or names adjacent to row/column categories, respectively. The front-end :ref:`clustergrammer_js` library can visualize matrices up to ~500,000 to ~1,000,000 cells large, but very large matrices may take a long time to cluster using the :ref:`clustergrammer_py` library. Clustergrammer is also optimized to visualize matrices with more rows than columns. Users are encouraged to arrange their matrix with data-points as columns and dimensions as rows, which enables users to take advantage of Clustergrammer's :ref:`interactive_dim_reduction`.
The tab-separated matrix file can take several formats shown below, which can include row/column categories and name/category titles. In call cases, row and column names must be unique. Users are encouraged to arrange their matrix with data-points as columns and dimensions as rows, which enables users to take advantage of Clustergrammer's :ref:`interactive_dim_reduction`.

The front-end :ref:`clustergrammer_js` library can visualize matrices up to ~500,000 to ~1,000,000 cells large and is also optimized to visualize matrices with more rows than columns. However very large matrices may take a long time to cluster using the :ref:`clustergrammer_py` library.


Simple Matrix Format
====================
Expand All @@ -19,16 +22,16 @@ The simplest tab-separated file format is shown here:
Row-B 3.0 0.0 8.0
Row-C 0.2 0.1 2.5

The first line gives the column names and starts with a blank tab. The first column of each row gives the name of this row followed by the data in each row of the matrix. See `example_tsv.txt`_ for an example of this matrix format.
The first row gives the column names and starts with a blank tab. All other rows start with the row name followed by the row data. Row and column titles can be added by prefixing each row or colum name with ``'Title: '`` (not shown in this example). See `example_tsv.txt`_ for an example of this matrix format.

Simple Matrix-Category Format
=============================
Row and column categories can also be included in the matrix in the following way:
Row and column categories can be included in your data in two ways. The first, simple matrix-category format, is shown below:

.. image:: _static/cat_tsv.png
:width: 700px

This screenshot of an Excel spreadsheet shows a single row category being added as an additional column of strings (e.g. ``Type: Interesting``) and a single column category being added as an additional row of strings (e.g. ``Gender: Male``). Up to 15 categories can be added in a similar manner. Titles for row or column names or categories can be added by prefixing each string with ``'Title: '`` (note that a space after the colon). For example the title of the column names is ``Cell Line`` and the title of the row categories is ``Gender``. See `rc_two_cats.txt`_ for an example of this matrix format.
This simple format allows users to encode column categories as a extra rows underneath column labels and row categories as an extra columns next to row labels. The above screenshot of an Excel spreadsheet shows a single row category being added as an additional column of strings (e.g. ``Type: Interesting``) and a single column category being added as an additional row of strings (e.g. ``Gender: Male``). Up to 15 categories can be added in a similar manner. Titles for row or column names or categories can be added by prefixing each string with ``'Title: '`` (note the space after the colon). For example the title of the column names is ``Cell Line`` and the title of the row categories is ``Gender``. See `rc_two_cats.txt`_ for an example of this matrix format. Titles, if given, will be shown as labels above row/column names or adjacent to row/column categories.

Tuple Matrix-Category Format
============================
Expand Down
1 change: 1 addition & 0 deletions docs/_build_html/case_studies.html
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@
<ul class="simple">
<li><a class="reference external" href="http://amp.pharm.mssm.edu/clustergrammer/CCLE/">Cancer Cell Line Encyclopedia Gene Expression Data</a></li>
<li><a class="reference external" href="http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb">Zika Virus RNA-seq Data Visualization</a></li>
<li><a class="reference external" href="http://nbviewer.jupyter.org/github/MaayanLab/single_cell_RNAseq_Visualization/blob/master/Single%20Cell%20RNAseq%20Visualization%20Example.ipynb">Single Cell RNA-seq Data Visualization</a></li>
<li><a class="reference external" href="http://nbviewer.jupyter.org/github/MaayanLab/iris_clustergrammer_visualization/blob/master/Iris%20Dataset.ipynb">Iris flower dataset</a></li>
<li><a class="reference external" href="https://maayanlab.github.io/MNIST_heatmaps/">MNIST Handwritten Digit Dataset</a></li>
</ul>
Expand Down
1 change: 1 addition & 0 deletions docs/_build_html/getting_started.html
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,7 @@ <h2>Interacting with Clustergrammer<a class="headerlink" href="#interacting-with
<li><a class="reference internal" href="interacting_with_viz.html#interactive-dim-reduction"><span class="std std-ref">Interactive Dimensionality Reduction</span></a> (e.g. filter rows based on variance)a</li>
<li><a class="reference internal" href="interacting_with_viz.html#interactive-categories"><span class="std std-ref">Interactive Categories</span></a></li>
<li><a class="reference internal" href="interacting_with_viz.html#cropping"><span class="std std-ref">Cropping</span></a></li>
<li><a class="reference internal" href="interacting_with_viz.html#search"><span class="std std-ref">Row Searching</span></a></li>
</ul>
<p>Press play or interact with the demo (see <a class="reference internal" href="interacting_with_viz.html#interacting-with-viz"><span class="std std-ref">Interacting with the Visualization</span></a> for more information):</p>
<iframe id='iframe_preview' src="http://amp.pharm.mssm.edu/clustergrammer/demo/" frameBorder="0" style='height: 495px; width:730px; margin-bottom:15px;'></iframe><p>Clustergrammer also has biology-specific features including:</p>
Expand Down
11 changes: 6 additions & 5 deletions docs/_build_html/matrix_format_io.html
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,8 @@
<li>a tab-separated matrix file</li>
<li>a Pandas DataFrame (using <a class="reference internal" href="clustergrammer_py.html#clustergrammer-py"><span class="std std-ref">Clustergrammer-PY</span></a>)</li>
</ul>
<p>The tab-separated matrix file can take several formats shown below, which can include row/column categories and name/category titles. In call cases, row and column names must be unique. Optional, name/category titles will be shown as titles above row/column names or names adjacent to row/column categories, respectively. The front-end <a class="reference internal" href="clustergrammer_js.html#clustergrammer-js"><span class="std std-ref">Clustergrammer-JS</span></a> library can visualize matrices up to ~500,000 to ~1,000,000 cells large, but very large matrices may take a long time to cluster using the <a class="reference internal" href="clustergrammer_py.html#clustergrammer-py"><span class="std std-ref">Clustergrammer-PY</span></a> library. Clustergrammer is also optimized to visualize matrices with more rows than columns. Users are encouraged to arrange their matrix with data-points as columns and dimensions as rows, which enables users to take advantage of Clustergrammer&#8217;s <a class="reference internal" href="interacting_with_viz.html#interactive-dim-reduction"><span class="std std-ref">Interactive Dimensionality Reduction</span></a>.</p>
<p>The tab-separated matrix file can take several formats shown below, which can include row/column categories and name/category titles. In call cases, row and column names must be unique. Users are encouraged to arrange their matrix with data-points as columns and dimensions as rows, which enables users to take advantage of Clustergrammer&#8217;s <a class="reference internal" href="interacting_with_viz.html#interactive-dim-reduction"><span class="std std-ref">Interactive Dimensionality Reduction</span></a>.</p>
<p>The front-end <a class="reference internal" href="clustergrammer_js.html#clustergrammer-js"><span class="std std-ref">Clustergrammer-JS</span></a> library can visualize matrices up to ~500,000 to ~1,000,000 cells large and is also optimized to visualize matrices with more rows than columns. However very large matrices may take a long time to cluster using the <a class="reference internal" href="clustergrammer_py.html#clustergrammer-py"><span class="std std-ref">Clustergrammer-PY</span></a> library.</p>
<div class="section" id="simple-matrix-format">
<h2>Simple Matrix Format<a class="headerlink" href="#simple-matrix-format" title="Permalink to this headline"></a></h2>
<p>The simplest tab-separated file format is shown here:</p>
Expand All @@ -164,13 +165,13 @@ <h2>Simple Matrix Format<a class="headerlink" href="#simple-matrix-format" title
<span class="n">Row</span><span class="o">-</span><span class="n">C</span> <span class="mf">0.2</span> <span class="mf">0.1</span> <span class="mf">2.5</span>
</pre></div>
</div>
<p>The first line gives the column names and starts with a blank tab. The first column of each row gives the name of this row followed by the data in each row of the matrix. See <a class="reference external" href="https://github.com/MaayanLab/clustergrammer/blob/master/txt/example_tsv.txt">example_tsv.txt</a> for an example of this matrix format.</p>
<p>The first row gives the column names and starts with a blank tab. All other rows start with the row name followed by the row data. Row and column titles can be added by prefixing each row or colum name with <code class="docutils literal"><span class="pre">'Title:</span> <span class="pre">'</span></code> (not shown in this example). See <a class="reference external" href="https://github.com/MaayanLab/clustergrammer/blob/master/txt/example_tsv.txt">example_tsv.txt</a> for an example of this matrix format.</p>
</div>
<div class="section" id="simple-matrix-category-format">
<h2>Simple Matrix-Category Format<a class="headerlink" href="#simple-matrix-category-format" title="Permalink to this headline"></a></h2>
<p>Row and column categories can also be included in the matrix in the following way:</p>
<p>Row and column categories can be included in your data in two ways. The first, simple matrix-category format, is shown below:</p>
<a class="reference internal image-reference" href="_images/cat_tsv.png"><img alt="_images/cat_tsv.png" src="_images/cat_tsv.png" style="width: 700px;" /></a>
<p>This screenshot of an Excel spreadsheet shows a single row category being added as an additional column of strings (e.g. <code class="docutils literal"><span class="pre">Type:</span> <span class="pre">Interesting</span></code>) and a single column category being added as an additional row of strings (e.g. <code class="docutils literal"><span class="pre">Gender:</span> <span class="pre">Male</span></code>). Up to 15 categories can be added in a similar manner. Titles for row or column names or categories can be added by prefixing each string with <code class="docutils literal"><span class="pre">'Title:</span> <span class="pre">'</span></code> (note that a space after the colon). For example the title of the column names is <code class="docutils literal"><span class="pre">Cell</span> <span class="pre">Line</span></code> and the title of the row categories is <code class="docutils literal"><span class="pre">Gender</span></code>. See <a class="reference external" href="https://github.com/MaayanLab/clustergrammer/blob/master/txt/rc_two_cats.txt">rc_two_cats.txt</a> for an example of this matrix format.</p>
<p>This simple format allows users to encode column categories as a extra rows underneath column labels and row categories as an extra columns next to row labels. The above screenshot of an Excel spreadsheet shows a single row category being added as an additional column of strings (e.g. <code class="docutils literal"><span class="pre">Type:</span> <span class="pre">Interesting</span></code>) and a single column category being added as an additional row of strings (e.g. <code class="docutils literal"><span class="pre">Gender:</span> <span class="pre">Male</span></code>). Up to 15 categories can be added in a similar manner. Titles for row or column names or categories can be added by prefixing each string with <code class="docutils literal"><span class="pre">'Title:</span> <span class="pre">'</span></code> (note the space after the colon). For example the title of the column names is <code class="docutils literal"><span class="pre">Cell</span> <span class="pre">Line</span></code> and the title of the row categories is <code class="docutils literal"><span class="pre">Gender</span></code>. See <a class="reference external" href="https://github.com/MaayanLab/clustergrammer/blob/master/txt/rc_two_cats.txt">rc_two_cats.txt</a> for an example of this matrix format. Titles, if given, will be shown as labels above row/column names or adjacent to row/column categories.</p>
</div>
<div class="section" id="tuple-matrix-category-format">
<h2>Tuple Matrix-Category Format<a class="headerlink" href="#tuple-matrix-category-format" title="Permalink to this headline"></a></h2>
Expand All @@ -186,7 +187,7 @@ <h2>Tuple Matrix-Category Format<a class="headerlink" href="#tuple-matrix-catego
<div class="section" id="category-types-string-and-value">
<h2>Category Types: String and Value<a class="headerlink" href="#category-types-string-and-value" title="Permalink to this headline"></a></h2>
<p>Row and column categories can be of type: string or value. If categories are given as strings (e.g. containing letters and not just numbers) then categories will be depicted using different colors. If categories are of type value (e.g. all categories contain no letters and only numbers) then categories will be depicted using a two colors (gray for positive and orange for negative) and the value will be depicted as opacity (similar to how matrix cells are visually encoded).</p>
<p>Value-based categories can be useful for adding a dimension of data to your visualization (e.g. time) that you would like to compare to your other dimensions, but would not like to influence your clustering. Value-based and String-based categories can also be used to reorder your matrix (see <a class="reference internal" href="interacting_with_viz.html#interactive-categories"><span class="std std-ref">Categories</span></a>).</p>
<p>Value-based categories can be useful for adding a dimension of data to your visualization (e.g. time) that you would like to compare to your other dimensions, but would not like to influence your clustering. Value-based and String-based categories can also be used to reorder your matrix (see <a class="reference internal" href="interacting_with_viz.html#interactive-categories"><span class="std std-ref">Interactive Categories</span></a>).</p>
</div>
<div class="section" id="matrix-file-examples">
<h2>Matrix File Examples<a class="headerlink" href="#matrix-file-examples" title="Permalink to this headline"></a></h2>
Expand Down

0 comments on commit 5b210e9

Please sign in to comment.