Edited defining tensors section

stephenchouca · stephenchouca · commit 9f7684d41aaf · 2019-05-30T10:33:28.000-04:00
diff --git a/sdh_documentation/docs/pytensors.md b/sdh_documentation/docs/pytensors.md
@@ -1,209 +1,196 @@
 # Declaring Tensors
 
-`pytaco.Tensor` objects correspond to mathematical tensors. You can can declare a new tensor by specifying its name, a vector with the size of each dimension and the [storage format](pytensors.md#defining-tensor-formats) that will be used to store the tensor and a [datatype](pytensors.md#tensor-datatypes):
+`pytaco.tensor` objects, which represent mathematical tensors, form the core of
+the TACO Python library. You can can declare a new tensor by specifying the
+sizes of each dimension, the [format](pytensors.md#defining-tensor-formats)
+that will be used to store the tensor, and the
+[datatype](pytensors.md#tensor-datatypes) of the tensor's nonzero elements:
 
 ```python
-# Import the pytaco library
+# Import the TACO Python library
 import pytaco as pt
-# Import the storage formats to save some typing
 from pytaco import dense, compressed
 
-# Declare a new tensor "A" of double-precision floats with dimensions 
+# Declare a new tensor of double-precision floats with dimensions 
 # 512 x 64 x 2048, stored as a dense-sparse-sparse tensor
-A = pt.tensor("A", [512, 64, 2048], pt.format([dense, compressed, compressed]), pt.float64)
-```
-
-The name of the tensor can be omitted, in which case taco will assign an arbitrary name to the tensor:
-```python
-import pytaco as pt
-from pytaco import dense, compressed
-
-# Declare a tensor with the same dimensions, storage format and type as before
 A = pt.tensor([512, 64, 2048], pt.format([dense, compressed, compressed]), pt.float64)
 ```
 
-The [datatype](pytensors.md#tensor-datatypes) can also be omitted in which case taco will default to using `pt.float32`:
-```python
-import pytaco as pt
-from pytaco import dense, compressed
+The datatype can be omitted, in which case TACO will default to using
+`pt.float32` to store the tensor's nonzero elements:
 
-# Declare a tensor with the same dimensions and storage format as before
+```python
+# Declare the same tensor as before
 A = pt.tensor([512, 64, 2048], pt.format([dense, compressed, compressed]))
 ```
 
-A single format can be given to create a tensor where all dimensions have that format:
-```python
-import pytaco as pt
-from pytaco import dense, compressed
+Instead of specifying a format that is tied to the number of dimensions that a
+tensor has, we can simply specify whether all dimensions are dense or sparse:
 
-# Declare a dense tensor
+```python
+# Declare a tensor where all dimensions are dense
 A = pt.tensor([512, 64, 2048], dense)
 
-# Declare a compressed tensor
+# Declare a tensor where all dimensions are sparse
 B = pt.tensor([512, 64, 2048], compressed)
 ```
 
-Scalars, which are treated as order-0 tensors, can be declared and initialized with some arbitrary value as demonstrated below:
-```python
-import pytaco as pt
-from pytaco import dense, compressed
+Scalars, which correspond to tensors that have zero dimension, can be declared
+and initialized with an arbitrary value as demonstrated below:
 
+```python
 # Declare a scalar
 aplha = pt.tensor(42.0)
 ```
 
 # Defining Tensor Formats
 
-Conceptually, you can think of a tensor as a tree with each level (excluding the root) corresponding to a dimension of the tensor. Each path from the root to a leaf node represents a tensor coordinate and its corresponding value. Which dimension each level of the tree corresponds to is determined by the order in which dimensions of the tensor are stored.
+Conceptually, you can think of a tensor as a tree where each level (excluding
+the root) corresponding to a dimension of the tensor.  Each path from the root
+to a leaf node represents the coordinates of a tensor element and its
+corresponding value.  Which dimension of the tensor each level of the tree
+corresponds to is determined by the order in which tensor dimensions are
+stored.
+
+TACO uses a novel scheme that can describe different storage formats for a
+tensor by specifying the order in which tensor dimensions are stored and
+whether each dimension is sparse or dense.  A sparse (compressed) dimension
+stores only the subset of the dimension that contains non-zero values, using
+index arrays that are found in the compressed sparse row (CSR) matrix format.
+A dense dimension, on the other hand, conceptually stores both zeros and
+non-zeros.  This scheme is flexibile enough to express many commonly-used
+tensor storage formats:
 
-taco uses a novel scheme that can describe different storage formats for any tensor by specifying the order in which tensor dimensions are stored and whether each dimension is sparse or dense. A sparse dimension stores only the subset of the dimension that contains non-zero values and is conceptually similar to the index arrays used in the compressed sparse row (CSR) matrix format, while a dense dimension stores both zeros and non-zeros. As demonstrated below, this scheme is flexibile enough to express many commonly-used matrix storage formats.
-
-You can define a new tensor storage format by creating a `pytaco.format` object. The constructor for `pytaco.format` takes as arguments a list specifying the type of each dimension and (optionally) a list specifying the order in which dimensions are to be stored, as seen below:
 ```python
 import pytaco as pt
-from pytaco import dense, compressed, format
-dm   = format([dense, dense])                   # (Row-major) dense matrix
-csr  = format([dense, compressed])              # Compressed sparse row matrix
-csc  = format([dense, compressed], [1, 0])      # Compressed sparse column matrix
-dcsr = format([compressed, compressed], [1, 0]) # Doubly compressed sparse column matrix
-```
-
-```pytaco``` provides common formats (csr, csc and csf) by default and can be used by simply typing ```pt.csr```, ```pt.csc``` or ```pt.csf```.
-
-# Tensor Datatypes
-
-Tensors can be of 10 different datatypes. The following are the possible tensor datatypes:
-
-Signed Integers:
-
-```pytaco.int8```
-
-```pytaco.int16```
-
-```pytaco.int32```
-
-```pytaco.int64```
-
-Unsigned Integers:
-
-```pytaco.uint8```
-
-```pytaco.uint16```
-
-```pytaco.uint32```
-
-```pytaco.uint64```
+from pytaco import dense, compressed
 
-Floating point precision: 
+dm   = pt.format([dense, dense])                        # (Row-major) dense matrix format
+csr  = pt.format([dense, compressed])                   # Compressed sparse row matrix format
+csc  = pt.format([dense, compressed], [1, 0])           # Compressed sparse column matrix format
+dcsr = pt.format([compressed, compressed], [1, 0])      # Doubly compressed sparse column matrix format
+csf  = pt.format([compressed, compressed, compressed])  # Compressed sparse fiber tensor format
+```
 
-```pytaco.float32``` 
+As demonstrated above, you can define a new tensor storage format by creating a
+`pytaco.format` object.  This requires specifying whether each tensor dimension
+is dense or sparse as well as (optionally) the order in which dimensions should
+be stored.  TACO also predefines some common tensor formats (including 
+```pt.csr```, ```pt.csc``` and ```pt.csf```) that you can use out of the box.
 
-```pytaco.float```
+# Initializing Tensors
 
-Double precision: 
+Tensors can be made by using python indexing syntax. For example, one may write
+the following: You can initialize a tensor by calling its `insert` method to
+add a nonzero element to the tensor. The `insert` method takes two arguments:
+a list specifying the coordinates of the nonzero element to be added and the
+value to be inserted at that coordinate:
 
-```pytaco.float64``` 
+```python
+# Declare a sparse tensor
+A = pt.tensor([512, 64, 2048], compressed)
 
-```pytaco.double```
+# Set A(0, 1, 0) = 42.0
+A.insert([0, 1, 0], 42.0)
+```
 
-# Initializing Tensors
+If multiple elements are inserted at the same coordinates, they are summed 
+together:
 
-Tensors can be made by using python indexing syntax. For example, one may write the following:
 ```python
-import pytaco as pt
-from pytaco import dense, compressed
-
-# Declare a dense tensor
+# Declare a sparse tensor
 A = pt.tensor([512, 64, 2048], compressed)
 
-# Set location (0, 1, 0) in A to 42.0
-A[0, 1, 0] = 42.0
+# Set A(0, 1, 0) = 42.0 + 24.0 = 66.0
+A.insert([0, 1, 0], 42.0)
+A.insert([0, 1, 0], 24.0)
 ```
 
-The insert operator adds the inserted non-zeros to a temporary buffer. Before a tensor can actually be used in a computation, it is automatcally packed. 
-
-For most cases, this is not necessary but you may also invoke the `pack` method to compress the tensor into the storage format that was specified after all values have been inserted.
-
-NOTE: Multidimensional indexing (as used with lists) are NOT supported. For example, the following is invalid code:
+The `insert` method adds the inserted nonzero element to a temporary buffer.
+Before a tensor can actually be used in a computation though, the `pack` method
+must be invoked to pack the tensor into the storage format that was specified
+when the tensor was first declared.  TACO will automatically do this
+immediately before the tensor is used in a computation.  You can also manually
+invoke `pack` though if you need full control over when exactly that is done:
 
 ```python
-import pytaco as pt
-from pytaco import dense, compressed
+A.pack()
+```
 
-# Declare a dense tensor
-A = pt.tensor([512, 64, 2048], compressed)
+You can then iterate over the nonzero elements of the tensor as follows:
 
-# INVALID STATEMENT
-A[0][1][0] = 42.0
+```python
+for elem in A:
+  print(elem)
 ```
 
-# Loading Tensors from File
+# File I/O
 
-Rather than manually invoking building a tensor, you can load tensors directly from file by calling `pytaco.read` as demonstrated below:
+Rather than manually constructing a tensor, you can load tensors directly from
+file by invoking the `pytaco.read` function:
 
 ```python
-import pytaco as pt
-from pytaco import dense, compressed, format
-
-# Load a dense-sparse-sparse tensor from file A.tns
-A = pt.read("A.tns", format([dense, compressed, compressed]))
+# Load a dense-sparse-sparse tensor from file "A.tns"
+A = pt.read("A.tns", pt.format([dense, compressed, compressed]))
 ```
 
-By default, `pytaco.read` returns a packed tensor. You can optionally pass a Boolean flag as an argument to indicate whether the returned tensor should be packed or not: 
+By default, `pytaco.read` returns a tensor that has already been packed into
+the specified storage format. You can optionally pass a Boolean flag as an
+argument to indicate whether the returned tensor should be packed or not: 
 
 ```python
-import pytaco as pt
-from pytaco import dense, compressed, format
-
-# Load an unpacked tensor from file A.tns
+# Load an unpacked tensor from file "A.tns"
 A = pt.read("A.tns", format([dense, compressed, compressed]), false)
 ```
-NOTE: the tensor will be packed anyway before any computation is actually performed.
-
 
-Currently, taco supports loading from the following matrix and tensor file formats:
+The loaded tensor will then remain unpacked until the `pack` method is manually 
+invoked or a computation that uses the tensor is performed.
 
-* [Matrix Market (Coordinate) Format (.mtx)](http://math.nist.gov/MatrixMarket/formats.html#MMformat)
-* [Rutherford-Boeing Format (.rb)](https://www.cise.ufl.edu/research/sparse/matrices/DOC/rb.pdf)
-* [FROSTT Format (.tns)](http://frostt.io/tensors/file-formats.html)
-
-# Writing Tensors to Files
-
-You can also write a (packed) tensor directly to file by calling `pytaco.write`, as demonstrated below:
+You can also write a tensor directly to file by invoking the `pytaco.write`
+function:
 
 ```python
-import pytaco as pt
-
-A = pt.tensor([512, 64, 2048], compressed)
-A[0, 1, 0] = 42.0
-A[1, 1, 1] = 77
-pt.write("A.tns", A);  # Write tensor A to file A.tns
+# Write tensor A to file "A.tns"
+pt.write("A.tns", A)
 ```
 
-`pytaco.write` supports the same set of matrix and tensor file formats as `pytaco.read`.
+TACO supports loading tensors from and storing tensors to the following file
+formats:
+
+* [Matrix Market (Coordinate) Format (.mtx)](http://math.nist.gov/MatrixMarket/formats.html#MMformat)
+* [Rutherford-Boeing Format (.rb)](https://www.cise.ufl.edu/research/sparse/matrices/DOC/rb.pdf)
+* [FROSTT Format (.tns)](http://frostt.io/tensors/file-formats.html)
 
-# I/O with Numpy or Scipy
+# NumPy and SciPy I/O
 
-Tensors can be initialized with either numpy arrays or scipy sparse CSC or CSR matrices. As such, we can use the I/O from numpy and scipy and feed the data into pytaco by initializing a tensor.
+Tensors can also be initialized with either NumPy arrays or SciPy sparse (CSR 
+or CSC) matrices:
 
 ```python
 import pytaco as pt
 import numpy as np
 import scipy.sparse
 
-# Assuming matrix is CSR
+# Assuming SciPy matrix is stored in CSR
 sparse_matrix = scipy.sparse.load_npz('sparse_matrix.npz')
 
-# Pass data into taco for use
-taco_tensor = pt.from_scipy_csr(sparse_matrix)
+# Cast the matrix as a TACO tensor (also stored in CSR)
+taco_tensor = pt.from_sp_csr(sparse_matrix)
 
-# We can also load a numpy array
+# We can also load a NumPy array
 np_array = np.load('arr.npy')
 
-# And initialize a tensor from this array
-dense_tensor = pt.from_numpy_array(np_array)
+# And initialize a TACO tensor from this array
+dense_tensor = pt.from_array(np_array)
 ```
 
+We can also export TACO tensors to either NumPy arrays or SciPy sparse
+matrices:
 
+```python
+# Convert the tensor to a SciPy CSR matrix
+sparse_matrix = taco_tensor.to_sp_csr()
 
-
+# Convert the tensor to a NumPy array
+np_array = dense_tensor.to_array()
+```