Skip to content

Commit 9f7684d

Browse files
committed
Edited defining tensors section
1 parent 7ac0cdd commit 9f7684d

File tree

1 file changed

+114
-127
lines changed

1 file changed

+114
-127
lines changed
Lines changed: 114 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -1,209 +1,196 @@
11
# Declaring Tensors
22

3-
`pytaco.Tensor` objects correspond to mathematical tensors. You can can declare a new tensor by specifying its name, a vector with the size of each dimension and the [storage format](pytensors.md#defining-tensor-formats) that will be used to store the tensor and a [datatype](pytensors.md#tensor-datatypes):
3+
`pytaco.tensor` objects, which represent mathematical tensors, form the core of
4+
the TACO Python library. You can can declare a new tensor by specifying the
5+
sizes of each dimension, the [format](pytensors.md#defining-tensor-formats)
6+
that will be used to store the tensor, and the
7+
[datatype](pytensors.md#tensor-datatypes) of the tensor's nonzero elements:
48

59
```python
6-
# Import the pytaco library
10+
# Import the TACO Python library
711
import pytaco as pt
8-
# Import the storage formats to save some typing
912
from pytaco import dense, compressed
1013

11-
# Declare a new tensor "A" of double-precision floats with dimensions
14+
# Declare a new tensor of double-precision floats with dimensions
1215
# 512 x 64 x 2048, stored as a dense-sparse-sparse tensor
13-
A = pt.tensor("A", [512, 64, 2048], pt.format([dense, compressed, compressed]), pt.float64)
14-
```
15-
16-
The name of the tensor can be omitted, in which case taco will assign an arbitrary name to the tensor:
17-
```python
18-
import pytaco as pt
19-
from pytaco import dense, compressed
20-
21-
# Declare a tensor with the same dimensions, storage format and type as before
2216
A = pt.tensor([512, 64, 2048], pt.format([dense, compressed, compressed]), pt.float64)
2317
```
2418

25-
The [datatype](pytensors.md#tensor-datatypes) can also be omitted in which case taco will default to using `pt.float32`:
26-
```python
27-
import pytaco as pt
28-
from pytaco import dense, compressed
19+
The datatype can be omitted, in which case TACO will default to using
20+
`pt.float32` to store the tensor's nonzero elements:
2921

30-
# Declare a tensor with the same dimensions and storage format as before
22+
```python
23+
# Declare the same tensor as before
3124
A = pt.tensor([512, 64, 2048], pt.format([dense, compressed, compressed]))
3225
```
3326

34-
A single format can be given to create a tensor where all dimensions have that format:
35-
```python
36-
import pytaco as pt
37-
from pytaco import dense, compressed
27+
Instead of specifying a format that is tied to the number of dimensions that a
28+
tensor has, we can simply specify whether all dimensions are dense or sparse:
3829

39-
# Declare a dense tensor
30+
```python
31+
# Declare a tensor where all dimensions are dense
4032
A = pt.tensor([512, 64, 2048], dense)
4133

42-
# Declare a compressed tensor
34+
# Declare a tensor where all dimensions are sparse
4335
B = pt.tensor([512, 64, 2048], compressed)
4436
```
4537

46-
Scalars, which are treated as order-0 tensors, can be declared and initialized with some arbitrary value as demonstrated below:
47-
```python
48-
import pytaco as pt
49-
from pytaco import dense, compressed
38+
Scalars, which correspond to tensors that have zero dimension, can be declared
39+
and initialized with an arbitrary value as demonstrated below:
5040

41+
```python
5142
# Declare a scalar
5243
aplha = pt.tensor(42.0)
5344
```
5445

5546
# Defining Tensor Formats
5647

57-
Conceptually, you can think of a tensor as a tree with each level (excluding the root) corresponding to a dimension of the tensor. Each path from the root to a leaf node represents a tensor coordinate and its corresponding value. Which dimension each level of the tree corresponds to is determined by the order in which dimensions of the tensor are stored.
48+
Conceptually, you can think of a tensor as a tree where each level (excluding
49+
the root) corresponding to a dimension of the tensor. Each path from the root
50+
to a leaf node represents the coordinates of a tensor element and its
51+
corresponding value. Which dimension of the tensor each level of the tree
52+
corresponds to is determined by the order in which tensor dimensions are
53+
stored.
54+
55+
TACO uses a novel scheme that can describe different storage formats for a
56+
tensor by specifying the order in which tensor dimensions are stored and
57+
whether each dimension is sparse or dense. A sparse (compressed) dimension
58+
stores only the subset of the dimension that contains non-zero values, using
59+
index arrays that are found in the compressed sparse row (CSR) matrix format.
60+
A dense dimension, on the other hand, conceptually stores both zeros and
61+
non-zeros. This scheme is flexibile enough to express many commonly-used
62+
tensor storage formats:
5863

59-
taco uses a novel scheme that can describe different storage formats for any tensor by specifying the order in which tensor dimensions are stored and whether each dimension is sparse or dense. A sparse dimension stores only the subset of the dimension that contains non-zero values and is conceptually similar to the index arrays used in the compressed sparse row (CSR) matrix format, while a dense dimension stores both zeros and non-zeros. As demonstrated below, this scheme is flexibile enough to express many commonly-used matrix storage formats.
60-
61-
You can define a new tensor storage format by creating a `pytaco.format` object. The constructor for `pytaco.format` takes as arguments a list specifying the type of each dimension and (optionally) a list specifying the order in which dimensions are to be stored, as seen below:
6264
```python
6365
import pytaco as pt
64-
from pytaco import dense, compressed, format
65-
dm = format([dense, dense]) # (Row-major) dense matrix
66-
csr = format([dense, compressed]) # Compressed sparse row matrix
67-
csc = format([dense, compressed], [1, 0]) # Compressed sparse column matrix
68-
dcsr = format([compressed, compressed], [1, 0]) # Doubly compressed sparse column matrix
69-
```
70-
71-
```pytaco``` provides common formats (csr, csc and csf) by default and can be used by simply typing ```pt.csr```, ```pt.csc``` or ```pt.csf```.
72-
73-
# Tensor Datatypes
74-
75-
Tensors can be of 10 different datatypes. The following are the possible tensor datatypes:
76-
77-
Signed Integers:
78-
79-
```pytaco.int8```
80-
81-
```pytaco.int16```
82-
83-
```pytaco.int32```
84-
85-
```pytaco.int64```
86-
87-
Unsigned Integers:
88-
89-
```pytaco.uint8```
90-
91-
```pytaco.uint16```
92-
93-
```pytaco.uint32```
94-
95-
```pytaco.uint64```
66+
from pytaco import dense, compressed
9667

97-
Floating point precision:
68+
dm = pt.format([dense, dense]) # (Row-major) dense matrix format
69+
csr = pt.format([dense, compressed]) # Compressed sparse row matrix format
70+
csc = pt.format([dense, compressed], [1, 0]) # Compressed sparse column matrix format
71+
dcsr = pt.format([compressed, compressed], [1, 0]) # Doubly compressed sparse column matrix format
72+
csf = pt.format([compressed, compressed, compressed]) # Compressed sparse fiber tensor format
73+
```
9874

99-
```pytaco.float32```
75+
As demonstrated above, you can define a new tensor storage format by creating a
76+
`pytaco.format` object. This requires specifying whether each tensor dimension
77+
is dense or sparse as well as (optionally) the order in which dimensions should
78+
be stored. TACO also predefines some common tensor formats (including
79+
```pt.csr```, ```pt.csc``` and ```pt.csf```) that you can use out of the box.
10080

101-
```pytaco.float```
81+
# Initializing Tensors
10282

103-
Double precision:
83+
Tensors can be made by using python indexing syntax. For example, one may write
84+
the following: You can initialize a tensor by calling its `insert` method to
85+
add a nonzero element to the tensor. The `insert` method takes two arguments:
86+
a list specifying the coordinates of the nonzero element to be added and the
87+
value to be inserted at that coordinate:
10488

105-
```pytaco.float64```
89+
```python
90+
# Declare a sparse tensor
91+
A = pt.tensor([512, 64, 2048], compressed)
10692

107-
```pytaco.double```
93+
# Set A(0, 1, 0) = 42.0
94+
A.insert([0, 1, 0], 42.0)
95+
```
10896

109-
# Initializing Tensors
97+
If multiple elements are inserted at the same coordinates, they are summed
98+
together:
11099

111-
Tensors can be made by using python indexing syntax. For example, one may write the following:
112100
```python
113-
import pytaco as pt
114-
from pytaco import dense, compressed
115-
116-
# Declare a dense tensor
101+
# Declare a sparse tensor
117102
A = pt.tensor([512, 64, 2048], compressed)
118103

119-
# Set location (0, 1, 0) in A to 42.0
120-
A[0, 1, 0] = 42.0
104+
# Set A(0, 1, 0) = 42.0 + 24.0 = 66.0
105+
A.insert([0, 1, 0], 42.0)
106+
A.insert([0, 1, 0], 24.0)
121107
```
122108

123-
The insert operator adds the inserted non-zeros to a temporary buffer. Before a tensor can actually be used in a computation, it is automatcally packed.
124-
125-
For most cases, this is not necessary but you may also invoke the `pack` method to compress the tensor into the storage format that was specified after all values have been inserted.
126-
127-
NOTE: Multidimensional indexing (as used with lists) are NOT supported. For example, the following is invalid code:
109+
The `insert` method adds the inserted nonzero element to a temporary buffer.
110+
Before a tensor can actually be used in a computation though, the `pack` method
111+
must be invoked to pack the tensor into the storage format that was specified
112+
when the tensor was first declared. TACO will automatically do this
113+
immediately before the tensor is used in a computation. You can also manually
114+
invoke `pack` though if you need full control over when exactly that is done:
128115

129116
```python
130-
import pytaco as pt
131-
from pytaco import dense, compressed
117+
A.pack()
118+
```
132119

133-
# Declare a dense tensor
134-
A = pt.tensor([512, 64, 2048], compressed)
120+
You can then iterate over the nonzero elements of the tensor as follows:
135121

136-
# INVALID STATEMENT
137-
A[0][1][0] = 42.0
122+
```python
123+
for elem in A:
124+
print(elem)
138125
```
139126

140-
# Loading Tensors from File
127+
# File I/O
141128

142-
Rather than manually invoking building a tensor, you can load tensors directly from file by calling `pytaco.read` as demonstrated below:
129+
Rather than manually constructing a tensor, you can load tensors directly from
130+
file by invoking the `pytaco.read` function:
143131

144132
```python
145-
import pytaco as pt
146-
from pytaco import dense, compressed, format
147-
148-
# Load a dense-sparse-sparse tensor from file A.tns
149-
A = pt.read("A.tns", format([dense, compressed, compressed]))
133+
# Load a dense-sparse-sparse tensor from file "A.tns"
134+
A = pt.read("A.tns", pt.format([dense, compressed, compressed]))
150135
```
151136

152-
By default, `pytaco.read` returns a packed tensor. You can optionally pass a Boolean flag as an argument to indicate whether the returned tensor should be packed or not:
137+
By default, `pytaco.read` returns a tensor that has already been packed into
138+
the specified storage format. You can optionally pass a Boolean flag as an
139+
argument to indicate whether the returned tensor should be packed or not:
153140

154141
```python
155-
import pytaco as pt
156-
from pytaco import dense, compressed, format
157-
158-
# Load an unpacked tensor from file A.tns
142+
# Load an unpacked tensor from file "A.tns"
159143
A = pt.read("A.tns", format([dense, compressed, compressed]), false)
160144
```
161-
NOTE: the tensor will be packed anyway before any computation is actually performed.
162-
163145

164-
Currently, taco supports loading from the following matrix and tensor file formats:
146+
The loaded tensor will then remain unpacked until the `pack` method is manually
147+
invoked or a computation that uses the tensor is performed.
165148

166-
* [Matrix Market (Coordinate) Format (.mtx)](http://math.nist.gov/MatrixMarket/formats.html#MMformat)
167-
* [Rutherford-Boeing Format (.rb)](https://www.cise.ufl.edu/research/sparse/matrices/DOC/rb.pdf)
168-
* [FROSTT Format (.tns)](http://frostt.io/tensors/file-formats.html)
169-
170-
# Writing Tensors to Files
171-
172-
You can also write a (packed) tensor directly to file by calling `pytaco.write`, as demonstrated below:
149+
You can also write a tensor directly to file by invoking the `pytaco.write`
150+
function:
173151

174152
```python
175-
import pytaco as pt
176-
177-
A = pt.tensor([512, 64, 2048], compressed)
178-
A[0, 1, 0] = 42.0
179-
A[1, 1, 1] = 77
180-
pt.write("A.tns", A); # Write tensor A to file A.tns
153+
# Write tensor A to file "A.tns"
154+
pt.write("A.tns", A)
181155
```
182156

183-
`pytaco.write` supports the same set of matrix and tensor file formats as `pytaco.read`.
157+
TACO supports loading tensors from and storing tensors to the following file
158+
formats:
159+
160+
* [Matrix Market (Coordinate) Format (.mtx)](http://math.nist.gov/MatrixMarket/formats.html#MMformat)
161+
* [Rutherford-Boeing Format (.rb)](https://www.cise.ufl.edu/research/sparse/matrices/DOC/rb.pdf)
162+
* [FROSTT Format (.tns)](http://frostt.io/tensors/file-formats.html)
184163

185-
# I/O with Numpy or Scipy
164+
# NumPy and SciPy I/O
186165

187-
Tensors can be initialized with either numpy arrays or scipy sparse CSC or CSR matrices. As such, we can use the I/O from numpy and scipy and feed the data into pytaco by initializing a tensor.
166+
Tensors can also be initialized with either NumPy arrays or SciPy sparse (CSR
167+
or CSC) matrices:
188168

189169
```python
190170
import pytaco as pt
191171
import numpy as np
192172
import scipy.sparse
193173

194-
# Assuming matrix is CSR
174+
# Assuming SciPy matrix is stored in CSR
195175
sparse_matrix = scipy.sparse.load_npz('sparse_matrix.npz')
196176

197-
# Pass data into taco for use
198-
taco_tensor = pt.from_scipy_csr(sparse_matrix)
177+
# Cast the matrix as a TACO tensor (also stored in CSR)
178+
taco_tensor = pt.from_sp_csr(sparse_matrix)
199179

200-
# We can also load a numpy array
180+
# We can also load a NumPy array
201181
np_array = np.load('arr.npy')
202182

203-
# And initialize a tensor from this array
204-
dense_tensor = pt.from_numpy_array(np_array)
183+
# And initialize a TACO tensor from this array
184+
dense_tensor = pt.from_array(np_array)
205185
```
206186

187+
We can also export TACO tensors to either NumPy arrays or SciPy sparse
188+
matrices:
207189

190+
```python
191+
# Convert the tensor to a SciPy CSR matrix
192+
sparse_matrix = taco_tensor.to_sp_csr()
208193

209-
194+
# Convert the tensor to a NumPy array
195+
np_array = dense_tensor.to_array()
196+
```

0 commit comments

Comments
 (0)