Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tutorials look good :)
Added a few minor comments
docs/tutorials/sparse/rowsparse.md
Outdated
|
||
``dense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]`` | ||
|
||
A RowSparseNDArray is typically used to represent non-zero row slices of a large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row slices are zeros. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of LARGE0 just felt slightly odd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Any suggestion for better names?
docs/tutorials/sparse/rowsparse.md
Outdated
|
||
## Inspecting Arrays | ||
|
||
Also as with `CSRNDArray`, you can inspect the contents of a `RowSparseNDArray` by filling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can be dropped
docs/tutorials/sparse/rowsparse.md
Outdated
{'b is a': b is a, 'b.asnumpy()':b.asnumpy(), 'c.asnumpy()':c.asnumpy(), 'd.asnumpy()':d.asnumpy()} | ||
``` | ||
|
||
If the storage types of source array and destination array do not match, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normal copyto function requires the shapes of arrays to be same, right?
How does this rowSparse.copyto work when storage types are different since that would require different amounts of memory. I think it would be good to clarify what happens here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source array will be converted if its storage type does not match. I'll add some clarification here
docs/tutorials/sparse/rowsparse.md
Outdated
|
||
* If sparse outputs are provided, MXNet will convert the dense outputs generated by the dense operator into the provided sparse format. | ||
|
||
* For operators that don't specialize in sparse arrays, you can still use them with sparse inputs with some performance penalty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first point is repeated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I noticed that, too. Will remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, they're different...
docs/tutorials/sparse/rowsparse.md
Outdated
``` | ||
|
||
* For operators that don't specialize in sparse arrays, you can still use them with sparse inputs with some performance penalty. | ||
In MXNet, dense operators require all inputs and outputs to be in the dense format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging of these two sentences as one bullet point is odd.
I'd actually suggest merging all points as a paragraph, because there is a flow between them like that.
|
||
You can instantiate an executor from a sparse symbol by using the `simple_bind` method, | ||
which allocate zeros to all free variables according to their storage types. | ||
The executor provides `forward` method for evaluation and an attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is evaluation the right word to use here? As I understand calling this function executes the commands ( to allocate zeros to those variables in this case), isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which sentence are you referring to? bind will allocate the memory, forward will perform evaluation
docs/tutorials/sparse/train.md
Outdated
```python | ||
# Create module | ||
mod = mx.mod.Module(symbol=lro, data_names=['data'], label_names=['label']) | ||
# Allocate memory by given the input data and label shapes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: given -> giving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely written tutorials.
docs/tutorials/sparse/csr.md
Outdated
and each row is sparse (i.e. with only a few nonzeros).** | ||
|
||
## Advantages of Compressed Sparse Row NDArray (CSRNDArray) | ||
For matrices of high sparsity (e.g. ~1% non-zeros), there are two primary advantages of `CSRNDArray` over the existing `NDArray`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to introduce both terms, sparsity and density. Say, For matrices of high sparsity, also known as low density (e.g. ~1% non-zeros = ~1% density), there are ...
docs/tutorials/sparse/csr.md
Outdated
- certain operations are much faster (e.g. matrix-vector multiplication) | ||
|
||
You may be familiar with the CSR storage format in [SciPy](https://www.scipy.org/) and will note the similarities in MXNet's implementation. However there are some additional competitive features in `CSRNDArray` inherited from `NDArray`, such as lazy evaluation and automatic parallelization that are not available in SciPy's flavor of CSR. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that some newbie user may read the Sparse tutorial without having background on the terms. Hmm, do we want to define the terms here, even though one could argue that these are easy-to-understand terms?
Lazy evaluation means that any operations on the CSRNDArray are not performed immediately, but are delayed until their evaluation is specifically requested. Automatic parallelization means that all operations are automatically executed in parallel with each other without requiring explicit mention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. These are mentioned in the NDArray tutorial which I listed as a prerequisite. I'll add the description nonetheless and add a link for further reading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I find async nonblocking more appropriate /accurate than lazy evaluation since the evaluation is actually not delayed. https://mxnet.incubator.apache.org/tutorials/basic/ndarray.html#lazy-evaluation-and-automatic-parallelization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest we avoid using the term lazy eval
in the original NDArray tutorial, too. It's misleading
docs/tutorials/sparse/csr.md
Outdated
The `indptr` array is what will help identify the rows where the data appears. It stores the index into `data` of the first non-zero element number of each row of the matrix. This array always starts with 0 (reasons can be explored later), so indptr[0] is 0. Each subsequent value in the array is the aggregate number of non-zero elements up to that row. Looking at the first row of the matrix you can see two non-zero values, so indptr[1] is 2. The next row contains all zeros, so the aggregate is still 2, so indptr[2] is 2. Finally, you see the last row contains one non-zero element bring the aggregate to 3, so indptr[3] is 3. Your result: | ||
|
||
indptr = [0, 2, 2, 3] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a pictorial summary of the above explanation: [[7, 0, 8, 0] [0, 0, 0, 0] [0, 9, 0, 0]] C0 C1 C2 C3 | | | | R0 -> [[7, 0, 8, 0] R1 -> [0, 0, 0, 0] R2 -> [0, 9, 0, 0]] R0 (R1,R2) Row-end => indptr | | | D0 D1 D2 D3 | | | | data = [7, 8, 9] Sentinel indices = [C0, C2, C1] = [0, 2, 1] indptr = [D0, D2, D2, D3] = [0, 2, 2, 3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't talked about Sentinel. Can you describe that further?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I am also thinking about putting something visual like this:
https://image.slidesharecdn.com/effectivenumericalcomputationinnumpyandscipy-140913005204-phpapp01/95/effective-numerical-computation-in-numpy-and-scipy-18-638.jpg?cb=1410824598
docs/tutorials/sparse/csr.md
Outdated
data_list = [7, 8, 9] | ||
indptr_list = [0, 2, 2, 3] | ||
indices_list = [0, 2, 1] | ||
a = mx.nd.sparse.csr_matrix(data_list, indptr_list, indices_list, shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may help to have this example too, saying, that the user does not have to do the calculations to create a CSRNDArray.
darr = np.array([[7., 0., 8., 0.], [0., 0., 0., 0.], [0., 9., 0., 0.]]) dmx = mx.nd.array(darr) csr = dmx.tostype('csr') print({'csr': csr, 'csr.data': \ csr.data, 'csr.indices': \ csr.indices, \ 'csr.indptr': csr.indptr})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach. In fact, we should address this towards the beginning by showing how it describes the data and even extend it to show how these values can be used to regenerate the full dataset.
Also, I think it would help to build up the scenarios a bit more. Using only 0's and 1's to show that duplicated data is fine. (There was some content before that mentioned you couldn't duplicate data, but it was changed to refer to the indices.) Without an explicit example the reader may assume incorrectly that data should be unique. Then, as we move to this example, one or two duplicated numbers would be good as well as a larger dataset. We should have an example that demonstrates the compression factor since the small ones we currently have don't show any savings of space.
docs/tutorials/sparse/csr.md
Outdated
shape = (3, 4) | ||
data_list = [7, 8, 9] | ||
indptr_list = [0, 2, 2, 3] | ||
indices_list = [0, 2, 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Picky: I would suggest keeping the order here the same as the order when these terms were explained, that is, indices_list followed by indptr_list.
docs/tutorials/sparse/train.md
Outdated
|
||
## Variables | ||
|
||
Variables are placeholder for arrays. We can use them to hold sparse arrays, too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comma between arrays and too.
docs/tutorials/sparse/train.md
Outdated
|
||
|
||
```python | ||
var_exec.arg_dict['b'][:] = mx.nd.ones(shape).tostype('csr') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: NameError: name 'var_exec' is not defined
docs/tutorials/sparse/train.md
Outdated
### Storage Type Inference | ||
|
||
What will be the output storage types of sparse symbols? In MXNet, for any sparse symbol, the result storage types are inferred based on storage types of inputs. | ||
You can read the [Sparse Symbol API](mxnet.io/api/python/symbol/sparse.html) documentation to find what output storage types are. In the example below we will try out the storage types introduced in the Row Sparse and Compressed Sparse Row tutorials: `default` (dense), `csr`, and `row_sparse`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add http:// to the URL.
and [mx.io.NDArrayIter](https://mxnet.incubator.apache.org/versions/master/api/python/io.html#mxnet.io.NDArrayIter) | ||
support loading sparse data in CSR format. In this example, we'll use the `NDArrayIter`. | ||
|
||
You may see some warnings from SciPy. You don't need to worry about those for this example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't find use of SciPy below. What is the reason to see possible warnings from SciPy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mx.test_utils.rand_ndarray may call scipy.random if scipy is available. The random function in scipy converts coo to csr. For some version of scipy, its random
function is using a deprecated argument when performing the conversion. This is not what we can control
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the output you get from running the block below.
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/sparse/coo.py:200: VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
if np.rank(self.data) != 1 or np.rank(self.row) != 1 or np.rank(self.col) != 1:
/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/sparse/compressed.py:130: VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
if np.rank(self.data) != 1 or np.rank(self.indices) != 1 or np.rank(self.indptr) != 1:
### Training the model with multiple machines | ||
|
||
To train a sparse model with multiple machines, please refer to the example in [mxnet/example/sparse/](https://github.com/apache/incubator-mxnet/tree/master/example/sparse) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the 3 tutorials are well-written!
|
||
data = [7, 8, 9] | ||
|
||
The `indices` array stores the column index for each non-zero element in `data`. As you cycle through the data array, starting with 7, you can see it is in column 0. Then looking at 8, you can see it is in column 2. Lastly 9 is in column 1. Your result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to mention here that column indices for a specific row will be looked at in sorted order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's mentioned a few lines below
docs/tutorials/sparse/csr.md
Outdated
{'b':b, 'c':c} | ||
``` | ||
|
||
Note that accessing individual elements in a CSRNDArray is currently not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also mention that slicing across axis 1 is not supported.
docs/tutorials/sparse/row_sparse.md
Outdated
|
||
To create a RowSparseNDArray on gpu, we need to explicitly specify the context: | ||
|
||
**Note** If a GPU is not available, an error will be reported in the following section. In order to execute it a cpu, set gpu_device to mx.cpu(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to execute it a cpu -> In order to execute it on a cpu
* add three sparse tutorials * update csr and rsp * update train.md * fix wrong var name * first round of edits * first edits and removed outputs * first round of edits * first round of edits * updates based on CR comments * update tutorial link. It wont work until the PR is merged * update based on cr comments
* add three sparse tutorials * update csr and rsp * update train.md * fix wrong var name * first round of edits * first edits and removed outputs * first round of edits * first round of edits * updates based on CR comments * update tutorial link. It wont work until the PR is merged * update based on cr comments
Preview at http://ec2-54-187-32-207.us-west-2.compute.amazonaws.com/tutorials/