Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Fix Sphinx Warnings (RST indent, cross-ref, and image scale) #4920

Merged
merged 2 commits into from Feb 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api/python/index.rst
Expand Up @@ -21,6 +21,7 @@ Python API
.. toctree::
:maxdepth: 2

tvm
runtime
ndarray
error
Expand Down
5 changes: 0 additions & 5 deletions docs/api/python/relay/op.rst
Expand Up @@ -53,8 +53,3 @@ tvm.relay.op

.. automodule:: tvm.relay.op.nn
:members:

.. automodule:: tvm.relay.op.vision.multibox
:members:

.. autofunction:: tvm.relay.vision.nms
2 changes: 1 addition & 1 deletion docs/api/python/runtime.rst
Expand Up @@ -27,7 +27,7 @@ tvm.runtime

.. autoclass:: tvm.runtime.PackedFunc
:members:
:inheritated-members:
:inherited-members:

.. autofunction:: tvm.register_func

Expand Down
7 changes: 3 additions & 4 deletions docs/contribute/pull_request.rst
Expand Up @@ -29,12 +29,11 @@ This is a quick guide to submit a pull request, please also refer to the detaile
git rebase upstream/master

- Make sure code style check pass by typing the following command, and all the existing test-cases pass.
- ``docker/bash.sh tvmai/ci-lint ./tests/scripts/task_lint.sh``
(Note: You must install docker beforehand so you can run a docker image.)
- ``docker/bash.sh tvmai/ci-lint ./tests/scripts/task_lint.sh``. (Note: You must install docker beforehand so you can run a docker image.)
- Add test-cases to cover the new features or bugfix the patch introduces.
- Document the code you wrote, see more at :ref:`doc_guide`
- Send the pull request, fix the problems reported by automatic checks.
Request code reviews from other contributors and improves your patch according to feedbacks.
- Send the pull request and fix the problems reported by automatic checks.
- Request code reviews from other contributors and improves your patch according to feedbacks.

- To get your code reviewed quickly, we encourage you to help review others' code so they can do the favor in return.
- Code review is a shepherding process that helps to improve contributor's code quality.
Expand Down
3 changes: 2 additions & 1 deletion docs/deploy/index.rst
Expand Up @@ -56,7 +56,6 @@ embedded devices is through TVM's RPC API.
Here are the links to the related tutorials.

- :ref:`tutorial-cross-compilation-and-rpc`
- :ref:`tutorial-deploy-model-on-mali-gpu`
- :ref:`tutorial-deploy-model-on-rasp`

After you finished tuning and benchmarking, you might need to deploy the model on the
Expand All @@ -68,3 +67,5 @@ target device without relying on RPC. see the following resources on how to do s
cpp_deploy
android
integrate
aocl_fpga
aws_fpga
11 changes: 0 additions & 11 deletions docs/dev/inferbound.rst
Expand Up @@ -118,13 +118,11 @@ In the Operation class declaration above, we can see that each operation also ha

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/stage_graph.png
:align: center
:scale: 70%

InferBound makes one pass through the graph, visiting each stage exactly once. InferBound starts from the output stages (i.e., the solid blue nodes in the graph above), and moves upwards (in the opposite direction of the edges). This is achieved by performing a reverse topological sort on the nodes of the graph. Therefore, when InferBound visits a stage, each of its consumer stages has already been visited.

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/inferbound_traversal.png
:align: center
:scale: 70%

The InferBound pass is shown in the following pseudo-code:

Expand Down Expand Up @@ -162,7 +160,6 @@ Recall that all IterVars of the stage are related by IterVarRelations. The IterV

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/relations.png
:align: center
:scale: 70%


The above diagram shows the IterVar hyper-graph for one stage. The stage has one root_iter_var, ``i``. It has been split, and the resulting inner axis ``i.inner``, has been split again. The leaf_iter_vars of the stage are shown in green: ``i.outer``, ``i.inner.outer``, and ``i.inner.inner``.
Expand Down Expand Up @@ -208,7 +205,6 @@ As mentioned above, a consumer may only require a small number of elements from

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/inferbound_phases.png
:align: center
:scale: 70%

IntSets
~~~~~~~
Expand Down Expand Up @@ -323,14 +319,12 @@ A ComputeOp has only a single output Tensor, whose axes correspond to the axis v

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/gatherbound.png
:align: center
:scale: 70%


The union of IntSets is computed by converting each IntSet to an Interval, and then taking the minimum of all minimums, and the maximum of all of these interval's maximums.

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/union.png
:align: center
:scale: 70%


This clearly results in some unnecessary computation, i.e., tensor elements will be computed that are never used.
Expand All @@ -340,7 +334,6 @@ Unfortunately, even if we're lucky and the IntervalSet unions do not produce unn

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/gatherbound_problem.png
:align: center
:scale: 70%

.. _InferBoundCA:

Expand Down Expand Up @@ -696,7 +689,6 @@ When InferRootBound is working on stage B, it visits B's consumer stage C to fin

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/passupdomain_problem.png
:align: center
:scale: 70%



Expand Down Expand Up @@ -756,17 +748,14 @@ If the split factor is 4, or 8, in the above example, the region of B needed in

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/passupdomain_div.png
:align: center
:scale: 70%

However, if the split factor is changed from 4 to 3 in the example above, it is easy to see that the region of B that C needs can no longer be described by an independent Range for each of its axes.


.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/passupdomain_nodiv.png
:align: center
:scale: 70%

The best that can be done with rectangular regions is shown in the following diagram. The orange regions are the minimum rectangular regions covering the region of B that needs to be computed, at each iteration of the outer loop.

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/docs/inferbound/passupdomain_min.png
:align: center
:scale: 70%
4 changes: 2 additions & 2 deletions docs/dev/relay_bring_your_own_codegen.rst
Expand Up @@ -535,7 +535,7 @@ To simplify, we define a graph representation named "ExampleJSON" in this guide.

Then the ExampleJON of this subgraph looks like:

.. code-block:: json
.. code-block:: none

subgraph_0
input 0 10 10
Expand All @@ -544,7 +544,7 @@ Then the ExampleJON of this subgraph looks like:
input 3 10 10
add 4 inputs: 0 1 shape: 10 10
sub 5 inputs: 4 2 shape: 10 10
add 6 inputs: 5 3 shape: 10 10
mul 6 inputs: 5 3 shape: 10 10

The ``input`` keyword declares an input tensor with its ID and shape; while the other statements describes computations in ``<op> <output ID> inputs: [input ID] shape: [shape]`` syntax.

Expand Down
3 changes: 0 additions & 3 deletions docs/dev/relay_intro.rst
Expand Up @@ -39,7 +39,6 @@ compile for heterogeneous execution environments (e.g., executing parts of the g

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/relay/dataflow.png
:align: center
:scale: 70%


You can use Relay to build a computational (dataflow) graph. Specifically, the above code shows how to
Expand Down Expand Up @@ -130,7 +129,6 @@ The code example below shows one program with two forms side by side.

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/relay/dataflow_vs_func.png
:align: center
:scale: 70%


The nested let binding is called A-normal form, and it is commonly used as IRs in functional programming languages.
Expand All @@ -155,7 +153,6 @@ which does not use let bindings.

.. image:: https://raw.githubusercontent.com/tvmai/tvmai.github.io/master/images/relay/let_scope.png
:align: center
:scale: 70%

The problem comes when we try to decide where we should evaluate node ``%1``. In particular, while the text format seems
to suggest that we should evaluate node ``%1`` outside the if scope, the AST(as shown in the picture) does not suggest so.
Expand Down
1 change: 1 addition & 0 deletions docs/dev/runtime.rst
Expand Up @@ -258,6 +258,7 @@ It also allows us to get members of an object easily in front-end language.
For example, in the following code, we accessed the op field of the TensorNode.

.. code:: python

import tvm

x = tvm.placeholder((3,4), name="x")
Expand Down
21 changes: 20 additions & 1 deletion docs/dev/virtual_machine.rst
Expand Up @@ -91,6 +91,7 @@ Ret
^^^
**Arguments**:
::

RegName dst
RegName result

Expand All @@ -100,6 +101,7 @@ InvokePacked
^^^^^^^^^^^^
**Arguments**:
::

Index packed_index
Index arity
Index output_size
Expand All @@ -114,6 +116,7 @@ AllocTensor
^^^^^^^^^^^
**Arguments**:
::

RegName dst
RegName storage
uint32_t ndim
Expand All @@ -127,6 +130,7 @@ AllocTensorReg
^^^^^^^^^^^^^^
**Arguments**:
::

RegName dst
RegName storage
RegName shape_register
Expand All @@ -139,6 +143,7 @@ AllocStorage
^^^^^^^^^^^^
**Arguments**:
::

RegName dst
RegName size
RegName alignment
Expand All @@ -151,6 +156,7 @@ AllocADT
^^^^^^^^
**Arguments**:
::

RegName dst
Index tag
Index num_fields
Expand All @@ -163,6 +169,7 @@ AllocClosure
^^^^^^^^^^^^
**Arguments**:
::

RegName dst
Index clo_index
Index num_freevar
Expand All @@ -176,6 +183,7 @@ GetField
^^^^^^^^
**Arguments**:
::

RegName dst
RegName object
Index field_index
Expand All @@ -186,6 +194,7 @@ If
^^
**Arguments**:
::

RegName test
RegName target
Index true_offset
Expand All @@ -199,6 +208,7 @@ GetTag
^^^^^^
**Arguments**:
::

RegName object
RegName dst

Expand All @@ -212,6 +222,7 @@ Goto
^^^^
**Arguments**:
::

Index pc_offset

Relative unconditional jump by ``pc_offset``.
Expand All @@ -220,6 +231,7 @@ Invoke
^^^^^^
**Arguments**:
::

Index func_index

Invoke function at ``func_index``, consumes the number of arguments contained in the VMFunction's
Expand All @@ -229,6 +241,7 @@ InvokeClosure
^^^^^^^^^^^^^
**Arguments**:
::

RegName closure
Index num_closure_args
RegName* closure_args
Expand All @@ -239,6 +252,7 @@ LoadConst
^^^^^^^^^
**Arguments**:
::

RegName dst
Index const_index

Expand All @@ -248,6 +262,7 @@ LoadConsti
^^^^^^^^^^
**Arguments**:
::

Index val
RegName dst

Expand Down Expand Up @@ -277,7 +292,7 @@ previous call. Registers are allocated in a continuous space (virtual register f

We keep track of a set of Relay functions we have called, a pointer into its bytecode, an offset into the byte code (known as the program counter).

::
.. code-block:: c

struct VirtualMachine {
...
Expand Down Expand Up @@ -331,6 +346,7 @@ Optimizations marked with `TODO` are not implemented yet.

Serialization
~~~~~~~~~~~~~

Serializing and deserializing the executable generated by the Relay VM compiler is a must as
we may want to save the model to the disk and perform inference later. Previously, Relay has produced
a serialized form in a json file for the graph runtime. However, the same format is not directly
Expand Down Expand Up @@ -372,14 +388,17 @@ Unresolved Questions

How do we handle dynamic shapes?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

TODO

How can we modify the VM to support JIT compilation of certain code paths?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In the code generation space there are still many tradeoffs to be analyzed and the VM is designed
to be very flexible so we can modify it for future experiments.

How do we support heterogenous execution?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Heterogenous execution should work out of the box assuming we have annotated the appropriate device copies.
In order to do this properly we need to run the device annotation and copying passes.
1 change: 1 addition & 0 deletions docs/vta/dev/hardware.rst
Expand Up @@ -215,6 +215,7 @@ This would result in a ``load-gemm-activate-store`` task pipeline which closely
Adding more stages has a cost however: it can add storage and extra logic overhead, which is why we opted for a default 3-stage pipeline.

.. _vta-uarch:

Microarchitectural Overview
---------------------------

Expand Down
19 changes: 0 additions & 19 deletions docs/vta/hardware.rst

This file was deleted.