apache · tqchen · Sep 29, 2015 · Sep 28, 2015 · Sep 29, 2015 · Sep 29, 2015
diff --git a/README.md b/README.md
@@ -9,13 +9,15 @@ MXNet is a deep learning framework designed for both *efficiency* and *flexibili
 It allows you to mix the [flavours](http://mxnet.readthedocs.org/en/latest/program_model.html) of
 deep learning programs together to maximize the efficiency and your productivity.
 
+
 What's New
 ----------
 * [Note on Programming Models for Deep Learning](http://mxnet.readthedocs.org/en/latest/program_model.html)
 
 Contents
 --------
-* [Documentation](http://mxnet.readthedocs.org/en/latest/)
+* [Documentation and Tutorials](http://mxnet.readthedocs.org/en/latest/)
+* [Open Source Design Notes](http://mxnet.readthedocs.org/en/latest/#open-source-design-notes)
 * [Code Examples](example)
 * [Build Instruction](doc/build.md)
 * [Features](#features)
@@ -25,8 +27,8 @@ Features
 --------
 * To Mix and Maximize
   - Mix all flavours of programming models to maximize flexibility and efficiency.
-* Lightweight and scalable
-  - Minimum build dependency, scales to multi-GPU and ready toward distributed.
+* Lightweight, scalable and memory efficient.
+  - Minimum build dependency, scales to multi-GPUs with very low memory usage.
 * Auto parallelization
   - Write numpy-style ndarray GPU programs, which will be automatically parallelized.
 * Language agnostic

diff --git a/doc/developer-guide/index.md b/doc/developer-guide/index.md
@@ -1,10 +1,60 @@
 MXNet Developer Guide
 =====================
-This page contains links to all the developer related documents on mxnet.
+This page contains resources you need to understand how mxnet works and how to work on mxnet codebase.
+We believe that it is important to make the system modularized and understandable by general audience.
+If you are interested in general design, checkout our effort of [open source design notes](#open-source-design-notes)
+for deep learning.
 
 Overview of the Design
 ----------------------
-* [Execution Engine](engine.md)
+![System Overview](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/system/overview.png)
+
+The above shows major modules of mxnet, and how do they interact with each other. The modules are
+- Runtime Dependency Engine: Schedules and executes the operations according to their read/write dependency.
+- Storage Allocator: Efficiently allocate and recycles memory blocks for GPU and CPU.
+- Resource Manager: Manage global resources such as random number generator, temporal space.
+- NDArray: Dynamic asynchronize n-dimensional arrays, provide flexible imperative programs for MXNet.
+- Symbolic Execution: Static symbolic graph executor, provide efficient symbolic graph execution and optimization.
+- Operator: Operators that defines static forward and gradient calculation(backprop).
+- Symbol Construction: Symbolic construction, provide a way to construct computation graph(net configuration)
+- KVStore: Key-value store interface for easy parameter synchronizations.
+- Data Loading(IO): Efficient distributed data loading and augmentation.
+
+How to Read the Code
+--------------------
+- All the module interface are listed in [include](../../include), these interfaces are heavily documented.
+- You read the [Doxygen Version](https://mxnet.readthedocs.org/en/latest/doxygen) of the document.
+- Each module will only depend on other module by the header files in [include](../../include).
+- The implementation of module is in [src](../../src) folder. 
+- Each source code only sees the file within its folder, [src/common](../../src/common) and [include](../../include).
+
+Most modules are mostly self-contained, with interface dependency on engine.
+So you are free to pick the one you are interested in, and read that part.
+
+### Analogy to CXXNet
+- The Symbolic Execution can be viewed as neural net execution(forward, backprop) with more optimizations.
+- The Operator can be viewed as Layers, but need to pass in weights and bias.
+	- It also contains more(optional) interface to further optimize memory usage.
+- The Symbolic Construction module is advanced config file.
+- The Runtime Dependency Engine engine is like a thread pool.
+	- But makes your life easy to solve dependency tracking for you.
+- KVStore adopts a simple parameter-server interface optimized for GPU synchronization.
+
+### Analogy to Minerva
+- The Runtime Dependency Engine is DAGEngine in Minerva, except that it is enhanced to support mutations.
+- The NDArray is same as owl.NDArray, except that it supports mutation, and can interact with Symbolic Execution.
+
+Documents of Each Module
+------------------------
+* [Runtime Dependency Engine](engine.md)
+* [Operators](operator.md)
+
+
+Open Source Design Notes
+------------------------
+* [Programming Models for Deep Learning](../program_model.md)
+	- Compares various programming models, which motivates the current design.
+
 
 List of Other Resources
 -----------------------

diff --git a/doc/index.md b/doc/index.md
@@ -13,14 +13,27 @@ User Guide
 * [Python Package Document](python/index.md)
 * [Frequently Asked Questions](faq.md)
 
+
 Developer Guide
 ---------------
-* [Programming Models for Deep Learning](program_model.md)
 * [Developer Documents](developer-guide/index.md)
 * [Environment Variables for MXNet](env_var.md)
 * [Contributor Guideline](contribute.md)
 * [Doxygen Version of C++ API](https://mxnet.readthedocs.org/en/latest/doxygen)
 
+
+Open Source Design Notes
+------------------------
+This section contains the design document and notes we made for mxnet system design and deep learning 
+libraries in general. We believe that open sourcing the system design note, its motivations and choices
+can benefit general audience, for those who uses deep learning and who builds deep learning systems.
+
+This section will be updated with self-contained design notes on various aspect of deep learning systems,
+in terms of abstraction, optimization and trade-offs.
+
+* [Programming Models for Deep Learning](program_model.md)
+
+
 Indices and tables
 ------------------
 

diff --git a/doc/python/index.md b/doc/python/index.md
@@ -11,6 +11,8 @@ There are three types of documents you can find about mxnet.
 Tutorials
 ---------
 * [Python Overview Tutorial](tutorial.md)
+* [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md)
+
 
 Python API Documents
 --------------------

diff --git a/doc/python/symbol.md b/doc/python/symbol.md
@@ -7,6 +7,9 @@ MXNet Python Symbolic API
 * [Symbol Object Document](#mxnet.symbol.Symbol) gives API reference to the Symbol Object
 * [Execution API Reference](#execution-api-reference) tell us on what executor can do.
 
+You are also highly encouraged to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md)
+with this document.
+
 How to Compose Symbols
 ----------------------
 The symbolic API provides a way for you to configure the computation graphs.

diff --git a/doc/python/symbol_in_pictures.md b/doc/python/symbol_in_pictures.md
@@ -0,0 +1,79 @@
+Symbolic Configuration and Execution in Pictures
+================================================
+This is a self-contained tutorial that explains the Symbolic construction and execution in pictures.
+You are recommend to read this together with [Symbolic API](symbol.md).
+
+Compose Symbols
+---------------
+The symbols are description of computation we want to do. The symbolic construction API generates the computation
+graph that describes the need of computation. The following picture is how we compose symbols to describe basic computations.
+
+![Symbol Compose](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/compose_basic.png)
+
+- The [mxnet.symbol.Variable](symbol.md#mxnet.symbol.Variable) function creates argument nodes that represents inputs to the computation.
+- The Symbol is overloaded with basic element-wise arithmetic operations. 
+
+Configure Neural Nets
+---------------------
+Besides fine-grained operations, mxnet also provide a way to perform big operations that is analogy to layers in neural nets.
+We can use these operators to describe a neural net configuration.
+
+![Net Compose](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/compose_net.png)
+
+
+Example of Multi-Input Net
+--------------------------
+The following is an example of configuring multiple input neural nets.
+
+![Multi Input](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/compose_multi_in.png)
+
+
+Bind and Execute Symbol 
+-----------------------
+When we need to execute a symbol graph. We call bind function to bind ```NDArrays``` to the argument nodes
+to get a ```Executor```.
+
+![Bind](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/bind_basic.png)
+
+You can call ```Executor.Forward``` to get the output results, given the binded NDArrays as input.
+
+![Forward](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/executor_forward.png)
+
+
+Bind Multiple Outputs
+---------------------
+You can use [mx.symbol.Group](symbol.md#mxnet.symbol.Group) to group symbols together then bind them to 
+get outputs of both.
+
+![MultiOut](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/executor_multi_out.png)
+
+But always remember, only bind what you need, so system can do more optimizations for you.
+
+
+Calculate Gradient
+------------------
+You can specify gradient holder NDArrays in bind, then call ```Executor.backward``` after ```Executor.forward```
+will give you the corresponding gradients.
+
+![Gradient](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/executor_backward.png)
+
+
+Simple Bind Interface for Neural Nets
+-------------------------------------
+Sometimes it is tedious to pass the argument NDArrays to the bind function. Especially when you are binding a big
+graph like neural nets. [Symbol.simple_bind](symbol.md#mxnet.symbol.Symbol.simple_bind) provides a way to simplify
+the procedure. You only need to specify input data shapes, and the function will allocate the arguments, and bind
+the Executor for you.
+
+![SimpleBind](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/executor_simple_bind.png)
+
+Auxiliary States
+----------------
+Auxiliary states are just like arguments, except that you cannot take gradient of them. These are states that may 
+not be part of computation, but can be helpful to track. You can pass the auxiliary state in the same way as arguments.
+
+![SimpleBind](https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/mxnet/symbol/executor_aux_state.png)
+
+More Information
+----------------
+Please refer to [Symbolic API](symbol.md) and [Python Documentation](index.md).
diff --git a/doc/python/tutorial.md b/doc/python/tutorial.md
@@ -344,6 +344,9 @@ to get the gradient.
 ```
 The [model API](../../python/mxnet/model.py) is a thin wrapper around the symbolic executors to support neural net training.
 
+You are also highly encouraged to read [Symbolic Configuration and Execution in Pictures](symbol_in_pictures.md),
+which provides a detailed explanation of concepts in pictures.
+
 ### How Efficient is Symbolic API
 
 In short, they design to be very efficienct in both memory and runtime.
@@ -357,7 +360,7 @@ utilization.
 The coarse grained operators are equivalent to cxxnet layers, which are
 extremely efficient.  We also provide fine grained operators for more flexible
 composition. Because we are also doing more inplace memory allocation, mxnet can
-be ***more memory efficient*** than cxxnet/caffe, and gets to same runtime, with
+be ***more memory efficient*** than cxxnet, and gets to same runtime, with
 greater flexiblity.
 
 ## Distributed Key-value Store