alibaba · yecol · Mar 1, 2023 · Feb 27, 2023 · Feb 27, 2023 · Feb 27, 2023
diff --git a/docs/analytical_engine/programming_model_pie.md b/docs/analytical_engine/programming_model_pie.md
@@ -1,6 +1,74 @@
 # Programming Model: PIE
 
-- Workflow, PEval and IncEval: intro
-- Under the hook
+Although the [vertex-centric programming model](https://graphscope.io/docs/latest/analytical_engine/vertex_centric_models.html) can express various graph analytics algorithms, existing sequential (single-machine) graph algorithms have to be modified to comply with the “think like a vertex” principle, making parallel graph computation a privilege for experienced users only. In addition, the performance of graph algorithms with vertex-centric model is sub-optimal in many cases: each vertex only has information about its 1-hop neighbors, and thus information is propagated through the graph slowly, one hop at a time. As a result, it may take many computation iterations to propagate a piece of information from a source to a destination.
+
+## What is the PIE Model?
+
+To address the abovementioned problems, we proposed a new programming model PIE (PEval-IncEval-Assemble) in a [SIGMOD paper](https://dl.acm.org/doi/10.1145/3035918.3035942) published in 2017. Different from the vertex-centric, the PIE model can automatically parallelize existing sequential graph algorithms, with only some small changes. This makes parallel graph computations accessible to users who know conventional graph algorithms covered in college textbooks, and there is no need to recast existing graph algorithms into a new model.
+
+Specifically, in the PIE model, users only need to provide three functions,
+
+- (1) *PEval*, a sequential (single-machine) function for given a query, computes the answer on a local partition;
+- (2) *IncEval*, a sequential incremental function, computes changes to the old output by treating incoming messages as updates; and
+- (3) *Assemble*, which collects partial answers, and combines them into a complete answer.
+
+:::{figure-md}
+
+<img src="../images/pie.png"
+     alt="The PIE model"
+     width="80%">
+
+The PIE model. 
+:::
+
+## Workflow of PIE
+
+The PIE model works on a graph *G* and each worker maintains a partition of *G*. Given a query, each worker first executes *PEval* against its local partition, to compute partial answers in parallel. Then each worker may exchange partial results with other workers via synchronous message passing. Upon receiving messages, each worker incrementally computes *IncEval*. The incremental step iterates until no further messages can be generated. At this point, *Assemble* pulls partial answers and assembles the final result. In this way, the PIE model parallelizes existing sequential graph algorithms, without revising their logic and workflow.
+
+In this model, users do not need to know the details of the distributed setting while processing big graphs in a cluster, and the PIE model auto-parallelizes the graph analytics tasks across
+a cluster of workers, based on a fixpoint computation. Under a monotonic condition, it guarantees to
+converge with correct answers as long as the three sequential algorithms provided are correct.
+
+The following pseudo-code shows how SSSP is expressed in the PIE model, where the Dijkstra’s algorithm is directly used for the computation of parallel SSSP.
+
+```python
+def dijkstra(g, vals, updates):
+    heap = VertexHeap()
+    for i in updates:
+	vals[i] = updates[i]
+	heap.push(i, vals[i])
+
+    updates.clear()
+
+    while not heap.empty():
+	u = heap.top().vid
+	distu = heap.top().val
+	heap.pop()
+	for e in g.get_outgoing_edges(u):
+	    v = e.get_neighbor()
+	    distv = distu + e.data()
+	    if vals[v] > distv:
+		vals[v] = distv
+		    if g.is_inner_vertex(v):
+			heap.push(v, distv)
+			updates[v] = distv       
+	return updates
+
+def PEval(source, g, vals, updates):
+    for v in vertices:
+        updates[v] = MAX_INT
+    updates[source] = 0
+    dijkstra(g, vals, updates)
+
+def IncEval(source, g, vals, updates):
+    dijkstra(g, vals, updates)
+```
+
+
+
+
+
+
+TODO(wanglei): Under the hook
     - MessageBuffer,
     - ParallelExecutor,
diff --git a/docs/analytical_engine/vertex_centric_models.md b/docs/analytical_engine/vertex_centric_models.md
@@ -1,20 +1,71 @@
-# Vertex Centric Models
+# Vertex-Centric Model
 
-biref intro to vertex centric models, history;
+In a single machine environment, developers can easily implement graph analytics algorithms as they have a global view of the graph and can freely iterate through all vertices and edges. When the size of graph data grows beyond the memory capacity of a single machine, graph data must be partitioned to distributed memory, leading to the indivisibility of the whole graph structure. 
 
-## GAS Model
+To allow developers to succinctly express graph analytics algorithms under such environment, the *vertex-centric programming model* have been developed, with the philosophy of "think like a vertex". Specifically, a graph analytics algorithm iteratively executes a user-defined program over vertices of a graph. The user-defined vertex program
+typically takes data from other vertices as input, and the
+resultant output of a vertex is sent to other vertices. Vertex programs are
+executed iteratively for a certain number of rounds, or until a convergence condition is
+satisfied. As opposed to *global* perspective of the graph, vertex-centric models
+employ a local, vertex-oriented perspective.
 
-intro and figure
+The philosophy of vertex-centric model encourages many programming models, including the [Pregel model](https://research.google/pubs/pub37252/) proposed by Google and the [GAS model](https://www.usenix.org/conference/osdi12/technical-sessions/presentation/gonzalez). These programming models
+have widely applied in various graph processing systems, such as Giraph, GraphX and PowerGraph.
 
 ## Pregel Model
 
-intro and figure
+Pregel was first introduced in a [SIGMOD paper](https://dl.acm.org/doi/10.1145/1807167.1807184) published by Google in 2010. A graph analytics algorithm with the Pregel model consists of a sequence of iterations(called *supersteps*).
+
+> “During a superstep the framework invokes a user-defined function for each vertex, conceptually in parallel. The function specifies behavior at a single vertex *V* and a single superstep *S*. It can read messages sent to *V* in superstep *S − 1*, send messages to other vertices that will be received at superstep *S + 1*, and modify the state of *V* and its outgoing edges. Messages are typically sent along outgoing edges, but a message may be sent to any vertex whose identifier is known.”
+
+:::{figure-md}
+
+<img src="../images/pregel.png"
+     alt="The Pregel model"
+     width="50%">
+
+The Pregel model. 
+:::
+
+The iterations terminate until no messages are sent from any vertex, indicating a halt.
+
+The vertex function can be invoked at each vertex in parallel, since individual vertices communicate via message-passing.
+
+With the Pregel model, the vertex program of single source shortest paths (SSSP) is expressed as follows.
+
+```c++
+void Compute(MessageIterator* msgs) {
+    int mindist = IsSource(vertex_id()) ? 0 : INF;
+    for (; !msgs->Done(); msgs->Next())
+        mindist = min(mindist, msgs->Value());
+    if (mindist < GetValue()) {
+        *MutableValue() = mindist;
+        OutEdgeIterator iter = GetOutEdgeIterator();
+        for (; !iter.Done(); iter.Next())
+            SendMessageTo(iter.Target(), mindist + iter.GetValue());
+}
+    VoteToHalt();
+}
+```
+
+## GAS Model
+
+However, the performance of Pregel drops dramatically when facing natural graphs which follow a power-law distribution. To solve this problem, [PowerGraph](https://www.usenix.org/conference/osdi12/technical-sessions/presentation/gonzalez) proposed the GAS (Gather-Apply-Scatter) programming model for the vertex-cut graph partitioning strategy. The *Gather* function runs locally on each partition and then one accumulator is sent from each mirror to the master. The master runs the *Apply* function and then sends the updated vertex data to all mirrors. Finally, the *Scatter* phase is run in parallel on mirrors to update the data on adjacent edges.
+
+:::{figure-md}
+
+<img src="../images/gas.png"
+     alt="The GAS model"
+     width="50%">
+
+The GAS model. 
+:::
 
 ## Simulation of Pregel Model in Analytical Engine
 
-As we proved in the paper, the analytical engine is able to simulate the vertex centric models. We have implemented the support for Pregel model in the analytical engine, and you can use the Pregel API to write your own algorithms. In addition, if you already have your graph applications implemented in Giraph or GraphX, you can run them on GraphScope directly. Better still, the analytical engine can achieve better performance than Giraph and GraphX. 
+As we proved in this [paper](https://dl.acm.org/doi/pdf/10.1145/3035918.3035942), the analytical engine of GraphScope is able to simulate the vertex centric models. We have implemented the support for Pregel model in the analytical engine, and you can use the Pregel APIs to write your own algorithms. In addition, if you already have your graph applications implemented in Giraph or GraphX, you can run them on GraphScope directly. Better still, the analytical engine can achieve better performance than Giraph and GraphX. 
 
 Please refer to related tutorials.
-- t1
-- t2
-- t3
+
+- [Tutorial: Run Giraph Applications on GraphScope](https://graphscope.io/docs/latest/analytical_engine/tutorial_run_giraph_apps.html)
+- [Tutorial: Run GraphX Applications on GraphScope](https://graphscope.io/docs/latest/analytical_engine/tutorial_run_graphx_apps.html)
diff --git a/docs/images/gas.png b/docs/images/gas.png
diff --git a/docs/images/pregel.png b/docs/images/pregel.png