Skip to content

Commit

Permalink
#106 update description, add workflow diagram
Browse files Browse the repository at this point in the history
  • Loading branch information
mincong-h committed Jul 17, 2016
1 parent fbf58e3 commit d3b4a0a
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 3 deletions.
13 changes: 10 additions & 3 deletions README.md
Expand Up @@ -21,10 +21,15 @@ This project redesigns the mass index job as a chunk-oriented, non-interactive,
long-running, background execution process. Execution contains operational
control (start/stop/restart), logging, checkpointing and parallelization.

![Workflow of the job "mass-index"][1]

* **Parallelization**. The core step execution _produceLuceneDoc_ runs in
parallel. It runs as multiple instance of the same step definition across
multiple threads, one partition per thread. The number of partitions equals
to the size of root entities selected before the job start.
parallel. It runs as multiple instances of the same step definition across
multiple threads, one partition per thread. The number of partitions depends
on 2 factors : the target entity's quantity and the partition capacity. For
example, if target entity `Company.class` has 5000 rows to index and the
partition capacity is 2500 entity / partition, then theses rows will run in 2
partitions.


## Context data
Expand Down Expand Up @@ -77,3 +82,5 @@ summarize all elementary count from each partition and compute the progress.
If the given were partition-work-count, then the computed result will be a
double-summarized progress, which is not desired.


[1]: https://github.com/mincong-h/gsoc-hsearch/tree/master/img/mass-index.png
Binary file added img/mass-index.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d3b4a0a

Please sign in to comment.