# Reduce amount of scaning needed to find notifications #500

Closed
opened this Issue May 22, 2015 · 6 comments

Projects
None yet
4 participants
Contributor

Contributor

### keith-turner commented May 26, 2015

 Workers should send out request for tablet scan asynchronously. This approach makes it likely that when a workers scans a tablet, its doing it on behalf of many other workers. If a worker issued a scan request to another worker and waited, then its possible that other tablets its interested in scanning could be scanned while its waiting.

Contributor

### keith-turner commented Apr 3, 2017

 I have come up with another solution to this problem. The solution is to group workers into fixed sized groups and have each group scan a disjoint set of tablets. For example if there are 23 workers, a minimum group size of 7, and 100 tablets, then would create the following groups : Group 1 with 8 workers and 34 tablets Group 2 with 8 workers and 33 tablets Group 3 with 7 workers and 33 tablets Each worker in group 1 would have a unique id within the group ranging from 0 to 7. A worker with id 5 in the group would scan all 34 of the groups tablets looking for notifications where `hash(notification) % 8 == 5`. This solution allows the cost of scanning for notifications to stay fixed as the number of workers grows. The reason to have groups of workers is that it evens out notification processing in the case where notifications are not evenly distributed among tablets. The current notification finding implementation in Fluo has a single group. So notification processing is very evenly spread among workers without having worry about collisions. However, the cost of every worker scanning every tablet does not scale well as the number of workers grows.
Member

### ctubbsii commented Apr 3, 2017

 How would tablets be assigned to groups?
Contributor

### keith-turner commented Apr 3, 2017

 How would tablets be assigned to groups? Good question, I have spent a good bit of time thinking about this. This is easy to do, IF all of the workers can agree on what the current set of tablets for a table is. However, the workers can possibly read different splits at different times. I was trying to figure out a fancy distributed way of all workers agreeing on the same set of split points for a table (for a time period), but could not think of anything. So I think putting the splits in zookeeper is a good option. I am currently thinking of taking a subset of table split points where the total size is less than 128K or 256K and putting that in zookeeper. Thinking of reading the splits and removing all odd splits while the total size is greater than 256K before storing in ZK. The worker with the lowest ID could manage the splits stored in ZK. All other workers can observe the splits node. Once the workers all agree on a set of split point, its smooth sailing. Using the info about workers in zookeeper, can decide how many groups there are. Then can shuffle the splits in a deterministic way and round robin assign to them to groups. This should lead to all workers making the same decisions about which splits are in a group. The shuffle+round robin will result in very even and random assignment of tablets to groups. I was thinking it would be nice to avoid assigning contiguous tablets to a group. I have used the term tablets and splits. In reality all of the worker just need to agree on some set of row ranges for the table that don't overlap and cover the table. It does not need to be tablet split points, that's just convenient.
Member

### mjwall commented Apr 6, 2017

 How does recovery happen if any given process is killed?
Contributor

### keith-turner commented Apr 6, 2017

 How does recovery happen if any given process is killed? All of the information used to partition workers, tablets, and notification comes from zookeeper. All of the workers watch this information in zookeeper. What I am currently doing in my branch is when any of the information changes then workers stop processing notifications until the information in zookeeper is stable for 60 seconds.

### keith-turner added a commit to keith-turner/fluo that referenced this issue Apr 7, 2017

``` fixes apache#500 Made scanning for notifications scalable ```
``` e28c038 ```

### keith-turner added a commit to keith-turner/fluo that referenced this issue Apr 7, 2017

``` fixes apache#500 Made scanning for notifications scalable ```
``` 48cfaae ```