Optimize intermediate file of <way, nodeids> mapping table #31

CodeBear801 · 2019-06-27T17:38:42Z

We need to generate <wayid, nodeids> mapping table to enable Telenav traffic for OSRM. More background could go to #22.
This table will be used to generate speed.csv each several minutes and be used for OSRM customization. The initial version takes about 70~80 seconds to iterate all the table and about 10GB's memory.
Our target is optimize time cost for dealing this file.

CodeBear801 · 2019-06-27T17:49:05Z

Current file format:

wayid1,nodeid1,nodeid2,nodeid3
wayid2,nodeid4,nodeid5
...

Example

24418343,84760849102,84760850102
24418344,84760846102,84760858102,847608599102

Statistic
Based on NA data, way count: 60M, node count: 700M

format	size	comments
csv	12,521,206,024
tar	2,035,689,499
snappy	3,623,614,741
parquet	3,768,399,074	code Just for compare, iterate all data don't need parquet

Currently, we are using csv for follow steps, I think a large time be used in I/O of loading this 10GB file. This file is don't need to be designed for human readable, so next step would be

Design strategy for lost less data schema. We don't need record original value, I believe record delta data would helps a lot.
For how to use the final result, if we load all of them into memory by once, then we could consider choosing a proper way to compress and decompress data. I will consider snappy first, consider which has good decompress performance.
If we plan to break big task into smaller one, then we could consider some kind of partition strategy. Like based on wayid to partition entire data to certain number of parts.
Say that we partition data based on last digit and each partition is sorted
(partition1)
34343401
34343511
34343551

(partition2)
34343402
34373572
34383552

(partition3)

If traffic component passing the result to us based on the same strategy, what we could benefit from swift's streaming and go concurrency.
After receiving all data from traffic component ended with 1, we could start job to deal with(traffic data end with 1, map data end with 1), since those to data set is sorted, it would be very quick to generate final joined result and put them into output channel.

Will do experiment for 1 and 2 for now, 3 is just draft idea for future discussions.

CodeBear801 · 2019-06-27T18:11:33Z

After analysis source data, we feel delta strategy could be a good way to reduce original information's size.
Instead of recording

24418343,84760849102,84760850102
24418344,84760846102,84760858102,847608599102

we could record

wayid1, nodeid1, (nodeid2-nodeid1)
wayid2-wayid1, nodeid3-nodeid1, nodeid4-nodeid3, nodeid5-nodeid4
wayid3-wayid2, nodeid6-nodeid3,nodeid7-nodeid6
//...

wayid 1000, nodeid1000, nodeid1001-nodeid1000
//...

Sample result is

24418343,84760849102,1102
1,-3102,12102,18102

Notes:

For each node, because their id is similar, so diff with previous one could save lots of size
For each way, diff with previous one also helps
If we sort data then result should be better
Must be careful about offset out of range during delta generation

CodeBear801 · 2019-06-27T23:34:27Z

Updates for experiment

Result:

format	size	comments
csv	4457788518
snappy	1280865368	30% of previous snappy size, 10% compare to original csv file

Code:

func generateIDMapptingString(nodeids []int64, wayid int64,
	preNodeID int64, preWayID int64) string {
	var str string
	str += strconv.FormatInt((wayid-preWayID), 10) + ","

	for i, n := range nodeids {
		str += strconv.FormatInt((n - preNodeID), 10)
		if i < (len(nodeids) - 1) {
			str += ","
		}
		preNodeID = n
	}
	str += "\n"

	return str
}

func updateDeltaBase(wc uint32, currNodeID, currWayID int64) (int64, int64) {
	if wc%blobSize == 0 {
		return 0, 0
	} else {
		return currNodeID, currWayID
	}
}

Experiment with differernt blob size

blob size	snappy file size	comments
10	1337122870
1000	1280865368	Will choose this threshold
10000	1280210252

wangyoucao577 · 2019-06-28T08:51:15Z

Seems good. The blob size is wayid2nodeids count, right?

wangyoucao577 · 2019-06-28T08:52:40Z

How about the snappy compress and decompress time cost?

CodeBear801 · 2019-06-28T17:42:30Z

Seems good. The blob size is wayid2nodeids count, right?

Blob size means group count while compressing elements in wayid2nodeids. I choose delta strategy to generate the diff, and each "Blob size" elements would reset the base.

CodeBear801 · 2019-07-08T18:47:00Z

Back from emergent firefighting. Will continue this task.

…mparess data Issues: #31

issues: #31

… format. issues: #31

issues: #31

CodeBear801 · 2019-07-13T23:01:35Z

Latest strategy(v4)

Record mapping table of wayid->nodeids into snappy(12GB -> 3GB)
When start rpc request with traffic server, start loading mapping table, write mapping table result into 64 channels(configurable)
As long as communication with traffic server finished, which means mapping of wayid->speed is generated, start transform and dump
There will be 64 threads working on transform, then finally use one thread to dump into file

Processing time for getting traffic flow takes 14.504480 seconds
Processing time for building traffic map takes 6.370307 seconds
Processing time for loadWay2NodeidsTable takes 48.059116 seconds
Processing time for dumpSpeedTable4Customize takes 32.162450 seconds

Total time is 14 + 6 + 32, compare with v1 is 14 + 6 + 98

The strategy of V2

one goroutine to generate traffic result(13 seconds)
one gorouting to generate a map of wayid to nodeids array (280 seconds)
16 ~ 32 tread to iterate traffic array, query on map and generate final result(15 seconds)

Problem

Load all data into system takes too much of time. Especially the nodeids using slice will cause lots of memory allocation and deallocation
Dump step need to wait for first two steps to finish

The strategy of V3

one goroutine to generate traffic result(13 seconds)
16~64 gorouting to decoding id mapping table in delta format, put [wayid, nodeids array] into channel (100+ seconds)
As long as step1 is finished, start dummping (100+ seconds)

Problem:

Decoding string and put [wayid, nodeids array] has lots of additional memory usage, while is V4 I direct put string into channel, and move decoding of this string near final dump.
Both V2 and V3 use my own delta compression strategy, the input of snappy file is about 1.2GB, but which must need decoding while reading the string, which makes following step not easy to run in parallel. So I decide to change back to just snappy compress original file to simplify the flow

CodeBear801 · 2019-07-13T23:08:52Z

Upload the cpu profile and mem profile for v2, will do more analysis and compare with v4 later.
cpu-v2.pdf
mem-v2.pdf

wangyoucao577 · 2019-07-15T01:40:02Z

It's awesome! It will be better if there's a data flow diagram describes the flow.
For the 64 threads(goroutines I think), how about set it by CPU cores by default?

CodeBear801 · 2019-07-15T22:07:26Z

@wangyoucao577
Thanks for the suggestion. A flow diagram is needed :)
Will do more experiments about how many goroutine is the best result.

CodeBear801 · 2019-07-16T00:31:21Z

Some idea about next step:

The original requirement of this task, is due to traffic team dropped original pbf's nodeid and we need to generate that.
Need to confirm with traffic team, add original node id should not hard from their side, they could put it in the rpc protocal
When we consider end to end time, we will evaluate the time cost between traffic happened in the real world and when such traffic be used in our routing engine.
Real world accident -> Traffic Provider -> Pull & Push request from Telenav -> Traffic component -> Script in this task->OSRM customization -> Restart routing
From what we could change, we will consider how to optimize "Script in this task->OSRM customization -> Restart routing"
For the script part, if we could load way-> nodeids into memory, then the time cost would be around 20 seconds, but memory usage could be up to 10GB
We could consider just put most frequent items into memory + in memory cache such as redis for other items. In memory cache could be updated more frequently, such as minute level, item need redis could be updated each several minutes
Currently, the biggest time cost is in OSRM Customization part, it seems each round of "OSRM customization" need about 10 minutes.
We might need to think about using delta strategy to handle this, not to update everything in each round.

…tes -wayid indicate for traffic flow on reverse direction. issues: #31

* feat: Optimize output of wayid2nodeids format, use delta format to comparess data Issues: #31 * feat: Implement logic to compress/decompress file to snappy. issues: #31 * feat: Modify osrm speed table generator to support snappy compression format. issues: #31 * feat: Fix bug during conversion * feat: Adjust traffic updater's logic to improve performance. * feat: Adjust traffic updater's logic to improve performance. issues: #39 * feat: Refine the code for osrm_traffic_updater. issues: #31 * fix: fix dead lock in the code. * fix: optimize performance with new architecture. issues: #31 * fix: revert way id generator to original solution * fix: Use string to pass between different components issues: #31 * fix: update unit test for latest changes issues: #31 * fix: remove useless printf * fix: update unit test for osrm_traffic_updater.go * fix: fix the misunderstanding with requirement. Traffic server generates -wayid indicate for traffic flow on reverse direction. issues: #31

CodeBear801 mentioned this issue Jun 27, 2019

Optimize end to end flow of enabling telenav live traffic for OSRM #30

Open

CodeBear801 self-assigned this Jun 27, 2019

CodeBear801 mentioned this issue Jun 28, 2019

Integration failed between OSRM custmization and Telenav traffic #22

Closed

CodeBear801 added a commit that referenced this issue Jul 9, 2019

feat: Optimize output of wayid2nodeids format, use delta format to co…

6a67dc9

…mparess data Issues: #31

CodeBear801 added a commit that referenced this issue Jul 9, 2019

feat: Implement logic to compress/decompress file to snappy.

7cd8d89

issues: #31

CodeBear801 added a commit that referenced this issue Jul 9, 2019

feat: Modify osrm speed table generator to support snappy compression…

ae3a91b

… format. issues: #31

CodeBear801 added a commit that referenced this issue Jul 11, 2019

feat: Refine the code for osrm_traffic_updater.

133bc26

issues: #31

CodeBear801 added a commit that referenced this issue Jul 12, 2019

fix: optimize performance with new architecture.

93c4ea4

issues: #31

CodeBear801 added a commit that referenced this issue Jul 12, 2019

fix: Use string to pass between different components

399b7ff

issues: #31

CodeBear801 added a commit that referenced this issue Jul 13, 2019

fix: update unit test for latest changes

adf7f39

issues: #31

CodeBear801 mentioned this issue Jul 15, 2019

Feature/optimize traffic convertor #41

Merged

6 tasks

CodeBear801 added a commit that referenced this issue Jul 16, 2019

fix: fix the misunderstanding with requirement. Traffic server genera…

6631fc5

…tes -wayid indicate for traffic flow on reverse direction. issues: #31

wangyoucao577 mentioned this issue Jul 17, 2019

Use Go Modules to manage Go code's dependency #42

Merged

1 task

CodeBear801 closed this as completed Jul 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize intermediate file of <way, nodeids> mapping table #31

Optimize intermediate file of <way, nodeids> mapping table #31

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

wangyoucao577 commented Jun 28, 2019

wangyoucao577 commented Jun 28, 2019

CodeBear801 commented Jun 28, 2019

CodeBear801 commented Jul 8, 2019

CodeBear801 commented Jul 13, 2019 •

edited

Loading

CodeBear801 commented Jul 13, 2019

wangyoucao577 commented Jul 15, 2019

CodeBear801 commented Jul 15, 2019

CodeBear801 commented Jul 16, 2019 •

edited by wangyoucao577

Loading

Optimize intermediate file of <way, nodeids> mapping table #31

Optimize intermediate file of <way, nodeids> mapping table #31

Comments

CodeBear801 commented Jun 27, 2019 • edited Loading

CodeBear801 commented Jun 27, 2019 • edited Loading

CodeBear801 commented Jun 27, 2019 • edited Loading

CodeBear801 commented Jun 27, 2019 • edited Loading

wangyoucao577 commented Jun 28, 2019

wangyoucao577 commented Jun 28, 2019

CodeBear801 commented Jun 28, 2019

CodeBear801 commented Jul 8, 2019

CodeBear801 commented Jul 13, 2019 • edited Loading

CodeBear801 commented Jul 13, 2019

wangyoucao577 commented Jul 15, 2019

CodeBear801 commented Jul 15, 2019

CodeBear801 commented Jul 16, 2019 • edited by wangyoucao577 Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jun 27, 2019 •

edited

Loading

CodeBear801 commented Jul 13, 2019 •

edited

Loading

CodeBear801 commented Jul 16, 2019 •

edited by wangyoucao577

Loading