-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize intermediate file of <way, nodeids> mapping table #31
Comments
Current file format:
Example
Statistic
Currently, we are using csv for follow steps, I think a large time be used in I/O of loading this 10GB file. This file is don't need to be designed for human readable, so next step would be
(partition2) (partition3) If traffic component passing the result to us based on the same strategy, what we could benefit from swift's streaming and go concurrency. Will do experiment for 1 and 2 for now, 3 is just draft idea for future discussions. |
After analysis source data, we feel delta strategy could be a good way to reduce original information's size.
we could record
Sample result is
Notes:
|
Updates for experiment Result:
Code: func generateIDMapptingString(nodeids []int64, wayid int64,
preNodeID int64, preWayID int64) string {
var str string
str += strconv.FormatInt((wayid-preWayID), 10) + ","
for i, n := range nodeids {
str += strconv.FormatInt((n - preNodeID), 10)
if i < (len(nodeids) - 1) {
str += ","
}
preNodeID = n
}
str += "\n"
return str
}
func updateDeltaBase(wc uint32, currNodeID, currWayID int64) (int64, int64) {
if wc%blobSize == 0 {
return 0, 0
} else {
return currNodeID, currWayID
}
} Experiment with differernt blob size
|
Seems good. The blob size is |
How about the |
Blob size means group count while compressing elements in wayid2nodeids. I choose delta strategy to generate the diff, and each "Blob size" elements would reset the base. |
Back from emergent firefighting. Will continue this task. |
Latest strategy(v4)
Total time is 14 + 6 + 32, compare with v1 is 14 + 6 + 98 The strategy of V2
Problem
The strategy of V3
Problem:
|
Upload the cpu profile and mem profile for v2, will do more analysis and compare with v4 later. |
It's awesome! It will be better if there's a data flow diagram describes the flow. |
@wangyoucao577 |
Some idea about next step:
|
…tes -wayid indicate for traffic flow on reverse direction. issues: #31
* feat: Optimize output of wayid2nodeids format, use delta format to comparess data Issues: #31 * feat: Implement logic to compress/decompress file to snappy. issues: #31 * feat: Modify osrm speed table generator to support snappy compression format. issues: #31 * feat: Fix bug during conversion * feat: Adjust traffic updater's logic to improve performance. * feat: Adjust traffic updater's logic to improve performance. issues: #39 * feat: Refine the code for osrm_traffic_updater. issues: #31 * fix: fix dead lock in the code. * fix: optimize performance with new architecture. issues: #31 * fix: revert way id generator to original solution * fix: Use string to pass between different components issues: #31 * fix: update unit test for latest changes issues: #31 * fix: remove useless printf * fix: update unit test for osrm_traffic_updater.go * fix: fix the misunderstanding with requirement. Traffic server generates -wayid indicate for traffic flow on reverse direction. issues: #31
We need to generate <wayid, nodeids> mapping table to enable Telenav traffic for OSRM. More background could go to #22.
This table will be used to generate speed.csv each several minutes and be used for OSRM customization. The initial version takes about 70~80 seconds to iterate all the table and about 10GB's memory.
Our target is optimize time cost for dealing this file.
The text was updated successfully, but these errors were encountered: