-
Notifications
You must be signed in to change notification settings - Fork 18.3k
Description
This is yet another trouble with GC/GOGC in 1.4+ releases. Although this issue might look like a duplicate of #12228, #14189, #14161, I will try to provide some additional details that should help you to build a better technology.
Couchbase Server has a query engine written in Go. After using Go 1.4 for a long time we decided to switch to Go 1.7. Unfortunately, the throughput of some queries has dropped by 60% after upgrade.
In order to explain the issue, let me introduce two workloads:
- Q1 - a very basic hash table lookup, relatively fast, small memory footprint.
- Q2 - a more complicated lookup based on the secondary indexes, slightly slower than Q1, also uses more memory.
Q1 throughput dropped by 60% after upgrade to Go 1.7, while Q2 throughput remained almost the same.
We ran several experiments with GOGC, we also tried Go 1.8 RC2 a couple days ago. The following spreadsheet summarizes the most important results:
https://docs.google.com/spreadsheets/d/1M-0kEd7LR0_o5qEXyNDk6ZHkZEVV7Lb3T7ViIEQLl-A/edit#gid=0
Here are some observations and concerns:
- Go 1.8 delivers outstanding performance. Great job!
- Upgrade to Go 1.7 reduces the live memory size but dramatically increases frequency of GC events.
- Although GOGC=200 helps to improve performance, Go 1.7 still seems slower than Go 1.4.
- The right table illustrates how a higher memory usage can improve the throughput. In this scenario we first run the Q2 workload in order to increase the live memory size and then we apply Q1. This way we reduce the time spent in STW GC and significantly improve the throughput even with GOGC=100.
The last experiment (Q1 after Q2) is very important. It shows how the relative nature of GOGC hurts use cases with small live sizes and large amount of garbage. We even had a joke that we should allocate and block a large array in order to achieve a better performance.
It seems that Xmx model might have worked better for us.
Go 1.8 with GOGC=200 provides a sufficient improvement and we consider this issue resolved. Let me know if you need any additional information (e.g., gctrace output). Otherwise, feel free to close the ticket;)