WebScaleSQL and Twitter MySQL

Chris Aniszczyk edited this page Mar 27, 2014 · 5 revisions

We are happy to announce that we are joining the WebScaleSQL initiative, along with teams from Facebook, Google, and LinkedIn. WebScaleSQL is a collaboration between several engineering teams who run deployments of MySQL truly tailored for the world-wide scale of the web. In the past, we've shared many of the same challenges, but have worked separately on innovative solutions both publicly and privately. Now, we'd like to combine efforts and pool our resources into a common codebase to share work, move faster, make it easier to collaborate together and minimize the efforts of maintenance.

Since Twitter was founded, MySQL has been its data store and we even open sourced our MySQL changes via this fork. It continues to be the data storage technology behind most of the Twitter data: the social graph, timelines, users, Direct Messages and as well as Tweets themselves. Over time, Twitter has contributed dozens of patches to the MySQL community:

  • Bug #71411 buf_flush_LRU() does not return correct number in case of compressed pages
  • Bug #70899 unnecessary buf_flush_list() during recovery
  • Bug #70811 buf_flush_event initialized too late
  • Bug #70564 future_group_master_log_pos not set properly
  • Bug #70241 Innodb_metrics::INDEX_MERGE defined but not set
  • Bug #68023 InnoDB reserves an excessive amount of disk space for write operations
  • Bug #67963 InnoDB wastes 62 out of every 16384 pages
  • Bug #67718 InnoDB drastically under-fills pages in certain conditions
  • Bug #67476 Innodb_buffer_pool_read_ahead_evicted is inaccurate
  • Bug #67156 Sporadic query cache related crash in pthread_rwlock_init()

What has Twitter Contributed to WebScaleSQL?

To start, Twitter has pushed several patches:

  • backport WL#7047 from MySQL 5.7: buffer pool list scan optimization, to reduce excessive scanning of pages when doing flush list batches. The fix is to introduce the concept of "Hazard Pointer", this reduces the time complexity of the scan from O(n2) to O(n).
  • fix for mysql bug#71411: buf_flush_LRU() does not return correct number in case of compressed pages. buf_flush_LRU() returns the number of pages processed. There are two types of processing that can happen. A page can get evicted or a page can get flushed. These two numbers are quite distinct and should not be mixed.
  • fix for mysql bug#70500 and bug#71988: page cleaner should perform LRU flushing regardless of server activity. The page_cleaner thread does spurious background flushing because of conditional sleep between iterations. The solution is not to make sleep dependent on server activity etc.
  • fix for mysql bug#70899: unnecessary buf_flush_list() in startup code path. innobase_start_or_create_for_mysql() could flush the entire buffer pool after creating rsegs. The intent is to force trx_sys page to the disk. We can reach this code path after doing recovery. In this case we can potentially have millions of dirty pages in the buffer pool. This can seriously increase the recovery time.
  • support for NUMA interleave policy: this patch provides startup options: ** flush-caches: Flush and purge buffers/caches ** numa-interleave: Run mysqld with its memory interleaved on all CPUs ** It also provides a config option: 'innodb_buffer_pool_populate' pre-allocation of buffer pool memory at startup.

The WebScaleSQL initiative is still young and we look forward to working with the community to push it in interesting directions and grow the project. Finally, we encourage you to attend our "Scaling Twitter with MySQL" talk and "MySQL at Twitter" panel Q&A at the upcoming MySQL Conference and Expo 2014.