Skip to content
This repository has been archived by the owner on Feb 20, 2023. It is now read-only.

Logging with Two Threads #370

Merged
merged 138 commits into from
Jun 13, 2019

Conversation

thepulkitagarwal
Copy link
Contributor

@thepulkitagarwal thepulkitagarwal commented May 5, 2019

Before this PR, our logging was single threaded. Buffers were pushed into a queue by transactions, the buffers were then popped from the queue, serialized, and persisted in disk. This is slow as we can perform the serialization and persisting in parallel.

This PR performs a large reformatting on the log manager. How transactions interact with the log manager remains the same. Transactions hand off buffers to the log manager, which are added to an internal queue. This minimizes the amount of time transactions spend interacting with the log manager.

The main change is the addition of two tasks running on dedicated threads (using the DedicatedThreadRegistry):

  1. LogSerializerTask: This task pops unserialized buffers from the internal unserialized buffer queue, serializes them, and adds them to a consumer queue to be consumed by the consumer task.
  2. DiskLogConsumerTask: This task pops serialized buffers from the consumer queue and writes them to the log file. It will also persist the log file under certain conditions (periodically, we have written a significant amount of data since the last persist, or someone forces a flush). The task will also call any corresponding commit callbacks when it persists commit records.

https://www.youtube.com/watch?v=oiY_iKSpWLM

@thepulkitagarwal thepulkitagarwal added class-project This is part of the DB class (15-721) feature Adds a requested feature in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. labels May 5, 2019
@utkarsh39 utkarsh39 changed the title Log Serialization with two threads Logging with two threads May 5, 2019
@utkarsh39 utkarsh39 added bug Something isn't working (correctness). Mark issues with this. and removed bug Something isn't working (correctness). Mark issues with this. labels May 5, 2019
@utkarsh39 utkarsh39 closed this May 5, 2019
@utkarsh39 utkarsh39 reopened this May 5, 2019
@codecov
Copy link

codecov bot commented May 5, 2019

Codecov Report

Merging #370 into master will increase coverage by 0.11%.
The diff coverage is 92.81%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #370      +/-   ##
==========================================
+ Coverage   87.17%   87.29%   +0.11%     
==========================================
  Files         211      214       +3     
  Lines        7635     7761     +126     
==========================================
+ Hits         6656     6775     +119     
- Misses        979      986       +7
Impacted Files Coverage Δ
src/include/common/notifiable_task.h 93.75% <ø> (ø) ⬆️
src/include/storage/projected_row.h 100% <ø> (ø) ⬆️
src/include/common/dedicated_thread_registry.h 100% <100%> (ø) ⬆️
src/include/storage/write_ahead_log/log_io.h 91.17% <100%> (+2.6%) ⬆️
...c/storage/write_ahead_log/disk_log_writer_task.cpp 100% <100%> (ø)
src/include/network/connection_dispatcher_task.h 100% <100%> (ø) ⬆️
src/include/main/db_main.h 100% <100%> (ø) ⬆️
src/include/storage/data_table.h 100% <100%> (ø) ⬆️
src/include/network/terrier_server.h 100% <100%> (ø) ⬆️
src/include/settings/settings_defs.h 100% <100%> (ø) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d67d909...4f9060e. Read the comment docs.

@GustavoAngulo GustavoAngulo added ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. and removed in-progress This PR is being actively worked on and not ready to be reviewed or merged. Mark PRs with this. labels Jun 10, 2019
@GustavoAngulo
Copy link
Member

Benchmark numbers for LoggingBenchmark and TPCCBenchmark looked better when I switched the serializer task to moving the entire queue instead if popping elements from it.

Copy link
Contributor

@tli2 tli2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for this PR.

We will have more work to do here to remove singletons and to rework the thread registry logic a bit more. Particularly, I think how this will work with dependency injection will be somewhat tricky.

@GustavoAngulo can you open issues for those after we merge this? Also make an issue to update the wiki's storage engine design doc I guess...

@tli2 tli2 added ready-to-merge This PR is ready to be merged. Mark PRs with this. and removed ready-for-review This PR passes all checks and is ready to be reviewed. Mark PRs with this. labels Jun 12, 2019
@GustavoAngulo GustavoAngulo changed the title Logging with two threads Multithreaded Logging Jun 13, 2019
@GustavoAngulo GustavoAngulo changed the title Multithreaded Logging Logging with Two Threads Jun 13, 2019
@GustavoAngulo GustavoAngulo merged commit ebb164d into cmu-db:master Jun 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature Adds a requested feature ready-to-merge This PR is ready to be merged. Mark PRs with this.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants