New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atomic ingest #4895
Atomic ingest #4895
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
b242c33
to
04404c5
Compare
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
04404c5
to
45d60e4
Compare
@riversand963 has updated the pull request. Re-import the pull request |
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@riversand963 has updated the pull request. Re-import the pull request |
1 similar comment
@riversand963 has updated the pull request. Re-import the pull request |
@riversand963 has updated the pull request. Re-import the pull request |
a0c1e66
to
427c8c9
Compare
@riversand963 has updated the pull request. Re-import the pull request |
btw, I believe this PR can be merged only after #4922 is merged. Otherwise we will see error |
@riversand963 has updated the pull request. Re-import the pull request |
775af76
to
54f1150
Compare
@riversand963 has updated the pull request. Re-import the pull request |
54f1150
to
bd67672
Compare
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@riversand963 has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@riversand963 has updated the pull request. Re-import the pull request |
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. @update-submodule: rocksdb
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Upstream commit ID : fb-mysql-5.6.35/269d05b0beb761d9f9ba057f12b62b0077a3c4a2 PS-6864 : Merge fb-prod201902 Summary: Enable atomic bulk loading - bulk loading using SST APIs are now atomic. You should not observe any in between state if there is any failures in ingestion (there are some caveats, as usual, that needs to be investigated/documented). This change essentially delays all SST ingestion until finish_bulk_load, and ingest them all in one batch. Care is taken to: 1. Make sure the state management of multiple `Rdb_sst_info` are done transactionally as well to match the ingestion, 2. Works well with closing connections and ALTER TABLE statement race conditions (however under race conditions the ingestion are forcefully interrupted and therefore no longer truly atomic - this needs to be investigated to see if there are other approaches). 3. Files are properly cleaned up in the failure case NOTE: I specifically avoided doing refactoring in order to make code review easier. Finish_bulk_load is becoming too big and that will get addressed later. This changes takes advantage of latest [PR][(facebook/rocksdb#4895) from Yanqin Jin which supports ingesting SST from multiple column families. NOTE: This change updates rocksdb to latest commit in the PR and will update to the proper rocksdb version once the PR is merged and we've decided on a proper RocksDB version to consume. update-submodule: rocksdb fbshipit-source-id: 767c36232b6
Make file ingestion atomic.
Summary: as title.
Ingesting external SST files into multiple column families should be atomic. If
a crash occurs and db reopens, either all column families have successfully
ingested the files before the crash, or non of the ingestions have any effect
on the state of the db.
Also add unit tests for atomic ingestion.
Note that the unit test here does not cover the case of incomplete atomic group
in the MANIFEST, which is covered in VersionSetTest already.
Test Plan: