Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fail getData for deleted buffer #9207

Closed

Conversation

arhimondr
Copy link
Contributor

@arhimondr arhimondr commented Mar 21, 2024

Sometimes requests can arrive out of order. When a getData arrives after a buffer is closed it may cause an unwanted failure.

Fixes prestodb/presto#22129

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 21, 2024
Copy link

netlify bot commented Mar 21, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 13e1ce4
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/660ac47359471600086fa305

@arhimondr
Copy link
Contributor Author

arhimondr commented Mar 21, 2024

Currently impacts Prestissimo: prestodb/presto#22129

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
message(STATUS "Using Boost - Bundled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change? CC: @kgpai @assignUser @majetideepak

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks! This is a generated file and shouldn't be checked in. (it's in .gitignore too)

updateAfterAcknowledgeLocked(freed, promises);
data = buffer->getData(
maxBytes, sequence, notify, activeCheck, arbitraryBuffer_.get());
if (buffer) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update comments for this method in the header file to mention this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment for deleteResults

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would you also update comments for OutputBufferManager::getData API as this is the public API used by Prestissimo?

@mbasmanova mbasmanova requested a review from Yuhta March 25, 2024 15:52
@Yuhta
Copy link
Contributor

Yuhta commented Mar 25, 2024

@arhimondr Also we never saw it happening in shadow workload, could it be something not set up correctly in the tests?

@mbasmanova
Copy link
Contributor

@arhimondr Also we never saw it happening in shadow workload, could it be something not set up correctly in the tests?

@Yuhta We are seeing that: 20240324_170232_00988_y57ss

@arhimondr
Copy link
Contributor Author

Sorry for the delay. Updated. @mbasmanova, @Yuhta could you please take an another look?

@amitkdutta
Copy link
Contributor

@arhimondr Also we never saw it happening in shadow workload, could it be something not set up correctly in the tests?

@Yuhta This happens ~10 times per day. Not frequent, but its there since the beginning of Prestissimo and a major source of flakiness.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr Thank you for the fix. Looks good % one ask for a comment.

Since requests can arrive out of order
@arhimondr
Copy link
Contributor Author

Test failure related to #9318

@facebook-github-bot
Copy link
Contributor

@arhimondr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@majetideepak
Copy link
Collaborator

We are seeing this as well internally. @arhimondr thanks for the fix!

@facebook-github-bot
Copy link
Contributor

@arhimondr merged this pull request in d4b3618.

Copy link

Conbench analyzed the 1 benchmark run on commit d4b3618e.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Apr 4, 2024
Summary:
Sometimes requests can arrive out of order. When a `getData` arrives after a buffer is closed it may cause an unwanted failure.

Fixes prestodb/presto#22129

Pull Request resolved: facebookincubator#9207

Reviewed By: mbasmanova

Differential Revision: D55591528

Pulled By: arhimondr

fbshipit-source-id: db493477e236b6c25ea39887a3bab5273558455b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AbstractTestNativeGeneralQueries.testSystemTables is flaky
7 participants