Skip to content

Commit

Permalink
Fix segfaults in WebP queries ran in parallel. (#5065)
Browse files Browse the repository at this point in the history
The WebpFilter calls `WriterTile::set_max_tile_chunk_size` to override
the maximum tile chunk size in an attempt to avoid further chunking data
within each tile, but because `WriterTile::max_tile_chunk_size_` is
static this doesn't work as intended when we're running multiple WebP
queries across threads. It's possible for one thread to override this
value while it's being used by another running query and this results in
incorrect chunking for WebP data, potentially causing segfaults from
libwebp reaching out of memory bounds.

The WebP library was reaching out of bounds because after another thread
overrides `max_tile_chunk_size_` we don't account for the smaller
chunk's dimensions when calling WebP APIs and still pass in the same
height, width, and row stride to [WebP API
calls](https://github.com/TileDB-Inc/TileDB/blob/dev/tiledb/sm/filter/webp_filter.cc#L144).

This PR removes the use of `WriterTile::set_max_tile_chunk_size` and
updates `FilterPipeline::use_tile_chunking` to return `false` for WebP,
which IIUC should have the behavior that was initially intended and
fixes segfaults seen on REST.


[SC-48697](https://app.shortcut.com/tiledb-inc/story/48697/rest-segfaults-running-webp-multiple-queries)

Limitations: If a WebP array exists that was impacted by this during
ingestion and it did not produce a segfault, that array will still be
read back using chunking. AFAICT there is no risk of segfault for this
case since the decoding APIs in WebP do not depend on the caller to
specify buffer dimensions, instead WebP will decode and provides to the
caller the width and height of the decoded image. On the next write to
the array there will be no error or risk of segfault and the new tiles
will not use chunking moving forward for reads or writes.

---
TYPE: BUG
DESC: Fix segfaults in WebP queries ran in parallel.
  • Loading branch information
shaunrd0 authored and KiterLuc committed Jun 14, 2024
1 parent 16c4b9d commit 1c1c961
Show file tree
Hide file tree
Showing 14 changed files with 188 additions and 32 deletions.
48 changes: 48 additions & 0 deletions test/support/src/whitebox_helpers.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/**
* @file whitebox_helpers.h
*
* @section LICENSE
*
* The MIT License
*
* @copyright Copyright (c) 2024 TileDB, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*
* @section DESCRIPTION
*
* Helpers to provide whitebox access to TileDB internals for testing.
*/
#ifndef TILEDB_WHITEBOX_HELPERS_H
#define TILEDB_WHITEBOX_HELPERS_H

#include "tiledb/sm/tile/tile.h"

namespace tiledb::sm {

class WhiteboxWriterTile {
public:
static void set_max_tile_chunk_size(uint64_t size) {
WriterTile::max_tile_chunk_size_ = size;
}
};

} // namespace tiledb::sm

#endif // TILEDB_WHITEBOX_HELPERS_H
3 changes: 3 additions & 0 deletions tiledb/sm/filter/filter_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
#include "tiledb/sm/misc/parallel_functions.h"
#include "tiledb/sm/stats/global_stats.h"
#include "tiledb/sm/tile/tile.h"
#include "webp_filter.h"

using namespace tiledb::common;

Expand Down Expand Up @@ -644,6 +645,8 @@ bool FilterPipeline::use_tile_chunking(
} else if (version >= 13 && has_filter(FilterType::FILTER_DICTIONARY)) {
return false;
}
} else if (has_filter(FilterType::FILTER_WEBP)) {
return false;
}

return true;
Expand Down
1 change: 1 addition & 0 deletions tiledb/sm/filter/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ commence(unit_test run_filter_pipeline)
unit_encryption_pipeline.cc
unit_positive_delta_pipeline.cc
unit_run_filter_pipeline.cc
unit_webp_pipeline.cc
unit_xor_pipeline.cc
)
conclude(unit_test)
Expand Down
4 changes: 2 additions & 2 deletions tiledb/sm/filter/test/filter_test_support.cc
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,12 @@ SimpleVariableTestData::SimpleVariableTestData()
: target_ncells_per_chunk_{10}
, elements_per_chunk_{14, 6, 11, 7, 10, 10, 20, 10, 12}
, tile_data_generator_{{4, 10, 6, 11, 7, 9, 1, 10, 20, 2, 2, 2, 2, 2, 12}} {
WriterTile::set_max_tile_chunk_size(
WhiteboxWriterTile::set_max_tile_chunk_size(
target_ncells_per_chunk_ * sizeof(uint64_t));
}

SimpleVariableTestData::~SimpleVariableTestData() {
WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}

/**
Expand Down
3 changes: 2 additions & 1 deletion tiledb/sm/filter/test/tile_data_generator.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
#include <algorithm>
#include <numeric>
#include <optional>
#include "test/support/src/whitebox_helpers.h"
#include "tiledb/sm/tile/tile.h"

using namespace tiledb::common;
Expand Down Expand Up @@ -139,7 +140,7 @@ class IncrementTileDataGenerator : public TileDataGenerator {
}

~IncrementTileDataGenerator() {
WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}

uint64_t cell_size() const override {
Expand Down
12 changes: 6 additions & 6 deletions tiledb/sm/filter/test/unit_bit_width_reduction_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ TEST_CASE(
auto tile = make_increasing_tile(nelts, tracker);
auto offsets_tile = make_offsets_tile(offsets, tracker);

WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
CHECK(
pipeline.run_forward(&dummy_stats, tile.get(), offsets_tile.get(), &tp)
.ok());
Expand Down Expand Up @@ -335,7 +335,7 @@ TEST_CASE(
}

SECTION("- Window sizes") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
std::vector<uint32_t> window_sizes = {
32, 64, 128, 256, 437, 512, 1024, 2000};
for (auto window_size : window_sizes) {
Expand All @@ -362,7 +362,7 @@ TEST_CASE(
}

SECTION("- Random values") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
std::random_device rd;
auto seed = rd();
std::mt19937 gen(seed), gen_copy(seed);
Expand Down Expand Up @@ -401,7 +401,7 @@ TEST_CASE(
}

SECTION(" - Random signed values") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
std::random_device rd;
auto seed = rd();
std::mt19937 gen(seed), gen_copy(seed);
Expand Down Expand Up @@ -460,7 +460,7 @@ TEST_CASE(
}

SECTION("- Byte overflow") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
auto tile = make_shared<WriterTile>(
HERE(),
constants::format_version,
Expand Down Expand Up @@ -493,5 +493,5 @@ TEST_CASE(
}
}

WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}
6 changes: 3 additions & 3 deletions tiledb/sm/filter/test/unit_bitshuffle_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ TEST_CASE("Filter: Test bitshuffle var", "[filter][bitshuffle][var]") {
pipeline.add_filter(BitshuffleFilter(Datatype::UINT64));

SECTION("- Single stage") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
CHECK(
pipeline.run_forward(&dummy_stats, tile.get(), offsets_tile.get(), &tp)
.ok());
Expand All @@ -161,7 +161,7 @@ TEST_CASE("Filter: Test bitshuffle var", "[filter][bitshuffle][var]") {
}

SECTION("- Indivisible by 8") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
const uint32_t nelts2 = 1001;
const uint64_t tile_size2 = nelts2 * sizeof(uint32_t);

Expand Down Expand Up @@ -194,5 +194,5 @@ TEST_CASE("Filter: Test bitshuffle var", "[filter][bitshuffle][var]") {
}
}

WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}
6 changes: 3 additions & 3 deletions tiledb/sm/filter/test/unit_byteshuffle_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ TEST_CASE("Filter: Test byteshuffle var", "[filter][byteshuffle][var]") {
pipeline.add_filter(ByteshuffleFilter(Datatype::UINT64));

SECTION("- Single stage") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
CHECK(
pipeline.run_forward(&dummy_stats, tile.get(), offsets_tile.get(), &tp)
.ok());
Expand All @@ -159,7 +159,7 @@ TEST_CASE("Filter: Test byteshuffle var", "[filter][byteshuffle][var]") {
}

SECTION("- Uneven number of elements") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
const uint32_t nelts2 = 1001;
const uint64_t tile_size2 = nelts2 * sizeof(uint32_t);

Expand Down Expand Up @@ -192,5 +192,5 @@ TEST_CASE("Filter: Test byteshuffle var", "[filter][byteshuffle][var]") {
}
}

WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}
8 changes: 4 additions & 4 deletions tiledb/sm/filter/test/unit_positive_delta_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ TEST_CASE(
auto tile = make_increasing_tile(nelts, tracker);
auto offsets_tile = make_offsets_tile(offsets, tracker);

WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
CHECK(
pipeline.run_forward(&dummy_stats, tile.get(), offsets_tile.get(), &tp)
.ok());
Expand Down Expand Up @@ -240,7 +240,7 @@ TEST_CASE(
}

SECTION("- Window sizes") {
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
std::vector<uint32_t> window_sizes = {
32, 64, 128, 256, 437, 512, 1024, 2000};
for (auto window_size : window_sizes) {
Expand Down Expand Up @@ -271,7 +271,7 @@ TEST_CASE(
auto tile = make_increasing_tile(nelts, tracker);
auto offsets_tile = make_offsets_tile(offsets, tracker);

WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
for (uint64_t i = 0; i < nelts; i++) {
auto val = nelts - i;
CHECK_NOTHROW(tile->write(&val, i * sizeof(uint64_t), sizeof(uint64_t)));
Expand All @@ -282,5 +282,5 @@ TEST_CASE(
.ok());
}

WriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
}
2 changes: 1 addition & 1 deletion tiledb/sm/filter/test/unit_run_filter_pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ TEST_CASE(

SECTION("- Multi-stage") {
// Add a few more +1 filters and re-run.
WriterTile::set_max_tile_chunk_size(80);
WhiteboxWriterTile::set_max_tile_chunk_size(80);
pipeline.add_filter(Add1InPlace(Datatype::UINT64));
pipeline.add_filter(Add1InPlace(Datatype::UINT64));

Expand Down
110 changes: 110 additions & 0 deletions tiledb/sm/filter/test/unit_webp_pipeline.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
/**
* @file unit_webp_pipeline.cc
*
* @section LICENSE
*
* The MIT License
*
* @copyright Copyright (c) 2024 TileDB, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*
* @section DESCRIPTION
*
* This set of unit tests checks running the filter pipeline with the webp
* filter.
*/

#include <test/support/tdb_catch.h>
#include "filter_test_support.h"
#include "test/support/src/mem_helpers.h"
#include "test/support/src/whitebox_helpers.h"
#include "tiledb/sm/filter/webp_filter.h"
#include "tiledb/sm/tile/tile.h"

using namespace tiledb::sm;

TEST_CASE("Filter: Round trip WebpFilter RGB data", "[filter][webp]") {
tiledb::sm::Config config;
auto tracker = tiledb::test::create_test_memory_tracker();

uint64_t height = 100;
uint64_t width = 100;
uint64_t row_stride = width * 3;
auto tile = make_shared<WriterTile>(
HERE(),
constants::format_version,
Datatype::UINT8,
sizeof(uint8_t),
height * row_stride,
tracker);

// Write full image in a single tile with chunking enabled.
std::vector<uint8_t> data{0, 125, 255};
std::vector<uint8_t> expected_result(height * row_stride, 0);
for (uint64_t i = 0; i < height * width; i++) {
// Write three values for each RGB pixel.
for (uint64_t j = 0; j < 3; j++) {
CHECK_NOTHROW(tile->write(&data[j], i * 3 + j, sizeof(uint8_t)));
expected_result[i * 3 + j] = data[j];
}
}
// For the write process 10 rows at a time using tile chunking.
// The row stride is 300 bytes, so the tile chunk size is 3000 bytes.
uint64_t extent_y = 10;
WhiteboxWriterTile::set_max_tile_chunk_size(extent_y * row_stride);

FilterPipeline pipeline;
ThreadPool tp(4);
float quality = 100.0f;
bool lossless = true;
// NOTE: The extent_y parameter must respect chunk size or risk OOB access.
// This sets WebpFilter::extents_ which is passed to WebP API during encoding.
// If we set this to `height`, webp will reach past the end of chunked data.
pipeline.add_filter(WebpFilter(
quality,
WebpInputFormat::WEBP_RGB,
lossless,
extent_y,
width * 3,
Datatype::UINT8));
bool use_chunking = true;
CHECK(
pipeline.run_forward(&dummy_stats, tile.get(), nullptr, &tp, use_chunking)
.ok());

// Check the original unfiltered data was removed.
CHECK(tile->size() == 0);
CHECK(tile->filtered_buffer().size() != 0);

// Read the full image back with chunking disabled.
// NOTE: For the read case, WebP decoding APIs don't require height and width.
// Instead, WebP takes references to these values during unfiltering and sets
// them to the correct values after decoding is finished.
WhiteboxWriterTile::set_max_tile_chunk_size(constants::max_tile_chunk_size);
auto unfiltered_tile =
create_tile_for_unfiltering(height * row_stride, tile, tracker);
run_reverse(config, tp, unfiltered_tile, pipeline);

for (uint64_t i = 0; i < height * row_stride; i++) {
uint8_t value;
CHECK_NOTHROW(unfiltered_tile.read(&value, i, sizeof(uint8_t)));
CHECK(value == expected_result[i]);
}
}
1 change: 0 additions & 1 deletion tiledb/sm/filter/webp_filter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -381,7 +381,6 @@ void WebpFilter::set_extents(const std::vector<ByteVecValue>& extents) {
throw StatusException(Status_FilterError(
"Tile extents too large; Max size WebP image is 16383x16383 pixels"));
}
WriterTile::set_max_tile_chunk_size(extents_.first * extents_.second);
}

template void WebpFilter::set_extents<uint8_t>(
Expand Down
4 changes: 0 additions & 4 deletions tiledb/sm/tile/tile.cc
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,6 @@ uint32_t WriterTile::compute_chunk_size(
return static_cast<uint32_t>(chunk_size64);
}

void WriterTile::set_max_tile_chunk_size(uint64_t max_tile_chunk_size) {
max_tile_chunk_size_ = max_tile_chunk_size;
}

/* ****************************** */
/* CONSTRUCTORS & DESTRUCTORS */
/* ****************************** */
Expand Down
Loading

0 comments on commit 1c1c961

Please sign in to comment.