-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a simple test to execute TableScan using Parquet files #869
Conversation
4f4b6e3
to
ea61ee5
Compare
I fixed the issue by changing the order of destruction to destroy RowReader before Reader. Will work on a proper PR. |
velox/dwio/CMakeLists.txt
Outdated
@@ -14,7 +14,7 @@ | |||
|
|||
add_subdirectory(common) | |||
add_subdirectory(dwrf) | |||
if(VELOX_ENABLE_PARQUET) | |||
#if(VELOX_ENABLE_PARQUET) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why commenting the if check out?
@@ -177,7 +177,7 @@ void ParquetRowReader::resetFilterCaches() { | |||
} | |||
|
|||
size_t ParquetRowReader::estimatedRowSize() const { | |||
VELOX_FAIL("ParquetRowReader::estimatedRowSize is NYI"); | |||
// VELOX_FAIL("ParquetRowReader::estimatedRowSize is NYI"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this line commented out?
@@ -14,25 +14,19 @@ | |||
* limitations under the License. | |||
*/ | |||
|
|||
#include <gmock/gmock.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes relavant?
velox/exec/tests/CMakeLists.txt
Outdated
@@ -60,6 +60,7 @@ target_link_libraries( | |||
velox_functions_lib | |||
velox_functions_prestosql | |||
velox_hive_connector | |||
velox_dwio_parquet_reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add if(VELOX_ENABLE_PARQUET)
check? If it's not turned on, the binary won't be built
velox/exec/tests/HashJoinTest.cpp
Outdated
.planNode(); | ||
|
||
auto filePath = | ||
"/Users/mbasmanova/cpp/velox-1/velox/dwio/parquet/tests/examples/sample.parquet"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs change?
Thanks! |
b892e53
to
6147768
Compare
@majetideepak @yingsu00 The PR is ready for review. |
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
protected: | ||
using OperatorTestBase::assertQuery; | ||
|
||
void SetUp() override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: setUp (please modify the base function to be also camelcase).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is defined in the googletest library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me.
I assume we cannot test a subset of columns due to #865?
parquet::registerParquetReaderFactory(); | ||
} | ||
|
||
std::string getExampleFilePath(const std::string& fileName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: inline?
"velox/dwio/parquet/tests", "examples/" + fileName); | ||
} | ||
|
||
std::shared_ptr<connector::hive::HiveConnectorSplit> makeSplit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extend makeHiveConnectorSplit
inside velox/exec/tests/utils/HiveConnectorTestBase.h
and remove this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a good way of doing that, but I updated this method to re-use makeHiveConnectorSplit:
auto split = makeHiveConnectorSplit(filePath);
split->fileFormat = dwio::common::FileFormat::PARQUET;
return split;
}); | ||
createDuckDbTable({data}); | ||
|
||
auto filePath = getExampleFilePath("sample.parquet"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline?
auto split = makeSplit(getExampleFilePath("sample.parquet"));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
That's right. Now that we have a basic test in place, it will be easy to repro such issues and add tests when fixing them. |
…ncubator#869) Summary: - Add a small test to execute TableScan using Parquet files with a pushed down filter and an aggregation on top. - Update HiveConnector to destroy RowReader before destroying Reader. ParquetRowReader holds an instance of an Allocator which needs to be alive as long as ParquetRowReader is alive. Pull Request resolved: facebookincubator#869 Differential Revision: D33540693 Pulled By: mbasmanova fbshipit-source-id: 26fd5cb07af1b3cf255838e3f9fc46740872a20f
This pull request was exported from Phabricator. Differential Revision: D33540693 |
6147768
to
2c783d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Masha!
@mbasmanova Was this actually merged? It seems the PR was closed without merging. |
Yes, this change was merged. See https://github.com/facebookincubator/velox/commits/main |
Fixes #846