-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-34326][CORE][SQL] Fix UTs added in SPARK-31793 depending on the length of temp path #31435
Conversation
…e length of temp path
val pathsInLocation = location.get.substring( | ||
location.get.indexOf('[') + 1, location.get.indexOf(']')).split(", ").toSeq | ||
|
||
// If the temp path length is less than (stop appending threshold - 1), say, 100 - 1 = 99, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally excluded the thing on closing bracket (']'), but the closing bracket will be added in any way, regardless of the current location string length. This was also not accounted as well.
cc.ing @gengliangwang @cloud-fan @HyukjinKwon @maropu who are author/reviewers of #28610 |
Test build #134770 has started for PR 31435 at commit |
@@ -137,9 +137,24 @@ class DataSourceScanExecRedactionSuite extends DataSourceScanRedactionTest { | |||
assert(location.isDefined) | |||
// The location metadata should at least contain one path | |||
assert(location.get.contains(paths.head)) | |||
// If the temp path length is larger than 100, the metadata length should not exceed | |||
// twice of the length; otherwise, the metadata length should be controlled within 200. | |||
assert(location.get.length < Math.max(paths.head.length, 100) * 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this can be flaky.
// path. | ||
// (Note we apply subtraction with 1 to count start bracket '['.) | ||
if (paths.head.length < 99) { | ||
assert(pathsInLocation.size >= 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to check that the paths are truncated here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand. Could you please elaborate?
We don't truncate the path itself, right? I think it's also something to be fixed (I'd rather want to see path being truncated with ellipses (...) instead of not adding and leaving it as it is.) but it's more likely bigger fix which may worth another fix instead of test fix.
If you meant counting the number of paths or something for edge-case, dealing with another UT would be easier, like we could simply add edge-cases there in this PR. (And that UT already tests basic functionality.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean simply checking
assert(pathsInLocation.size < paths.size)
But this is a minor comment. You can ignore it since there is UT for it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense, but let's leave it as it is, as I think I'd propose the another form which would make clear how many elements they are instead of only showing available paths and don't say there're more.
Nice catch! |
retest this, please |
Just retriggered Github Action as well. |
Kubernetes integration test starting |
Kubernetes integration test status failure |
Test build #134782 has finished for PR 31435 at commit
|
Merging to master/3.1. |
…e length of temp path ### What changes were proposed in this pull request? This PR proposes to fix the UTs being added in SPARK-31793, so that all things contributing the length limit are properly accounted. ### Why are the changes needed? The test `DataSourceScanExecRedactionSuite.SPARK-31793: FileSourceScanExec metadata should contain limited file paths` is failing conditionally, depending on the length of the temp directory. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Modified UTs explain the missing points, which also do the test. Closes #31435 from HeartSaVioR/SPARK-34326. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com> (cherry picked from commit 6386602) Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
Thanks all for the quick reviews! |
Hm, the tests started to fail:
https://github.com/apache/spark/runs/1818352990 @HeartSaVioR, please let me revert this for now since it targeted to fix the test but it started to fail in any event. |
Ah thanks for the check. Looks odd... I'll check it. |
Raised a new PR #31449 |
1 similar comment
Raised a new PR #31449 |
…e length of temp path ### What changes were proposed in this pull request? This PR proposes to fix the UTs being added in SPARK-31793, so that all things contributing the length limit are properly accounted. ### Why are the changes needed? The test `DataSourceScanExecRedactionSuite.SPARK-31793: FileSourceScanExec metadata should contain limited file paths` is failing conditionally, depending on the length of the temp directory. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Modified UTs explain the missing points, which also do the test. Closes apache#31435 from HeartSaVioR/SPARK-34326. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
What changes were proposed in this pull request?
This PR proposes to fix the UTs being added in SPARK-31793, so that all things contributing the length limit are properly accounted.
Why are the changes needed?
The test
DataSourceScanExecRedactionSuite.SPARK-31793: FileSourceScanExec metadata should contain limited file paths
is failing conditionally, depending on the length of the temp directory.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Modified UTs explain the missing points, which also do the test.