-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-4566: Create tool for recursive snapshot analysis #1902
Conversation
add a tool to recursively collect and display child count and data stored in sub-trees
} | ||
|
||
public void run(String snapshotFileName, String startingNode, int maxDepth) throws IOException { | ||
InputStream is = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PATH_TRAVERSAL_IN: This API (java/io/FileInputStream.(Ljava/lang/String;)V) reads a file whose location might be specified by user input
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
InputStream is = | ||
new CheckedInputStream(new BufferedInputStream(new FileInputStream(snapshotFileName)), | ||
new Adler32()); | ||
InputArchive ia = BinaryInputArchive.getArchive(is); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RESOURCE_LEAK: resource of type java.util.zip.CheckedInputStream
acquired by call to new()
at line 69 is not released after line 72.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
DataTree dataTree = new DataTree(); | ||
Map<Long,Integer> sessions = new HashMap<Long,Integer>(); | ||
|
||
fileSnap.deserialize(dataTree, sessions, ia); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RESOURCE_LEAK: resource of type java.io.DataInputStream
acquired by call to getArchive(...)
at line 72 is not released after line 79.
Note: potential exception at line 79
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
bin/zkSnapShotSumToolkit.sh
Outdated
# use POSIX interface, symlink is followed automatically | ||
ZOOBIN="${BASH_SOURCE-$0}" | ||
ZOOBIN="$(dirname "${ZOOBIN}")" | ||
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2164: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
bin/zkSnapShotSumToolkit.sh
Outdated
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh | ||
else | ||
. "$ZOOBINDIR"/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC1091: Not following: ./zkEnv.sh was not specified as input (see shellcheck -x).
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
bin/zkSnapShotSumToolkit.sh
Outdated
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" | ||
|
||
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC1091: Not following: ./../libexec/zkEnv.sh was not specified as input (see shellcheck -x).
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
bin/zkSnapShotSumToolkit.sh
Outdated
. "$ZOOBINDIR"/zkEnv.sh | ||
fi | ||
|
||
"$JAVA" -cp "$CLASSPATH" $JVMFLAGS \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2086: Double quote to prevent globbing and word splitting.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice tool, thanks for sharing!! A few minor comments:
- we already have some documentation about these zookeeper tools: https://zookeeper.apache.org/doc/r3.8.0/zookeeperTools.html
could you add a description of the new tool here? You can find the source file for this here: https://github.com/apache/zookeeper/blob/master/zookeeper-docs/src/main/resources/markdown/zookeeperTools.md - please check the automated sonatype-lift comments. Some of them might make sense (and add "@sonatype-lift ignore" to the rest)
- NIT: I'm not sure about the name of the tool... I don't know what would be the best, but maybe something like: 'SnapsotRecursiveSubTreeSize' ? maybe a bit too long... What do you think? (I'm OK if you think it is good as it is)
- NIT: what about also printing the number of the children ZNodes? It can also be an improvement for the future...
*/ | ||
public static void main(String[] args) throws Exception { | ||
if (args.length != 3) { | ||
System.err.println("USAGE: SnapshotSumFormatter snapshot_file starting_node max_depth"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you be a bit more "helpful" :) here? like:
SnapshotSumFormatter is a tool traversing the whole snapshot and summing up the data in the subtrees.
USAGE:
SnapshotSumFormatter <snapshot_file> <starting_node> <max_depth>
snapshot_file: path to the zookeeper snapshot
starting_node: the path in the zookeeper tree where the traversal should begin
max_depth: the depth where you would like to get the output (data in paths longer than the depth will be still summed up, but won't be printed out)
Or something like this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw your documentation in the code above the class, that might be even better...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symat Thank you for your feedback!
I'm trying to follow the patterns in SnapshotFormatter and will apply or ignore sonatypes based on that. If that is acceptable for you.
I have extended the Tools documentation with a section on the new tool. Renamed it to SnapshotRecursiveSummar / zkSnapshotRecursiveSummaryToolkit.sh and added an extended usage based on your suggestion.
what about also printing the number of the children ZNodes?
That is already printed. Please check the example output in the jira ticket. Or you would like it presented in some other way?
name change and some cleanup
} | ||
|
||
public void run(String snapshotFileName, String startingNode, int maxDepth) throws IOException { | ||
File snapshotFile = new File(snapshotFileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PATH_TRAVERSAL_IN: This API (java/io/File.(Ljava/lang/String;)V) reads a file whose location might be specified by user input
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sonatype-lift ignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've recorded this as ignored for this pull request. If you change your mind, just comment @sonatype-lift unignore
.
DataTree dataTree = new DataTree(); | ||
Map<Long,Integer> sessions = new HashMap<Long,Integer>(); | ||
|
||
fileSnap.deserialize(dataTree, sessions, ia); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RESOURCE_LEAK: resource of type java.io.DataInputStream
acquired by call to getArchive(...)
at line 69 is not released after line 76.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this warning message: the inputStream
is defined in the try()
clause which means it will be automatically released at the end of the block. What's wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BinaryInputArchive.getArchive(is)
creates a new DataInputStream based on the inputStream. The try()
only closes the inputStream, the DataInputStream is technically never closed. I see this as a non-issue and this is how SnapshotFormatter is written too, so it should be fine like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because it's a standalone tool. You can ignore that.
# use POSIX interface, symlink is followed automatically | ||
ZOOBIN="${BASH_SOURCE-$0}" | ||
ZOOBIN="$(dirname "${ZOOBIN}")" | ||
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2164: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can ignore this.
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh | ||
else | ||
. "$ZOOBINDIR"/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC1091: Not following: ./zkEnv.sh was not specified as input (see shellcheck -x).
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" | ||
|
||
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC1091: Not following: ./../libexec/zkEnv.sh was not specified as input (see shellcheck -x).
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could try adding:
# shellcheck source=somefile
Based on: https://github.com/koalaman/shellcheck/wiki/SC1091
Otherwise feel free to ignore this one too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anmolnar I'm not sure adding shellcheck is helpful here. The libexec
folder does not exist at this point so we can not verify if it has the file. I could add # shellcheck source=/dev/null
but that basically circumvents the shellcheck and at that point I would prefer not to touch the script at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean
# shellcheck source=/bin/zkEnv.sh
Don't waste too much time on this, feel free to ignore the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, adding it.
. "$ZOOBINDIR"/zkEnv.sh | ||
fi | ||
|
||
"$JAVA" -cp "$CLASSPATH" $JVMFLAGS \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SC2086: Double quote to prevent globbing and word splitting.
Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.
When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sonatype-lift ignoreall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ignoreall command is active on this PR, all the existing Lift issues are ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not adding double-quotes to $JVMFLAGS
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @BukrosSzabolcs for updating the PR!
Generally it looks great now.
Please fix the spotbugs error reported by the CI, and then I think we are good:
[ERROR] Boxing/unboxing to parse a primitive org.apache.zookeeper.server.SnapshotRecursiveSummary.main(String[]) [org.apache.zookeeper.server.SnapshotRecursiveSummary] At SnapshotRecursiveSummary.java:[line 57] DM_BOXED_PRIMITIVE_FOR_PARSING
(p.s. you can run checkstyle / spotbugs locally: mvn verify spotbugs:check checkstyle:check -DskipTests
)
@symat I have fixed the spotbug issues. The test failure looks unrelated. Should we re-run it? |
I don't think we need to re-run. Seems to be a flaky test (and totally unrelated). @eolivelli , @anmolnar - can you take a look? This is a handy tool Szabolcs created to identify "heavy ZK users" in snapshots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
Please try to fix sonatype issues.
DataTree dataTree = new DataTree(); | ||
Map<Long,Integer> sessions = new HashMap<Long,Integer>(); | ||
|
||
fileSnap.deserialize(dataTree, sessions, ia); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this warning message: the inputStream
is defined in the try()
clause which means it will be automatically released at the end of the block. What's wrong?
# use POSIX interface, symlink is followed automatically | ||
ZOOBIN="${BASH_SOURCE-$0}" | ||
ZOOBIN="$(dirname "${ZOOBIN}")" | ||
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can ignore this.
ZOOBINDIR="$(cd "${ZOOBIN}"; pwd)" | ||
|
||
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could try adding:
# shellcheck source=somefile
Based on: https://github.com/koalaman/shellcheck/wiki/SC1091
Otherwise feel free to ignore this one too.
if [ -e "$ZOOBIN/../libexec/zkEnv.sh" ]; then | ||
. "$ZOOBINDIR"/../libexec/zkEnv.sh | ||
else | ||
. "$ZOOBINDIR"/zkEnv.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
. "$ZOOBINDIR"/zkEnv.sh | ||
fi | ||
|
||
"$JAVA" -cp "$CLASSPATH" $JVMFLAGS \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not adding double-quotes to $JVMFLAGS
instead?
requested sonatype fixes
I'm merging it now to master. Thanks @BukrosSzabolcs for the contribution!! |
add a tool to recursively collect and display child count and data stored in sub-trees Author: Szabolcs Bukros <szabolcs@cloudera.com> Reviewers: Andor Molnar <andor@apache.org>, Mate Szalay-Beko <symat@apache.org> Closes apache#1902 from BukrosSzabolcs/ZOOKEEPER-4566
add a tool to recursively collect and display child count and data
stored in sub-trees