Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHOENIX-6388 Add sampled logging for read repairs #1146

Merged
merged 1 commit into from
Feb 23, 2021

Conversation

yanxinyi
Copy link
Contributor

No description provided.

@stoty
Copy link
Contributor

stoty commented Feb 19, 2021

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 46s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ 4.x Compile Tests _
+1 💚 mvninstall 16m 11s 4.x passed
+1 💚 compile 1m 17s 4.x passed
+1 💚 checkstyle 0m 37s 4.x passed
+1 💚 javadoc 1m 1s 4.x passed
+0 🆗 spotbugs 3m 45s phoenix-core in 4.x has 945 extant spotbugs warnings.
_ Patch Compile Tests _
+1 💚 mvninstall 8m 2s the patch passed
+1 💚 compile 1m 21s the patch passed
+1 💚 javac 1m 21s the patch passed
-1 ❌ checkstyle 0m 40s phoenix-core: The patch generated 12 new + 210 unchanged - 0 fixed = 222 total (was 210)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 56s the patch passed
+1 💚 spotbugs 4m 15s phoenix-core generated 0 new + 944 unchanged - 1 fixed = 944 total (was 945)
_ Other Tests _
+1 💚 unit 212m 21s phoenix-core in the patch passed.
+1 💚 asflicense 0m 11s The patch does not generate ASF License warnings.
253m 19s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1146
Optional Tests dupname asflicense javac javadoc unit spotbugs hbaseanti checkstyle compile
uname Linux 12fdb91ca90f 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev/phoenix-personality.sh
git revision 4.x / ec386e9
Default Java Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08
checkstyle https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/1/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt
Test Results https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/1/testReport/
Max. process+thread count 4839 (vs. ulimit of 30000)
modules C: phoenix-core U: phoenix-core
Console output https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/1/console
versions git=2.7.4 maven=3.3.9 spotbugs=4.1.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

if (shouldLog()) {
LOG.info("Index row repair on region "
+ env.getRegionInfo().getRegionNameAsString() + " took "
+ repairTime + "ms");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use {} substitution for parameters ?

@stoty
Copy link
Contributor

stoty commented Feb 20, 2021

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ 4.x Compile Tests _
+1 💚 mvninstall 16m 58s 4.x passed
+1 💚 compile 1m 13s 4.x passed
+1 💚 checkstyle 0m 32s 4.x passed
+1 💚 javadoc 0m 53s 4.x passed
+0 🆗 spotbugs 3m 21s phoenix-core in 4.x has 945 extant spotbugs warnings.
_ Patch Compile Tests _
+1 💚 mvninstall 6m 48s the patch passed
+1 💚 compile 1m 6s the patch passed
+1 💚 javac 1m 6s the patch passed
-1 ❌ checkstyle 0m 31s phoenix-core: The patch generated 14 new + 210 unchanged - 0 fixed = 224 total (was 210)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 47s the patch passed
+1 💚 spotbugs 3m 27s phoenix-core generated 0 new + 944 unchanged - 1 fixed = 944 total (was 945)
_ Other Tests _
+1 💚 unit 215m 42s phoenix-core in the patch passed.
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
254m 12s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1146
Optional Tests dupname asflicense javac javadoc unit spotbugs hbaseanti checkstyle compile
uname Linux 1ab8ca252d6f 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev/phoenix-personality.sh
git revision 4.x / ec386e9
Default Java Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08
checkstyle https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/2/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt
Test Results https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/2/testReport/
Max. process+thread count 4694 (vs. ulimit of 30000)
modules C: phoenix-core U: phoenix-core
Console output https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/2/console
versions git=2.7.4 maven=3.3.9 spotbugs=4.1.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few comments, looks good otherwise

@@ -101,6 +102,9 @@
*/
public class GlobalIndexChecker extends BaseScannerRegionObserver {
private static final Logger LOG = LoggerFactory.getLogger(GlobalIndexChecker.class);
private static final String REPAIR_LOGGING_PERCENT_ATTRIB = "phoenix.index.repair.logging.percent";
private static final double DEFAULT_REPAIR_LOGGING_PERCENT = 0.01;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per random.nextDouble() <= (loggingPercent / 100.0d), we will compare random.nextDouble() with 0.01/100 = 0.0001 which seems way less. If we can keep default value as 1 here, even 1/100 is going to be very less as far as random distribution between 0 and 1 is concerned. Unless we never want to log with default config value, i think we should change this to 1 (at least some better probability of logging read repairs than 0.01 as default value, which is almost never going to log)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, 1% makes sense.

Comment on lines 607 to 610
if (shouldLog()) {
LOG.info(String.format("Index row repair on region %s took %d ms.",
env.getRegionInfo().getRegionNameAsString(), repairTime));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good one, however regardless of shouldLog(), we might want to consider logging at TRACE level always:

                    if (shouldLog()) {
                        LOG.info("Index row repair on region {} took {} ms.",
                                env.getRegionInfo().getRegionNameAsString(), repairTime);
                    }
                    if (LOG.isTraceEnabled()) {
                        LOG.trace("Index row repair on region {} took {} ms.",
                                env.getRegionInfo().getRegionNameAsString(), repairTime);
                    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: String.format() is not required, we can use slf4j log placeholder {}

                        LOG.info("Index row repair on region {} took {} ms.",
                                env.getRegionInfo().getRegionNameAsString(), repairTime);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on the placeholder, but I don't think addig the unconditional trace log is a good idea.
We should not use one set of conditions for INFO, and another for TRACE, and should not double log, as both conflicts with the usual/expected logging semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking if a user enables TRACE logging, this is a good log candidate for all read repairs (it might be overwhelming too but that's why unconditional TRACE might be more suitable) and whereas % based logs at INFO level won't make it overwhelming which is the purpose of this PR anyways. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, logging every event with trace would make sense from an operational standpoint, as it can theoretically be enabled without restarting the RS (I THINK that LOG4j will pick logging config file changes during runtime)

On the other hand I feel that having a statistic logging at INFO level, but also logging everything at TRACE level would go against the 'do not surprise/confuse the user' principle. AFAIK we have other statistic logging events, which cover similar cases, but we don't override the percentage setting like this.

IMO REPAIR_LOGGING_PERCENT_ATTRIB sets the probability for logging the event at INFO, but we're logging every event at TRACE anyway is not intuitive behaviour, nor is logging the same event twice at different logging levels.

Unfiltered trace logging in production systems in unsustainable, and if someone wants to do targeted log index rebuilds, he can just use the percentage setting (which is easier to get right than logging config, even if it does require an RS restart to take effect)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nor is logging the same event twice at different logging levels.

Yeah got your point, sure I think it's fine, we don't need TRACE level logging, anyways we have some useful metrics available too e.g unverifiedIndexRowAge, indexRepairTime etc

Comment on lines 616 to 619
if (shouldLog()) {
LOG.warn(String.format("Index row repair failure on region %s took %d ms.",
env.getRegionInfo().getRegionNameAsString(), repairTime));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@stoty
Copy link
Contributor

stoty commented Feb 23, 2021

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 5m 36s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ 4.x Compile Tests _
+1 💚 mvninstall 13m 51s 4.x passed
+1 💚 compile 0m 59s 4.x passed
+1 💚 checkstyle 0m 32s 4.x passed
+1 💚 javadoc 0m 43s 4.x passed
+0 🆗 spotbugs 3m 0s phoenix-core in 4.x has 945 extant spotbugs warnings.
_ Patch Compile Tests _
+1 💚 mvninstall 5m 40s the patch passed
+1 💚 compile 1m 2s the patch passed
+1 💚 javac 1m 2s the patch passed
-1 ❌ checkstyle 0m 31s phoenix-core: The patch generated 13 new + 210 unchanged - 0 fixed = 223 total (was 210)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 javadoc 0m 44s the patch passed
+1 💚 spotbugs 3m 8s phoenix-core generated 0 new + 944 unchanged - 1 fixed = 944 total (was 945)
_ Other Tests _
+1 💚 unit 127m 50s phoenix-core in the patch passed.
+1 💚 asflicense 0m 11s The patch does not generate ASF License warnings.
164m 44s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1146
Optional Tests dupname asflicense javac javadoc unit spotbugs hbaseanti checkstyle compile
uname Linux 776e7f7de528 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev/phoenix-personality.sh
git revision 4.x / 358ace8
Default Java Private Build-1.8.0_242-8u242-b08-0ubuntu3~16.04-b08
checkstyle https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/3/artifact/yetus-general-check/output/diff-checkstyle-phoenix-core.txt
Test Results https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/3/testReport/
Max. process+thread count 5720 (vs. ulimit of 30000)
modules C: phoenix-core U: phoenix-core
Console output https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/3/console
versions git=2.7.4 maven=3.3.9 spotbugs=4.1.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@yanxinyi yanxinyi merged commit a1cf54f into apache:4.x Feb 23, 2021
@stoty
Copy link
Contributor

stoty commented Feb 23, 2021

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 6m 6s Docker mode activated.
-1 ❌ patch 0m 6s #1146 does not apply to 4.x. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/in-progress/precommit-patchnames for help.
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1146
Console output https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1146/4/console
versions git=2.7.4
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants