Skip to content

Conversation

@uce
Copy link
Contributor

@uce uce commented Nov 29, 2016

I would like to include the following fixes for the 1.1.4 release. In general, I try to improve the debugging experience with this.

  1. Reduce log pollution: Some debug logs were very noisy and actually dominate the logs although they provide little value.
    • Log heartbeats on TRACE level
    • Don't log InputChannelDeploymentDescriptor
    • Decrease HadoopFileSystem logging
  2. Improve log messages: Some existing debug log messages were not that helpful.
    • Log GlobalConfiguration loaded properties on INFO level
    • Add TaskState toString
    • Add more detailed log messages to HA job graph store
  3. Improve existing logger configuration templates: The existing template simply configured the appenders and left everything except the root logger to the user. I changed it to be more fine grained (root logger, Flink, common libs/connectors). The goal is that users trying to DEBUG Flink, don't end up with too many unrelated log messages.

/cc @tillrohrmann

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me @uce. I had only some minor comments. After addressing them +1 for merging.

# change the log levels here.
log4j.logger.akka=INFO
log4j.logger.org.apache.kafka=INFO
log4j.logger.org.apache.hadoop=INFO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also work for the shaded hadoop dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, at runtime they have a different package. Therefore, we need to change this here, good catch I think ;) @rmetzger Can you confirm?

Copy link
Contributor

@rmetzger rmetzger Nov 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not relocating Hadoop, so this configuration line is correct.
As you can see from our logs (the first line usually):
org.apache.hadoop.util.NativeCodeLoader
We are relocating some Hadoop dependencies (like Guava), and rewrite the Hadoop code to call them, but we are not relocating the Hadoop code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I remember ;) This is indeed shady business. But good to know that we can keep it as is.

long stateSize = 0L;
for (SubtaskState subtaskState : subtaskStates.values()) {
stateSize += subtaskState.getStateSize();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we use the getStateSize method here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

LOG.debug("Adding " + possibleHadoopConfPath + "/core-site.xml to hadoop configuration");
}
} else {
LOG.debug("File " + possibleHadoopConfPath + "/core-site.xml not found.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{} for variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

LOG.debug("Adding " + possibleHadoopConfPath + "/hdfs-site.xml to hadoop configuration");
}
} else {
LOG.debug("File " + possibleHadoopConfPath + "/hdfs-site.xml not found.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with {}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@tillrohrmann
Copy link
Contributor

We should also set the received and handled message in FlinkUntypedActor#onReceive to trace.

@uce
Copy link
Contributor Author

uce commented Nov 30, 2016

Added FlinkUntypedActor as well and addressed the comments, Till. I would like to merge this now.

asfgit pushed a commit that referenced this pull request Nov 30, 2016
@uce uce closed this Nov 30, 2016
uce added a commit to uce/flink that referenced this pull request Nov 30, 2016
asfgit pushed a commit that referenced this pull request Dec 1, 2016
liuyuzhong pushed a commit to liuyuzhong/flink that referenced this pull request Dec 2, 2016
liuyuzhong pushed a commit to liuyuzhong/flink that referenced this pull request Dec 5, 2016
static-max pushed a commit to static-max/flink that referenced this pull request Dec 13, 2016
skidder pushed a commit to muxinc/flink that referenced this pull request Dec 27, 2016
joseprupi pushed a commit to joseprupi/flink that referenced this pull request Feb 12, 2017
@uce uce deleted the logging branch February 16, 2017 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants