Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8271186: Add UL option to replace newline char #4885

Closed
wants to merge 10 commits into from

Conversation

YaSuenag
Copy link
Member

@YaSuenag YaSuenag commented Jul 23, 2021

Most of UL entries would print log in one line, however some categories (e.g. exceptions) have multiline entries as following:

[0.157s][info][exceptions] Exception <a 'java/lang/NullPointerException'{0x000000008b918f70}: test>
 thrown in interpreter method <{method} {0x00007f8335000248} 'main' '([Ljava/lang/String;)V' in 'Test'>
 at bci 9 for thread 0x00007f8330017160 (main)

It is ease to parse with log shipper (Fluent Bit, Logstash, and more) if UL can print all logs in one line.
Famous log shippers support multiline logs of course, but its configuration tends to be complex, and also some input plugins (e.g. TCP on Fluent Bit) do not support multiline logs.

So I want to introduce new UL option foldmultilines to replace all of newline char (\n) in the log entry will be replaced to 2 chars \\n. In addition, all of backslashes (\\ in C) are also replaced to 2 chars \\\\.

After this patch, we can get following logs with foldmultilines=true:

[0.166s][info][exceptions] Exception <a 'java/lang/NullPointerException'{0x000000008b918f70}: test>\n thrown in interpreter method <{method} {0x00007fbc81000248} 'main' '([Ljava/lang/String;)V' in 'Test'>\n at bci 9 for thread 0x00007fbc9c0171a0 (main)

I've also filed CSR for this issue, please review it.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4885/head:pull/4885
$ git checkout pull/4885

Update a local copy of the PR:
$ git checkout pull/4885
$ git pull https://git.openjdk.java.net/jdk pull/4885/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4885

View PR using the GUI difftool:
$ git pr show -t 4885

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4885.diff

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Jul 23, 2021

/csr needed

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jul 23, 2021

👋 Welcome back ysuenaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the csr label Jul 23, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jul 23, 2021

@YaSuenag this pull request will not be integrated until the CSR request JDK-8271188 for issue JDK-8271186 has been approved.

@openjdk
Copy link

@openjdk openjdk bot commented Jul 23, 2021

@YaSuenag The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime label Jul 23, 2021
@YaSuenag YaSuenag marked this pull request as ready for review Jul 23, 2021
@openjdk openjdk bot added the rfr label Jul 23, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jul 23, 2021

@mseledts
Copy link
Member

@mseledts mseledts commented Jul 23, 2021

Could you, please, consider adding a test for this issue? Or updating an existing test?

@openjdk openjdk bot removed the csr label Aug 24, 2021
@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Aug 24, 2021

CSR has been approved, and I pushed new commit to add testcase and to update help message & manpage. Could you review?

Copy link
Member

@iklam iklam left a comment

I tested the latest version in our CI pipeline and all tests in tiers 1-2 passed.

*equals_pos = '=';
success = LogFileStreamOutput::initialize(pos, errstream);
*equals_pos = '\0';
if (!success) {
Copy link
Member

@iklam iklam Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code can be simplified by parsing the value here:

bool fold_multilines = ....;
LogFileStreamOutput::set_fold_multilines(foldmultilines);

Copy link
Member Author

@YaSuenag YaSuenag Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foldmultilines is a parameter which should be applied in initialization. LogOutput and extended classes of it look like to have a rule to apply initialization parameter at initialize().
We can simplify the code as you said if we can apply foldmultiline for LogFileStreamOutput at LogFileOutput::initialize(), but I concern it breaks the semantics for UL implementation.

Copy link
Member

@iklam iklam Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, your concern is _fold_multilines is a property of LogFileStreamOutput, so the parsing of this option should be done inside the LogFileStreamOutput class. That way, (in the future) all subclasses of LogFileStreamOutput will be able to handle foldmultiline=true in their options by deferring to LogFileStreamOutput::initialize.

I suppose this would be useful if we have support for sockets in the future, like

-Xlog:all=socket::address=10.1.1.1,port=1234,foldmultiline=true

However, I don't think your current implementation has the right abstraction -- FoldMultilinesOptionKey is now checked both in LogFileStreamOutput::initialize() and LogFileOutput::initialize(). If we add a new LogSocketOutput in the future, we would have to duplicated the code in LogSocketOutput::initialize() as well.

I think we should defer the design of such an abstraction until we need it. Ideally, LogFileOutput and LogSocketOutput should not need to know what options are supported by their superclass, LogFileStreamOutput

For the time being, it's easier to parse all the file output options in LogFileOutput::initialize() to keep the code simple.

Copy link
Member Author

@YaSuenag YaSuenag Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think your current implementation has the right abstraction

Agree, I think we can (should) improve more. Its improvement is not for LogSocketStream but also for LogStd{out,err}Output.
As we discussed in CSR about stdout/err, we will work for them in next step. But they extend LogFileStreamOutput directly. So I want to refactor it before next step.

Anyway I moved foldmultiline handling into LogFileOutput in new commit.

" with the character sequence."
" Escape newline (\\n) and backslash (\\) characters in the UL output"
" if it is set to true."
" Note that it works on file output only.");
Copy link
Member

@iklam iklam Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this message?

out->print_cr("   foldmultilines=.. - If set to true, a log event that consists of multiple lines"
                                       " will be folded into a single line by escaping"
                                       " the newline (\\n) and backslash (\\) characters"
                                       " in the output.")

I think there's no need to say "it works on file output only" because this is under the heading of "Additional output-options for file outputs".

Copy link
Member

@dholmes-ora dholmes-ora Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not "escaping" newline characters here. We went through great pains in the CSR request to describe the actual process accurately. We are replacing newlines with the sequence \ and n. We also have to replace \ with the sequence \\ so that the conversion can be reversed by a parser. Saying all this in the help output is awkward but necessary. We also need to warn that this may not be safe with character encodings other than UTF-8. So something like:

If set to true, a log event that consists of multiple lines will be folded into a single line by replacing newline characters with the sequence '\' and 'n' in the output. Existing single backslash characters will also be replaced with a sequence of two backslashes so that the conversion can be reversed. This option is safe to use with UTF-8 character encodings, but other encodings may not work.

Copy link
Member

@iklam iklam Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this version is much better than mine :-)

.PP
\f[I]foldmultilines\f[R] enables to replace newline characters within
a multiline log event with the character sequence '\\' and 'n'.
Note that it works on file output only.
Copy link
Member

@iklam iklam Aug 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

When foldmultilines is true, a log event that consists of multiple lines will be folded into a single line by escaping the newline (\n) and backslash (\) characters in the output. This option is available only for file outputs.

Copy link
Member

@dholmes-ora dholmes-ora Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be expanded similar to the help info. But you can elaborate on the character encoding limitation as per the CSR request.

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Aug 25, 2021

Thanks @iklam and @dholmes-ora ! I was relieved to hear my PR passed your CI pipeline.

I updated help message and manpage. I will update foldmultilines handling after the discussion if it needs.

iklam
iklam approved these changes Aug 25, 2021
Copy link
Member

@iklam iklam left a comment

LGTM. Just a small nit in the man page.

.RE
.PP
When \f[I]foldmultilines\f[R] is true, a log event that consists of multiple lines will be folded into a single line by replacing newline characters with the sequence '\\' and 'n' in the output. Existing single backslash characters will also be replaced with a sequence of two backslashes so that the conversion can be reversed. This option is safe to use with UTF-8 character encodings, but other encodings may not work. For example, it may happen inadvertently conversion in multi-byte sequences in Shift JIS and BIG5.
This option is available only for file outputs.
Copy link
Member

@iklam iklam Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about: "For example, it may incorrectly convert multi-byte sequences in Shift JIS and BIG5."

Copy link
Member Author

@YaSuenag YaSuenag Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ioi! I updated manpage, and also I formatted the description - it was too long than other lines!

@openjdk
Copy link

@openjdk openjdk bot commented Aug 25, 2021

@YaSuenag This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8271186: Add UL option to replace newline char

Reviewed-by: iklam, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 25 new commits pushed to the master branch:

  • a3308af: 8272836: Limit run time for java/lang/invoke/LFCaching tests
  • c4c76e2: 8272811: Document the effects of building with _GNU_SOURCE in os_posix.hpp
  • 673ce7e: 8272873: C2: Inlining should not depend on absolute call site counts
  • 7212561: 8267188: gc/stringdedup/TestStringDeduplicationInterned.java fails with Shenandoah
  • e36cbd8: 8242847: G1 should not clear mark bitmaps with no marks
  • 2ef6871: 8272447: Remove 'native' ranked Mutex
  • 63e062f: 8236176: Parallel GC SplitInfo comment should be updated for shadow regions
  • c5a2712: 8272850: Drop zapping values in the Zap* option descriptions
  • 1e3e333: 8272884: Make VoidClosure::do_void pure virtual
  • 0f428ca: 8272570: C2: crash in PhaseCFG::global_code_motion
  • ... and 15 more: https://git.openjdk.java.net/jdk/compare/22ef4f065315c1238216849ce9ce71b8207b43f8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Aug 25, 2021
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Seems okay now.

Thanks,
David

@YaSuenag
Copy link
Member Author

@YaSuenag YaSuenag commented Aug 27, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Aug 27, 2021

Going to push as commit b16a04e.
Since your change was applied there have been 34 commits pushed to the master branch:

  • d732c30: 8272863: Replace usages of Collections.sort with List.sort call in public java modules
  • fe7d708: 8272473: Parsing epoch seconds at a DST transition with a non-UTC parser is wrong
  • 845e1ce: 8272983: G1 Add marking details to eager reclaim logging
  • c420530: 8272481: [macos] javax/swing/JFrame/NSTexturedJFrame/NSTexturedJFrame.java fails
  • e43a907: 8271315: Redo: Nimbus JTree renderer properties persist across L&F changes
  • 11c9fd8: 8272975: ParallelGC: add documentation to heap memory layout
  • b94fd32: 8272859: Javadoc external links should only have feature version number in URL
  • 9166ba3: 8272973: Incorrect compile command used by TestIllegalArrayCopyBeforeInfiniteLoop
  • 49b2789: 8262751: RenderPipelineState assertion error in J2DDemo
  • a3308af: 8272836: Limit run time for java/lang/invoke/LFCaching tests
  • ... and 24 more: https://git.openjdk.java.net/jdk/compare/22ef4f065315c1238216849ce9ce71b8207b43f8...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Aug 27, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Aug 27, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Aug 27, 2021

@YaSuenag Pushed as commit b16a04e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@YaSuenag YaSuenag deleted the JDK-8271186 branch Aug 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime integrated
4 participants