Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8276970: Default charset for PrintWriter that wraps PrintStream #6401

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -99,15 +99,17 @@ public OutputStreamWriter(OutputStream out, String charsetName)
}

/**
* Creates an OutputStreamWriter that uses the default character encoding.
* Creates an OutputStreamWriter that uses the default character encoding, or
* where {@code out} is a {@code PrintStream}, the charset used by the print
* stream.
*
* @param out An OutputStream
* @see Charset#defaultCharset()
*/
public OutputStreamWriter(OutputStream out) {
super(out);
se = StreamEncoder.forOutputStreamWriter(out, this,
Charset.defaultCharset());
out instanceof PrintStream ps ? ps.charset() : Charset.defaultCharset());
}

/**
@@ -68,6 +68,7 @@ public class PrintStream extends FilterOutputStream
private final boolean autoFlush;
private boolean trouble = false;
private Formatter formatter;
private Charset charset;
Copy link
Member

@jaikiran jaikiran Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Naoto, should this be formally marked as final?

Copy link
Member Author

@naotoj naotoj Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I will make it final.


/**
* Track both the text- and character-output streams, so that their buffers
@@ -104,11 +105,22 @@ private static Charset toCharset(String csn)
}
}

/**
* Returns the charset used in this PrintStream instance. Called from
* OutputStreamWriter and PrintWriter via the package-private access.
*
* @return the charset
*/
Charset charset() {
return charset;
}
Copy link
Contributor

@AlanBateman AlanBateman Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. You could use {@return the charset used ...} to avoid repeating the message. Also might be better to move the method to after the constructors so that it's with the other instance methods.
The update method descriptions in PS, PW, and OutputStreamWriter look good.
So overall I think we've got to a good place. Wrapping a PS with PW and not inheriting the charset is an potential accident that goes back 20+ years.

Copy link
Member Author

@naotoj naotoj Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Alan. Modified as suggested.


/* Private constructors */
private PrintStream(boolean autoFlush, OutputStream out) {
super(out);
this.autoFlush = autoFlush;
this.charOut = new OutputStreamWriter(this);
this.charset = out instanceof PrintStream ps ? ps.charset() : Charset.defaultCharset();
this.charOut = new OutputStreamWriter(this, charset);
this.textOut = new BufferedWriter(charOut);
}

@@ -124,7 +136,8 @@ private PrintStream(boolean autoFlush, Charset charset, OutputStream out) {
/**
* Creates a new print stream, without automatic line flushing, with the
* specified OutputStream. Characters written to the stream are converted
* to bytes using the default charset.
* to bytes using the charset in {@code out} if it is a {@code PrintStream},
* or using the default charset.
*
* @param out The output stream to which values and objects will be
* printed
@@ -139,6 +152,7 @@ public PrintStream(OutputStream out) {
/**
* Creates a new print stream, with the specified OutputStream and line
* flushing. Characters written to the stream are converted to bytes using
* the charset in {@code out} if it is a {@code PrintStream}, or using
* the default charset.
*
* @param out The output stream to which values and objects will be
@@ -201,6 +215,7 @@ public PrintStream(OutputStream out, boolean autoFlush, Charset charset) {
this.autoFlush = autoFlush;
this.charOut = new OutputStreamWriter(this, charset);
this.textOut = new BufferedWriter(charOut);
this.charset = charset;
}

/**
@@ -118,7 +118,8 @@ public PrintWriter(Writer out,
* Creates a new PrintWriter, without automatic line flushing, from an
* existing OutputStream. This convenience constructor creates the
* necessary intermediate OutputStreamWriter, which will convert characters
* into bytes using the default charset.
* into bytes using the charset in {@code out} if it is a {@code PrintStream},
* or using the default charset.
*
* @param out An output stream
*
@@ -132,8 +133,9 @@ public PrintWriter(OutputStream out) {
/**
* Creates a new PrintWriter from an existing OutputStream. This
* convenience constructor creates the necessary intermediate
* OutputStreamWriter, which will convert characters into bytes using the
* default charset.
* OutputStreamWriter, which will convert characters into bytes using
* the charset in {@code out} if it is a {@code PrintStream}, or using
* the default charset.
*
Copy link
Contributor

@AlanBateman AlanBateman Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the wording in OutputStreamWriter because it puts the default encoding first and makes it just a bit clearer that the PS case is the exception.

Copy link
Member Author

@naotoj naotoj Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will use OutputStreamWriter's wording here. Also I am tempted to make PrintStream::charset() public, as some custom OutputStreamWriter implementations would also need the charset information.

Copy link
Contributor

@AlanBateman AlanBateman Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I am tempted to make PrintStream::charset() public, as some custom OutputStreamWriter implementations would also need the charset information.

I think that would be good addition.

* @param out An output stream
* @param autoFlush A boolean; if true, the {@code println},
@@ -144,7 +146,7 @@ public PrintWriter(OutputStream out) {
* @see Charset#defaultCharset()
*/
public PrintWriter(OutputStream out, boolean autoFlush) {
this(out, autoFlush, Charset.defaultCharset());
this(out, autoFlush, out instanceof PrintStream ps ? ps.charset() : Charset.defaultCharset());
}

/**
@@ -0,0 +1,102 @@
/*
* Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/

import org.testng.annotations.DataProvider;
import org.testng.annotations.Test;
import static org.testng.Assert.assertEquals;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.PrintStream;
import java.io.PrintWriter;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;

/**
* @test
* @bug 8276970
* @summary Test to verify the charset in PrintStream is inherited
* in the OutputStreamWriter/PrintWriter
* @run testng InheritEncodingTest
*/
@Test
public class InheritEncodingTest {

private static final String testString = "\u00e9\u3042"; // "éあ"

@DataProvider
public Object[][] encodings() {
return new Object[][]{
{StandardCharsets.ISO_8859_1},
{StandardCharsets.US_ASCII},
{StandardCharsets.UTF_8},
{StandardCharsets.UTF_16},
{StandardCharsets.UTF_16BE},
{StandardCharsets.UTF_16LE},
};
}

@Test (dataProvider = "encodings")
public void testOutputStreamWriter(Charset stdCharset) throws IOException {
var ba = new ByteArrayOutputStream();
var ps = new PrintStream(ba, true, stdCharset);
var expected = new String(testString.getBytes(stdCharset), stdCharset);

// tests OutputStreamWriter's encoding explicitly
var osw = new OutputStreamWriter(ps);
assertEquals(Charset.forName(osw.getEncoding()), stdCharset);

// tests roundtrip result
osw.write(testString);
osw.flush();
var result = ba.toString(stdCharset);
assertEquals(result, expected);
}

@Test (dataProvider = "encodings")
public void testPrintWriter(Charset stdCharset) throws IOException {
// tests roundtrip result
var ba = new ByteArrayOutputStream();
var ps = new PrintStream(ba, true, stdCharset);
var expected = new String(testString.getBytes(stdCharset), stdCharset);
var pw = new PrintWriter(ps);
pw.write(testString);
pw.flush();
var result = ba.toString(stdCharset);
assertEquals(result, expected);
}

@Test (dataProvider = "encodings")
public void testPrintStream(Charset stdCharset) throws IOException {
// tests roundtrip result
var ba = new ByteArrayOutputStream();
var ps = new PrintStream(ba, true, stdCharset);
var expected = new String(testString.getBytes(stdCharset), stdCharset);
var psWrapper = new PrintStream(ps);
psWrapper.print(testString);
psWrapper.flush();
var result = ba.toString(stdCharset);
assertEquals(result, expected);
}
}