Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8266013: Unexpected replacement character handling on stateful CharsetEncoder #3719

Closed
wants to merge 2 commits into from

Conversation

@takiguc
Copy link

@takiguc takiguc commented Apr 27, 2021

When an invalid character is converted by getBytes() method, the character is converted to replacement byte data.
Shift code (SO/SI) may not be added into right place by EBCDIC Mix charset.
EBCDIC Mix charset encoder is stateful encoder.
Shift code should be added by switching character set.
On x-IBM1364, "\u3000\uD800" should be converted to "\x0E\x40\x40\x0F\x6F", but "\x0E\x40\x40\x6F\x0F"
SI is not in right place.

Also ISO2022 related charsets use escape sequence to switch character set.
But same kind of issue is there.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8266013: Unexpected replacement character handling on stateful CharsetEncoder

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3719/head:pull/3719
$ git checkout pull/3719

Update a local copy of the PR:
$ git checkout pull/3719
$ git pull https://git.openjdk.java.net/jdk pull/3719/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3719

View PR using the GUI difftool:
$ git pr show -t 3719

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3719.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Apr 27, 2021

👋 Welcome back itakiguchi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Apr 27, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 27, 2021

@takiguc The following labels will be automatically applied to this pull request:

  • i18n
  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 27, 2021

Webrevs

@takiguc
Copy link
Author

@takiguc takiguc commented Apr 27, 2021

/label remove nio

@openjdk openjdk bot removed the nio label Apr 27, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 27, 2021

@takiguc
The nio label was successfully removed.

@takiguc
Copy link
Author

@takiguc takiguc commented Apr 27, 2021

/label add core-libs

@openjdk openjdk bot added the core-libs label Apr 27, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 27, 2021

@takiguc
The core-libs label was successfully added.

@takiguc
Copy link
Author

@takiguc takiguc commented Apr 29, 2021

/label remove rfr

@openjdk
Copy link

@openjdk openjdk bot commented Apr 29, 2021

@takiguc The label rfr is not a valid label. These labels are valid:

  • serviceability
  • hotspot
  • sound
  • hotspot-compiler
  • kulla
  • i18n
  • shenandoah
  • jdk
  • javadoc
  • 2d
  • security
  • swing
  • hotspot-runtime
  • jmx
  • build
  • nio
  • beans
  • core-libs
  • compiler
  • net
  • hotspot-gc
  • hotspot-jfr
  • awt

@takiguc
Copy link
Author

@takiguc takiguc commented May 9, 2021

Gentle reminder

Currently stateful CharsetEncoder (like EBCDIC Mix, ISO2022 related) cannot handle replacement characters.
Please give me your suggestion or advice.

@@ -611,9 +628,35 @@ public abstract class Charset$Coder$ {
return cr;

if (action == CodingErrorAction.REPLACE) {
#if[encoder]
if (maxBytesPerChar > 3.0) {
Copy link
Member

@naotoj naotoj May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this check imply it is for stateful encoder? Since the fix is for incorrect SO/SI handling, should the fix be localized in those EBCDIC/ISO2022 encoders, not in the generic Charset-X-Coder?

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jun 8, 2021

@takiguc This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jul 6, 2021

@takiguc This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants