Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteString introduced as AsciiString super class #3579

Closed
wants to merge 1 commit into from

Conversation

Scottmitch
Copy link
Member

Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.

Modifications:

  • ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.

Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.

@Scottmitch Scottmitch self-assigned this Apr 3, 2015
@Scottmitch Scottmitch added this to the 4.1.0.Beta5 milestone Apr 3, 2015
@Scottmitch
Copy link
Member Author

@trustin @nmittler @buchgr - FYI.

protected EmbeddedChannel newContentCompressor(AsciiString contentEncoding) throws Http2Exception {
if (GZIP.equalsIgnoreCase(contentEncoding) || X_GZIP.equalsIgnoreCase(contentEncoding)) {
protected EmbeddedChannel newContentCompressor(ByteString contentEncoding) throws Http2Exception {
if (GZIP.equals(contentEncoding) || X_GZIP.equals(contentEncoding)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this change. equalsIgnoreCase doesn't fit well when dealing with binary strings. You have to think about encodings and translations...blah. HTTP/2 requires that header names are lower case. So a byte wise comparison should be sufficient, unless the user switches off the "force to lower case default"...in which case they are explicitly taking this risk.

@Scottmitch
Copy link
Member Author

@netkins build

private static ByteStringProcessor VALIDATE_NAME_PROCESSOR = new ByteStringProcessor() {
@Override
public boolean process(byte value) throws Exception {
// Check to see if the character is not an ASCII character
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing . on EOL

* {@link #array()}. Care must be taken when directly accessing the memory as that may invalidate assumptions that
* this object is immutable.
*/
public class ByteString {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, did you write this from scratch or is it a copy from another project?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know it is based on the old AsciiString that was written by @trustin . He based it on code from Apache Harmony

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment, since you allow ByteString to be extended, would it make sense to mark many of the frequently called methods as final to help with optimization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@normanmaurer +1. This is mostly the result of pulling all the encoding neutral stuff from AsciiString.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nmittler - Yip. I made a bunch of them final.

@nmittler
Copy link
Member

nmittler commented Apr 7, 2015

@Scottmitch looks great overall! Just a few questions/comments.

@Scottmitch
Copy link
Member Author

@nmittler - Thanks for review! I think I got all your comments...feel free to take a look.

@trustin
Copy link
Member

trustin commented Apr 8, 2015

Thanks @Scottmitch! Will review today.

@Scottmitch
Copy link
Member Author

@trustin - Sounds good. Ping me when you are finished.

@@ -706,83 +533,12 @@ public boolean equalsIgnoreCase(CharSequence string) {
final byte[] value = this.value;
final char[] buffer = new char[length];
for (int i = 0, j = start; i < length; i++, j++) {
buffer[i] = (char) (value[j] & 0xFF);
buffer[i] = (char) (value[j]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same problem. We should not drop & 0xFF.

@trustin
Copy link
Member

trustin commented Apr 10, 2015

Moving ByteString, AsciiString and ByteVisitor to netty-common (since it's not limited to writing a codec anymore)
@trustin - We may be creating a circular dependency here. ByteString exposes methods that take ByteBuf types...and so we would have to add netty-buffer as a dependency of netty-common but the netty-buffer package already depends upon netty-common. We may have to introduce an abstraction layer...or move some other stuff....any suggestions?

Those methods are just shortcut methods. We could move them to ByteBufUtil?

* Create a copy of the underlying storage from {@link value} into a byte array.
* The copy will start at {@link ByteBuffer#position()} and copy {@link ByteBuffer#remaining()} bytes.
*/
public static byte[] getBytes(ByteBuffer value) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason these getBytes[] methods should be public? They look unrelated to ByteString and useful only for internal purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess they don't have to be. The thought was incase you don't want to have a ByteString (which Ideally should be unmodifiable) object you could directly to byte[].

@trustin
Copy link
Member

trustin commented Apr 10, 2015

Review done. Please ping me for another round. Thank you for your patience!

@Scottmitch
Copy link
Member Author

Moving ByteString, AsciiString and ByteVisitor to netty-common (since it's not limited to writing a codec anymore)

@trustin - We may be creating a circular dependency here. ByteString exposes methods that take ByteBuf types...and so we would have to add netty-buffer as a dependency of netty-common but the netty-buffer package already depends upon netty-common. We may have to introduce an abstraction layer...or move some other stuff....any suggestions?

Those methods are just shortcut methods. We could move them to ByteBufUtil?

Good idea. I had to manually do some conversions which I was leaning on ByteBuf to do for me. Please review. I'm wondering if I should worry about manually releasing the ByteBuffer object I allocate to do the conversion...

@Scottmitch Scottmitch force-pushed the ascii_string_refactor branch 2 times, most recently from e9ec00b to 920189b Compare April 10, 2015 18:18
* Create a copy of the underlying storage from {@link value} into a byte array.
* The copy will start at {@link ByteBuffer#position()} and copy {@link ByteBuffer#remaining()} bytes.
*/
public static byte[] getBytes(ByteBuffer value) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trustin - Your comment was collapsed I think but the rational for having these public is if a user wanted to directly get back a modifiable byte[]. The "desired" use case for ByteString is to be "unmodifiable". I can make private though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make them private and consider moving them somewhere else and making them public in a later discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Am 14.04.2015 um 16:43 schrieb Scott Mitchell notifications@github.com:

In common/src/main/java/io/netty/util/ByteString.java:

  • public ByteString(CharSequence value, Charset charset) {
  •    this.value = getBytes(value, charset);
    
  • }
  • /**
  • \* @see {@link #getBytes(CharSequence, Charset, int, int)}
    
  • */
    
  • public ByteString(CharSequence value, Charset charset, int start, int length) {
  •    this.value = getBytes(value, charset, start, length);
    
  • }
  • /**
  • \* Create a copy of the underlying storage from {@link value} into a byte array.
    
  • \* The copy will start at {@link ByteBuffer#position()} and copy {@link ByteBuffer#remaining()} bytes.
    
  • */
    
  • public static byte[] getBytes(ByteBuffer value) {
    SGTM


Reply to this email directly or view it on GitHub.

@Scottmitch
Copy link
Member Author

@trustin - Ready for another round.

@trustin
Copy link
Member

trustin commented Apr 14, 2015

@Scottmitch Looks great! Please cherry-pick once you make the getBytes() methods private. If you are inclined to making them public, we could move it somewhere more appropriate.

Motivation:
The usage and code within AsciiString has exceeded the original design scope for this class. Its usage as a binary string is confusing and on the verge of violating interface assumptions in some spots.

Modifications:
- ByteString will be created as a base class to AsciiString. All of the generic byte handling processing will live in ByteString and all the special character encoding will live in AsciiString.

Results:
The AsciiString interface will be clarified. Users of AsciiString can now be clear of the limitations the class imposes while users of the ByteString class don't have to live with those limitations.
@Scottmitch
Copy link
Member Author

@trustin - Methods made private for now. As you said we can always increase visibility later. I'll pull this in after #3626.

@Scottmitch
Copy link
Member Author

@buchgr - FYI. I'm going to pull this in after your #3626 PR. This may cause a bit of heart-ache for your headers optimization work, but hopefully not much if you are focused mostly on DefaultHeaders.

@buchgr
Copy link
Contributor

buchgr commented Apr 14, 2015

@Scottmitch thanks! Will update once I get to work!

@Scottmitch
Copy link
Member Author

PR #3631 will serve to review the forward-port into master.

Scottmitch added a commit that referenced this pull request Apr 15, 2015
Motivation:
While forward porting #3579 there were a few areas that had not been previously back ported.

Modifications:
Backport the missed areas to ensure consistency.

Result:
More consistent 4.1 and master branches.
@Scottmitch
Copy link
Member Author

Cherry-picked into 4.1 (9a7a85d and e36c143)

@Scottmitch Scottmitch closed this Apr 15, 2015
@Scottmitch Scottmitch deleted the ascii_string_refactor branch April 15, 2015 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants