Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Non empty BINARY fields are considered empty #834

Merged
merged 2 commits into from
Aug 29, 2016

Conversation

pfcoperez
Copy link
Contributor

@pfcoperez pfcoperez commented Aug 26, 2016

Thank you for submitting a pull request!

Please make sure you have signed our Contributor License Agreement (CLA).
We are not asking you to assign copyright to us, but to give us the right to distribute your code without restriction. We ask this of all contributors in order to assure our users of the origin and continuing existence of the code.
You only need to sign the CLA once.

Prior to this PR, all values for BINARY mapped fields were considered as empty no matter what value they actually had. The problem is that the condition deciding how to manage empty values is incomplete:

def binaryValue(value: Array[Byte]) = {
    if (value != null) {
      if (emptyAsNull) { // This should be:  if (emptyAsNull && value.isEmpty)
        nullValue()
      }
      else {
        parseBinary(value)
      }
    }
    else {
      nullValue()
    }
  }

On the other hand, this is not critical however, two paths of conditions returning the same value in an if-statement can become a source of bugs if there are changes in the code (what if someone changes the return value at the outer else but not when the emptyAsNull && value.isEmpty is satisfied?). For that, I've rephrased the checks in a more idiomatic Scala style which leverages the nullity check that Option's factory provide:

 Option(value) collect {
      case value: Array[Byte] if !emptyAsNull || !value.isEmpty =>
        parseBinary(value)
    } getOrElse nullValue()

…ed as empty no matter what value they actually had.

Besides, it has been simplified in a more idiomatic way.
@jbaiera
Copy link
Member

jbaiera commented Aug 26, 2016

@pfcoperez the change looks good to me, but could you add a unit test case that covers the change?

@pfcoperez
Copy link
Contributor Author

pfcoperez commented Aug 26, 2016

@jbaiera Sure. I had to introduce additional changes in order to perform the tests with the provided jackson version.

Jackson 1.8.8 Utf8SreamReader lacks the else branch of the following if-sentence at getBinaryValue:

if (_tokenIncomplete) {
            try {
                _binaryValue = _decodeBase64(b64variant);
            } catch (IllegalArgumentException iae) {
                throw _constructError("Failed to decode VALUE_STRING as base64 ("+b64variant+"): "+iae.getMessage());
            }
            /* let's clear incomplete only now; allows for accessing other
             * textual content in error cases
             */
            _tokenIncomplete = false;
        } else { // may actually require conversion...
            if (_binaryValue == null) {
                @SuppressWarnings("resource")
                ByteArrayBuilder builder = _getByteArrayBuilder();
                _decodeBase64(getText(), builder, b64variant);
                _binaryValue = builder.toByteArray();
            }
        }
        return _binaryValue;

In any case, they're related to the unit tests but are is orthogonal to the initial change fixing the bug.

@jbaiera
Copy link
Member

jbaiera commented Aug 29, 2016

Awesome, thanks. I noticed a discrepancy between the ScalaValueReader and the JdkValueReader with how they handle emptyAsNull for BINARY types. Seems ScalaValueReader does, while the JdkValueReader does not. This may be due to JdkValueReader's sub classes and how they convert things to writables (Pig, Hive) which is harder to represent null values in. In any case, that is definitely outside of the scope for this PR.

LGTM. Thanks again!

@jbaiera jbaiera merged commit b60f606 into elastic:master Aug 29, 2016
@pfcoperez pfcoperez deleted the fix_scalareader_binary_type branch August 30, 2016 06:24
@pfcoperez
Copy link
Contributor Author

@jbaiera Thanks to you! I wonder whether this PR could be also merged into 2.3.x branches.

@jbaiera jbaiera added the v2.4.0 label Aug 30, 2016
@jbaiera
Copy link
Member

jbaiera commented Aug 30, 2016

I have a few changes that I was planning on back porting to 2.x, and this is one of them. Should be done by end of day tomorrow. Cheers!

@pfcoperez
Copy link
Contributor Author

That's nice! Would it be possible to port
#826 too?

Thanks for your great work and being open approach to PRs and suggestions.

El mar., 30 ago. 2016 a las 18:12, James Baiera (notifications@github.com)
escribió:

I have a few changes that I was planning on back porting to 2.x, and this
is one of them. Should be done by end of day tomorrow. Cheers!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#834 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAQr4-17xRfsIczP_CFqP4bAlEh1LCUhks5qlFZUgaJpZM4Jt52V
.

@jbaiera
Copy link
Member

jbaiera commented Aug 30, 2016

Yep that one is on the list as well. Thanks for the contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants