Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-21240: JSON SerDe Deserialize Re-Write #530

Open
wants to merge 2 commits into
base: master
from

Conversation

@belugabehr
Copy link
Contributor

belugabehr commented Feb 11, 2019

No description provided.

@belugabehr belugabehr force-pushed the belugabehr:HIVE-21240 branch 7 times, most recently from 12389d7 to 9fb041d Feb 13, 2019
@belugabehr belugabehr force-pushed the belugabehr:HIVE-21240 branch 3 times, most recently from f567d1d to 8174475 Feb 22, 2019
Object o[] = (Object[]) res;
assertEquals(null, o[0]);
// Invalid - should throw Exception
udf.evaluate(evalArgs("{\"b\":null}"));

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

am Not sure why this has changed ? seems like this is different from old behavior can you please explain more?

This comment has been minimized.

Copy link
@belugabehr

belugabehr Apr 22, 2019

Author Contributor

Good question.

If you checkout the test method, it is annotated with @Test(expected = HiveException.class). This means that the test will fail if it does not throw a HiveException. What causes it to throw the Exception is the call to udf.evaluate(evalArgs("{\"b\":null}")) therefore, everything that comes after it is dead code, so I simply removed it.

@@ -140,6 +140,7 @@ CREATE EXTERNAL TABLE kafka_table_2
`country` string,`continent` string, `namespace` string, `newPage` boolean, `unpatrolled` boolean,
`anonymous` boolean, `robot` boolean, added int, deleted int, delta bigint)
STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
WITH SERDEPROPERTIES ("timestamp.formats"="yyyy-MM-dd\'T\'HH:mm:ss\'Z\'")

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

thanks for fixing this.


if (t.getLength() == 0) {
if (!this.nullEmptyLines) {
throw new SerDeException("Encountered an empty row in the text file");

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

how about merging those 2 ifs ?

This comment has been minimized.

Copy link
@belugabehr

belugabehr Apr 22, 2019

Author Contributor

So, these two if statements cannot be merged. Check out the code that follows. If the nullEmptyLines feature is disabled, then it has another behavior of just returning 'null' for all fields.

} catch (Exception e) {
LOG.warn("Error [{}] parsing json text [{}].", e, t);
LOG.warn("Error parsing JSON text [{}].", t, e);

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

not sure why this is a warning seems like it is an error ?

This comment has been minimized.

Copy link
@belugabehr

belugabehr Apr 22, 2019

Author Contributor

Thanks for pointing this out. I am going to lower this logging level to debug because there is also an Exception thrown here. It should either log, or throw, not both.

https://community.oracle.com/docs/DOC-983543#logAndThrow

throws SerDeException {
StringBuilder sb = new StringBuilder();
try {
public Writable serialize(final Object obj,

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

is this split on 2 lines for reason or it is github rendering ?

This comment has been minimized.

Copy link
@belugabehr

belugabehr Apr 22, 2019

Author Contributor

I just checked this... this is the correct format when using the Hadoop check-style format.

* Enumeration that defines all on/off features for this reader.
*/
public enum Feature {
COL_INDEX_PARSING, PRIMITIVE_TO_WRITABLE, IGNORE_UKNOWN_FIELDS

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

can you please document those features?

*
* @param rootNode The root node to process
* @param oi The ObjectInspector to use
* @return The value in this node (may be a complex type if nested)

This comment has been minimized.

Copy link
@b-slim

b-slim Apr 19, 2019

Contributor

seems like this can return null, can you add annotation plus explanation what null means ?

This comment has been minimized.

Copy link
@belugabehr

belugabehr Apr 22, 2019

Author Contributor

I can try to clarify,... null has not special meaning here, it's just the value of the JsonNode.

Example: {"name":null}

@belugabehr belugabehr force-pushed the belugabehr:HIVE-21240 branch from 8174475 to 6ff2969 Apr 22, 2019
@belugabehr belugabehr force-pushed the belugabehr:HIVE-21240 branch from 6ff2969 to 26efbd3 Apr 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.