Skip to content

KYLIN-3942 Support multi-level json event in backend#605

Merged
allenma merged 2 commits intoapache:masterfrom
hit-lacus:KYLIN-3942
Apr 18, 2019
Merged

KYLIN-3942 Support multi-level json event in backend#605
allenma merged 2 commits intoapache:masterfrom
hit-lacus:KYLIN-3942

Conversation

@hit-lacus
Copy link
Copy Markdown
Member

@hit-lacus hit-lacus commented Apr 14, 2019

Problem

Currently real-time OLAP didn't support multi-level json event.
For example,if I have a kafka multi-level json event like this:

Sample Event

{
    "country":"JAPAN",
    "amount":13.075058425023922,
    "qty":8,
    "currency":"USD",
    "order_time":1554801950882,
    "category":"ELECTRONIC",
    "device":"Andriod",
    "user":{
        "gender":"Female",
        "id":"7a0cfa5e-bbaa-79ef-1a38-e06f02c85fcb",
        "first_name":"unknown",
        "age":16
    }
}

streaming_receiver.log

2019-04-09 09:46:09,878 ERROR [StreamingV2Cube_channel] kafka.TimedJsonStreamParser:107 : error
com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize instance of `java.lang.String` out of START_OBJECT token
at [Source: (String)"\{"country":"US","amount":14.498498222823619,"qty":1,"currency":"USD","order_time":1554803169876,"category":"Other","device":"Other","user":{"gender":"Female","id":"0736b41a-9ae7-9b4a-a124-f74436d3eb41","first_name":"unknown","age":26}}"; line: 1, column: 140] (through reference chain: java.util.HashMap["user"])
at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63)
at com.fasterxml.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1342)
at com.fasterxml.jackson.databind.DeserializationContext.handleUnexpectedToken(DeserializationContext.java:1138)
at com.fasterxml.jackson.databind.DeserializationContext.handleUnexpectedToken(DeserializationContext.java:1092)
at com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:63)
at com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:10)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3030)
at org.apache.kylin.stream.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:79)
at org.apache.kylin.stream.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:54)
at org.apache.kylin.stream.source.kafka.consumer.KafkaConnector.nextEvent(KafkaConnector.java:110)
at org.apache.kylin.stream.core.consumer.StreamingConsumerChannel.run(StreamingConsumerChannel.java:93)
at java.lang.Thread.run(Thread.java:748)

After repair

{
    "act_type":"play",
    "uid":1716,
    "video_id":1786,
    "play_duration":10.2966,
    "detail":{
        "Info":"dk",
        "Type":"abc",
        "location":{
            "province":"taiwan"
        }
    },
    "video_type":"3aa17",
    "ts":1555261235000,
    "pageview_id":"6ed3aa34d75f0",
    "play_times":31
}

image

d97bdc4e-5952-4383-a1f8-eefc06933a03

@asfgit
Copy link
Copy Markdown

asfgit commented Apr 14, 2019

Can one of the admins verify this patch?

@coveralls
Copy link
Copy Markdown

coveralls commented Apr 14, 2019

Pull Request Test Coverage Report for Build 4403

  • 39 of 42 (92.86%) changed or added relevant lines in 1 file are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.06%) to 27.823%

Changes Missing Coverage Covered Lines Changed/Added Lines %
stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/TimedJsonStreamParser.java 39 42 92.86%
Files with Coverage Reduction New Missed Lines %
core-dictionary/src/main/java/org/apache/kylin/dict/lookup/cache/RocksDBLookupTable.java 1 81.08%
stream-core/src/main/java/org/apache/kylin/stream/core/storage/columnar/ColumnarStoreCache.java 2 55.81%
Totals Coverage Status
Change from base Build 4393: 0.06%
Covered Lines: 22746
Relevant Lines: 81753

💛 - Coveralls

@codecov-io
Copy link
Copy Markdown

codecov-io commented Apr 14, 2019

Codecov Report

Merging #605 into master will increase coverage by 0.04%.
The diff coverage is 83.33%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #605      +/-   ##
============================================
+ Coverage      25.3%   25.34%   +0.04%     
- Complexity     5827     5838      +11     
============================================
  Files          1379     1379              
  Lines         81720    81753      +33     
  Branches      11444    11452       +8     
============================================
+ Hits          20676    20720      +44     
+ Misses        59032    59011      -21     
- Partials       2012     2022      +10
Impacted Files Coverage Δ Complexity Δ
...lin/stream/source/kafka/TimedJsonStreamParser.java 76.71% <83.33%> (+76.71%) 12 <8> (+12) ⬆️
...he/kylin/dict/lookup/cache/RocksDBLookupTable.java 72.97% <0%> (-5.41%) 6% <0%> (-1%)
...ream/core/storage/columnar/ColumnarStoreCache.java 48.83% <0%> (-3.49%) 7% <0%> (ø)
.../apache/kylin/cube/cuboid/TreeCuboidScheduler.java 63.84% <0%> (-2.31%) 0% <0%> (ø)
...rg/apache/kylin/cube/inmemcubing/MemDiskStore.java 69.6% <0%> (-1.22%) 7% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 595c4e2...546bfaa. Read the comment docs.

@hit-lacus
Copy link
Copy Markdown
Member Author

After disscussion, I use ParseInfo's columnToSourceFieldMapping to parse multi-level json event.

企业微信截图_36c7f0aa-3f32-4c27-bf6f-0801a393e2ed


public interface IStreamingMessageParser<T> {
StreamingMessage parse(T sourceMessage);
StreamingMessage parse(T sourceMessage, int partition, long offset);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better not change the interface

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@allenma allenma merged commit e7b76b7 into apache:master Apr 18, 2019
@hit-lacus hit-lacus deleted the KYLIN-3942 branch March 18, 2020 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants