Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double quotes when specifying column names gives NULL data #2649

Closed
rmoff opened this issue Apr 8, 2019 · 5 comments
Closed

Double quotes when specifying column names gives NULL data #2649

rmoff opened this issue Apr 8, 2019 · 5 comments
Assignees
Projects

Comments

@rmoff
Copy link
Contributor

rmoff commented Apr 8, 2019

KSQL 5.2.1.

Source data

ksql> PRINT 'vflow.netflow9.flat' FROM BEGINNING;
Format:JSON
{"ROWTIME":1554712041216,"ROWKEY":"null","exportTime":"Apr 5, 2019 9:31:45 PM","UNIXSecs":1554499905,"SeqNum":567608,"SrcID":1,"AgentID":"71.41.183.98","SysUpTime":2147483647,"Version":9.0,"Count":20.0,"in_bytes":66,"in_pkts":1,"flows":-1,"protocol":6,"src_tos":0,"tcp_flags":18,"l4_src_addr":443,"ipv4_src_addr":"13.107.18.254","src_mask":-1,"input_snmp":6,"l4_dst_port":49243,"ipv4_dst_addr":"71.41.183.98","dst_mask":-1,"output_snmp":8,"ipv4_next_hop":"NA","src_as":-1,"dst_as":-1,"bgp_ipv4_next_hop":"NA","mul_dst_pkts":-1,"mul_dst_bytes":-1,"last_switched":2350854112,"first_switched":2350854112,"out_bytes":-1,"out_pkts":-1,"min_pkt_lngth":-1,"max_pkt_lngth":-1,"ipv6_src_addr":"NA","ipv6_dst_addr":"NA","ipv6_src_mask":-1,"ipv6_dst_mask":-1,"ipv6_flow_label":-1,"icmp_type":0,"mul_igmp_type":-1,"sampling_interval":-1,"sampling_algorithm":-1,"flow_active_timeout":-1,"flow_inactive_timeout":-1,"engine_type":-1,"engine_id":-1,"total_bytes_exp":-1,"total_pkts_exp":-1,"total_flows_exp":-1,"ipv4_src_prefix":"NA","ipv4_dst_prefix":"NA","mpls_top_label_type":-1,"mpls_top_label_ip_addr":"NA","flow_sampler_id":-1,"flow_sampler_mode":-1,"flow_sampler_random_interval":-1,"min_ttl":-1,"max_ttl":-1,"ipv4_ident":-1,"dst_tos":-1,"in_src_mac":"NA","out_dst_mac":"NA","src_vlan":-1,"dst_vlan":-1,"ip_protocol_version":-1,"direction":0,"ipv6_next_hop":"NA","bpg_ipv6_next_hop":"NA","ipv6_option_headers":-1,"mpls_label_1":"NA","mpls_label_2":"NA","mpls_label_3":"NA","mpls_label_4":"NA","mpls_label_5":"NA","mpls_label_6":"NA","mpls_label_7":"NA","mpls_label_8":"NA","mpls_label_9":"NA","mpls_label_10":"NA","in_dst_mac":"NA","out_src_mac":"NA","if_name":"NA","if_desc":"NA","sampler_name":"NA","in_permanent_bytes":-1,"in_permanent_pkts":-1,"fragment_offset":-1,"forwarding_status":-1,"mpls_pal_rd":"NA","mpls_prefix_len":-1,"src_traffic_index":-1,"dst_traffic_index":-1,"application_description":"NA","application_tag":"NA","application_name":"NA","postipdiffservcodepoint":-1,"replication_factor":-1,"layer2packetsectionoffset":-1,"layer2packetsectionsize":-1,"layer2packetsectiondata":"NA"}
^CTopic printing ceased

Create stream with double quotes around column names:

ksql> CREATE STREAM test2 ("UNIXSecs" INTEGER,"SeqNum" INTEGER,"SrcID" INTEGER,"AgentID" STRING) WITH (kafka_topic='vflow.netflow9.flat', value_format='JSON');

 Message
----------------
 Stream created
----------------

🔴 Query - data not returned as expected - only shows NULLs

ksql> select * from test2 limit 1;
1554712041216 | null | null | null | null | null
Limit Reached
Query terminated
ksql>

Create stream :

ksql> CREATE STREAM test1 (UNIXSecs INTEGER,SeqNum INTEGER,SrcID INTEGER,AgentID STRING) WITH (kafka_topic='vflow.netflow9.flat', value_format='JSON');                                                                                                                                                                                                                                                                    Message                                                                                                                                                                                                     ----------------                                                                                                                                                                                              Stream created                                                                                                                                                                                              ----------------

✅ Query - data returned as expected:

ksql> select * from test1 limit 1;
1554712041216 | null | 1554499905 | 567608 | 1 | 71.41.183.98
Limit Reached
Query terminated

Related: #2589

@vcrfxia
Copy link
Contributor

vcrfxia commented Apr 16, 2019

I believe this issue is JSON-specific, and is a specific case of #2551.

@alexott
Copy link

alexott commented Apr 22, 2019

I had similar issue when used double quotes in SQL instead of single ones:

ksql> CREATE STREAM tweets (text varchar, lang varchar, id bigint) with (KAFKA_TOPIC="tweets", VALUE_FORMAT='JSON', KEY = 'id');
Failed to prepare statement: name is null

From error message it's hard to guess what is wrong there...

@allenansari174
Copy link

I created a stream but rowkey format is like this:
{"schema":{"type":"struct","fields":[{"type":"int64","optional":true,"field":"Id"}],"optional":false,"name":"com.github.jcustenborder.kafka.connect.twitter.StatusKey","doc":"Key for a twitter status."},"payload":{"Id":1140239850848882688}}

and the rest of the fields are null

any idea how to solve this issue.

ksql> SELECT * FROM twitter_raw LIMIT 1;
1560689344973 | {"schema":{"type":"struct","fields":[{"type":"int64","optional":true,"field":"Id"}],"optional":false,"name":"com.github.jcustenborder.kafka.connect.twitter.StatusKey","doc":"Key for a twitter status."},"payload":{"Id":1140239850848882688}} | null | null | null

@vcrfxia
Copy link
Contributor

vcrfxia commented Jun 16, 2019

Hi @allenansari174 , KSQL currently does not support message keys that are JSON objects but there is ongoing work to add support for this: #824 (though the issue hasn't been updated in a while, there has been substantial work towards it since earlier in the year).

@big-andy-coates big-andy-coates added this to Needs triage in Bugs Oct 25, 2019
@agavra agavra self-assigned this Oct 25, 2019
@agavra
Copy link
Contributor

agavra commented Oct 28, 2019

This has been fixed in master and will be available in the next release!

ksql> PRINT test FROM BEGINNING;
Format:JSON
{"ROWTIME":1572281922845,"ROWKEY":"null","UNIXSecs":1,"SeqNum":2}
ksql> CREATE STREAM test ("UNIXSecs" INTEGER, "SeqNum" INTEGER) WITH (kafka_topic='test', value_format='JSON', partitions=1);

 Message
----------------
 Stream created
----------------
ksql> SELECT * FROM TEST EMIT CHANGES LIMIT 1;
+----------------------+----------------------+----------------------+----------------------+
|ROWTIME               |ROWKEY                |UNIXSecs              |SeqNum                |
+----------------------+----------------------+----------------------+----------------------+
|1572281922845         |null                  |1                     |2                     |
Limit Reached
Query terminated

Feel free to re-open if the problem persists!

@agavra agavra closed this as completed Oct 28, 2019
Bugs automation moved this from Needs triage to Closed Oct 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Bugs
  
Closed
Development

No branches or pull requests

5 participants