Hive Storage Handler for Kafka

HiveKa is Apache Hive's storage handler that adds support in Apache Hive to query data from Apache Kafka. This provides an opportunity to Kafka users to inspect data ingested by Kafka without writing complex Kafka consumers. Hive makes it possible to run complex analytical queries across various data sources, like, HDFS, Solr, Hbase, etc.. HiveKa extends this support to Kafka.

Visit our website.

To create a Kafka table in Hive run:

create external table test_kafka (a int, b string) stored by 'org.apache.hadoop.hive.kafka.KafkaStorageHandler' tblproperties('kafka.service.uri'='hivekafka-1.ent.cloudera.com:9092', 'kafka.whitelist.topics'='test4', 'kafka.avro.schema.file'='/tmp/test.avsc');

To generate Avro byte data into a topic, run our DemoProducer and pass the topic, number of messages and a kafka broker as parameters. For example:

java -classpath "/opt/cloudera/parcels/CDH/lib/avro/*:hive-kafka-1.0-SNAPSHOT.jar:/usr/lib/hive/*:/usr/lib/hive/*" org.apache.hadoop.hive.kafka.demoproducer.DemoProducer test4 10 hivekafka-1:9092

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
src/main		src/main
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hive Storage Handler for Kafka

About

Releases

Packages

Languages

License

astroshim/HiveKa

Folders and files

Latest commit

History

Repository files navigation

Hive Storage Handler for Kafka

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages