Skip to content

mihir-tailor/etl-hdfs-hive

Repository files navigation

etl-hdfs-hive

Design a batch ETL job using HDFS and Hive

The objectives of this project can be categorized as :

Design a full batch data pipeline

How to use Hive to prepare raw data for data transformation

How to use partitioning (sharding) in Hive

Data Set Used : STM GTFS

About

Design a batch ETL job using HDFS and Hive

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages