You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An Automated ETL Data pipeline which extract complex json data from web API service (GBFS-bixi Data) and convert to CSV for loading into Data-warehouse HDFS. After-that, Hive will process the further by external and managed table. Same procedure is also applied with AWS S3 and Athena.
Distributed computational problem-solving project, which aims to perform large-scale graph matching using cloud computing technologies. The project allows users to import two directed graphs and analyze the differences between them.
Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.
In this project, we are going to build a Bicycle sharing demand prediction service using Apache Spark and Scala. I have created a two spark application one for model generation and another for model demand prediction.