深圳地铁大数据客流分析系统🚇🚄🌟
-
Updated
May 16, 2024 - Scala
深圳地铁大数据客流分析系统🚇🚄🌟
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
High performance data store solution
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
extremely distributed machine learning
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
A re-implementation of Hadoop DistCP in Apache Spark
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."