Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
-
Updated
Aug 7, 2024 - Scala
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
A Scala kernel for Jupyter
Apache Spark Course Material
Qubole Sparklens tool for performance tuning Apache Spark
Spark Structured Streaming / Kafka / Cassandra / Elastic
Apache Spark 3 - Structured Streaming Course Material
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Spark Connector to read and write with Pulsar
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤
Intellij template to develop Apache Spark 2.x applications
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎
A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.
To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."