#

spark-sql

Here are 193 public repositories matching this topic...

apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

kubernetes sql spark hive hadoop jdbc thrift data-lake hacktoberfest spark-sql

Updated Aug 7, 2024
Scala

databricks / LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

spark apache-spark mllib structured-streaming spark-sql spark-mllib mlflow delta-lake

Updated May 8, 2024
Scala

apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

arrow clickhouse simd vectorization spark-sql velox

Updated Aug 7, 2024
Scala

almond-sh / almond

A Scala kernel for Jupyter

scala spark jupyter repl jupyter-notebook jupyter-kernels spark-sql

Updated Jul 29, 2024
Scala

LearningJournal / SparkProgrammingInScala

Apache Spark Course Material

scala big-data spark apache-spark bigdata data-lake datalake spark-sql spark-scala

Updated Apr 21, 2023
Scala

qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark

performance scala spark simulation cluster scheduler scheduling performance-metrics performance-tuning performance-visualization performance-analysis sparkjava spark-job spark-applications spark-sql spark-mllib spark-ml

Updated Jun 26, 2024
Scala

polomarcus / Spark-Structured-Streaming-Examples

Spark Structured Streaming / Kafka / Cassandra / Elastic

kafka spark cassandra structured-streaming spark-sql

Updated Feb 7, 2023
Scala

mc2-project / opaque-sql

An encrypted data analytics platform

security machine-learning privacy spark analytics enclave spark-sql

Updated Mar 29, 2023
Scala

LearningJournal / Spark-Streaming-In-Scala

Apache Spark 3 - Structured Streaming Course Material

scala big-data spark apache-spark bigdata spark-streaming datalake spark-sql

Updated Sep 8, 2020
Scala

sjyttkl / spark_learning

尚硅谷大数据Spark-2019版最新 Spark 学习

spark spark-sql spark-core

Updated Jun 21, 2022
Scala

Thomas-George-T / Movies-Analytics-in-Spark-and-Scala

Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.

scala movies big-data spark hadoop analytics movielens-data-analysis shell-script dataframes movielens-dataset rdd case-study spark-sql spark-programs spark-dataframes big-data-analytics spark-scala big-data-projects spark-rdd

Updated May 19, 2021
Scala

streamnative / pulsar-spark

Spark Connector to read and write with Pulsar

data-science spark apache-spark stream-processing flink data-processing batch-processing structured-streaming spark-sql apache-pulsar

Updated Apr 12, 2024
Scala

xiaogp / recsys_spark

Spark SQL 实现 ItemCF，UserCF，Swing，推荐系统，推荐算法，协同过滤

collaborative-filtering recommender-system spark-sql

Updated Dec 19, 2019
Scala

maziyarpanahi / spark2-template

Intellij template to develop Apache Spark 2.x applications

spark-streaming spark-sql spark2 spark-ml

Updated Jan 18, 2022
Scala

spider-123-eng / Spark

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .

streaming consumer parquet kafka-producer spark-sql spark-kafka-integration spark-streaming-data spark-transformations spark-to-cassandra-connection spark-dataframes spark-joins spark-hive-context spark-jdbc-connection spark-with-mangodb spark-aggregations-using-dataframe spark-use-cases cassandra-installation spark-datadog spark-mangodb spark-catalog-api

Updated Nov 16, 2022
Scala

wangj1106 / recommendMoteur

电影推荐系统、电影推荐引擎、使用Spark完成的电影推荐引擎

movies kafka spark spark-streaming recommendation-engine recommender-system flume als recommendation spark-sql

Updated Jun 25, 2018
Scala

anish749 / spark2-etl-examples

A project with examples of using few commonly used data manipulation/processing/transformation APIs in Apache Spark 2.0.0

spark spark-sql sparkscala spark2 spark-batch

Updated Aug 5, 2021
Scala

qbeast-spark

Qbeast-io / qbeast-spark

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

scala big-data spark sampling datasource spark-sql data-lakehouse

Updated Aug 5, 2024
Scala

minio / spark-select

A library for Spark DataFrame using MinIO Select API

select spark sbt bigdata pyspark minio parquet-files spark-sql amazon-s3

Updated Sep 27, 2019
Scala

mayur2810 / sope

Apache Spark ETL Utilities

yaml framework scala spark etl dsl transformer spark-sql

Updated Aug 21, 2023
Scala

Improve this page

Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."