#

apache-spark

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 12 public repositories matching this topic...

fayaz92 / file-converter-spark-scala

The application converts any CSV file into Parquet or AVRO format and it reads and writes from any File System, HDFS, and S3.

scala apache-spark avro bigdata parqu

Updated Oct 13, 2020
XSLT

vinceecws / Project-1

A project that involves manipulating unstructured CSV data with Hadoop's HDFS & Hive, additionally performing queries using SparkSQL

scala apache-spark sbt hadoop-hdfs hadoop-hive

Updated Aug 4, 2021
XSLT

hirenshah7390 / Spark_MLlib

machine-learning scala apache-spark data-engineering

Updated Sep 6, 2017
XSLT

s5745623 / Wiki_Search_Engine

Wiki Search Engine

search-engine scala apache-spark wikipedia wikidata

Updated Oct 12, 2017
XSLT

rajat0721 / AnalysisInSpark

scala apache-spark

Updated Jul 2, 2021
XSLT

joowon-byun / socialDataAnalysis

scala spark apache-spark

Updated Sep 1, 2018
XSLT

naironics / spark-movielens

Apache Spark Programs to perform data analysis on movielens data

python scala apache-spark movielens-dataset

Updated Jan 19, 2016
XSLT

prabhuvashwin / QuoraQuestionPairs

An SVM classifier to determine whether two questions on quora are duplicates or not

scala apache-spark kaggle-competition data-analysis quora-question-pairs

Updated Aug 14, 2017
XSLT

laveenaBachani / Geo-Spatial-Hotspot-Analysis-of-Large-scale-datasets-using-Apache-Spark

Geo-Spatial Hotspot Analysis of Large scale datasets using Apache Spark and Scala

scala apache-spark

Updated Jan 8, 2018
XSLT

furqan-software-engineer / Spark-BigData-Twitter-Sentiment-Analyzer

Twitter's Tweets Stream Sentiment Analyser using Apache Spark - Spark Stream, Spark SQL , Stanford NLP(Natural Language Processing)

scala apache-spark twitter-streaming-api big-data-analytics

Updated Mar 21, 2018
XSLT

sanirudh94 / spark-twitter-streaming

Live twitter treaming from twitter4j API and Spam Detection Using Apache Spark for Stream Data Processing, ElasticSearch to index the data and Kibana to get live interatice dashboards

elasticsearch kibana scala apache-spark twitter-streaming-api spark-streaming

Updated Jul 19, 2018
XSLT

tspannhw / livysparkjob

Apache Livy - Apache NiFi - Example Scala Spark Job

scala apache-spark sbt apache-nifi apache-livy

Updated Mar 15, 2018
XSLT

Created by Matei Zaharia

Released May 26, 2014

Followers: 420 followers
Repository: apache/spark
Website: spark.apache.org
Wikipedia: Wikipedia

Related Topics