Spark's assignment using SparkSQL and Spark Streaming processing with Kafka. Calculating spaceships consumptions.
-
Updated
Jan 1, 2019 - Scala
Spark's assignment using SparkSQL and Spark Streaming processing with Kafka. Calculating spaceships consumptions.
spark with scala, including rdd, transform, action, hdfs, sparkSQL, dataframe and mllib
Designed a Machine Learning model which takes newsgroup dataset and performs binary classification to predict if a given document has Atheistic or Christian sentiment. Used LIME library and PySpark. Performed feature selection to improve classifier’s performance.
Yelp Dataset Analysis using Apach Spark, PIG and insightfulls using Zeppelin GUI
In this repository, Google Collab is paired with SparkSQL to determine key metrics about home sales data. Spark is also used to create temporary views, partition data, and cache/unchache a temporary table in the process.
Dockerfile for spark-ubunt-scala-python3
Analyzing key metrics on Home sales data
pyspark and sparksql data transformation
Weather Data Analysis using Python, Pandas, SparkSQL, AutoRegression Model
Developing Spark applications using scala
A fun place for me to blog about distributed databases, aerial arts, and life in general
Add a description, image, and links to the sparksql topic page so that developers can more easily learn about it.
To associate your repository with the sparksql topic, visit your repo's landing page and select "manage topics."