Skip to content

mmm0469/spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

spark

Apache Spark Tutorial

Spark was originally designed and developed by the developers at Berkeley AMPLab. To take the benefit of wide open community at Apache and take Spark to all of those interested in data analytics, the developers have donated the codebase to Apache Software Foundation and Apache Spark is born. Hence, Apache Spark is an open source project from Apache Software Foundation.

Get started with Apache Spark Core concepts and setup :

Install Spark on Mac OS Install Spark on Ubuntu How to Setup an Apache Spark Cluster cluster managers supported in Apache Spark load data from JSON file and execute SQL query in Spark SQL Setup Apache Spark to run in Standalone cluster mode Get started with Apache Spark with the help of Word Count Example Configure Apache Spark Ecosystem Configure Spark Application Configuring Spark Environment DAG and Physical Execution Plan Text Search Example Example to read table in MySQL Database Pi Estimation example to demonstrate compute intensive tasks. Apache Spark MLlib Apache Spark SQL Library

A detailed explanation with an example for each of the available machine learning algorithms is provided below :

Classification using Logistic Regression Classification using Naive Bayes Generalised Regression Survival Regression Classification using Decision Trees in Apache Spark MLlib with Java RandomForest Classification Example using Spark MLlib Gradient Boosted Trees Recommendation using Alternating Least Squares (ALS) Clustering using KMeans Clustering using Gaussian Mixtures Topic Modelling using Latent Dirichlet Conditions Frequent Itemsets Association Rules Sequential Pattern Mining

About

Apache Spark Tutorial

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages