Spark Projects for the Berkeley Data Science Course
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
1_Word_count
2_Apache_web_log
3_Entity_Resolution
4_Recommendations
5_MathReview
6_Linear_Regression
7_Logistic_Regression
8_PCA
.DS_Store
README.md

README.md

Spark_Projects

Spark Projects for the Berkeley Data Science Course

  1. Wordcount in Spark - A word counting program to count the words in all of Shakespeare's plays

  2. Apache Log File analysis in Spark - Use Spark to explore NASA Apache web server log

  3. Entity Resolution - Entity Resolution using TFIDF approaches in Spark.

  4. Movie Recommendation using ALS - Predicting Movie ratings using Spark.

  5. Linear Regression - Predicting Song Year using Linear regression in Spark.

  6. Logistic Regression - Predicting Click Through Rates using Spark. One Hot Encoding, Hashing Explained.

  7. PCA - Running the PCA on neuroscience data