Code to accompany Advanced Analytics with Spark
This repo contains a translation of the source codes used in Spark advanced analysis from scala to python (pyspark). This repo is maintained with the intention of being a collection for templates for analyses for rapid customization and deployment.
/
├── ch02-intro : Introduction to PySpark
├── ch03-recommender : Recommender system
├── ch04-rdf : Predictor system
├── ch05-kmeans : Anomaly detector system
├── ch06-lsa : Semantic analysis
├── ch07-graph : Network analysis
├── ch08-geotime : Geospatial and temporal analysis
├── ch09-risk : Financial risk modeling and simulation
├── ch10-genomics : Genomics data analysis
└── ch11-neuro : Imaging data analysis