Code and Data for PyData-Hyderabad-Chapter meetup
-
Updated
Aug 31, 2018 - HTML
Code and Data for PyData-Hyderabad-Chapter meetup
Labs for the course "Big Data: architectures and data analytics" @ Politecnico di Torino a.y. 2021/22
Data preparation, visualization and feature engineering and classification of survival people using pyspark libraries
Developed a model/Spark ML pipeline stream to identify potential customers that may purchase top up services in the future.
The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark
Examples for data science learning
This repo contains code for restuarant recommendation system for users based upon business rating value.
The goal is to train a linear regression model to predict Deerfoot commute times given weather and accident conditions using Spark RDD and MLlib
This repository contains Apache Spark, Apache Hive, Apache Pig work
Implemented an auto-clustering tool with seed and number of clusters finder. Optimizing algorithms: Silhouette, Elbow. Clustering algorithms: k-Means, Bisecting k-Means, Gaussian Mixture. Module includes micro-macro pivoting, and dashboards displaying radius, centroids, and inertia of clusters. Used: Python, Pyspark, Matplotlib, Spark MLlib.
Yelp Toronto User Pattern Analysis and Recommender System
A UDF to evaluate Spark-MLlib classification model using PySpark
Apache Spark mllib example for seminar 'AI with scala'
Streaming component of the project, which is written with Spark Streaming.
Add a description, image, and links to the spark-mllib topic page so that developers can more easily learn about it.
To associate your repository with the spark-mllib topic, visit your repo's landing page and select "manage topics."