pyspark
Here are 103 public repositories matching this topic...
From image to text - a handwriting recognition tool prototype using Image Classification - Deep Learning in DataBricks.
-
Updated
Feb 1, 2024 - HTML
•Achieved real-time analysis of over 10,000 Ethereum transactions scraped using Selenium analyzed with PySpark while also reducing the cost to store the massive data in a database by storing only the analytics instead of all the data stored as a container in Docker which can be pulled on any machine as a local image and run the service.
-
Updated
May 21, 2023 - HTML
Repository that contains all self taught data engineering concepts and hands on project.
-
Updated
Jun 13, 2022 - HTML
Final Year Research project to recommend movies based on user behavioral data using the Big 5 personality model and user rating data. The model uses K-Means Clustering for Big 5 scores and 3 ALS models to recommend movies
-
Updated
Mar 11, 2021 - HTML
UCR CS179G Database Senior Design
-
Updated
Mar 14, 2017 - HTML
Distributed ML: Predicting Churn from Click Data with Apache Spark
-
Updated
Oct 25, 2019 - HTML
Using PySpark for handling BigData and Machine Learning
-
Updated
Aug 4, 2021 - HTML
Final year major project on big data analysis of instacart dataset and finally Product Bundle Recommendation using pyspark(for clustering) and bigram for recommendation
-
Updated
Mar 4, 2021 - HTML
Use Machine Learning (NLP Transformers model) to identify negative-sentiment tweets about JWST space mission
-
Updated
Nov 22, 2022 - HTML
A project on classification of GitHub readme sections using Machine Learning
-
Updated
May 11, 2022 - HTML
An assignment on preprocessing of text including tokenization, stop word removal
-
Updated
May 1, 2022 - HTML
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
-
Updated
Feb 24, 2019 - HTML
This is the final project for the Data Scientist Nanodegree, where our goal is to predict churn for a fictional streaming service called Sparkify.
-
Updated
Jul 6, 2023 - HTML
Binary classification project in PySpark on an AWS-EMR cluster to predict customer churn.
-
Updated
Jun 6, 2022 - HTML
Improve this page
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."