Analyze fraudulent credit card transactions using Kafka, Spark Streaming, and Random Forest Classifier algorithms in PySpark
-
Updated
Aug 12, 2024 - Python
Analyze fraudulent credit card transactions using Kafka, Spark Streaming, and Random Forest Classifier algorithms in PySpark
Explore essential MapReduce design patterns for big data processing! This repository includes practical implementations of patterns from the "MapReduce Design Patterns" book, complete with examples across summarization, filtering, organization, joins, and more.
The project deals with analyzing Retail-Dataset using Apache Spark.
Predicting the Fare on a Billion Taxi Trips with BigQuery. How long time does it take and how much does it cost to analyse and train a model on a billion taxi trips in the cloud?
This analysis was pereformed as a final project for Rutgers MSBA Course "Big Data Analytics". It consists of a data analysis and generated machine learning models based on open source research data collected by researcher Milanz Dravkovic from a single pharmacy's point-of-sales system.
Big Data Analysis on a Covid-19 Dataset
Docker de Sandbox Hortonworks Data Platform, Sandbox Hortonworks Data Flow y Sandbox Proxy
Detection of credit card freuds and big data analysis of transactions. (csv files were too large to upload on github.)
🎓 Implementation of all the milestones of the Big Data Analytics course @ UniPi Department of Computer Science
Your go-to repository for interview preparation, use-case, research and practice where you’ll find curated notes, sample QA and code strategies in notebooks, text and pdf to help you navigate next level technical stacks. | BigData | GenAI | AI-ML | Java | Docker | DB | Cloud | Data Science | SQL | CPP
Brain Tumor detection and classification , Website for Hospital Management taking real time data from users , Tableau Visualizations of reports
An analysis of Amazon reviews to determine if there is bias amongst the Vine review program.
UTS Mata Kuliah Praktikum Big Data
Project Related to BigDataSets with MachineLearning
Spark Home Sales Analysis utilizes Apache Spark to explore and analyze home sales data, providing insights into average prices based on various criteria. The project employs Spark SQL queries for efficient data processing and is designed for easy setup and usage.
Implementation of the Page Rank Algorithm in Python
The aim of the project is to gain insights into customers' booking behaviors, preferences, and decision-making processes when reserving hotel accommodations.The data analysis process involves cleaning and organizing the hotel reservation data and generating visualizations and reports to present the findings.
Explore Python para análise de dados com este repositório. Desenvolva um modelo de regressão linear para prever casos de dengue e utilize aprendizado de máquina para classificar dados do conjunto Iris.
Add a description, image, and links to the bigdataanalytics topic page so that developers can more easily learn about it.
To associate your repository with the bigdataanalytics topic, visit your repo's landing page and select "manage topics."