Visual, interactive queries against big databases
-
Updated
Jul 16, 2024 - Java
Visual, interactive queries against big databases
Big Data Analysis of NYC Fire Incident data to analyze casual relationship between fires, govt. inspections, socio-ecnomic factors and enviroment. Used Hadoop MapReduce for data pre-processing, Trino for complex queries and Tableau for visualizations and interactive dashboards
💾 Welcome to the Big Data Analytics Repository! 📚✨ Immerse yourself in a carefully curated reservoir of knowledge on Big Data Analytics. 🌐💡 Explore the intricacies of deriving insights from vast datasets and navigating powerful analytics tools. 🚀🔍
a suite of benchmark applications for distributed data stream processing systems
《DNA元基催化与肽计算》 在进化计算中, 软件函数文件进行 DNA 语义元基索引编码的 PDE 新陈代谢优化方式, 是一种有效的进化方式.
The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark
Implementing parallel processing techniques for efficient handling of Big Data through practical activities.
Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.
This repo explains the implementation of Map-Reduce Algorithm on the AirBnb data to understand the consumer satisfaction region and country wise. This is the effective use of parallel distributed computing to resolve the big data problems
Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clusters on Kubernetes. This is the git repository of Eskimo Community Edition.
Reservoir Sampling for Group-By Queries in Flink Platform. Answering effectively Single Aggregate.
Big Data Pipeline | Querying Data from Hive Table Phase
This repository contains a project showcasing the use of Big Data technologies in processing and visualizing real-time data from an eCommerce electronics store using tools such as Apache Kafka, Spark Streaming, Spark SQL, HBase, and Plotly
Real-time click event project with Elasticsearch
Adaptive Decision Forest(ADF) is an incremental machine learning framework called to produce a decision forest to classify new records. ADF is capable to classify new records even if they are associated with previously unseen classes. ADF also is capable of identifying and handling concept drift; it, however, does not forget previously gained kn…
Java Hadoop MapReduce code for my Big Data Analytics Project using the Titanic dataset
Fetch data from Twitter and push it through Kafka to Spark then HDFS
SUTD 2021 50.043 Database and Big Data Systems Code Dump
Workflow management system for the automated and distributed analysis of large-scale experimental data.
The objective of the project is to gather, process and analyze publicly available COVID related data acquired from various reliable sources such as CDC and JHU. The application finds strong correlation between the features such as number of cases across different cohorts and how it affects COVID case count in that particular geography, over time…
Add a description, image, and links to the big-data-analytics topic page so that developers can more easily learn about it.
To associate your repository with the big-data-analytics topic, visit your repo's landing page and select "manage topics."