The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
-
Updated
Feb 27, 2024 - Scala
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
GameTuner Scala Stream Collector is project for collecting raw events from tracker
GameTuner BigQuery Loader is application that loads enriched event to BigQuery
The U.S. Department of Transportation's (DOT) Bureau of Transportation Statistics tracks the on-time performance of domestic flights operated by large air carriers. Summary information on the number of on-time, delayed, canceled, and diverted flights is published in DOT's monthly Air Travel Consumer Report and in this dataset of 2015 flight dela…
Media Recommendations Using Big Data Analytics.
Space filling curve library for Spark
SparkSQL analysis on groceries and medication prices from Wal-mart and Competitors to deduce empirical fact on best cost-effective grocery store
I used big data tools (Hive, SparkRDDs, and Spark SQL). I solved challenging big data processing tasks by finding highly efficient solutions. Experienced processing four different types of real data: Standard multi-attribute data (video game sales data), Time series data (Twitter feed), Bag of words data, A News aggregation corpus.
Titian: Data Provenance Support in Spark (VLDB 2016) / Adding Data Provenance Support to Apache Spark (VLDB Journal)
Geo-spatial data analysis on data collected from a taxi-cab company.
Set of tasks solved in Big Data Algorithms course
Implementation of simple Bloom Filter
(Semester 4) Big Data Analytics - End Semester Project
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Performance of Aircraft in the US from 1987 to 2008.
This repository is created by Dharshan Kumar K S and Siva Prakash as part of our semester project from 'Big Data Analysis' subject
Add a description, image, and links to the big-data-analytics topic page so that developers can more easily learn about it.
To associate your repository with the big-data-analytics topic, visit your repo's landing page and select "manage topics."