Spatial Queries To MapReduce Translator ( Hadoop GIS functionality for spatial Big Data )
-
Updated
Mar 14, 2015 - Python
Spatial Queries To MapReduce Translator ( Hadoop GIS functionality for spatial Big Data )
Cleanup and modularize the pymix code
Highly verstatile Python SQL Benchmarking tool that generates realistic queries to test performance of any SQL database schema on any given hardware platform. I was a contributing member of this project originally developed at Deep Information Sciences.
Coursework from Big Data (CS3390) -- Machine Learning tasks performed using Hadoop, MapReduce, and Spark
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Implementations for hw/projects for HMC Math189 (Mathematics of Big Data)
Project to learn and understand bayesian learning with the goal to classify Tweets to check whether they're positive or negative
Toying around with data about patients with diabetes and their readmission rates to the hospital.
A movie recommendation engine based on collaborative filtering and content based similarity.
Determined the potential spammers in preprocessed Amazon Food Reviews dataset based on reviews and ratings of reviewers using certain heuristics in Pig and applied Naïve Bayes Algorithm on it using PySpark to identify the actual spammers
Recommends you Subreddits based on Word2Vec neural net!
Galvanize Capstone Project Repo: demystifying the evolution of topics over time.
Bare bone examples of machine learning in TensorFlow
This repository contains mapreduce extractors to preprocess and extract websites from the common crawl corpus.
Big Data project for ATS subject, basic parallel implementation of map-reduce paradigm with a test to count words in text files
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."