Skip to content
Switch branches/tags
This branch is up to date with master.

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Polldentify algorithms

Polldentify is a web app that tracks and traces the sources of pollution within a US city base on its geographical location and wind direction at a moment in history. It compiles 15 years of data, run all the data through an algorithm inspired by the Gaussian Dispersion Model of Pollution, and showcased the info through .This github open sources all the programs that we use to collect, compile and analyze the data

##Data_collection crawls concentration of pollutants (ozone, sulphur dioxide, nitrogen dioxide, carbon monoxide, PM10) from the EPA and save it as a dataframe (can be converted to csv through pd.to_csv). in case your computer doesn't have enough memory for all the pollutant files downloaded through, this allows you to download each pollutant one at a time.

openfile.m: MATLAB program to download nc files on wind data from NOAA and save it as a csv file. / extract wind data from file generated from openfile.m downloads altitude information at each sampling longitude and latitude point through GoogleAPI and save it as csv file. combine wind data, altitude, longitude, latitude and air pollutant data into one final dataframe, indexed by date and save it as a csv file.

##Data_finalize in case of missing data, we either interpolate the data or use a randomize function to put a best possible value for the data. / finalize the dataframe so that it could be directly delivered into an algorithm.

##Algorithms This file contains the general algorithm for manipulating the final dataframe. For a location (x,y,z), the location would have pollution concentration C and the rate of source emission is Q linear regression algorithm to predict pollutant concentration in the future. Open for improvements since linear regression is a very basic machine learning algorithm.


Algorithms used for polldentify



No releases published


No packages published