[WORK IN PROGRESS (still adding projects as of Jan 15 2017)]
A collection of interesting, memorable, and well... mundane projects developed for and during my bachellor's and master's degree at PUPR(San Juan, PR) and JHU(Baltimore,MD), respectively.
A comparison of the Euclidean distance, Mahalanobis distance, and Naive Bayes classifiers is done. There are l=7 features, and N=160 exemplars. ML estimator is used when calculating mean vectors and covariance matrices of the classes under a Gaussian distribution assumption.
PowerBall Analysis (a personal mini-project)
Historical data is used in an attempt to determine whether ball numbers drawn in the lottery follow a uniform distribution. Any reasonable person would know what to do if they DO NOT follow a uniform distro (i.e. contact the lottery people, right?). Complicating the analysis is the fact that the range of values have changed throughout the years, and thus, relatively few data points are available.
Two estimatation methods (maximum likelihood and Method of moments) are compared at estimating a paramater of a uniformly distributed random variable.
The effect of weather events in the D.C. Metropolitan area on the number of bike rental is explored. Also, several models based on linear regression are built.
Using an unsuppervised machine learning algorithm, the number of classes is estimated. Then, the estimation error is compared to the known labels(l=7). The k-Means clustering algorithm and UCI's wheat seeds database (N=210) were used.
Samples (N=186), including features(19), are generated using stock data from Matlab (uses Yahoo Finance). An artificial neural network is then trained to determine if the price of a stock will go up or down relative to the support level.
Here I perform automated analysis of microscopic imagery to detect the presence of Karnal bunt spores. Using image processing and a SVM classifier, spores and non-spores are differentiated.
Using clustering algorithms, the number of unique categories of GRBs is estimated.