Discrimination of Wine Varietals Using Indicator Displacement Assay (Anslyn Freshman Research Initiative): Reprocessing of Data Using Python
Wine varietals were used as test mixtures to optimize chemical array sensors created in the Anslyn labs (FRI).
The chemical sensor arrays work so that the arrays' UV-vis absorbance change upon indicator displacement with the components of the test mixture. UV-vis absorbance readings were obtained at different wavelengths so that the data is multivariate. This data is then amenable to machine learning, such as principal component analysis and linear discriminant analysis.
The following plots were obtained from analysis using Python, and were mostly consistent with the results obtained using a statistical software, XLStat (the publication for the original study can be found here and a pdf can be found here).
Explained variance and explained varaiance ratios were the same as the ones previously obtained. Here is the 3D plot of the transformed data (linear discriminant analysis):
The Jupyter Notebook can be found here.
The same dataset were analyzed using spectral co-clustering.
First, the correlations of the wines were calculated using pandas correlation function. The correlations were analyzed for clusters using [spectral co-clustering](](http://scikit-learn.org/stable/auto_examples/bicluster/plot_spectral_coclustering.html) provided by the scikit-learn package. Plotting the correlations of the wines based on clusters results in the following (the right was rearranged to show the clusters).
Re-plotting shows that the clusters were of the same varietal, mostly.
The Jupyter notebook for this is here.
PCA example using the iris dataset: http://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_iris.html#sphx-glr-auto-examples-decomposition-plot-pca-iris-py
Using Python for Research (HarvardX: PH526x) https://courses.edx.org/courses/course-v1:HarvardX+PH526x+3T2016/course/