Skip to content

siddsach/Clusterviz

Repository files navigation

OPINION BUBBLE ENGINE

All of the math and machine learning is in graph_data_construction.py

alt text

Motivated by research on casual information visualization, the Flipside team and I built an interactive visualization of the different sides of an issue, based on people's votes on a particular claim. We combined a d3.js and vue.js front-end with regular HTTP requests to the backend to dynamically update this visualization as people voted.

Here, I've put up all of machine learning and math code for this that I developed. All of the math and machine learning is in graph_data_construction.py The methodology for the clustering went as follows:

  • Code to process votes of form (user_id, sentence_id, vote), cluster user into groups and calculate statistics about those groups.
  • Construct Binary Agree/Disagree Votes Matrix
  • Perform Dimension Reduction (PCA, metric and non-metric MDS)
  • Perform Clustering on new representations (KMeans, MeanShift)
  • Automatic tune Hyperparameters based on Sillhouette Score
  • Construct Clustering Rationale using Statistical Feature Selection
  • Processes 2000 votes in <0.1 seconds using heavy vector parallelization

About

Automic clustering library for Flipside

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published