- Online URL: https://cmu-ids-2020.github.io/fp-classification_clarification/
- Team members:
- Contact person: Juliette Wong (jnwong@andrew.cmu.edu)
- Christian Deverall (cdeveral@andrew.cmu.edu)
- Nathan Jen (njen@andrew.cmu.edu)
- Laura Howard (lmhoward@andrew.cmu.edu)
- Track: Narrative
- Link to Writeup
- Link to Video
The goal of our project is to explain classification algorithms to readers without prior knowledge in machine learning. Our solution is to create an interactive “scrollytelling” narrative. Our narrative is centered around a case study where readers see whether education and age have an impact on income level. The narrative first introduces the dataset, allowing users to understand the basis of our models through exploratory data analysis. Next, we cover some of the inherent problems common to all machine learning problems, such as the concept of training data and overfitting. Finally, we introduce the three classification algorithms, K-Nearest Neighbors, Decision Trees, and Logistic Regression. In these sections, we include interactive visualizations that allow the user to see how changing the hyperparameters can affect the results of the model. We hope that by reading and interacting with the article, readers will understand at a high-level how these different classifications work, and that there are multiple ways to achieve similar outcomes with machine learning.
I primarily worked on the visualizations for the report. I created the visualizations for the EDA and mystery man section using vega-lite, and I helped use idyll to make the decision tree visualization interactive. Additionally, I created the content (text and graphs) for the logistic regression section, and ran the algorithms to compute accuracies for the model selection section.
I was primarily responsible for the styling of our web application, which meant writing custom CSS and bolding text. Additionally, since I am the only team member with JavaScript experience, I had to build out the custom React components for our application as well as implement a lot of our app's functionality. Also helped build the decision tree in d3 and brainstormed and helped out with the EDA and mystery person. Finally, I was responsible for figuring out how to deploy our Idyll application to GitHub pages.
I mainly handled the data side of things including creating the dataset and structures for the decision trees. I created the interactive visualisations for decision trees boundaries and KNN in Vega-lite and helped to visualize the trees in D3. I also wrote some of the text that explained the inner-workings of the classification algorithms.
For this project, I was primarily in a product management role. I managed the team's shared folders, organized meetings, and helped to define the narrative. I also helped with the layout design, sourced images and gifs, wrote narrative text, and inserted them into the Idyll application.
Our project progressed relatively slowly in the initial stage as we had to gain familiarity with the Idyll markup language. The major challenge here was integrating Idyll with D3 and vega-lite components, which contained limited documentation. For datasets, we initially used the Iris flower dataset, however we later settled on the 1994 Census dataset to perform income prediction. We created a plan for the 3 algorithms we wanted to cover and the technologies we would use to visualize them. This allowed us to simultaneously work on the text contents and visualizations. In order to promote seamless transitions from text to visualizations, we changed the text contents iteratively as we created updated versions of our visualizations. In the end stage of the project, we focused on making stylistic adjustments in Idyll based on group feedback and adding decoration to our visualizations.
- The URL at the top of this readme needs to point to your application online. It should also list the names of the team members.
- A completed proposal. The contact should submit it as a PDF on Canvas.
- Develop a prototype of your project.
- Create a 5 minute video to demonstrate your project and lists any question you have for the course staff. The contact should submit the video on Canvas.
- All code for the project should be in the repo.
- A 5 minute video demonstration.
- Update Readme according to Canvas instructions.
- A detailed project report. The contact should submit the video and report as a PDF on Canvas.
- Install npm (follow this link)
- Install idyll
- npm install -g idyll
- Install dependencies for idyll
- npm i
- Use es5 version for vega-lite and vega-embed (must be manually done)
- For vega-lite: Go to node_modules/vega-lite, copy vega-lite.js from build-es5 directory to build directory
- For vega-embed: Go to node_modules/vega-embed, copy vega-embed.js from build-es5 directory to build directory
- Run the project
- idyll