Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added images/springbreak/fullspread.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/springbreak/onebeach.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week1/LondonLibraries.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week1/SeoulLibraries.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week1/baseball.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week1/faces_of_cattle.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week1/violent_crime_in_the_US.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/abstract.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/big.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/new.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/objectiveresults.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/subjectiveresults.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week10/traditional.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week11/abstract.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week11/faces.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week11/points.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week11/process.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week12/abstract.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week12/encodings.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week12/logrrp.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week12/scenario.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week13/abstract.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week13/experiments.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week14/abstract.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week14/after.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/week14/before.PNG
Binary file added images/week14/figure6.PNG
Binary file added images/week14/figure7.PNG
Binary file added images/week14/figure8.PNG
Binary file added images/week14/proofofconcept.PNG
Binary file added images/week2/dailydoses.PNG
Binary file added images/week2/percentvacinnatedbystate.PNG
Binary file added images/week2/whenwillwebevaccinated.PNG
Binary file added images/week3/DisneyProducts.jpg
Binary file added images/week4/birds.PNG
Binary file added images/week4/firerisk.PNG
Binary file added images/week4/humanimpact.PNG
Binary file added images/week4/unsurveyed.PNG
Binary file added images/week4/wildlifedensity.PNG
Binary file added images/week4/year1.PNG
Binary file added images/week4/year5.PNG
Binary file added images/week5/parallel_axes.gif
Binary file added images/week6/timesearcher2.PNG
Binary file added images/week6/video.PNG
Binary file added images/week8/bitmaps.PNG
Binary file added images/week8/grids.PNG
Binary file added images/week8/timeseries.PNG
Binary file added images/week9/itunes_visualizer.gif
Binary file added images/week9/mam_partmotion.gif
Binary file added images/week9/mam_pianoroll.gif
Binary file added images/week9/mam_tonalcompass.gif
Binary file added images/week9/mpm.gif
25 changes: 25 additions & 0 deletions springbreak.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Spring Break - Springbreak Phone Data to Show the Potential Spread of COVID-19
===
By Andrew Nolan (3-22-21)

![The Possible Spread of COVID from Fort Lauderdale Beach](./images/springbreak/fullspread.PNG)

I was not sure if we needed to submit a reflection this week since it is spring break, but I thought it would be fun to share a cool spring break visualization since it's currently WPI's spring break and starting next week all reflections will be on academic papers.

This article was actually published last year, and it was a bit ahead of it's time. Now we are aware of how dangerous the spread of COVID-19 can be, but last year everything was new. A group of data visualization researchers at Tectonix Geo created a model showing how just the people gathered at one beach in Florida during spring break could spread COVID-19 across the country. They used cell phone data aggregated by the location data company X-Mode Social. This data showed 5000 phones at a beach in Fort Lauderdale Florida during peak Spring break time in March, 2020. Then it zooms out the timeline to see where the phones were the week before and after spring break. This reveals the potential spread that COVID could have from just one beach of spring breakers.

Evidently (and unfortunately), the warnings from Tectonix Geo did not get heard because COVID spread a lot and people are still travelling for spring break this year. But as a data visualization it is very effective at showing how spread can occur. It works as a network showing connections of where the phones travel across the U.S. It ties in niceley with our recent reading of chapter 8 in the text book, since this is a very clear example of arranging spatial data.

Here you can see the collection of 5000 phones without social distance at the beach:

![The Possible Spread of COVID from Fort Lauderdale Beach](./images/springbreak/onebeach.PNG)

Now you can see the paths these phones have travelled in the dates surrounding the week of spring break:

![The Possible Spread of COVID from Fort Lauderdale Beach](./images/springbreak/fullspread.PNG)


Sources
---
1. Thousands of spring breakers traveled from one Florida beach to cities across the US. Mapping their phone data shows the importance of social distancing amid the coronavirus outbreak: https://www.businessinsider.com/coronavirus-florida-spring-break-location-data-spread-social-distancing-2020-3
2. Tectonix's Tweet: https://twitter.com/TectonixGEO/status/1242628347034767361?s=20
86 changes: 86 additions & 0 deletions week1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
Week 1 - A Discussion of Chernoff Faces
===
By Andrew Nolan (2-8-21)

#### Intro
Last summer I took the course CS 548 Knowledge Discovery and Data Mining. In one of the lectures we talked about data visualizations and in one slide we breifly discussed Chernoff Faces. For some reason I think about this data visualization way more than I should. For my assignment 1 project I made some Chernoff Face visualizations of Iris species data in d3. Since I've been thinking about them all week I felt like an introduction/discussion to the applications of the visualization would be a good choice for this week's reflection.

#### Background

For those unfamiliar with the data vis, Chernoff Faces were invented in 1973 by the American Mathematician, Herman Chernoff. Chernoff Faces provide a way to visualize multivariate data by mapping data features to visual features on the face [1][11]. The benefits of this visualization come from the claim that humans can easily recognize faces and identify small changes between them. This is useful if the goal of the visualization is pattern recognition, outlier detection, or clustering similar data objects. However, the scale and value of data is often obscured by this vis. Without intimate knowledge of which values are mapped to what facial feature it is hard to decipher more than which objects are similar or outliers.

From what I can tell, in most languages you would have to fork someone's Github repo or build your own system for visualizing data with Chernoff Faces. But if you are a fan of R, there is a package called aplpack that includes easy to use Chernoff Face code. This library is commonly used in real world Chernoff Faces. You can see actually see this library in use in all of the examples below. You can learn how to use it in the article "How to Visualize Data with Cartoonish Faces ala Chernoff" [6].


#### Real World Examples

Considering that most people have probably not heard of this visualization, it's unsurprising that it is rarely used for real world applications. However, I wanted to share a few practical examples that could be found.

In 2006 a Social Science Statistics blogger shared a Baseball bloggers Chernoff face represenation of 2005 National League Baseball statistics [4][5]. The faces below map the following features to facial features:
- Win percent -> face height, smile curve, and hair styling
- Hits -> face width, eye height, and nose height
- Home runs -> face shape, eye width, and nose width
- Walks -> mouth height, hair height, and ear width
- Stolen bases -> mouth width, hair width, and ear height.

![2005 National League Baseball Statistics](./images/week1/baseball.png)

While these statistics may be meaningful, it's hard to determine anything from these faces. In 2005 the winner of the National League was the Houston Astros (who then went on to lose to the White Sox in the World Series). In these faces, Houston doesn't particularly stand out. I'm not knowledgeable enough about baseball to tell you if that means anything, but it appears to me that would mean stand out stats don't neccisarily determine winningness in baseball.

Most Chernoff Face discussion will be from blog posts like the one above, mostly as a for fun look at an obscure data visualization. But there are cases in which Chernoff Faces are used in publishable research. For example, in 2017, the Journal of Documentation published an entry titled "Big data analysis of public library operations and services by using the Chernoff face method" [12]. By using the following feature mapping, the researchers were able to use Chernoff Faces to compare libraries in London and Seoul over time.

- Height of hair -> issues
- Width of hair -> visits
- Eye size -> collections
- Ear size -> number of libraries
- Nose size -> budgets
- Mouth size -> number of staff
- Mouth curvature -> number of professional staff
- Face size -> library floor space


![London Library Data visualized with Chernoff Faces](./images/week1/LondonLibraries.PNG)

![Seoul Library Data visualized with Chernoff Faces](./images/week1/SeoulLibraries.PNG)

The results of this study were used to compare the libaries. For example, they found London libraries typically had larger budgets than Seoul libraries. And in the time difference they measured (2004-2014) they discovered libaries were shrinking. Specifically related to the data vis, the study determined Chernoff Faces are useful for identifying patterns between libraries. And if a baseline face is set, it can be useful for measuring performance changes overtime. This is actually a very interesting point that is not often seen in Chernoff Faces. The researchers also point out 3 key limitations of Chernoff Faces. They propose that using more than 10 variables in a face will cause result in hard to notice changes. If there are too many faces it can also become hard to remember differences. Finally it is difficult to use to compare datasets that are not normalized. In their case comparing London and Seoul were hard as the different cities reported different data metrics.

Another recently published research example of Chernoff Faces I found comes from cattle research. In 2020, researchers from Akdeniz University in Turkey published a paper titled "Chernoff faces application in livestock" [10]. The paper ananlyzes the theoretical and practical applications of using Chernoff faces to visualize live stock data of cattle and goats. "Easily understood presentations facilitated by figures were obtained." Unfortunately, everything in the paper besides the abstract and image captions are in Turkish, so I don't really know what these faces are saying but we can take their word for it.

![Faces of Cattle](./images/week1/faces_of_cattle.PNG)

To summarize, there are many pratical applications of Chernoff Faces. Although they are rare, there are even more than the ones I've shown here. They are used in all sorts of industries and even used for big data. The primary benefit of this data vis is detecting patterns and outliers in the data.

#### Challenges and Limitations

So far we've mostly looked at successful implementations of Chernoff Faces. But as mentioned in the library paper, Chernoff Faces do have drawbacks. They do not work well with more than 10 variables and if there are too many faces it can become overwhelming. Furthermore, while they are useful for identifying patterns and outliers, they are not as useful for direct comparisons or measuring of values.

In 2007, Robert Kosara, a researcher for Tableau, wrote a blog post describing the drawbacks of Chernoff Faces [2]. His primary criticism serves as a direct counter to the alleged benefit of Chernoff Faces. Chernoff claims faces are a good visualization tool because humans are evolved to recognize faces. Kosara cites papers arguing this is true for faces as a whole and not for individual components. Arguing, "Face perception works in a holistic and hierarchical way. We do not see a nose, ears, eyes, eyebrows, etc., and then piece them together (at least not consciously). Rather, we recognize a person". Arguably, this still allows Chernoff Faces to be a valuable tool for pattern and outlier detection, but further supports the idea that they are not as good for identifying and comparing specific values.

Expanding on one of the drawbacks discussed in the library paper, Kosara (and the wikipedia article) mentions the importance and limitations of using facial features. In the library article they mention how it is difficult to represent data with more than 10 attributes with a Chernoff Face. Kosara adds onto this mentioning that since certain facial features change more and are easier to recognize, data mapped to these features is percieved as more important. Thus, great care must be taken by data scientists using this visualization to map features appropriately and effectively.

One last story I thought was interesting and worth discussing in this reflection is about the social implications of Chernoff Faces. Isabella Chua, a data story teller for the Kontinentalist, wrote a blog post discussing the possible accidentally racist nature of some Chernoff Faces [9]. There is a tutorial for how to use Chernoff Faces in R, the same tutorial I mentioned earlier [6]. This tutorial uses United States' crime statistics as its example dataset. The resulting Chernoff Faces can be seen in an image below.

![Face of Crime in the US](./images/week1/violent_crime_in_the_US.gif)

At first glance this seems like standard Chernoff Faces. But as Chua discovered from reading the comments in the tutorial, some unintentional effects occured. Just due to how the R algorithm for Chernoff Faces worked, it mapped high violent crime rates to certain facial features stereotypical of black people. For obvious reasons, this is not a good thing. Chua went on to review her own Chernoff Faces to see if anything similar occured. She discovered that in her own work with Chernoff Faces when representing data from multiple countries, the way the features were chosen had lead to the Chernoff Faces of asian countries having slanted eyes. Another offensive stereotype. She admits in her blog post that she may be looking into it too much. But I believe she raises a good point, with a data vis like Chernoff Faces, or really any data visualization, the way you choose to represent the data tells an important story. We want to convey what the data is telling us and avoid any bias or potentially unintentional offensiveness that the data may create.

#### Concluding Sentence

Sorry for the long reflection this week. I just got really interested in Chernoff Faces and wanted to do a deep dive. I hope this was interesting to anyone who took the time to read it :)

Sources/Further Reading
---
1. Wikipedia https://en.wikipedia.org/wiki/Chernoff_face
2. A Critique of Chernoff Faces https://eagereyes.org/criticism/chernoff-faces
3. Mapping Quality of Life with Chernoff Faces https://web.archive.org/web/20041217153643/http://gis.esri.com/library/userconf/educ04/papers/pap5000.pdf
4. Chernoff Faces (Baseball) https://web.archive.org/web/20130916002111/http://blogs.iq.harvard.edu/sss/archives/2006/11/chernoff_faces_1.shtml
5. What's the Matter With Chernoff Faces? https://web.archive.org/web/20130128144805/http://alexreisner.com/baseball/stats/chernoff
6. How to visualize data with cartoonish faces ala Chernoff https://flowingdata.com/2010/08/31/how-to-visualize-data-with-cartoonish-faces/
7. Chernoff Faces (Crime) https://ldld.samizdat.cc/2016/chernoff-faces/
8. Deep Chernoff Faces https://www.ihatethefuture.com/2020/06/deep-chernoff-faces.html
9. How can a data visualization be racist? https://medium.com/kontinentalist/how-can-a-data-visualisation-be-racist-a652910d8184
10. Chernoff Faces Application in Livestock https://www.cabdirect.org/cabdirect/abstract/20203479862
11. The Use of Faces to Represent
Points in k-Dimensional Space Graphically https://www.jstor.org/stable/2284077
12. Big Data Analysis of Public Library Operations and Services by using the Chernoff Face Method https://www.emerald.com/insight/content/doi/10.1108/JD-08-2016-0098/full/html
33 changes: 33 additions & 0 deletions week10.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Week 10 - Chess Evolution Visualization
===
By Andrew Nolan (4-12-21)

![The abstract of the paper](./images/week10/abstract.PNG)

I like chess. For a while I was President of WPI's chess club. I was having trouble finding a data visualization paper that interested me this week, so I thought I would look at IEEE VIS and see if there were any papers on non-technical topics that I found fun. I searched for chess and this showed up! So here we are.

A standard chess position analysis/visualization tool will look something like the figure shown below. In this tool pieces you can capture appear highlighted in green, your pieces under attack are highlighted in red, and pieces partially under attack are highlighted in yellow. Arrows on the board show the recommended moves. In addition, most modern chess visualizations will also highlight the previous move. However, chess is not a one move game, the situation changes over time. Traditionally these columns on the left represent the recommended series of moves and expected opponent responses for a given position. This visualization below appears to show 5 sequences of moves and highlights two of them on the board. I've been using tools like these for over a decade now, so I am very comfortable and familiar with how to use them. However, these tools do not visualize the temporally evolving nature of a chess game. And the chess algebraic notation is hard to read for an untrained/novice user.

![The abstract of the paper](./images/week10/traditional.PNG)

This paper proposes a new chess visualization to effectively convey the changes in a game over many moves. It is a multi-part visualization tool containing a score chart, evolution graph, and chess boards that can provide local move based and global overall game analysis of a chess match. An example of this tool can be seen in the following two figures. They use a modified version of a tree/directed graph to depict the game. The network is a tree of nodes and edges representing the moves and possible outcomes. It relies on the Stockfish engine with a search of depth 20 moves to calculate possible positions, this is typical of most chess visualizers. (Stockfish the most powerful chess computer that is not driven solely by AI, it did lose to Alpha Zero). It stores all moves and shows what it considers key points. To represent the evolution of a chess game, the researchers want to show potential positions after multiple moves and depict key events "such as draws, effective checks, and checkmates". The visualization, like the traditional chess tools, only show moves that improve the position, it does not visualize potential outcomes that obviously detriment the player.

The circles in the visualization represent actual moves, the squares represent moves calculated by Stockfish. To simplify the graph moves with only one or two direct responses were not shown, and instead the edge in the graph shows a "Several moves" arrows. Additionally, positions that repeat were merged into one node. The network is read from left to right starting at move 1 for white. The thickness of the arrows on the edges represents the relative gained advantage from each move. In the second image you can see the score chart at the bottom it represents the relative advantage of each player at a given move.

![The abstract of the paper](./images/week10/new.PNG)

![The abstract of the paper](./images/week10/big.PNG)

The researchers performed analysis using their tool on several famous games and reached similar conclusions to professional chess commentators and players. They claim their tool is effective for analyzing the evolution of a chess game over time. Especially since it makes it easier for users to jump between different possible branches to see the outcome. In the traditional chess visualizers/analyzers you cannot easily jump between branches as things play out sequentially and analysis is only shown for the current position.

They also did a comparative user study between their system and a traditional chess system called Arena. The results shown in the figures below show that it took less time to answer questions about the chess games using their system and the users also had a higher correct answer rate. The user study had 21 participants, two-thirds identified as novice players. Participants not only were more effective using this system, but they also responded more positively to its design.

![The abstract of the paper](./images/week10/objectiveresults.PNG)

![The abstract of the paper](./images/week10/subjectiveresults.PNG)

In my personal opinion, as a chess player, I am not sure I would use this tool. Maybe as they show, it is effective for seeing overall patterns in a game. But from my experience I would argue the main use of these tools is to understand the specific positions were something goes wrong and could have been better The overall game is often decided by these few decisions. Looking at this large network could identify these mistakes, but it also would have lots of superfluous info. Additionally, and the researchers do mention this in their limitations and future work, this system does not actually show the chess positions. Users would need another tool or real life chess board to see were the pieces are. This tool just shows the relative advantage and key moments. With a board included this could become a more helpful tool. But a board takes up a lot of screen real estate and I believe the traditional design may be more effective. I think sequentially seeing the board and the evolution of the game is a more effective learning tool than seeing an overall evolution as this visualization shows. But, I'm not a published IEEE VIS author (yet) so these researchers may be on to something after all.

Sources
---
1. Chess Evolution Visualization - https://ieeexplore.ieee.org/abstract/document/6710145
Loading