# Homework 5

## Before reflection
Think about how people of all abilities may interact with visualizations and the internet. What accessibility challenges are you already aware of? What specific limitations or disabilities do you think would impact someone's interpretation of visualization or navigation of the internet, and how? Are there any design considerations you know of that may broaden accessibility? How often do you use them currently?



> The core issue of accessibility is that others might not able to interact with and/or perceiving things in the same exact way as you. I have a mouse but users might be on touch screens, so cannot possibly hover overing something. But even without clicking anything, seeing them alone can already be challenging: color, font size, and whatever causing high cognitive load that make it difficult to understand.
>
> With that said, accessibility needs consideration but not a burden. The key is provide alternatives, can see picture? Provide a text description underneath! I deeply understand the issue as I have some minor color weaknesses (i.e. cannot distinguish between certain colors for being too similar to me), while I still using color, I often using a secondary channels like text or shape to guarantee the understanding.

## After reflection
What did you learn? What was surprising? How might this information change your designs in the future? What are a few concrete changes you can make to make your visualizations more accessible?

> I'm surprised to learn there are way more considerations needed for accessibility, in terms of both breadth and depth. Perception wise it's no longer just can't see the colors but also complete vision loss, the screen reader demo really stands out; as well as interactions with the visualization that it might be impossible to navigate without precise control of mouse.
>
> I will put more thoughts on to providing alternative format of my visualizations. For examples, double encodingsâ€”never rely on color alone and adding direct labels if possible, provide meaningful text description of charts for screen readers; keep text contents simple and clear; ensure proper size for element to be seen and clicked.

## Improvement to graph
Then, go into your archive of past homework submissions and choose a visualization to redesign in light of your new knowledge about accessibility.  Explain why, with concrete justification, it is inaccessible. Preserve the original plot, and directly next to it, create a new version that solves these accessibility issues. Be sure to comment on all the aspects of the design or presentation that violate the accessibility guidelines, even if you can't fix them due to our technical limitations.

In [18]:
import altair as alt
import pandas as pd

dataurl = 'http://lib.stat.cmu.edu/datasets/colleges/usnews.data'
college = pd.read_csv(dataurl,
                       na_values='*',
                       names = ['FICE', 'College','State','Private','MathSAT','VerbalSAT','CombinedSAT','ACT',
                                'Q1Math','Q3Math','Q1Verbal','Q3Verbal','Q1ACT','Q3ACT',
                               'Applied','Accepted','Enrolled','Top10HS','Top25HS','FullTimeUG','PartTimeUG',
                               'InStateTuition','OutOfStateTuition','RoomBoardCost','RoomCost','BoardCost',
                               'ExtraFees','BookFees','PersonalFees',
                               'PhDFaculty','TerminalFaculty','StuFacRatio','AlumniDonate','SpendPerStudent','GradRate'])

og = alt.Chart(college).mark_point().transform_calculate(
    RenamePrivate = 'if(datum.Private==1, "public", "private")'
).encode(
    x=alt.X("StuFacRatio:Q", scale=alt.Scale(domain=[2, 30])),
    y=alt.Y("GradRate:Q"),
    color=alt.Color("RenamePrivate:N", scale=alt.Scale(domain=["public", "private"]), legend=alt.Legend(title="Private/Public"))
).transform_filter(
    "datum.GradRate <= 100 && datum.StuFacRatio <= 30"
).properties(
    title="Graduation Rate vs Student to Faculty Ratio",
    width=500,
    height=500,
)

base_gradrate = alt.Chart(college).transform_calculate(
    RenamePrivate='if(datum.Private == 1, "Public", "Private")'
).transform_filter(
    "datum.GradRate <= 100 && datum.StuFacRatio <= 30"
)
improved_gradrate_box = base_gradrate.mark_boxplot(extent="min-max", size=80).encode(
    x=alt.X("RenamePrivate:N", title="Private/Public", axis=alt.Axis(labelAngle=0)),
    y=alt.Y("GradRate:Q", title="Graduation Rate", scale=alt.Scale(zero=False)),
    color=alt.Color("RenamePrivate:N", legend=None, scale=alt.Scale(domain=["Private", "Public"]))
)
improved_gradrate_text = base_gradrate.mark_text(
    align="left",
    dx=45,
    fontSize=12,
    fontWeight="bold"
).encode(
    x=alt.X("RenamePrivate:N"),
    y=alt.Y("median(GradRate):Q"),
    text=alt.Text("median(GradRate):Q", format=".1f"),
    color=alt.value("black")
)
improved_gradrate = (improved_gradrate_box + improved_gradrate_text).properties(
    title={
        "text": "Graduation Rate by School Type",
        "subtitle": "Private schools with a significantly higher median graduation rate of 67.0 compared to 50.0 for public schools"
    },
    width=250,
    height=500
)

base_studfac = alt.Chart(college).transform_calculate(
    RenamePrivate='if(datum.Private == 1, "Public", "Private")'
).transform_filter(
    "datum.GradRate <= 100 && datum.StuFacRatio <= 30"
)
improved_studfac_box = base_studfac.mark_boxplot(extent="min-max", size=80).encode(
    x=alt.X("RenamePrivate:N", title="Private/Public", axis=alt.Axis(labelAngle=0)),
    y=alt.Y("StuFacRatio:Q", title="Student to Faculty Ratio", scale=alt.Scale(zero=False)),
    color=alt.Color("RenamePrivate:N", legend=None, scale=alt.Scale(domain=["Private", "Public"]))
)
improved_studfac_text = base_studfac.mark_text(
    align="left",
    dx=45,
    fontSize=12,
    fontWeight="bold"
).encode(
    x=alt.X("RenamePrivate:N"),
    y=alt.Y("median(StuFacRatio):Q"),
    text=alt.Text("median(StuFacRatio):Q", format=".1f"),
    color=alt.value("black")
)

improved_studfac = (improved_studfac_box + improved_studfac_text).properties(
    title={
        "text": "Student to Faculty Ratio by School Type",
        "subtitle": "Private schools have a lower median ratio of 12.8, while public schools have a higher median of 17.5"
    },
    width=250,
    height=500
)

og | improved_gradrate | improved_studfac

### Justifications
While the original plot did use color scheme that is somewhat color vision deficiency/colorblind-friendly, it still suffers from overplotting for which makes it difficult for/to: 1) add redundant encoding (e.g. shape) because of the visual clutter; 2) cognitive impairment for users to understand the plot because there are too much going on in the plot, it's neither simple nor clean.

Additionally, the generated plot is an image, which is not accessible to screen readers. Unfortunately, Altair doesn't have native support for adding alt text to the generated charts (i.e. I can add subtitles but they will become part of the image), fully fix this issue would require manually patching the exported HTML file.

Due to this technical limitation, I decided to make simpler plots with less visual clutter, that box-plots effectively summarized the center and spread of the data, and add direct labels to make it easier to understand without relying on the axis labels or tooltip/mouse hover, as well as adding subtitles to explain the key insights.