### Homework 10
#### Ryan Dubnicek

The purpose of this assignment is to get some more practice developing on the web, specifically exporting plots made in Python+altair+vegalite to your github repository. (Additionally, this is good practice for your final project!)

**This week you will be using Python, Altair and vega-lite to create two visualizations of the same dataset. These visualizations do NOT need to be linked but AT LEAST ONE needs to be interactive in some way beyond the "built in" pan/zoom capabilities.**

Please note that these will be publicly posted submissions, and this is needed for grading, so plan accordingly.

Group submissions are required for this assignment, however you can be in a "group" of 1 person on your own if you do not want to submit with others.  Please see notes about this process at the end of this post.  You must list all group member's names on your assignment -- we will not accept submissions without all group members listed.

Include two paragraphs in your write up, one for each of your visualizations. This write up should include:
* for EACH PLOT:
    * a description of what you are visualizing
    * what design choices you made: specifically what encoding types you are using, and if you are using a color scheme -- what variables you are coloring by and why you chose a specific color scheme
    * discussion of any data transformations that you did on the analysis side of things in your Python notebook
    * if you are using similar plots to your Homework #9 -- make sure you include quotations around any parts of your Homework #9 that you are using in your write up AND what you changed between Homework #9 to this Homework in order to get things to work with Altair
    * for your interactivity: specify what interactivity you chose for one (or more) plots and how this helps your visualization be more clear or interesting

Expect about a paragraph for each plot and another few sentences for discussion of interactivity.

You must include both a link to your data AND a link to your Python notebook that you used to generate these plots. You can link them with "The Data" and "The Analysis" buttons like in the in-class examples. Make sure this link is to the GitHub page of the notebook! Otherwise, it will download by default and you won't get the nice rendering that GitHub will do for your projects.  

Your submission will be:
* a URL to your github repo -- your submission here will be a link to your page hosted on GitHub pages like: https://jnaiman.github.ioLinks to an external site.
* your analysis .ipynb notebook

Note that you will need to read in your data from a URL in your notebook.  This means if you are hosting the data yourself, you will need to be aware of GitHub's upload limits and plan accordingly.
Some hints:
* if you get a "number of rows larger than maximum" error, you might want to check out the documentation around "alt.data_transformers.disable_max_rows()" hereLinks to an external site..
* if you are using temporal data it will likely be easier to transform Pandas columns into timestamps with .to_timestampLinks to an external site. instead of trying to do it in vega-lite/Altair


In [1]:
import altair as alt
import pandas as pd


### Viz 1
The idea for this visualization is to show UFO sighting duration by sighting, by state, allowing users to hover over one sighting (mark) and see which state it belonged to and how many seconds it lasted), which the larger visualization does the same, but lets you see if there are longer or shorter sightings for certain states.

In [62]:
chart1 = alt.Chart.from_dict({
  "data":{"url":"https://raw.githubusercontent.com/rdubnic2/rdubnic2.github.io/main/ufo-scrubbed-geocoded-time-standardized-00-wheader-limitduration.csv"},
  "mark":{"type":"point", "tooltip":True},
  "encoding":{
    "x":{"field":"state", "type":"ordinal", "title":"UFO Sightings by State"},
    "y":{"field":"duration_seconds", "type":"quantitative","axis":{"title":"Length of Sighting (Seconds)"}}
  }
}
)

In [63]:
chart1

My design choices here, as usual, are pretty bare bones--I tend to not have a great eye for design! But I am using state abbreviations as the X axis and the number of seconds for each sighting as the Y. States are ordinal and seconds are quantitative. For this viz, I decided to remove sightings with over 5,000,000 seconds as duration, as there were just a few (~25 or so), that were skewing the axis very high, and I, quite frankly, couldn't spend more time trying to tweak the code to get the scae to change or the figure size to change. 

### Viz 2

The idea for this visualization, which is interactive, is to allow someone to view Bigfoot sightings, by state (in a bar chart), and to choose to see the mean humidity v. mean temperature for the sightings from a given state by clicking on the bar for that state.

In [10]:
brush = alt.selection_interval(encodings=['x','y'])

chart22 = alt.Chart.from_dict({
    # My code from Homework 9 for plotting mean humidity versus temperature for bigfoot sightings
  "data":{"url":"https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_bcubcg_fall2022/main/data/bfro_reports_fall2022.csv"},
  "mark":{"type":"bar"},
    #"mark":{"type":"point", "tooltip":True},
  "encoding":{
    "x":{"field":"state", "type": "ordinal","title":"Bigfoot Sightings by State"},
    "y":{"aggregate":"count","field": "state", "type":"ordinal","axis":{"title":"Sighting Count"}}
  } #// end encoding
}).add_selection(
        brush
    )

In [12]:
# Chart1 is a replica of my viz from Homework 9! 

chart11 = alt.Chart.from_dict({
    # My code from Homework 9 for plotting mean humidity versus temperature for bigfoot sightings
  "data":{"url":"https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_bcubcg_fall2022/main/data/bfro_reports_fall2022.csv"},
  "mark":{"type":"point"},
    #"mark":{"type":"point", "tooltip":True},
  "encoding":{
    "x":{"field":"temperature_mid", "type": "quantitative","title":"Average Mid Temperature"},
    "y":{"aggregate":"mean","field": "humidity", "type":"quantitative","axis":{"title":"Mean Humidity"}}
  } #// end encoding
}).transform_filter(
    brush
)

In [14]:
chart = chart22 | chart11
chart

For the bar chart, I am simply visualizing a count of all sightings by state, using sum, and for the scatter plot, I am grabbing mean values for two columns of quantitative data: mean humidity and `temperature_mid` which is the recorded mid point of temperature for the day of a sighting. For both quantitative columns in `chart11` I am aggregating mean values. I didn't need to transform the data, as I felt this general approach would be more widely interesting. Since I used the base viz from Homework 9, I did have to tweak the things a bit for Altair--adding titles and starting first with the bar chart as the base viz and the scatter as the referenced chart. The interactivity here is hopefully clear, but I allow users to select a state's bar (representing Bigfoot sightings) and see each sighting in our data represented as a point plotted between mean humidity and mean temperature on an accompanying chart.

### Saving Viz 1 and Viz 2 as JSON objects for uploading to my Jekyll site:

In [65]:
myJekyllDir = '/Users/rdubnic2/Desktop/is445-dataviz/rdubnic2.github.io/assets/json/'

# Viz 1
chart1.properties(width='container').save(myJekyllDir+'chart1.json')

# Viz 2
chart.properties().save(myJekyllDir+'dualChart.json')