## Visualization
In this notebook, we will cover five visualizations and the hope is that each of them will reveal tips for you for your next plot. There is a wide range of visually pleasing plots you can generate with Julia and I strongly recommend envisioning what you want to plot beforehand and then figuring out how to accomplish it (most likely you will find all what you need in at least one julia plotting package). Here, we will be using the `Plots` package with a `gr()` backend

Just a quick note about plotting with xlabels that are long and rotated. Currently, there seems to be an issue with using xticks labels that are rotated and long, like the plot I show next. As per this issue https://github.com/JuliaPlots/Plots.jl/issues/2107, this hasn't been fixed yet. But here, I create a quick function that will act as a "hack" to avoid this problem.

### Let us first get some data that we will use throughout this notebook

We will use these three states as examples throughtout this notebook, so we will create a `DataFrame` for each of them to have them ready when we need them.

### 🔴Plot 1: Symmetric violin plots and annotations
We will get started by just picking the most recent data we have about these states and plot their violin plots to see the distribution of house prices.

One concept I learned from reading one of Edward Tufte's books is the idea of avoiding symmetry. Here, as you can see, each violin plot is symmetric. We can probably fit more information there by making use of each side of the violin plot. And indeed, we will now compare housing prices in these states from February 2020 with housing prices from 10 years before that (February 2010).

This is really intersting... seems like the price distributions stayed very similar except that it shifted upwards after 10 years. Now let's make the plot more informative.

Violin plots are cool in that they show a distribution of values. Nevertheless, one really intersting value in each violin plot is the median. We can very easily annotate this value on top of these violin plots.

Now let's put all of this together and add annotations on both sides.

### 🔴Plot 2: Bar charts, histograms, and insets
Now let's compare states based on the number of location entries they have in the data.

There are a few problems with this histogram. First, unsorted histograms are often harder to read so the first thing we will do is rearrange this histogram. Next, we will add annotations to be able to map each bar to a state quickly.

Next, we will arrange this plot horizontally (via the `orientation = :horizontal` argument) and add annotations.

Get rid of linewidth, and map the states to the two letters identifier.

Since we are using one color, use a more neutral color (gray), and use Edward Tufte's strategy in his book "The Visual Display of Quantitative Information" to add grid lines on top of the bar charts.

Finally, we will fix sizes and add an inset figure to zoom in on the lower left corner.

### 🔴Plot 3: Plots with error bars
Next, we will compar state prices over the years and see how they have changed. we will use error bars too.

We will first get started with plotting the price changes over time for each region.

A plot like this isn't indicative of how the price trend is going overall for New York. What we will do next is for each time point, we will find the median value as well as the 80th and 20th percentile and plot these values. Let's write the precentile functin first.

Next, we will just put everything together in one function. note the `!` symbol at the end of the name -- this is because we will pass a plot canvas `plotid` as an argument, and this function will modify it.

### 🔴Plot 4: Plots with double axes

Interesting! Seems like in general, lower rank regions have higher price and vice versa.

### 🔴Plot 5: High-dimensional data in a 2D plot
We've seen a 3D plot in the Clustering notebook previously. I personally prefer 2D plots because they are often easier to read and is viewable in a print version more clearly. Here, we will explore how we can use color as a third dimension. Note that you can also use sizes as third dimension.

We will use the California data, and plot the prices from 2010-02 on the x-axis and 2020-02 on the y-axis. We will then color code each data point by its current rank.

Let's generate a quick scatter plot first.

Then, to work with colors we will make use of the package `ColorSchemes`.

We have the colors, but we have no indication of what the color means. Next, we will create a new plot whose job will be to encode the ranks of these dots, and then we will pad the two plots together.

And now pad together...

This agrees with what we saw earlier with the data from New York. Lower rank regions seem to have higher prices.

# Finally...
After finishing this notebook, you should be able to:
- [ ] create violin plots in julia
- [ ] create bar charts
- [ ] add annotations to your plots
- [ ] create an inset figure for your plot
- [ ] create plots with error margin
- [ ] create plots with double axes
- [ ] create a new color mapping to a given set of values
- [ ] create two dimensional plots and use color to indicate a third dimension
- [ ] pad multiple plots together

# 🥳 One cool finding

Many interesting cool things here! The most interesting I found was that Idaho is following California's trend in housing prices, and Idaho's prices are growing faster than places like Indiana and Ohio.

<img src="data/1201.png" width="500">
<img src="data/1202.png" width="500">