# Data Viz w/ Plotly, Part 1

In [2]:
import pandas as pd
import numpy as np

In [3]:
airbnb = pd.read_csv('airbnb.csv') # For Seattle only

As with all new datasets, let's start by familiarizing ourselves with the dataset:

<br>Go ahead and check it out for yourself-

## Scatter, Bars, and Histograms: The Basics

Our imports- Note that we'll rename `plotly express` as `px`.

`px` is a fantastic "wrapper" for the base `plotly` package. What that means is we can use incredibly easy and readable functions, and plotly express will do the hard work of convering that input into formats that the software can understand.

Quick aside: If you're a web developer and love JS, or a academic and use R, the same Plotly API is available to use in both languages. 

Let's start off with a simple scatter plot, which we can whip up with `px.scatter()`

What does the association between price and availability look like?

It works, but doesn't really tell us too much. Let's modify the plot by adding parameters to `px.scatter()`

<br>With *any* python package, we can pull up some quick documentation from Jupyter itself using `?`
Try it out: What parameters does `px.scatter` accept?

So we're still not seeing much of a clear trend here, bummer. 

There are, however, quite a few outliers in the price. Let's see if we can adjust our graph so the rest of the data isnt squished down.

Peep the histogram on the right, that shows a pretty neat trend with the room types. We can check that out in more depth later.

### Quick aesthetics

Those outliers were causing us a bit of trouble, but wasn't too hard to deal with. 
<br>But that does make me a bit curious: What was so special about those listings?

The power of plotly is that we can use the interactvity to literally just hover over the data points to see what's going on. 
<br>All we have to do is suggest what features to display: 

See if you can find out which parameters can be used to show text on hover: 

Plotly is interactive! Play around with the legends and plot area

Double click on the legend icon on the right, and plotly will automatically update the figure to select those points only. If we want 

We can change our colors fairly easily using color scales.

<br>If the feature we pass to `color=` is **discrete or categorical**, we'll add the `color_discrete_sequence` param
* Documentation for what's accepted: https://plotly.com/python/discrete-color/

<br>If the feature is instead **continuous**, we'll use the `color_continuous_scale` param instead
* The corresponding docs for continuous: https://plotly.com/python/colorscales/
* The link to the color scale options are a bit hidden:  https://plotly.com/python/builtin-colorscales/

<br> Open the docs, and try out your favorite below:

Under the hood, we can see that each of these sequences are just lists of colors, so we could subset them to use different values

To finish off, we can add **titles**, **labels** and such pretty easily.

See if you can use the function documentation or google to figure out how to do that:

### Bar Plots & Histograms

We can check out other basic features of the dataset by construction quick bar plots and histograms.

<br>In breakout groups, see if you can:
1. Build a **bar plot** to show average prices counts in each neighbourhood group, and sort them in a meaningful way
2. Create a **histogram** that shows the distribution of prices against the room type

Make these complete! Label axes, hover text, other columns of data if you can. 
<br>**Hint**: You may need to use GroupBy functions from Week 2 to aggregate data more comfortably

In [17]:
# 2 


## Base API & Layers

Aside from scatter and bar plots, there's quite a lot else we can make.
<br> Check it out here: https://plotly.com/python/

As you'll see, some of the documentation uses `graph_objects` as `go`, or `figure_factory` as `ff`. What does that mean? 

`Plotly` was originally written using a base API, in which each "layer" was constructed individually. 
<br>Before, you'd have to say write out what type of object was being written, the exact data to be passed as a list, etc.

`Plotly express` "abstracted" all this away. It created quick functions that generate the necessary layers in the backend, saving us quite a bit of time.

<br> We can still use some of these features to add more customization to our graph. For example, with our previous barplot:

In [20]:
# Add Figure Annotiations w/ additional layers