# Scatter Plots
A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. If the points are coded (color/shape/size), one additional variable can be displayed. The data are displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis. [Scatter Plot on Wikipedia](https://en.wikipedia.org/wiki/Scatter_plot)

* Scatter plots, X-Y plots are used to determine relationships between the two different things.
* The x-axis is used to measure one event (or variable) and the y-axis is used to measure the other.
* If both variables increase at the same time, they have a positive relationship. If one variable decreases while the other increases, they have a negative relationship. Sometimes the variables don't follow any pattern and have no relationship. [Source](https://nces.ed.gov/nceskids/help/user_guide/graph/scatter.asp)


* Scatter plots are used to show the relationship between pairs of quantitative measurements made for the same object or individual.
* For example, a scatter plot could be used to present information about the examination and coursework marks for each of the students in a class. 
* In a scatter plot a dot represents each individual or object (child in this case) and is located with reference to the x-axis and y-axis, each of which represent one of the two measurements.
* By analysing the pattern of dots that make up a scatter plot it is possible to identify whether there is any systematic or causal relationship between the two measurements.
* Regression lines can also be added to the graph and used to decide whether the relationship between the two sets of measurements can be explained or if it is due to chance. [Source](https://www.le.ac.uk/oerresources/ssds/numeracyskills/page_35.htm)

## Scatter plot with Plotly Express

In [None]:
# x and y given as array_like objects
import plotly.express as px
fig = px.scatter(x=[0, 1, 2, 3, 4], y=[0, 1, 4, 9, 16])
fig.show()

In [None]:
# x and y given as DataFrame columns
import plotly.express as px
df = px.data.iris() # iris is a pandas DataFrame
fig = px.scatter(df, x="sepal_width", y="sepal_length")
fig.show()
#df.head()

In [None]:
df['species'].unique()

## Set size and color with column names
Note that *color* and *size* data are added to hover information. You can add other columns to hover data with the *hover_data* argument of *px.scatter*.

In [None]:
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 size='petal_length', hover_data=['petal_width'])
fig.show()

## Scatter plot with go.Scatter (Graphic Object)
We have seen how to generate Scatter and Line plots together in the Line Plots notebook!

We can customize the Scatter object by setting the *mode* option .. possible values are '*lines*', '*markers*' and '*lines+marker*'

In [None]:
import plotly.graph_objects as go
import numpy as np

N = 1000
t = np.linspace(0, 10, 100)
y = np.sin(t)

fig = go.Figure(data=go.Scatter(x=t, y=y, mode='markers'))

fig.show()

## Bubble Scatter Plots

In bubble charts, a third dimension of the data is shown through the size of markers (more in the Bubble Charts notebook).

In [None]:
import plotly.graph_objects as go

fig = go.Figure(data=go.Scatter(
    x=[1, 2, 3, 4],
    y=[10, 11, 12, 13],
    mode='markers',
    marker=dict(size=[40, 60, 80, 100],
                color=[0, 1, 2, 3])
))

fig.show()

## Style Scatter Plots

You will see that customizing Plotly plots is fun and easy!

In [None]:
import plotly.graph_objects as go
import numpy as np


t = np.linspace(0, 10, 100)

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=t, y=np.sin(t),
    name='sin',
    mode='markers',
    marker_color='rgba(152, 0, 0, .8)'
))

fig.add_trace(go.Scatter(
    x=t, y=np.cos(t),
    name='cos',
    marker_color='rgba(255, 182, 193, .9)'
))

# Set options common to all traces with fig.update_traces
fig.update_traces(mode='markers', marker_line_width=2, marker_size=10)
fig.update_layout(title='Styled Scatter',
                  yaxis_zeroline=False, xaxis_zeroline=False)


fig.show()

## Data Labels on Hover
What do you want to

In [None]:
import plotly.graph_objects as go
import pandas as pd

data= pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/2014_usa_states.csv")

fig = go.Figure(data=go.Scatter(x=data['Postal'],
                                y=data['Population'],
                                mode='markers',
                                marker_color=data['Population'],
                                text=data['State'])) # hover text goes here

fig.update_layout(title='Population of USA States')
fig.show()

## Scatter with a Color Dimension

In [None]:
import plotly.graph_objects as go
import numpy as np

fig = go.Figure(data=go.Scatter(
    y = np.random.randn(500),
    mode='markers',
    marker=dict(
        size=16,
        color=np.random.randn(500), #set color equal to a variable
        colorscale='Peach', # one of plotly colorscales
        showscale=True
    )
))

fig.show()

## Large Data Sets

In [None]:
%%time
import plotly.graph_objects as go
import numpy as np

N = 100000
# notice Scattergl not Scatter
fig = go.Figure(data=go.Scattergl(
    x = np.random.randn(N),
    y = np.random.randn(N),
    mode='markers',
    marker=dict(
        color=np.random.randn(N),
        colorscale='Viridis',
        line_width=1
    )
))

fig.show()

## 3D Scatter Plot

In [None]:
import plotly.graph_objs as go
import numpy as np
z = np.linspace(0, 10, 50)
x = np.cos(z)
y = np.sin(z)

#N = 1000
#t = np.linspace(0, 10, 100)
#y = np.sin(t)

trace=go.Scatter3d(x=x, y=y, z=z,mode='markers',
    marker=dict(
    size=12,
    color=z, # set color to an array/list of desired values
    colorscale='Viridis')
)
fig = go.Figure(trace)
fig.show()
