# Cufflinks

Let's not linger on this intro too much: `cufflinks` makes `pandas.iplot()` happen, which is completely analogous to `pandas.plot()`

In [None]:
!pip install cufflinks --no-dependencies

## Small thing

The current version of cufflinks is broken, but I still want to show it to you. It's a great tool and should be fixed in the near future to run out-of-the-box. For now we can fix it like this:

In [17]:
import os, shutil
venv = !pipenv --venv
paths = ["lib/python3.7/site-packages/cufflinks/","lib/site-packages/cufflinks/"]

# this might not always be where it is, please find your cufflinks installation and put the folder path in here
base_dir = [os.path.join(venv[0],path) for path in paths if os.path.isdir(os.path.join(venv[0],path))][0]
for tainted_file in ['plotlytools.py','tools.py',"__init__.py"]:
    file_in = os.path.join(base_dir,tainted_file)
    file_out = os.path.join(base_dir,".tmp.py")
    with open(file_in, "rt") as fin:
        with open(file_out, "wt") as fout:
            for line in fin:
                fout.write(line.replace('plotly.plotly', 'chart_studio.plotly'))
    shutil.move(file_out, file_in)

## Back to Cufflinks

It's quite useful to keep the `cufflinks` guide nearby, as the API leaves a lot of room for failure: https://plot.ly/ipython-notebooks/cufflinks

The main approach to using it is to start with the `kind` of plot, and enter the required fields. Cufflinks will throw errors at you if you are missing required fields, but often it will return something unexpected which you can tweak by specifying further arguments.

In [2]:
import cufflinks as cf
cf.go_offline()

In [37]:
import pandas as pd
temperatures = pd.read_csv("../data/global_temperatures/GlobalLandTemperaturesByCountry.csv", parse_dates=['dt'])
continents = pd.read_csv("../data/continents.csv")
countries = pd.read_csv("../data/countries.csv")
countries.drop(columns=['code'])
gdp = pd.read_csv("../data/2014_world_gdp_with_codes.csv")
temperatures = temperatures.merge(continents).merge(countries, left_on="Country", right_on="country").merge(gdp, left_on="Country", right_on="COUNTRY")
temperatures['year'] = temperatures.dt.dt.year
temperatures['month'] = temperatures.dt.dt.month_name()
temperatures['m'] = temperatures.dt.dt.month
yearly_change = temperatures[(temperatures.year==1963) | (temperatures.year==2013)].groupby(["Country","year"], as_index=False).AverageTemperature.mean()
yearly_change['AverageTemperatureChange'] = yearly_change.groupby(["Country"], as_index=False).AverageTemperature.transform("diff")
yearly_change.dropna(inplace=True)
temperature_slice=yearly_change.merge(temperatures[["Country","Code","lon","lat"]].drop_duplicates())

# One more map

Cufflinks binds `.iplot()` to your `pandas` DataFrames. It's a sacrifice between the well-documented API of Plotly and the convenience of `cufflinks`. Notice that we don't need to worry about the scaffolding to prop up the `go.Figure`, this is handled automagically by `cufflinks`. If you want to get in and customize the figure, you can specify `asFigure=True` and cufflinks will return the figure object!

In [38]:
temperature_slice.iplot(
    kind="choropleth",
    locations="Code",
    z="AverageTemperatureChange",
    colorscale="-RdBu"
)

# Rapidfire Cufflinks Exercises

`df.iplot()` can replace `df.plot()` in your workflow. For the same effort you can get nicer plots. Ideal for first looks and speedy analysis.

If you are stuck on where to start, try just calling `.iplot(kind=...)` with the right kind specified. Depending on the type of plot, you may need to restrict the columns passed in: `df[['col1','col2']].iplot()` 

# EXERCISE

- Make a bar graph of `temperature_slice` showing the Average Temperature Change for each Country
- Give it a good title and colour

In [39]:
#Solution
temperature_slice.iplot(kind="bar", x="Country", y="AverageTemperatureChange", title="Temperature Changes")

In [None]:
# Your Solution

temperature_slice.iplot()

# EXERCISE

- Stay with `temperature_slice`, plot two histograms in one plot:
    - One showing the distribution of Average Temperature
    - One showing the distribution of the change in Average Temperature
    - Try turning the *traces* on and off to view the (very) different histograms

In [40]:
temperature_slice[['AverageTemperature','AverageTemperatureChange']].iplot(kind="histogram", bins=25, title="Temperature Changes")

In [None]:
# Your Solution



# EXERCISE

Let's go back to the *big* dataset, `temperatures`.

- Replicate the Heatmap we made earlier in `seaborn` showing temperature by month and year

In [52]:
# Solution

temperatures.pivot_table(
    columns="m",
    index="year",
    values="AverageTemperature"
).iplot(
    kind="heatmap",
    colorscale="-RdBu",
    text="m",
    title = "Quick Heatmap"
)

In [51]:
# Your Solution

temperatures.pivot_table(
    columns="m",
    index="year",
    values="AverageTemperature"
).iplot()

# EXERCISE



In [50]:
temperatures.T

KeyboardInterrupt: 