# Interactive Plotting

To get started, please install `ipywidgets`. You can do so using:

```
conda install ipywidgets
```


## Getting Started

Let's import the usual suspects: `pandas`, `matplotlib`, `numpy`, and `ipywidgets`.

Gensim is a very powerful module for performing all sorts of natural language processing. It has become the default for word embedding (word vector) models like word2vec and doc2vec. Because `gensim` is very large, we won't import the whole thing. We'll only import the parts that we're going to need.

For many problems, you may want to refer to the Gensim documentation. This page will be particularly helpful: https://radimrehurek.com/gensim/models/ldamodel.html

In [None]:
import ipywidgets as widgets
from ipywidgets import interact, interact_manual

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


## Example 1

Let's first interact with a dataframe. We'll use the Twitter dataset for this. 

In [None]:
docs = pd.read_csv("twitter.csv")
docs.head()


Let's make a function that randomly samples a tweet given the Topic and Sentiment.

In [None]:
def get_tweet(df, topic, sentiment):
    ## Sample a row based on our conditions
    row = df.loc[(df["Sentiment"]==sentiment) & (df["Topic"]==topic),"TweetText"].sample(n=1)
    ## Get the tweet text out of the row we've returned
    tweet = row.iloc[0]
    ## Print the tweet
    print(tweet)
    
get_tweet(docs, "apple","positive")

Now, let's make this function interactive! First, we need to know the possible options for sentiment and topic. We can get these from the dataframe itself.

In [None]:
sentiment_options = docs["Sentiment"].unique()
topic_options = docs["Topic"].unique()

Now, let's make a new function that is interactive and calls our `get_tweet()` function. The user should choose from a list for sentiment and also choose from a list for topic. The `@interact` command will help make this very easy. `@interact` is what we call "syntactic sugar" because it makes our code look sweet. It turns any function into an interactive function! A helpful guide can be found here: https://towardsdatascience.com/interactive-controls-for-jupyter-notebooks-f5c94829aee6

Check out the example below. When you run this block of code, it should give you a set of interactive widgets below.

In [None]:
@interact
def sample_tweets(chosen_sentiment = sentiment_options,
                  chosen_topic = topic_options):
    get_tweet(docs, chosen_topic, chosen_sentiment)
    

Let's break down what is happening. The `@interact` is telling Python (ipywidgets) to interpret the function below as interactive. `@interact` magically knows that if an argument is given a list in your function definition, it should treat that list as set of options for the user (for example: `chosen_sentiment = sentiment_options` where `sentiment_options` is a list). Every time the user interacts with the function by changing an option, the function is run again with the new values that the user has selected.

Once the user selects values for `chosen_sentiment` and `chosen_topic`, our function simply calls the `get_tweet()` function that we defined above. It passes the `docs` dataframe with our tweets in it and the chosen values for sentiment and topic. Then, `get_tweet()` handles the rest. 

## Example 2

Sweet. Let's try it with plotting now. We can adjust all of the parameters of our plots this way. Let's try this with `iris.csv` since it makes life so easy.

In [None]:
iris = pd.read_csv("iris.csv")

In [None]:
def plot_iris(df, x, y, versicolor_color, virginica_color, setosa_color, xlim=None, ylim=None):
    ## Start a new figure
    fig, ax = plt.subplots()
    ## Define a dictionary that maps species to the chosen color
    colors = {'versicolor':versicolor_color, 'virginica':virginica_color, 'setosa':setosa_color}
    ## Plot the scatterplot of x variable, y variable, and species by color
    ax.scatter(df[x], df[y], c=df['Species'].apply(lambda x: colors[x]))
    ## Set the X and Y axis limits
    if xlim is not None:
        plt.xlim(xlim[0],xlim[1])
    if ylim is not None:
        plt.ylim(ylim[0],ylim[1])
    ## Show the plot
    plt.show()
    
plot_iris(iris,"SepalLength","SepalWidth","red","green","blue",(0,10),(0,10))

The function above, `plot_iris`, allows the user to specify which variables from the iris dataset will be plotted, the colors for each species, and the x and y limits for the axes. X and Y limits should be specified as tuples (xmin,xmax). Colors must be value colors in Python. Python includes several named colors.

The below code should launch an interactive version of the function above.

In [None]:
xlim_slider = widgets.FloatRangeSlider(min=0,max=10,step=0.5)
ylim_slider = widgets.FloatRangeSlider(min=0,max=10,step=0.5)

@interact
def interactive_iris(x=["SepalLength","SepalWidth","PetalLength","PetalWidth"],
                     y=["SepalLength","SepalWidth","PetalLength","PetalWidth"],
                     versicolor_color=["red","green","blue","yellow","orange","purple","black"],
                     virginica_color=["red","green","blue","yellow","orange","purple","black"],
                     setosa_color=["red","green","blue","yellow","orange","purple","black"],
                     xlim=xlim_slider,
                     ylim=ylim_slider):
    plot_iris(iris,x,y,versicolor_color,virginica_color,setosa_color,xlim,ylim)


## Example 3

It turns out, we can write interactive functions to control all sorts of things in Python, not just plots! So, let's write a function that uses Random Forests to classify irises. The user can select some model parameters and then see how the performance is affected.

First, let's print the true classifications for reference.

In [None]:
iris_test = pd.read_csv("iris_test.csv")

plot_iris(iris_test,"PetalWidth","PetalLength","red","blue","orange",(0,3),(0,8))

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

features_selection = widgets.SelectMultiple(
    options=['PetalLength',"PetalWidth","SepalLength","SepalWidth"],
    value=['PetalLength',"PetalWidth"],
    description='Features',
    disabled=False
)

@interact
def estimate_random_forest(features=features_selection,
                           n_estimators=(1,10,1), 
                           max_depth=(1,10,1),  
                           max_samples=(1,50,5), 
                           max_leaf_nodes=(2,10,1),
                           xaxis=['PetalLength',"PetalWidth","SepalLength","SepalWidth"],
                           yaxis=['PetalWidth',"PetalLength","SepalLength","SepalWidth"],
                           show_true=False):
    ## Initialize a model with our chosen hyperparameters
    rf_model = RandomForestClassifier(n_estimators=n_estimators, 
                                      max_depth=max_depth, 
                                      max_samples=max_samples,
                                      max_leaf_nodes=max_leaf_nodes,
                                      random_state=1234, class_weight="balanced")
    ## Fit the model to the training data
    rf_model.fit(iris[list(features)], iris["Species"])
    ## Make predictions for the test data
    predictions = rf_model.predict(iris_test[list(features)])
    ## Make a new test set copy
    iris_predicted = iris_test.copy()
    ## Should we replace the true values with our predictions?
    if show_true is False:
        iris_predicted["Species"] = predictions
    ## Plot the results
    plot_iris(iris_predicted,xaxis,yaxis,"red","blue","orange",None,None)
    


Every time you adjust a slider, an entire new random forest model is being estimated. Because of some tricky programming from our `plot_iris` function, the x and y axis limits should adjust automatically based on the features you pick. You can select one or more features on which to run the random forest.

## Problem 1

Alright. Your turn! Make a cool interactive plot or function. Exactly what this should look like is up to you. If you need some ideas, check out the following resources:

* Basic tutorial for @interact: https://towardsdatascience.com/interactive-controls-for-jupyter-notebooks-f5c94829aee6
* All available widget controls: https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20List.html#Selection-widgets
* Simple intro to ipywidgets: https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Basics.html
* Examples of non-interactive plots (matplotlib) for inspiration: https://matplotlib.org/3.1.1/gallery/index.html