# Selecting scattered data using plotly

##### Jose Caparica - caparicajr@gmail.com - Mar/2021

##### Roughly based on examples taken from:
- https://plotly.com/python/v3/selection-events/

- https://stackoverflow.com/questions/57006458/how-to-assign-color-value-using-plotly-based-on-column-value



##### Useful Ref:
- https://plotly.com/python/reference/scatter/

- https://plotly.com/python-api-reference/generated/plotly.graph_objects.Scatter.html

### Using Iris dataset as example


In [1]:
import plotly.express as px
df = px.data.iris()
#features = ["sepal_width", "sepal_length", "petal_width", "petal_length"]



### Importing the tools we need

In [2]:
import plotly.graph_objs as go
import pandas as pd
import numpy as np
from ipywidgets import interactive, HBox, VBox

#This should probably be fixed. It is needed to avoid a recurring warning message
pd.set_option('mode.chained_assignment',None)

### Here's the actual code:

In [3]:
df = df.assign( Selection=pd.Series(np.zeros(len(df)), dtype='int8').values )


f = go.FigureWidget([go.Scatter(y = df['petal_width'], x = df['petal_length'], marker = dict ( color = df['Selection'].map({0:'red', 1:'green'})), mode = 'markers')])
scatter = f.data[0]
        

def update_axes(xaxis, yaxis, group):
    scatter = f.data[0]
    scatter.x = df[xaxis]
    scatter.y = df[yaxis]
    with f.batch_update():
        f.layout.xaxis.title = xaxis
        f.layout.yaxis.title = yaxis
    f.update_traces(marker = dict ( color = df[group].map({0:'red', 1:'green'})))


        
def selection_fn(trace,points,selector):
    df['Selection'] = pd.Series(np.zeros(len(df)), dtype='int8').values 
    for i in points.point_inds:
        df['Selection'][i] = 1
    f.update_traces(marker = dict ( color = df['Selection'].map({0:'red', 1:'green'})))



scatter.on_selection(selection_fn)

dropdowns = interactive(update_axes, yaxis = df.select_dtypes('float64').columns, xaxis = df.select_dtypes('float64').columns,  group = df.select_dtypes('int8').columns)

VBox((HBox(dropdowns.children),f))



VBox(children=(HBox(children=(Dropdown(description='xaxis', options=('sepal_length', 'sepal_width', 'petal_len…

#### If we want to take a snapshot from the selection and "save" it as a new column, we could just use something like this:

In [6]:

df = df.assign( Class_1=df['Selection'] )


#### There result can be seen by just printing the dataframe

In [4]:
df

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species,species_id,Selection
0,5.1,3.5,1.4,0.2,setosa,1,0
1,4.9,3.0,1.4,0.2,setosa,1,0
2,4.7,3.2,1.3,0.2,setosa,1,0
3,4.6,3.1,1.5,0.2,setosa,1,0
4,5.0,3.6,1.4,0.2,setosa,1,0
...,...,...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica,3,0
146,6.3,2.5,5.0,1.9,virginica,3,1
147,6.5,3.0,5.2,2.0,virginica,3,0
148,6.2,3.4,5.4,2.3,virginica,3,1


#### Instead of having the above cell, I would like to have some method like this

In [4]:
def SaveClass(data=df, name=None):
    
    if name:
        name.replace(" ", "_")
    else:
        n = data.shape[1] - n_cols
        name = str("Class_") + str(n)
    data = data.assign( name=data['Selection'] )


#### but it's not working.  I noticed the assign method would require a different approach.. these dicts are making me crazy
