<h2>Eukaryotic Cellulases</h2>

<p>To see the plot, <strong>run this notebook</strong> by clicking on the <span title="Restart the kernel and run all cells"><u>double arrow symbol above</u></span> (it will take a couple of seconds to run)</p>


In [20]:
import panel as pn
import plotly.express as px
import pandas as pd
import ast

pn.extension('plotly')

# Load data
red_df = pd.read_csv('tSNE_cellulase.csv')
red_df['cellulase_list'] = red_df['cellulase_list'].apply(ast.literal_eval)

# Generate a color map that maps each unique clade to a specific color
clades = red_df['clade'].unique()
colors = px.colors.qualitative.D3  # You can choose any other palette
color_map = {clade: colors[i % len(colors)] for i, clade in enumerate(clades)}

# Define the function to update the plot
def update_plot(search_value_cellulases, search_value_species):
    filtered_df = red_df

    if search_value_cellulases:
        search_terms = [term.strip() for term in search_value_cellulases.split(',')]
        #filtered_df = filtered_df[filtered_df['cellulases'].apply(lambda x: any(term in x for term in search_terms))]
        #filtered_df = filtered_df[filtered_df['cellulase_list'].apply(lambda x: any(term == item for term in search_terms for item in x))]
        filtered_df = filtered_df[filtered_df['cellulase_list'].apply(lambda x: any(term in x for term in search_terms))]

    
    if search_value_species:
        filtered_df = filtered_df[filtered_df['label'].str.contains(search_value_species, case=False, na=False)]

    fig = px.scatter(
        filtered_df, x='PC1', y='PC2',
        labels={'PC1': 'tSNE 1', 'PC2': 'tSNE 2'}, 
        hover_data={'clade': True, 'num_cellulases': True, 'cellulases': True, 'label': True, 'PC1': False, 'PC2': False},
        width=1000, height=1000,
        size='num_cellulases',
        color='clade',
        color_discrete_map=color_map  # Use the predefined color map
    )
    return fig

# Create interactive widgets
search_bar_cellulases = pn.widgets.TextInput(name='Search Cellulases', placeholder='Enter cellulases...')
search_bar_species = pn.widgets.TextInput(name='Search Species', placeholder='Enter species...')

# Bind the function and widgets
@pn.depends(search_bar_cellulases.param.value, search_bar_species.param.value)
def get_plot(search_value_cellulases, search_value_species):
    return update_plot(search_value_cellulases, search_value_species)

# Layout
layout = pn.Column(pn.Row(search_bar_cellulases, search_bar_species), get_plot)

print('done')

done


## tSNE Plot

* You can search for multiple cellulases with comma-seperation like `GH51, GH3`, this will <u>filter out</u> species that do not have `GH51` OR `GH3`
* You can search for species using a substring of the label, e.g. `Aspergillus` or for a speciesname using an underscore delimiter like `Nelumbo_nucifera`

In [21]:
# Serve the Panel app
layout.servable()