## IDR comparison tool

### Part 1: Running a query
* To get started, execute the cell below. This will initialize a graphical user interface for the comparison tool.
* After it runs, you should see a play button with the label `Initialize IDR Analysis Tool`; click it to continue

### Building your query
* You will see fields to:
  * (1) enter a COG ID to query
    * `COG0513`: Superfamily II DNA and RNA helicase (contains DeaD and RhlB)
    * `COG0532`: (contains IF-2)
    * `COG3827`: (contains PopZ)
    * `COG0206`: (contains FtsZ)
    * `COG0606`: (contains ComM)
  * (2) select the datasets you want to compare
    * `omrgcv2`: Data from 180 Tara Oceans samples (OM-RGV.v2)
    * `3300003177`: diffuse chimney talus (processed by data depositor)
    * `3300003178`: black smoker spire (processed by data depositor)
    * `3300003873`: diffuse chimney talus (processed by JGI standard pipeline)
    * `3300003872`: black smoker spire (processed by JGI standard pipeline)
  * (3) select the minimum IDR length for your analysis (optional)
  * (4) provide a list of the architecture types you want to consider (optional)
  * (5) provide temperature thresholds to define cold- and hot-sample only IDRs (Tara Oceans data only, optional)
    * If you do not provide a minimum and maximum temperature, all Tara Oceans data will be analyzed as one bin
  * (3) select the IDR properties you want to compare
    *  `Asphericity`: measure of spherical character (0 = spherical cloud, 1 = rod/elongated structure)
    *  `Radius_of_gyration`: radius of gyration
    *  `Radius_of_gyration_scaled`: length-normalized radius of gyration
    *  `End_to_end_distance`: end to end distance
    *  `End_to_end_distance_scaled`: length-normalized end-to-end distance
    *  `Scaling_exponent`: polymer scaling environment
    *  `Prefactor`: size coefficient in scaling law
    *  `Kappa`: charge patterning metric
    *  `Length`: Number of amino acids in the IDR
    *  `FCR`: Fraction of charged residues
    *  `NCPR`: Net charge per residue
    *  `SHD`: Sequence hydropathy decoration
    *  `SCD`: Sequence charge decoration
    *  `Molecular_weight`: Molecular weight
    *  `F_Neg`: Fraction of negatively charged residues
    *  `F_Pos`: Fraction of positively charged residues
    *  `Hydrophobicity`: Mean of Kyte-Doolittle hydropathy scores per amino acid
    *  `Fraction_aromatic`: Fraction of aromatic residues
    *  `Fraction_aliphatic`: Fraction of aliphatic residues
    *  `Fraction_polar`: Fraction of polar residues
    *  `Complexity`: sequence complexity (high indicates a mixed sequence, low indicates repetitive, low-complexity regions)
* Once you have made your selections, click the second play button labeled `Run query and plot`

In [None]:
import os, sys, ipywidgets as widgets
from IPython.display import display, clear_output, HTML
import importlib
SCRIPT_DIR = os.path.abspath("python-scripts")
if SCRIPT_DIR not in sys.path:
    sys.path.append(SCRIPT_DIR)
import idr_analysis_setup
importlib.reload(idr_analysis_setup)
init_button = widgets.Button(
    description="â–¶ Initialize IDR Analysis Tool",
    button_style="primary",
    layout=widgets.Layout(width="280px")
)
init_output = widgets.Output()
def on_init_clicked(b):
    with init_output:
        clear_output()
        ui = idr_analysis_setup.create_idr_ui()
        display(ui)
init_button.on_click(on_init_clicked)
display(HTML("<h3>IDR Analysis Tool</h3>"))
display(init_button, init_output)

### Part 2: Saving the results
* After you have run at least one query, you can execute the below code cell to create a user interface for saving the results
* Once you have chosen a file name, click "Save results as CSV"
* Note well, if you have not yet run a query or loaded the underlying code by running the code cell in Part 1, this will give you an error

In [None]:
import ipywidgets as widgets
from IPython.display import display, clear_output, FileLink

save_label = widgets.HTML(
    "<b>Save results:</b> After running an analysis above, "
    "save the latest results table as a CSV file."
)

filename_input = widgets.Text(
    description="Filename:",
    value="idr_query_results.csv",
    layout=widgets.Layout(width="320px")
)

save_button = widgets.Button(
    description="ðŸ’¾ Save results as CSV",
    button_style="",
    layout=widgets.Layout(width="220px")
)

save_output = widgets.Output()

def on_save_clicked(b):
    with save_output:
        clear_output()
        df = idr_analysis_setup.LAST_RESULTS

        if df is None:
            print("No results available. Please run an analysis first.")
            return
        if df.empty:
            print("The last results table is empty; nothing to save.")
            return

        fname = (filename_input.value or "idr_query_results.csv").strip()

        # keep it simple / safe
        if "/" in fname or "\\" in fname:
            print("Please use a simple filename without path separators.")
            return

        df.to_csv(fname, index=False)
        print(f"Saved results to '{fname}' in the current directory.")
        try:
            display(FileLink(fname))
        except Exception:
            pass  # if FileLink not available, they can grab it from the file browser

save_button.on_click(on_save_clicked)

display(widgets.VBox([
    save_label,
    widgets.HBox([filename_input, save_button]),
    save_output
]))