### Project checklist

- [ ] Title
- [ ] Abstract (max 300 words)
- [ ] env.yml (include both full and cross-platform)
  - _If time..._
    - [ ] Set up container environment to run Notebook (Binder?)
- [ ] Package motivations
- [ ] Include rich text (equations/tables/links/images/vids)
- **I/O**
  - [ ] Use `pandas` to read large data _or_ `numpy` to load from files
  - [ ] Save processed/generated data to disk with `pandas`
- **DATA MANIPULATION**
  - [ ] Needs to include numerical operations (`numpy`, `scipy`, `pandas`) or data transformation (`pandas`)
- **VISUALIZATION**
  - [ ] Min. one composite plot (multi-panel or inset)
  - "[Publication ready figures](https://pubs.acs.org/doi/10.1021/jz500997e)"
    - [ ] The figs are 89 mm wide (single column) or 183 mm wide (double column)
    - [ ] The axes are labeled
    - [ ] The font sizes are sufficiently large
    - [ ] The figures are saved as ~~rasterized images (300 dpi) or~~ **vector art**
- [ ] [Repo Zenodo DOI](https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content)

# Project title

## Abstract

Abstract text...

## Index

- [Notebook instructions](#notebook-instructions)
- [Packages](#packages)
  - [Package](#package)
  - [Imports](#imports)
- [Functions](#functions)
- [Analysis](#analysis)

## Notebook instructions

_Information on how to use/run the notebook_.

## Packages

### Package

_Reason for inclusion_.

### Imports

In [199]:
import sqlite3 as sql
from pathlib import Path

import ipywidgets as widgets
import pandas as pd
from pandas import DataFrame

## Functions

In [200]:
def sql_to_df(db_file: Path, query: str = 'SELECT * FROM MS2data') -> DataFrame:

    con = sql.connect(db_file)

    return pd.read_sql(query, con)

## Analysis

Bla bla bla, antigen of interest, epitope, reciever, etc. **Select the desired antigen domain to analyze in the below dropdown menu**.

In [209]:
# Display dropdown to enable data subset selection.
data_subset = widgets.Dropdown(options=['m1-b_rep', 'm1-c_rep'],
                               value='m1-b_rep',
                               description='Reciever:',
                               disabled=False)
display(data_subset)

Dropdown(description='Reciever:', options=('m1-b_rep', 'm1-c_rep'), value='m1-b_rep')

In [206]:
data_subset.value

'm1-b_rep'

In [64]:
base_dir = Path('data').resolve()

sample_dicts = {k.name: {p.name: {'dir': p}
                         for p in [*k.resolve().iterdir()]}
                for k in base_dir.iterdir()}

In [117]:
df = pd.concat({k: pd.concat({a: sql_to_df(db_file=b['dir'] / 'ms2_results.sql')
                              for a, b in v.items()},
                             axis=0)
                for k, v in sample_dicts.items()}, axis=0)

In [155]:
for ligand in df.index.levels[1]:

    print(ligand)

    print(df.loc[('m1-c_rep',), 'XL'][ligand].size)

LUIGSeq_m11_ctrl-igs01_r0
31
LUIGSeq_m11_ctrl-pls01_r0
25
LUIGSeq_m11_ctrl-tail01_r0
39
LUIGSeq_m11_ctrl-tail02_r0
32
LUIGSeq_m11_ctrl-tail03_r0
31
LUIGSeq_m11_top01_r0
20
LUIGSeq_m11_top02_r0
20
LUIGSeq_m11_top03_r0
12
LUIGSeq_m11_top04_r0
14
LUIGSeq_m11_top05_r0
26


In [58]:
d = sample_dirs['m1-c_rep'][3] / 'top_xls.txt'

top_xls = !cat {d}

df.query(f'XL in {top_xls}')

KeyError: 3

In [93]:
xl_dict = {}

for k, v in sample_dirs.items():

    for path in v:

        print(k, path)

m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_ctrl-tail03_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_top04_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_ctrl-tail02_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_ctrl-tail01_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_top05_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_ctrl-igs01_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_top01_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_ctrl-pls01_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_top02_r0
m1-c_rep /home/jstrobaek/Projects/2023-compute_jupyter_course/data/m1-c_rep/LUIGSeq_m11_top03_r0
m1