# Todo

## Figure 1  *make fig*

Properties of the statistical measures.

- [X]  TOE LRT not $\chi^2$
- [X] $\nabla$ sensitive to alignment length, $\delta_{\nabla}$ fixes that
- [X] $\delta_{\nabla}$ relates to non-stationary, it is proportional to $JSD$

In [1]:
from mdeq_analysis.plot import tabulate, mixed, util as plot_util
from project_paths import FIG_DIR, TABLE_DIR, RESULT_DIR, DATA_DIR

write_pdf = plot_util.pdf_writer()

In [2]:
pval_paths = list((RESULT_DIR / "micro/toe/fg-GSN-toe/").glob("*hi_hi*.tsv"))
nabla_path = (
    RESULT_DIR / "micro/convergence/toe-filtered-selected-convergence.sqlitedb"
)
align_path = DATA_DIR / "micro/filtered-selected.sqlitedb"
conv_paths = list((RESULT_DIR / "micro/convergence/fg-GSN-toe/").glob("*hi_hi*.sqlitedb"))

fig = mixed.make_mixed_properties(
    pval_paths=pval_paths,
    align_path=align_path,
    nabla_path=nabla_path,
    conv_paths=conv_paths,
)
# fig.show()
write_pdf(fig, FIG_DIR / "properties.pdf")

## Figure 2 *make fig*

Evidence for systematically elevated mutation disequilibrium in *Drosophila melanogaster* compared to *Drosophila simulans*

- [ X ] Smile plots (Dmel, Dsim)
- [ ] $\delta_{\nabla}$ genomic plots, or histograms (autosome, X)

In [3]:
drosophila = mixed.mixed_smiled_hist(ape=False)
# drosophila.show()
write_pdf(drosophila, FIG_DIR / "drosophila-smile-hist-plots.pdf")

## Figure 3 *make fig*

Majority of sampled genomic segments show mutation disequilibrium

- [ X ] Smile plots (intron, cds)
- [ ] $\delta_{\nabla}$ histogram between intron/cds

In [4]:
ape = mixed.mixed_smiled_hist(ape=True)
# ape.show()
write_pdf(ape, FIG_DIR / "ape-smile-hist-plots.pdf")

## Table 1

PAR regions show greater mutation disequilibrium

- [ X ] produce stats

In [None]:
table = tabulate.fxy_table(DATA_DIR / "fxy", RESULT_DIR / "fxy")
# table

In [None]:
table.title = r"The magnitude of mutation disequilibrium is higher in the region of the \emph{Fxy} gene within the PAR."
table.legend = r"Intron rank 2 remains X-linked while ranks 4-6 are within the PAR. $\hat{p}$-value and $\hat\delta_{\nabla}$ are from a TOE with \emph{M. musculus} was treated as the foreground edge, $\hat\sigma_\nabla$ is the estimated standard deviation of $\nabla$ from the null distribution, length is from the sampled \emph{M. musculus} intron sequence."

latex = table.to_latex(label="tab:fxy")
outfile = TABLE_DIR / "fxy.tex"
outfile.write_text(latex)