Plotting multiple samples
=========================

Contents:
- [Notebook set-up](#notebook-set-up)
- [Initializing MaP sample](#initializing-map-sample)
- [ShapeMapper QC](#shapemapper-qc)
- [Skyline plots](#skyline-plots)
- [Linear Regression](#linear-regression)
- [Arc Plots](#arc-plots)
- [Secondary Structure](#secondary-structure)
- [3D Structures](#3d-structures)

Notebook set-up
---------------

In [None]:
# This sets plots to display in-line by default
%matplotlib inline

# Import module. For high-level functions, no additional modules are needed
import starmapper as MaP

# Creates an HTML button that hides/shows code cells
# Useful for lab notebook reports and research updates
# Note: This works in html and jupyter notebooks
#   but not github markdown (where you are likely looking.)
MaP.create_code_button()

Initializing MaP sample
-----------------------
If you have consistently named files, you can use a function to create a list of samples quickly.

In [None]:
path = 'data/'
def init_sample(sample):
    return MaP.Sample(sample       = sample,                              # Sample name, this will appear in labels and legends.
                      profile      = path+sample+"_rnasep_profile.txt",   # ShapeMapper2 profile.txt
                      ct           = path+"RNaseP.ct",                    # base-pairing information in ct format, this may be redundant if a secondary structure file is provided.
                      ss           = path+"RC_CRYSTAL_STRUCTURE.xrna",    # secondary structure drawing in xrna, varna, cte, or nsd format
                      rings        = path+sample+"-rnasep.corrs",         # RingMapper output (extension specified by user)
                      pairs        = path+sample+"-rnasep-pairmap.txt",   # PairMapper output (pairmapper.txt)
                      # allcorrs = PairMapper allcorrs.txt file, not included in this example
                      log          = path+sample+"_shapemapper_log.txt",  # ShapeMapper2 log file contains fragment length and mutations-per-molecule distributions
                                                                          # but only if --per-read-histograms flag is used
                      dance_prefix = path+sample+"_rnasep",               # prefix for DanceMapper files, will detect rings, pairs, profiles, and predicted structures if present.
                      deletions    = path+"example-rnasep-deletions.txt", # ShapeJumper deletions.txt file
                      fasta        = path+"RNaseP-noSC.fasta",            # Fasta file used for ShapeJumper (required if deletions are provided)
                      pdb          = path+"3dhs_Correct.pdb",             # a 3-D molecular structure in PDB format, support for cif and pdbx files forthcoming.
                      pdb_kwargs   = {"chain":"A"})                       # Additional info for PDB parsing. Chain is required.
                                                                          # offset and fasta may be required if the PDB header is incomplete.

example1 = init_sample("example1")
example2 = init_sample("example2")
example3 = init_sample("example3")
example4 = init_sample("example4")

samples = [example1, example2, example3, example4]

ShapeMapper QC
--------------

In [None]:
MaP.array_qc(samples);

Skyline Plots
-------------

In [None]:
MaP.array_skyline(samples);

Linear Regression
-----------------

In [None]:
MaP.array_linreg(samples, colorby="sequence");

Arc Plots
---------

In [None]:
MaP.array_ap(samples, ij="rings");

Secondary Structure
-------------------

In [None]:
MaP.array_ss(samples, ij="rings", colors="profile");

## 3D structures

In [None]:
MaP.array_mol(samples, ij="pairs", nt_color="profile");