# Nanocompore SampComp API demo 

## Import SampComp module

In [2]:
from nanocompore.SampComp import SampComp

## Full API documentation

In [4]:
help (SampComp.__init__)

Help on function __init__ in module nanocompore.SampComp:

__init__(self, eventalign_fn_dict, output_db_fn, fasta_fn, bed_fn=None, whitelist=None, comparison_method=['GMM', 'KS'], force_logit=True, sequence_context=0, sequence_context_weights='uniform', min_coverage=50, downsample_high_coverage=None, max_invalid_kmers_freq=0.1, select_ref_id=[], exclude_ref_id=[], nthreads=4, log_level='info')
    * eventalign_fn_dict: Multilevel dictionnary indicating the condition_label, sample_label and file name of the eventalign_collapse output
        example d = {"S1": {"R1":"path1.tsv", "R2":"path2.tsv"}, "S2": {"R1":"path3.tsv", "R2":"path4.tsv"}}
        eventalign_fn_dict can also ba a path to a YAML file
        2 conditions are expected, and at least 2 sample replicates are highly recomended per condition
    * output_db_fn: Path where to write the result database
    * fasta_fn: Path to a fasta file corresponding to the reference used for read alignemnt
    * bed_fn: Path to a BED file co

## Initialise and call SampComp

#### Using a Python dictionary to specify the location of the eventalign files

In [3]:
# Init the object
s = SampComp (
    eventalign_fn_dict = {
        'Modified': {'rep1':'./sample_files/modified_rep_1.tsv', 'rep2':'./sample_files/modified_rep_2.tsv'},
        'Unmodified': {'rep1':'./sample_files/unmodified_rep_1.tsv', 'rep2':'./sample_files/unmodified_rep_2.tsv'}},
    output_db_fn = "./results/out.db",
    fasta_fn = "./reference/ref.fa")

# Run the analysis
db = s ()

Initialise SampComp and checks options
Initialise Whitelist and checks options
Read eventalign index files
	References found in index: 5
Filter out references with low coverage
	References remaining after reference coverage filtering: 5
Start data processing
100%|██████████| 5/5 [00:01<00:00,  3.07 Processed References/s]


#### Using a YAML file instead to specify the files location

In [35]:
# Init the object
s = SampComp (
    eventalign_fn_dict = "./samples.yaml",
    output_db_fn = "./results/out.db",
    fasta_fn = "./reference/ref.fa")

# Run the analysis
db = s ()

Initialise SampComp and checks options
Initialise Whitelist and checks options
Read eventalign index files
	References found in index: 5
Filter out references with low coverage
	References remaining after reference coverage filtering: 5
Start data processing
100%|██████████| 5/5 [00:10<00:00,  2.28s/ Processed References]


#### Tweaking statistical options

In [None]:
# Init the object
s = SampComp (
    eventalign_fn_dict = "./samples.yaml",
    output_db_fn = "./results/out.db",
    fasta_fn = "./reference/ref.fa",
    comparison_method=["GMM", "MW", "KS"],
    sequence_context=2,
    sequence_context_weights='harmonic')

# Run the analysis
db = s ()

#### Tweaking statistical options

In [None]:
# Init the object
s = SampComp (
    eventalign_fn_dict = "./samples.yaml",
    output_db_fn = "./results/out.db",
    fasta_fn = "./reference/ref.fa",
    comparison_method=["GMM", "MW", "KS"],
    sequence_context=2,
    sequence_context_weights='harmonic')

# Run the analysis
db = s ()