## Load JSON pileup data

Rather than download plaintext pileup data or a BAM file from Bloom, I converted the BAM file to four JSON files:

- `seq2pos2totalcov.json`: Maps sequence name to a mapping from position (1-indexed) to the total number of aligned reads covering that position. Includes all reads in the alignment (with the exception of partially-mapped reads, which I filtered explicitly).


- `seq2pos2matchct.json`: Maps sequence name to a mapping from position (1-indexed) to the number of aligned reads covering that position __that actually match the reference at this position__.


- `seq2pos2mismatchct.json`: Maps sequence name to a mapping from position (1-indexed) to the number of aligned reads covering that position __that do not match the reference at this position__. This does not include deletions -- it should only include actual mismatches in the alignment.


- `seq2pos2mismatches.json`: Maps sequence name to a mapping from position (1-indexed) to another mapping, where the keys can be any of `A`, `C`, `G`, `T` and the values are the number of time this non-matching base was seen in the reads aligned to this sequence. Nucleotides not seen at a position are omitted from the innermost mapping, so if the `mismatchct` of this position is 0 then this will be `{}`.

These JSON files, even considering the sum of their file sizes, are much smaller than a BAM file.
(That being said, at this point they have become large enough to not really be workable with in-memory on an average laptop anyway...)

**NOTE: positions are stored as strings due to this being roundtripped through JSON. TODO, if we store / load as ints this should fix pyplot stuff with plotting spectra and decrease filesize.**

In [None]:
import json
import os

JSONPREFIX = "../main-workflow/output"

seq2pos2totalcov = {}
with open(os.path.join(JSONPREFIX, "seq2pos2totalcov.json"), "r") as jf:
    seq2pos2totalcov = json.load(jf)

seq2pos2matchct = {}
with open(os.path.join(JSONPREFIX, "seq2pos2matchct.json"), "r") as jf:
    seq2pos2matchct = json.load(jf)

seq2pos2mismatchct = {}
with open(os.path.join(JSONPREFIX, "seq2pos2mismatchct.json"), "r") as jf:
    seq2pos2mismatchct = json.load(jf)
    
seq2pos2mismatches = {}
with open(os.path.join(JSONPREFIX, "seq2pos2mismatches.json"), "r") as jf:
    seq2pos2mismatches = json.load(jf)