Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: New subcommand for plotting bam files #156

Merged
merged 26 commits into from
Jul 7, 2021
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
24915b6
Add new subcommand plot-bam
fxwiegand Jun 8, 2021
0af950c
Merge remote-tracking branch 'upstream/master' into plot-bam
fxwiegand Jun 8, 2021
0d7eeb3
Merge remote-tracking branch 'upstream/master' into plot-bam
fxwiegand Jun 8, 2021
5634b49
Minor fixes
fxwiegand Jun 9, 2021
0aff4f9
Visual updates
fxwiegand Jun 9, 2021
e08a5e3
Remove debug
fxwiegand Jun 9, 2021
ba4cc89
Format
fxwiegand Jun 9, 2021
032d417
Support for multiple bam files
fxwiegand Jun 15, 2021
582b106
Format
fxwiegand Jun 15, 2021
bd4fec5
Plot BAM files below each other
fxwiegand Jun 16, 2021
57d5c06
Update README.md
fxwiegand Jun 17, 2021
806d3ca
Minor clippy fixes
fxwiegand Jun 17, 2021
405e9c0
Write output to stdout
fxwiegand Jun 22, 2021
0e8a038
Format
fxwiegand Jun 22, 2021
ff1634a
Add navigation for samples
fxwiegand Jun 22, 2021
86e6d27
Bump xlsxwriter version
fxwiegand Jun 22, 2021
61296a8
Update Cargo.toml
fxwiegand Jun 24, 2021
4576af6
Update Cargo.toml
fxwiegand Jun 24, 2021
f00a0d4
Update Cargo.toml
fxwiegand Jun 24, 2021
cc35df1
Fix cli help text and wrong default value
fxwiegand Jun 24, 2021
ec68c4b
pack chrom, start, stop into Region struct instead; replace Path arg …
tedil Jun 24, 2021
146b887
merge
tedil Jun 24, 2021
2094682
Use context for anyhow error message
fxwiegand Jun 24, 2021
93cc71d
Merge branch 'master' into plot-bam
johanneskoester Jun 28, 2021
99e595f
Merge remote-tracking branch 'upstream/master' into plot-bam
fxwiegand Jul 2, 2021
14e67e9
Update README.md
fxwiegand Jul 2, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ regex = "1.3"
tera = "1"
jsonm = "0.1.4"
chrono = "0.4"
xlsxwriter = "0.3.3"
xlsxwriter = "=0.3.3"
lazy_static = "1.4"
anyhow = "1"
thiserror = "1"
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Rust-Bio-Tools provides a command `rbt`, which currently supports the following
* a tool to generate interactive HTML based reports that offer multiple plots visualizing the provided genomics data in VCF and BAM format (`rbt vcf-report`)
* a tool to generate an interactive HTML based report from a csv file including visualizations (`rbt csv-report`)
* a tool for splitting VCF/BCF files into N equal chunks, including BND support (`rbt vcf-split`)
* a tool to generate visualizations for a specific region of one or multiple BAM files with a given reference contained in a single HTML file (`rbt plot-bam`)

Further functionality is added as it is needed by the authors. Check out the [Contributing](#Contributing) section if you want contribute anything yourself.
For a list of changes, take a look at the [CHANGELOG](CHANGELOG.md).
Expand Down
1 change: 1 addition & 0 deletions src/bam/mod.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
//! Tools that work on BAM files
pub mod collapse_reads_to_fragments;
pub mod depth;
pub mod plot;
86 changes: 86 additions & 0 deletions src/bam/plot/bam_plot.html.tera
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
<!doctype html>
<html lang="en">
<head>
<title>rbt bam report</title>
<meta charset="UTF-8">
<script src="https://cdn.jsdelivr.net/npm/vega@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-lite@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-embed@6"></script>
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1" crossorigin="anonymous"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/lz-string/1.4.4/lz-string.min.js" integrity="sha512-qoCTmFwBtCPvFhA+WAqatSOrghwpDhFHxwAGh+cppWonXbHA09nG1z5zi4/NGnp8dUhXiVrzA6EnKgJA+fyrpw==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://unpkg.com/jsonm@1.0.10/build/jsonm.js"></script>
<!-- CSS only -->
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">
</head>

<body style="text-align: center;">
<nav class="navbar navbar-expand-lg navbar-dark bg-dark">
<a class="navbar-brand" href="#">rbt report</a>
<div class="collapse navbar-collapse" id="navbarText">
<ul class="navbar-nav mr-auto">
<li class="nav-item">
<a class="nav-link" href="https://github.com/rust-bio/rust-bio-tools/blob/master/CHANGELOG.md">{{ version }}</a>
</li>
<li class="nav-item">
<a class="nav-link" href="https://github.com/rust-bio/rust-bio-tools">github</a>
</li>
</ul>
<span class="navbar-text">
created {{ time }}
</span>
</div>
</nav>
<nav aria-label="breadcrumb">
<ul class="breadcrumb">
<li class="breadcrumb-item active" style="padding-top: 8px;">chrom: {{ chrom|safe }}</li>
<li class="breadcrumb-item active" style="padding-top: 8px;" aria-current="page">region: {{ start }} - {{ end }}</li>
<button class="btn btn-secondary dropdown-toggle ml-auto" type="button" id="dropdownMenuButton" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
Samples
</button>
<div class="dropdown-menu" aria-labelledby="dropdownMenuButton" style="overflow-y: auto; max-height: 200px; z-index: 1001;">
{% for bam in bams %}<div class="dropdown-item">
<input class="form-check-input" type="checkbox" checked="checked" data-toggle="collapse" data-target="#collapse_{{loop.index}}" value="" id="defaultCheck{{ loop.index }}">
<label class="form-check-label" for="defaultCheck{{ loop.index }}">
{{ bam }}
</label>
</div>{% endfor %}
</div>
</ul>
</nav>

{% for plot in plots %}<div id="collapse_{{loop.index}}" class="row justify-content-center collapse show" style="margin-top: 25px;">
<div id="plot_{{loop.index}}" style="overflow-y: auto; height: calc(50vh - 50px);"></div>
</div>{% endfor %}
<script>
let plots = [{% for plot in plots %}{{ plot | safe }}{% if not loop.last %},{% endif %}{% endfor %}];
let bams = [{% for bam in bams %}"{{ bam }}"{% if not loop.last %},{% endif %}{% endfor %}];
let i = 1;

let decompressed_plots = [];

for (plot of plots) {
let decompressed = LZString.decompressFromUTF16(plot);
let unpacker = new jsonm.Unpacker();
unpacker.setMaxDictSize(100000);
let unpacked_specs = unpacker.unpack(JSON.parse(decompressed));
unpacked_specs["width"] = $(window).width() - 100;
unpacked_specs["title"] = bams[i - 1];
decompressed_plots.push(unpacked_specs);
let plot_id = `#plot_${i}`;
vegaEmbed(plot_id, unpacked_specs);
i++;
}

$(window).resize(function() {
let j = 1;
for (plot of decompressed_plots) {
let plot_id = `#plot_${j}`;
plot["width"] = $(window).width() - 100;
vegaEmbed(plot_id, plot);
}
})
</script>
</body>
</html>
1 change: 1 addition & 0 deletions src/bam/plot/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pub mod plot_bam;
62 changes: 62 additions & 0 deletions src/bam/plot/plot_bam.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
use crate::bcf::report::table_report::create_report_table::create_report_data;
use crate::bcf::report::table_report::create_report_table::manipulate_json;
use anyhow::Result;
use chrono::{DateTime, Local};
use itertools::Itertools;
use std::io;
use std::io::Write;
use std::path::Path;
use std::str::FromStr;
use tera::{Context, Tera};

pub(crate) fn plot_bam(
bam_paths: &[String],
fasta_path: &str,
region: &str,
max_read_depth: u32,
) -> Result<()> {
let splitted_region = region.split(':').collect_vec();
let chrom = splitted_region[0];
let span = splitted_region[1].split('-').collect_vec();
assert_eq!(2, span.len());
let start = u64::from_str(span[0])?;
let end = u64::from_str(span[1])?;
let mut plots = Vec::new();

for bam_path in bam_paths {
let (content, max_rows) = create_report_data(
Path::new(fasta_path),
None,
Path::new(bam_path),
chrom.to_owned(),
start,
end,
max_read_depth,
)?;
let visualization = manipulate_json(content, start, end, max_rows)?;

plots.push(visualization);
}

let bams = bam_paths
.iter()
.map(|b| Path::new(b).iter().last().unwrap().to_str().unwrap())
.collect_vec();

let mut templates = Tera::default();
templates.add_raw_template("bam_plot.html.tera", include_str!("bam_plot.html.tera"))?;
let mut context = Context::new();
let local: DateTime<Local> = Local::now();
context.insert("time", &local.format("%a %b %e %T %Y").to_string());
context.insert("version", &env!("CARGO_PKG_VERSION"));
context.insert("plots", &plots);
context.insert("bams", &bams);
context.insert("chrom", &chrom);
context.insert("start", &start);
context.insert("end", &end);

let html = templates.render("bam_plot.html.tera", &context)?;
io::stdout().write_all(html.as_bytes())?;

Ok(())
}
20 changes: 11 additions & 9 deletions src/bcf/report/table_report/create_report_table.rs
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ pub(crate) fn make_table_report(
if pos < 75 {
let (content, max_rows) = create_report_data(
fasta_path,
var.clone(),
Some(var.clone()),
bam_path,
chrom.clone(),
0,
Expand All @@ -337,7 +337,7 @@ pub(crate) fn make_table_report(
} else if pos + 75 >= fasta_length as i64 {
let (content, max_rows) = create_report_data(
fasta_path,
var.clone(),
Some(var.clone()),
bam_path,
chrom.clone(),
pos as u64 - 75,
Expand All @@ -349,7 +349,7 @@ pub(crate) fn make_table_report(
} else {
let (content, max_rows) = create_report_data(
fasta_path,
var.clone(),
Some(var.clone()),
bam_path,
chrom.clone(),
pos as u64 - 75,
Expand Down Expand Up @@ -490,9 +490,9 @@ pub(crate) fn read_tag_entries(
Ok(())
}

fn create_report_data(
pub(crate) fn create_report_data(
fasta_path: &Path,
variant: Variant,
variant: Option<Variant>,
bam_path: &Path,
chrom: String,
from: u64,
Expand All @@ -513,7 +513,7 @@ fn create_report_data(
from,
to,
max_read_depth,
&variant,
variant.as_ref(),
)?;

for b in bases {
Expand All @@ -526,14 +526,16 @@ fn create_report_data(
data.push(mat);
}

data.push(json!(variant));
if variant.is_some() {
data.push(json!(variant));
}

Ok((Json::from_str(&json!(data).to_string()).unwrap(), max_rows))
}

/// Inserts the json containing the genome data into the vega specs.
/// It also changes keys and values of the json data for the vega plot to look better and compresses the json with jsonm.
fn manipulate_json(data: Json, from: u64, to: u64, max_rows: usize) -> Result<String> {
pub(crate) fn manipulate_json(data: Json, from: u64, to: u64, max_rows: usize) -> Result<String> {
let json_string = include_str!("vegaSpecs.json");

let mut vega_specs: Value = serde_json::from_str(&json_string)?;
Expand All @@ -545,7 +547,7 @@ fn manipulate_json(data: Json, from: u64, to: u64, max_rows: usize) -> Result<St
for (i, _) in v.iter().enumerate() {
let k = v[i]["marker_type"].clone().as_str().unwrap().to_owned();

if k == "A" || k == "T" || k == "G" || k == "C" || k == "U" {
if k == "A" || k == "T" || k == "G" || k == "C" || k == "U" || k == "N" {
values["values"][i]["base"] = values["values"][i]["marker_type"].clone();
} else if k == "Deletion"
|| k == "Match"
Expand Down
26 changes: 19 additions & 7 deletions src/bcf/report/table_report/static_reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ fn calc_rows(
reads: Vec<AlignmentNucleobase>,
matches: Vec<AlignmentMatch>,
max_read_depth: u32,
variant: &Variant,
variant: Option<&Variant>,
) -> (
Vec<StaticAlignmentNucleobase>,
Vec<StaticAlignmentMatch>,
Expand All @@ -55,8 +55,14 @@ fn calc_rows(
let mut max_row = 0;

for r in matches {
if r.read_start < variant.start_position as u32 && r.read_end > variant.end_position as u32
{
let overlaps = if variant.is_some() {
r.read_start < variant.unwrap().start_position as u32
&& r.read_end > variant.unwrap().end_position as u32
} else {
true
};

if overlaps {
let mut row: u16 = 0;

if read_names.contains_key(&r.name) {
Expand Down Expand Up @@ -85,8 +91,14 @@ fn calc_rows(
}

for r in reads {
if r.read_start < variant.start_position as u32 && r.read_end > variant.end_position as u32
{
let overlaps = if variant.is_some() {
r.read_start < variant.unwrap().start_position as u32
&& r.read_end > variant.unwrap().end_position as u32
} else {
true
};

if overlaps {
let mut row: u16 = 0;

if read_names.contains_key(&r.name) {
Expand Down Expand Up @@ -141,13 +153,13 @@ pub fn get_static_reads(
from: u64,
to: u64,
max_read_depth: u32,
variant: &Variant,
variant: Option<&Variant>,
) -> Result<(
Vec<StaticAlignmentNucleobase>,
Vec<StaticAlignmentMatch>,
usize,
)> {
let alignments = read_indexed_bam(path, chrom.clone(), from, to)?;
let (msm, m) = make_nucleobases(fasta_path, chrom, alignments, from, to)?;
Ok(calc_rows(msm, m, max_read_depth, &variant))
Ok(calc_rows(msm, m, max_read_depth, variant))
}
24 changes: 23 additions & 1 deletion src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,28 @@ pub(crate) enum Command {
output_path: String,
},

/// Creates a html file with a vega visualization of the given bam region
/// Example:
/// rbt plot-bam -b input.bam -g 2:132424-132924 -r input.fa > plot.html
#[structopt(author = "Felix Wiegand <felix.wiegand@tu-dortmund.de>")]
PlotBam {
/// BAM file to be visualized.
#[structopt(long, short = "b")]
bam_path: Vec<String>,

/// Path to the reference fasta file.
#[structopt(long, short = "r", default_value = "100")]
fxwiegand marked this conversation as resolved.
Show resolved Hide resolved
reference: String,

/// Chromosome and region for the visualization. Example: 2:132424-132924
#[structopt(long, short = "g")]
region: String,

/// Set the maximum lines of reads that will be shown in the alignment plots. Default value is 500.
fxwiegand marked this conversation as resolved.
Show resolved Hide resolved
#[structopt(long, short = "d", default_value = "500")]
max_read_depth: u32,
},

/// Creates report from a given VCF file including a visual plot
/// for every variant with the given BAM and FASTA file.
/// The VCF file has to be annotated with VEP, using the options --hgvs and --hgvsg.
Expand Down Expand Up @@ -246,7 +268,7 @@ pub(crate) enum Command {
#[structopt(long, short = "c", default_value = "1000")]
cells: u32,

/// Set the maximum number of cells in the oncoprint per page. Lowering max-cells should improve the performance of the plots in the browser. Default value is 1000.
/// Set the maximum lines of reads that will be shown in the alignment plots. Default value is 500.
#[structopt(long, short = "d", default_value = "500")]
max_read_depth: u32,

Expand Down
6 changes: 6 additions & 0 deletions src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,12 @@ fn main() -> Result<()> {
pin_until.as_deref(),
)?
}
PlotBam {
bam_path,
reference,
region,
max_read_depth,
} => bam::plot::plot_bam::plot_bam(&bam_path, &reference, &region, max_read_depth)?,
VcfReport {
fasta,
vcfs,
Expand Down