Update benchmarking script with new CLI #56

victorlin · 2020-04-25T00:08:52Z

Opening this PR to track progress for #27. Keeping as draft until ready.

TODO:

split into 2 scripts
- cov_benchmark.py
  - design high-level CLI
  - implement simulation parameters
  - implement alignment parameters
  - implement custom subprocess parameters (msbar,art_illumina, bowtie2)
  - convert counts to proportions
- pangenome_cov_benchmark.py
  - per genome in pangenome: generate single-record FASTA, run cov_benchmark.py with predefined parameters
  - aggregate results in .csv file
update README
write containerized unit tests (?)

mathemage · 2020-04-26T13:20:01Z

PR closes #27

- add option for verbose printing - fix error when tmp_dir exists

one step closer to custom parameters

proportions rounded to 4 decimal points

victorlin · 2020-05-03T21:51:54Z

I was able to do most of the finalizing work for this today. A few more unit tests that I'm figuring out how to implement since it doesn't seem like msbar supports a random seed to generate reproducible output.

The updated README is mostly technical without much insight on the biological context; @ababaian if you have any comments/suggestions feel free to modify directly or let me know!

ababaian · 2020-05-03T23:04:23Z

Hey @victorlin, can you remove the fq files from the repo and instead host them on the s3 bucket. There is a test_data folder which these would fit in.

--

I'm a bit tied up at the moment, I'll do a proper review tomorrow

victorlin · 2020-05-04T05:06:03Z

@ababaian done - moved all the test input files to s3://serratus-public/test-data/benchmarker/. Also added a setup.sh script to copy the files into test/benchmarker/local/. Using local for the folder name since it's targeted by our repo's .gitignore.

victorlin · 2020-05-24T17:20:38Z

A lot has happened since this was started. Is the benchmarker still needed?

The last task is unit testing msbar, but it seems non-trivial. We could merge this PR in and add a separate task, or just close out to make room for other exciting things.

rcedgar · 2020-05-24T17:24:20Z

IMO we don't need a mapping benchmark for the main Serratus search, we have settled on how to run bowtie2 and are not likely to revisit benchmarking of mappers.

ababaian · 2020-05-24T18:06:33Z

Ironically enough I do think we /should/ revisit this as there will be ways to improve on --very-sensitive-local I just think on the totem pole of priorites this has gotten bumped down quite a bit as more pressing things always seem to be there. I'd say let's leave this open and swing back to it when time is a luxury we have.

Update benchmarking script with new CLI

166d36a

mathemage added this to Task In Progress in TODO List via automation Apr 26, 2020

mathemage assigned victorlin Apr 26, 2020

mathemage linked an issue Apr 26, 2020 that may be closed by this pull request

Benchmarking statistics for divergence simulation tests #27

Closed

victorlin added 7 commits April 26, 2020 15:39

Merge remote-tracking branch 'ababaian/master'

7500bae

cov_benchmark.py updates

44957fd

- add option for verbose printing - fix error when tmp_dir exists

add Command class, refactoring

0298a40

one step closer to custom parameters

Implement custom parameters for msbar, art_illumina

6dc3013

fix custom parameters

b40f961

implement custom parameters for bowtie2

08e510b

update CLI, add option for proportions

0467393

proportions rounded to 4 decimal points

victorlin force-pushed the master branch from 10da567 to 0467393 Compare May 2, 2020 04:44

victorlin added 10 commits May 3, 2020 15:04

remove default rndSeed for art_illumina

58ba521

update inline docs

c29e408

Update parameter string parsing

d0eb960

print command string with -v flag

4588270

Add pos/neg read source options, modularize simulate_read_set function

3bfcbfb

cov_benchmark: refactoring

4e5757b

update verbose output

31d7dd6

update README

24af7e3

add test script and files

172647c

add pangenome_cov_benchmark

e08ff30

move test files to s3

630f3c8

ababaian removed this from Task In Progress in TODO List Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update benchmarking script with new CLI #56

Update benchmarking script with new CLI #56

victorlin commented Apr 25, 2020 •

edited

mathemage commented Apr 26, 2020 •

edited

victorlin commented May 3, 2020

ababaian commented May 3, 2020 •

edited

victorlin commented May 4, 2020

victorlin commented May 24, 2020

rcedgar commented May 24, 2020

ababaian commented May 24, 2020

Update benchmarking script with new CLI #56

Are you sure you want to change the base?

Update benchmarking script with new CLI #56

Conversation

victorlin commented Apr 25, 2020 • edited

mathemage commented Apr 26, 2020 • edited

victorlin commented May 3, 2020

ababaian commented May 3, 2020 • edited

victorlin commented May 4, 2020

victorlin commented May 24, 2020

rcedgar commented May 24, 2020

ababaian commented May 24, 2020

victorlin commented Apr 25, 2020 •

edited

mathemage commented Apr 26, 2020 •

edited

ababaian commented May 3, 2020 •

edited