-
Notifications
You must be signed in to change notification settings - Fork 577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Depad #404
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is not ideal, but is much better than not handling the read at all (which would happen if the assert didn't abort the command).
Updated the bam header API to use htslib. Fixed the test data for r051 so that it has the correctly depadded cigar string. Fixed the code such that it can now generate the correct output for r051 too. Changed the tests from expected fail to expected pass.
Also had to fix the test data to not have unmapped reads with a CIGAR string as this isn't supported in CRAM. The change to stop this blowing up in CRAM now also means that these strings do not get edited in SAM or BAM either. I think that is valid though as they are currently nonsensical.
Merge complicated by the ancestry and that this all predates the grand white-space unification. Joy! Conflicts: padding.c
CRAM output is now forced to using non-reference mode as clearly we don't have a copy of depadded reference to compare against. Also implemented the "TODO" comment. We now have proper @sq parsing and fixing.
jmarshall
added a commit
that referenced
this pull request
Jun 18, 2015
charles-plessy
added a commit
to Debian/samtools
that referenced
this pull request
Apr 26, 2016
Samtools release 1.3: many improvements, fixes, new commands * The obsolete "samtools sort in.bam out.prefix" usage has been removed. If you are still using -f, -o, or out.prefix, convert to use -T PREFIX and/or -o FILE instead. (samtools#295, samtools#349, samtools#356, samtools#418, PR samtools#441; see also discussions in samtools#171, samtools#213.) * The "bamshuf" command has been renamed to "collate" (hence the term bamshuf no longer appears in the documentation, though it still works on the command line for compatibility with existing scripts). * The mpileup command now outputs the unseen allele in VCF/BCF as <*> rather than X or <X> as previously, and now has AD, ADF, ADR, INFO/AD, INFO/ADF, INFO/ADR --output-tags annotations that largely supersede the existing DV, DP4, DPR annotations. * The mpileup command now applies BAQ calculations at all base positions, regardless of which -l or -r options are used (previously with -l it was not applied to the first few tens of bases of each chromosome, leading to different mpileup results with -l vs. -r; samtools#79, samtools#125, samtools#286, samtools#407). * Samtools now has a configure script which checks your build environment and facilitates choosing which HTSlib to build against. See INSTALL for details. * Samtools's Makefile now fully supports the standard convention of allowing CC/CPPFLAGS/CFLAGS/LDFLAGS/LIBS to be overridden as needed. Previously it listened to $(LDLIBS) instead; if you were overriding that, you should now override LIBS rather than LDLIBS. * A new addreplacerg command that adds or alters @rg headers and RG:Z record tags has been added. * The rmdup command no longer immediately aborts (previously it always aborted with "bam_get_library() not yet implemented"), but remains not recommended for most use (samtools#159, samtools#252, samtools#291, samtools#393). * Merging files with millions of headers now completes in a reasonable amount of time (samtools#337, samtools#373, samtools#419, samtools#453; thanks to Nathan Weeks, Chris Smowton, Martin Pollard, Rob Davies). * Samtools index's optional index output path argument works again (samtools#199). * Fixed calmd, targetcut, and potential mpileup segfaults when given broken alignments with POS far beyond the end of their reference sequences. * If you have source code using bam_md.c's bam_fillmd1_core(), bam_cap_mapQ(), or bam_prob_realn_core() functions, note that these now take an additional ref_len parameter. (The versions named without "_core" are unchanged.) * The tview command's colour scheme has been altered to be more suitable for users with colour blindness (samtools#457). * Samtools depad command now handles CIGAR N operators and accepts CRAM files (samtools#201, samtools#404). * Samtools stats now outputs separate "N" and "other" columns in the ACGT content per cycle section (samtools#376). * Added -a option to samtools depth to show all locations, including zero depth sites (samtools#374). * New samtools dict command, which creates a sequence dictionary (as used by Picard) from a FASTA reference file. * Samtools stats --target-regions option works again. * Added legacy API sam.h functions sam_index_load() and samfetch() providing bam_fetch()-style iteration over either BAM or CRAM files. (In general we recommend recoding against the htslib API directly, but this addition may help existing libbam-using programs to be CRAM-enabled easily.) * Fixed legacy API's samopen() to write headers only with "wh" when writing SAM files. Plain "w" suppresses headers for SAM file output, but this was broken in 1.2. * "samtools fixmate - -" works in pipelines again; with 1.0 to 1.2, this failed with "[bam_mating] cannot determine output format". * Restored previous "samtools calmd -u" behaviour of writing compression level 0 BAM files. Samtools 1.0 to 1.2 incorrectly wrote raw non-BGZF BAM files, which cannot be read by most other tools. (Samtools commands other than calmd were unaffected by this bug.) * Restored bam_nt16_nt4_table[] to legacy API header bam.h. * Fixed bugs samtools#269, samtools#305, samtools#320, samtools#328, samtools#346, samtools#353, samtools#365, samtools#392, samtools#410, samtools#445, samtools#462, samtools#475, and samtools#495.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Mix of ancient fixes from Peter Cock along with newer updates for fix bugs and add CRAM support.
Possible commenting improvements coming with a later patch.