Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle unset ref when creating new records #526

Merged
merged 1 commit into from Aug 8, 2017

Conversation

msto
Copy link
Contributor

@msto msto commented Aug 8, 2017

Hi,

I found a small bug in the updated VariantHeader.new_record method. When setting the record's initial attributes, the setters' calls to bcf_sync_end raise an exception because record.ref is None.

I added a quick check in bcf_sync_end to handle this case. I think setting ref_len to 0 prior to the INFO/END deletion shouldn't have any side effects, since a new record shouldn't have any alleles or a set END anyways.

This bug could also be solved by setting ref before setting the remaining attributes in new_record, but I wasn't sure what the desired default would be for ref (or if it would be a good idea to force a default).

@bioinformed
Copy link
Member

Thanks for the bug report and PR! Much appreciated!

@bioinformed bioinformed merged commit da3e8b4 into pysam-developers:master Aug 8, 2017
@msto
Copy link
Contributor Author

msto commented Aug 8, 2017

No problem, thanks for merging it in so quickly!

I was actually going to follow up with a related bug. In the ref setter method, if the record doesn't already have an associated list of alleles, its list of alleles is set to contain just the ref allele.

pysam/pysam/libcbcf.pyx

Lines 3136 to 3141 in 27d8452

if r.d.allele and r.n_allele:
alleles = [r.d.allele[i] for i in range(r.n_allele)]
alleles[0] = value
else:
alleles = [value]
self.alleles = alleles

However, the alleles setter method raises an exception if passed a list of fewer than 2 alleles.

pysam/pysam/libcbcf.pyx

Lines 3169 to 3170 in 27d8452

if len(values) < 2:
raise ValueError('must set at least 2 alleles')

How do you think this one should be solved? Is the allele count restriction necessary? I ran a quick test without it and nothing seemed to break, so I could submit another PR with a quick fix removing that test. If it is necessary, I think setting a new record's ref allele would then require the addition of a dummy or default alternate allele.

@bioinformed
Copy link
Member

VCF doesn't allow null alleles and requires at least one alt allele. The easier way to proceed is to require ref and at least one alt to be specified during construction (even if they default to 'N' and '<NON_REF>') and then enforce that requirement globally.

@msto msto deleted the MS_new_record_bugfix branch November 16, 2017 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants