Check on screed's attributes #1484

ctb · 2016-10-06T13:29:20Z

Does 'accuracy' still exist on FASTA records?

Update: last remaining differences between khmer Read and screed Record objects seems to be description vs annotation.

The text was updated successfully, but these errors were encountered:

luizirber · 2016-10-06T20:57:01Z

No, neither does quality.

khmer.ReadParser records also use annotations, while screed.screedRecord uses description.
khmer.Read has everything from the ID line in the name field, leaves annotations empty
screed.screedRecord has ID line split by spaces, first into name and remainder in description

camillescott · 2016-11-17T17:42:42Z

Note that the quality attribute always seems to exist in the ReadParser's Read extension class, so the hasattr(read, "quality") paradigm doesn't work for them. Perhaps both implementations should switch to adding an extra format attribute.

standage · 2016-11-17T17:50:16Z

If .quality is always present but never set for Fasta records, then it should be possible to switch the check from hasattr(record, "quality") to record.quality != "", no?

camillescott · 2016-11-17T17:52:39Z

Yup, although screed just leaves it off entirely if it doesn't exist. I'd
prefer using a @property and returning None for the null case (checking
is None has less room for edge cases than empty string).

On Thu, Nov 17, 2016 at 9:50 AM, Daniel Standage notifications@github.com
wrote:

If .quality is always present but never set for Fasta records, then it
should be possible to switch the check from hasattr(record, "quality") to record.quality
!= "", no?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1484 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACwxrbGPMKxN0A4rJ2I68uvDS4RsSTonks5q_JPYgaJpZM4KP8NG
.

Camille Scott

Graduate Group for Computer Science
Lab for Data Intensive Biology
University of California, Davis

camille.scott.w@gmail.com

ctb · 2016-11-18T01:04:30Z

This was discussed in #1468 and resolved to eliminating the quality attribute on FASTQ records, which was the opinion of me and @luizirber both. Following from that, we decided 'quality' should be eliminated from ReadParser results on FASTA sequences.

At this point I'm happy to reconsider (I kinda like the 'format' notion) but I would like a compleat (if brief) proposal that satisfies all of the discussion points raised in the various discussions, i.e. would require some work by someone :).

ctb · 2016-11-18T01:06:26Z

One extra consideration is whether Nanopore or PacBio add sequence attributes that we want to plan ahead for. I don't have a clear idea of this.

betatim · 2016-11-18T11:10:44Z

For me read = khmer.Read("blah", "ACGT"); hasattr(read, 'quality') works as expected (returns False). (However ipython tab-completes read.q to read.quality so somewhere there is a bug)

ctb · 2016-12-19T20:09:20Z

I believe #1484 (comment) is incorrect - 'description' is only set if parse_description=True.

ctb · 2016-12-19T20:09:46Z

But it seems like the bigger annoyance is that khmer now has 'annotations' and screed has 'description' (when parse_description is set)

ctb · 2016-12-24T22:29:29Z

Note that we must allow 'screed.Record()', as it is used in khmer 2.0 which depends on screed > 0.9. In future we should remember to bound requirements by next major version :)

betatim · 2017-01-02T09:08:51Z

See dib-lab/screed#64 for allowing Read() to work again.

standage · 2017-01-21T15:55:21Z

screed.Record() compatibility between khmer and screed was restored with dib-lab/screed#64. As far as bringing reads/records into alignment between khmer and screed, it looks like there are two remaining discrepancies.

annotations attribute in khmer vs description attribute in screed (Check on screed's attributes #1484 (comment))
quality attribute present in khmer.Read but not screed.Record for fasta records (screed.Record(..., quality=None) should not set 'quality' #1468)

ctb · 2017-01-23T15:40:44Z

quality does not appear to be present in khmer.Read(...) if quality=None in constructor.

See test_read_parsers.py::test_read_type_basic and test_read_parsers.py::test_read_quality_none (added in #1559).

One remaining question - is it present when we read FASTA sequences?

betatim · 2017-01-23T15:44:35Z

In [2]: rp = khmer.ReadParser('tests/test-data/random-20-a.fa')

In [3]: for read in rp:
   ...:     print(read)
   ...:     print(hasattr(read, 'quality'))
   ...:     
<khmer.Read object at 0x7f31b9e4acc0>
False

Shall we immortalise by making it a test?

ctb · 2017-01-23T15:46:32Z

See #1583.

The only remaining issue here seems to be 'annotations' which we should be able to change in khmer (since AFAIK nothing uses it...)

betatim · 2017-01-23T15:55:18Z

I'm renaming annotations to description.

ctb added the screed label Oct 6, 2016

ctb added this to the screed 1.0 milestone Oct 6, 2016

This was referenced Oct 6, 2016

screed.Record(..., quality=None) should not set 'quality' #1468

Closed

Screed 1.0 Release #1478

Closed

camillescott mentioned this issue Nov 17, 2016

Have sandbox/sweep-reads2 output FASTQ records when quality is present #1515

Merged

ctb mentioned this issue Jan 23, 2017

Explicitly check that ReadParser does not add 'quality' attribute when reading FASTA #1583

Merged

8 tasks

betatim mentioned this issue Jan 23, 2017

Rename annotations to description #1584

Merged

8 tasks

betatim closed this as completed Jan 31, 2017

standage moved this from Slated to Closed in Screed 1.0 release Feb 17, 2017

luizirber moved this from TODO to Complete in Parameters for screed and khmer record objects Mar 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check on screed's attributes #1484

Check on screed's attributes #1484

ctb commented Oct 6, 2016 •

edited

luizirber commented Oct 6, 2016

camillescott commented Nov 17, 2016

standage commented Nov 17, 2016

camillescott commented Nov 17, 2016

ctb commented Nov 18, 2016

ctb commented Nov 18, 2016

betatim commented Nov 18, 2016 •

edited

ctb commented Dec 19, 2016

ctb commented Dec 19, 2016

ctb commented Dec 24, 2016

betatim commented Jan 2, 2017

standage commented Jan 21, 2017

ctb commented Jan 23, 2017

betatim commented Jan 23, 2017

ctb commented Jan 23, 2017

betatim commented Jan 23, 2017

Check on screed's attributes #1484

Check on screed's attributes #1484

Comments

ctb commented Oct 6, 2016 • edited

luizirber commented Oct 6, 2016

camillescott commented Nov 17, 2016

standage commented Nov 17, 2016

camillescott commented Nov 17, 2016

ctb commented Nov 18, 2016

ctb commented Nov 18, 2016

betatim commented Nov 18, 2016 • edited

ctb commented Dec 19, 2016

ctb commented Dec 19, 2016

ctb commented Dec 24, 2016

betatim commented Jan 2, 2017

standage commented Jan 21, 2017

ctb commented Jan 23, 2017

betatim commented Jan 23, 2017

ctb commented Jan 23, 2017

betatim commented Jan 23, 2017

ctb commented Oct 6, 2016 •

edited

betatim commented Nov 18, 2016 •

edited