Issue 797 #803

ctSkennerton · 2016-04-06T20:53:48Z

Same changes as #801 minus the change in doctest that I couldn't figure out how to fix

Implements a solution to #797 by checking for DBSeqRecord and explicitly returning SeqRecord

codecov-io · 2016-04-06T21:26:45Z

Current coverage is `78.71%`

Merging #803 into master will not affect coverage as of 9e54d13

@@            master    #803   diff @@
======================================
  Files          315     315       
  Stmts        46619   46622     +3
  Branches         0       0       
  Methods          0       0       
======================================
+ Hit          36694   36697     +3
  Partial          0       0       
  Missed        9925    9925

Review entire Coverage Diff as of 9e54d13

Powered by Codecov. Updated on successful CI builds.

peterjc · 2016-06-24T15:10:36Z

Alternative solution which does not change the SeqRecord but duplicates code in DBSeqRecord:
https://github.com/peterjc/biopython/commits/biosql_slicing

What do you think @ctSkennerton? Yours seems a better long term choice as avoids code duplication.

ctSkennerton · 2016-06-24T17:39:56Z

I'm against duplicated code as the number of people that a looking at BioSQL is going to be less than for SeqRecord so there is a greater chance of things breaking.

Having said that, I think adding __getitem__ to BioSQL.BioSeq would be worth it if there are additional features that make use of the lazy loading from the database. For example instead of iterating over all of the features and then appending those that are inside the slice, a specific SQL query could be used to only return those features from the database, which would boost performance. (This is something I've thought about doing recently)

I see your comment in Bio.SeqRecord about keeping the api in sync - can this be enforced automatically using tests?

peterjc · 2016-06-24T19:07:24Z

A more elegant fix would be to extend the DBSeqRecord.__init__ to take start and end parameters (defaulting to everything) and using these for the sequence and feature lookup. But that is more work.

ctSkennerton · 2016-06-24T20:22:14Z

Yeah I think that is a nice approach but would require rewriting a bunch of the SQL statements. I think for now it would be best to keep it simple and not duplicate the code and then down the track think about adding it in.

Squashed commit of pull request biopython#803, closes issue biopython#797. Long term the DBSeqRecord could implement __getitem__, perhaps by extending __init__ to include start/end options (default to the full record) and acting as a windowed view of the full record (e.g. loading only selected features on demand).

ctSkennerton added 2 commits April 6, 2016 13:01

add test for DBSeqRecord slicing

5555870

revert commits that were causing doctest to fail

09325df

peterjc merged commit 48cd6bc into biopython:master Jun 27, 2016

peterjc mentioned this pull request Jun 27, 2016

Unable to slice a DBSeqRecord #797

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 797 #803

Issue 797 #803

ctSkennerton commented Apr 6, 2016

codecov-io commented Apr 6, 2016

peterjc commented Jun 24, 2016

ctSkennerton commented Jun 24, 2016

peterjc commented Jun 24, 2016

ctSkennerton commented Jun 24, 2016

Issue 797 #803

Issue 797 #803

Conversation

ctSkennerton commented Apr 6, 2016

codecov-io commented Apr 6, 2016

Current coverage is 78.71%

peterjc commented Jun 24, 2016

ctSkennerton commented Jun 24, 2016

peterjc commented Jun 24, 2016

ctSkennerton commented Jun 24, 2016

Current coverage is `78.71%`