Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use introspection rather than calling Interval() inside methods that return new intervals from inside instances #58

Closed
xguse opened this issue Apr 23, 2021 · 6 comments · Fixed by #59
Labels
enhancement New feature or request

Comments

@xguse
Copy link

xguse commented Apr 23, 2021

Here is what I mean:

Rather than this:

def __or__(self, other):
    if isinstance(other, Interval):
        return Interval(self, other)
    else:
        return NotImplemented

I think that something like this is preferable:

def __or__(self, other):
    if isinstance(other, self.__class__):
        return self.__class__(self, other)
    else:
        return NotImplemented

The reason for my feeling here is that it allows one to subclass your extremely helpful Interval class and allows that subclass to return instances of itself when making an or comparison, for example.

Lets say that I have two objects that are subclasses of Interval (SpecialInterval): a and b.

In your code, type(a | b) returns Interval.

I think most people would expect it to return a SpecialInterval object since we were comparing two SpecialInterval instances.

Does this make sense?

@AlexandreDecan AlexandreDecan added the enhancement New feature or request label Apr 23, 2021
@AlexandreDecan
Copy link
Owner

Hello,

Yes, it makes sense ;-) Actually, that's something I already implemented in a (quite old now) local branch. I've never made it public because I haven't found a proper solution to enable IntervalDict, from_string, from_data, etc. to automatically use the "appropriate" Interval subclass. However, that's not mandatory and already making Interval easier to be subclassed is a good first step ;-)

I'll look at this next week. I'll update this issue as soon as something get done ;)

@AlexandreDecan
Copy link
Owner

I have two questions:

(1) Would you agree to review the changes related to this issue? If so, I'll create a PR, if not, I'll commit directly to the master branch;
(2) Could you explain your use-case for a subclass of Interval? Perhaps I can do something to add the "missing" functionalities or to ease them.

Thank you!

@xguse
Copy link
Author

xguse commented Apr 26, 2021

I have two questions:

(1) Would you agree to review the changes related to this issue? If so, I'll create a PR, if not, I'll commit directly to the master branch;

Sure I would be happy to take a look.

(2) Could you explain your use-case for a subclass of Interval? Perhaps I can do something to add the "missing" functionalities or to ease them.

So I am a computational biologist, and I deal with DNA sequences quite a bit which you can think of as a number line for these purposes. Genes and other sequence features are defined ontop of this number line by their start and stop coordinates. Here is an illustration.

image

Now sometimes I was to "slice" out portions of a larger sequence and have the annotated features re-adjust themselves onto the new seq which will have a different "zero" position.

So I will need to check my registry of features for which ones happen to overlap the region I am slicing, then clip the ends that fall outside the sliced region and finally slide the whole set to the left to account for the missing "upstream" region.

As an illustration take a look at this:

These are the feature intervals on the original sequence.
image

And these are after I have used my Interval subclass.
image

from_skbio_interval is a method that has no general use. Its just a method to make an Interval out of a domain specific data source. But the rest may be of general interest.

our_chunk is the interval that I want to slice out of the full seq.

.ceiling(our_chunk.upper) checks the target ival for whether the its upper extends beyond the provided location and clips it at the provided location if it does. Otherwise, it does nothing. .floor is similar but deals with the lower bound.

.slide(-our_chunk.lower) transforms the interval up or down the spectrum by the provided amount.

I hope that is clear. I can elaborate if you would like.

@AlexandreDecan
Copy link
Owner

Hello,

For (1) Ok, thank you ;) I'll submit a PR soon, and I'll ping you so that you can review these changes ;)

For (2) Thanks for these explanations. I've the impression that ceiling and floor can be easily implemented using replace (e.g. floor(x) could be replace(lower=lambda v: max(x, v), and that slide(x) could be implemented using apply (or even replace depending on the right semantics ;-)). Obviously, it's far more convenient to have all those functions as instance methods, so I understand why you asked for easing subclassing Interval (and we will ;-)). Perhaps at some point I should think about a "protocol" that allows one to add methods to interval instances on-the-fly (e.g., Interval.register(method_name, func)) so that there won't be a need to subclass Interval in those cases.

@AlexandreDecan AlexandreDecan linked a pull request Apr 27, 2021 that will close this issue
@AlexandreDecan
Copy link
Owner

I don't know if you got "ping-ed" by the pull request, but here it is: #59
Sorry if you already received a notification ;-)

@AlexandreDecan
Copy link
Owner

Hi! I hope you're doing well. Can you have a look at #59 please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants