New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for autocompletable attributes on pipeline #8481

Closed
amueller opened this Issue Mar 1, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@amueller
Member

amueller commented Mar 1, 2017

Many people use jupyter for data analysis. I like us to be as responsive and interactive in this environment as possible. I also like API to be discoverable by pressing .

Unfortunately it's not possible to access attributes on pipelines without writing out full names.
afaik there's two ways to access attributes:

coef = pipeline.steps[1][1].coef_
coef = pipeline.named_steps['logisticregression'].coef_

I usually prefer the second one because I like names better than counting and it's more explicit for anyone reading the code.
But that requires me to know two things that I can't get via tab-completion: the names of the steps and the name of the attribute.

I think it would be awesome if I could get to coef_ using a tab-completable, which would make the whole thing much more discoverable.

I'm not sure that's very easy to do, though.
One option would be to either overload Pipelines __getattr__ and __dir__ so that you could do

pipeline.logisticregression.coef_

That requires some magic and it would not work for the names that contain numbers in them. I would be fine if we don't support names with -1 and -2 in them for now.
But more magic in the pipeline is not great.

We could also overload __getattr__ and __dir__ on ```pipeline.named_steps. That would limit the magic to named_steps`` which is just for convenience and not used internally anywhere afaik. And we already have an implementation of this: the bunch object.

So we could replace named_steps with a Bunch object and get an API that is way more discoverable (and easier on my fingertips ;)

@herilalaina

This comment has been minimized.

Contributor

herilalaina commented Mar 1, 2017

I would like to work on this issue

@jnothman

This comment has been minimized.

Member

jnothman commented Mar 1, 2017

@amueller

This comment has been minimized.

Member

amueller commented Mar 3, 2017

can you link to your pipeline.pop proposal? I don't think the last estimator is necessarily any more important than the other ones.

@jnothman

This comment has been minimized.

Member

jnothman commented Mar 4, 2017

@amueller

This comment has been minimized.

Member

amueller commented Mar 6, 2017

Oh also: the keys are auto-completed but nothing that comes after is.
So I can auto-complete 'logisticregression' in the example above, but not coef_.

@jnothman

This comment has been minimized.

Member

jnothman commented Mar 6, 2017

@herilalaina herilalaina referenced this issue Mar 14, 2017

Merged

[MRG+1] Change named_steps to Bunch object #8586

3 of 3 tasks complete

@TomDLT TomDLT closed this in #8586 Mar 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment