Skip to content
This repository has been archived by the owner on Jun 17, 2023. It is now read-only.

Speedup with_init #9

Merged
merged 4 commits into from
Oct 30, 2014
Merged

Speedup with_init #9

merged 4 commits into from
Oct 30, 2014

Conversation

hynek
Copy link
Owner

@hynek hynek commented Jun 28, 2014

The current version may be Pythonic but it‘s quite ineffective.

Since I like having a lot of classes with a lot of instances, I want it to be as fast as possible.

Let’s introduct some dark arts and create the initializer dynamically!

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1386c83 on speedup into 92902f8 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1386c83 on speedup into 92902f8 on master.

@hynek
Copy link
Owner Author

hynek commented Jun 28, 2014

@hynek
Copy link
Owner Author

hynek commented Jun 28, 2014

<exarkun> hynek: I would rather have a 30% slower __init__ than *ever* have to see what happens when pdb encounters the code in that PR.

> <string>(14)init()->None
(Pdb++) l
[EOF]
(Pdb++)

@hynek hynek closed this Jun 28, 2014
@glyph
Copy link
Collaborator

glyph commented Jun 29, 2014

Personally, without this optimization (or one like it), the point of characteristic is somewhat obviated for me. It's annoying to type __init__(self, foo, bar, ...): self.foo = foo, self.bar = bar, self.... = ... but if the upshot of doing so is that I see a 30% speedup, then it will likely be worth it. The reason I want to use characteristic is to make it easier to have a proliferation of tiny objects, the sort of objects which often show up in performance-critical code and would otherwise be tuples.

As far as characteristic is supposed to be promoting this approach and making it easier to declare objects, I strongly suspect that the average Python programmer will immediately give up on it when they learn that it's going to cause a 30% slowdown for object construction throughout their codebase.

That's not to say that @exarkun's concern is unwarranted; it would be good to make pdb do the right thing here as well, but I don't think that it's impossible to do that. It could be integrated with the linecache module, for example.

@hynek hynek reopened this Jun 29, 2014
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1386c83 on speedup into 92902f8 on master.

@hynek
Copy link
Owner Author

hynek commented Jun 29, 2014

Thank you for your support glyph.

I see it exactly your way; if I want to promote many small objects, it should be as fast as possible and pre-writing optimal code seems like exactly the way to go.

Would you consider the linecache thing a blocker for this PR? I’ve never used it, so help would be appreciated.

@glyph
Copy link
Collaborator

glyph commented Jun 29, 2014

Personally I wouldn't consider it a blocker but then again I don't spend a lot of time staring at inscrutable state in PDB, and I recognize that this is a privilege bought with the time of people who do ;-).

I'll have a crack at getting code-inspection tools like debuggers or profilers to do something reasonable with this in the coming week. Please feel free to pester me; specifically, I think I can make some time on Wednesday for this.

@hynek
Copy link
Owner Author

hynek commented Jun 29, 2014

Good, I’m merging it now with the loose prospect of making it nicer later.

Stepping through looks like this btw:

-> InitWithKWArgs(a=1, b=42)
(Pdb++) s
--Call--
> <string>(1)init()
(Pdb++) s
> <string>(5)init()
(Pdb++) s
> <string>(6)init()
(Pdb++) s
> <string>(12)init()
(Pdb++) s
--Call--
> /Users/hynek/Projects/characteristic/test_characteristic.py(197)__init__()
-> def __init__(self, **kw):
(Pdb++) s
> /Users/hynek/Projects/characteristic/test_characteristic.py(198)__init__()
-> assert "a" not in kw
(Pdb++) s
> /Users/hynek/Projects/characteristic/test_characteristic.py(199)__init__()
-> assert "b" in kw
(Pdb++) c

It’s not pretty but there’s worse things in the world if we don’t get it done.

Would changing <string> into something like <autogenerated-init-wrapper-by-characteristic> be a helpful compromise @exarkun?

@exarkun
Copy link

exarkun commented Jun 29, 2014

Would changing into something like be a helpful compromise @exarkun?

This has little or nothing to do with my concern. Losing the ability to coherently step through the source code is my concern.

Not really. The first time I have to debug this kind of code I'll probably switch away from characteristic. Having to type the __init__ myself once

@hynek
Copy link
Owner Author

hynek commented Jun 30, 2014

That would be a pity because the init-autogen is in there more or less for you.

Let’s see if glyph gets that pdb-able.

@glyph
Copy link
Collaborator

glyph commented Jun 30, 2014

Not really. The first time I have to debug this kind of code I'll probably switch away from characteristic. Having to type the init myself once

Well, the __init__. And the __eq__. And maybe the __ne__. And the __repr__. And the __str__, maybe. And possibly the __lt__ and the __gt__, although I guess you can ignore those on py3?

If you can coherently step through the generated code in PDB with a source listing, would that address your concern? Or is any level of automated code generation unacceptable?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 511ccab on speedup into be12cca on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 511ccab on speedup into be12cca on master.

@hynek
Copy link
Owner Author

hynek commented Aug 6, 2014

I’ve rebased it for the new factory code; making it jp-compatible would be very much appreciated now (since I have no idea how).

FWIW, this is the last blocker for 14.0.

@hynek hynek added this to the 14.0 milestone Aug 6, 2014
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 3f9c685 on speedup into be12cca on master.

@glyph
Copy link
Collaborator

glyph commented Aug 6, 2014

Let's be specific: "jp-compatibility" is defined as having comprehensible behavior when you step through the source code. A little asciicast of some sensible single-stepping through PDB or PUDB where you can tell when each attribute is set should demonstrate the proof of concept.

In the absence of further feedback from @exarkun I am going to assume that the major missing piece here is the absence of any source listing in pdb, because that's where you can't tell what's going on when single-stepping through the code (in other words, that's the feature that would be good enough for me).

Replacing <string> with something more descriptive does seem like another requirement.

@hynek
Copy link
Owner Author

hynek commented Aug 6, 2014

Yup, that’s what I understood too. FWIW, the script is already accessible within __init__._script for your dark sorcery.

@hynek
Copy link
Owner Author

hynek commented Aug 17, 2014

FWIW this needs another heavy rebase but I’m not going to do it myself.

@hynek hynek force-pushed the master branch 3 times, most recently from 1ccc7cd to a2be970 Compare August 22, 2014 13:47
@hynek hynek force-pushed the speedup branch 2 times, most recently from db0179d to 6c18e80 Compare October 25, 2014 14:02
@hynek
Copy link
Owner Author

hynek commented Oct 25, 2014

So the good news is, I got this working!

The bad news is: it has to be redone on top of the new master that added some fine features.

Anyhow, I’m glad to have 6c18e80 thanks to @fijal.

@hynek
Copy link
Owner Author

hynek commented Oct 26, 2014

Aaand here we go. I have rebased once more and am curious what y’all say.

The speedups are pretty significant: https://gist.github.com/hynek/76b2536c99289ba3e3b2

It’s still noticeably slower than hand-crafted simple version but it should open the doors for further optimizations in the future.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling db30428 on speedup into 53df083 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 7632a18 on speedup into 53df083 on master.

Side-stepping doesn't work anymore since we don't use setattr anymore.
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 19f0285 on speedup into 53df083 on master.

@lvh
Copy link

lvh commented Oct 28, 2014

This does not break any of the test suites of my projects using characteristic.

@glyph
Copy link
Collaborator

glyph commented Oct 28, 2014

@hynek - sorry I never got around to this. I'm extremely grateful that you made the time. Thank you for enhancing this important optimization with this arguably even more important debuggability and readability feature :).

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c7ed524 on speedup into 53df083 on master.

hynek added a commit that referenced this pull request Oct 30, 2014
Speedup with_init

Code generation is now used to create the actual initializer.

This is is much faster already and opens up opportunities for even more optimizations in the future.
@hynek hynek merged commit ee5ee15 into master Oct 30, 2014
@hynek
Copy link
Owner Author

hynek commented Oct 30, 2014

Since I’ve got multiple feedback that nothing breaks, I’m merging it now. Thanks to everyone for their cooperation!

@hynek hynek deleted the speedup branch October 30, 2014 09:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants