Skip to content

allow read/put use at module level#22

Merged
JonnyJD merged 5 commits into
masterfrom
easy_free
Apr 21, 2013
Merged

allow read/put use at module level#22
JonnyJD merged 5 commits into
masterfrom
easy_free

Conversation

@JonnyJD
Copy link
Copy Markdown
Collaborator

@JonnyJD JonnyJD commented Apr 14, 2013

Not exactly sure, why I decided using "del" wouldn't be good
enough, but having the option of an easier interface is no bad thing.

It is an open question if we should allow read/put on module and on
class level or hide the class methods.

Re-using the same class for multiple put()s is in theory faster, but not sure if that is relevant with the overhead the binding introduces.
Even when we decide to hide read/put on class level, we might just keep them for now, mark them as deprecated and remove them from documentation.

Not exactly sure, why I decided using "del" wouldn't be good
enough, but having the option of an easier interface is no bad thing.

It is an open question if we should allow read/put on module and on
class level or hide the class methods.
@JonnyJD
Copy link
Copy Markdown
Collaborator Author

JonnyJD commented Apr 14, 2013

That part of the interface is similar to how it works on https://github.com/sebastinas/python-libdiscid. There is no reason this wouldn't work with ctypes.

@phw
Copy link
Copy Markdown
Member

phw commented Apr 15, 2013

Funny to see this today, see what I did yesterday to the Ruby version: phw/ruby-discid@4b8e623 (wanted to do that refactoring for a long time, guess I'll release that as a 1.0).

I like that simplification for the python bindings, too. It feels more like a Python module and less like a Java one that way.

@JonnyJD
Copy link
Copy Markdown
Collaborator Author

JonnyJD commented Apr 15, 2013

Yes, I've seen your changes and that reminded me I wanted to push a branch with this change. Still a lot of things to test and change (all the tests, documentation decisions etc.), but it seems people want this simplification.

@JonnyJD
Copy link
Copy Markdown
Collaborator Author

JonnyJD commented Apr 19, 2013

Test program:

import discid

first = 1
last = 15
offsets = [258725, 150, 17510, 33275, 45910,
           57805, 78310, 94650,109580, 132010,
           149160, 165115, 177710, 203325, 215555, 235590]

disc_object = discid.DiscId()
def put_object():
    disc_object.put(first, last, offsets)
    #disc_object.id

def put_module():
    disc = discid.put(first, last, offsets)
    #disc.id

for i in range(0, 1000000):
    #put_module()
    put_object()

1 million put()s. Tried with module and class level put, with and without generating a disc id.
put_module with id: 37s, without: 16s
put_object with id: 31s, without: 10s
The calculation of the disc ID takes longer than the put (of course). There is some overhead when creating a new discid object (and allocating space).
So it is a significant difference in comparision of the methods, but when we can calculate 1 million disc IDs in 30 seconds, we don't have a performance problem in general.
There are only half a million disc IDs on musicbrainz.org!

@JonnyJD
Copy link
Copy Markdown
Collaborator Author

JonnyJD commented Apr 19, 2013

I removed the self.free() from __del__.
The module variant now took > 3 GB of RAM :-D (but not more time, since no swapping occured).
So I would say deallocation works fine with __del__.

If you try to calculate 1 M disc IDs, you normally would have some file system access anyways, which is probably slower than the whole discid operation.

FYI:
I tried the same with C. It doesn't seem to make a difference if I create a new object every time. (possibly the allocation itself is not a problem in python, but the additional calls to C). Both times it takes 17 secons with disc ID calculation and without the calculation the compiler seems to optimize most stuff to NOP, since it only takes 1/10 of a second for both then.

conclusion:
Performance is no reason to make read/put available on object/class level. We should deprecate usage of put/read on object level and only use it on module level. Being simple is better in this case than "premature optimization".
I will probably leave the read/put in the code for a while and print a deprecation warning to stderr.

JonnyJD added 4 commits April 20, 2013 00:56
These should now be used to create DiscID objects,
rather than creating them directly.
This also means the DiscId() object is not mentioned in code directly
anymore.
@ghost ghost assigned JonnyJD Apr 19, 2013
@JonnyJD
Copy link
Copy Markdown
Collaborator Author

JonnyJD commented Apr 19, 2013

Since I technically still use DiscId.read() from outside the class, I didn't rename it to DiscId._read(). It's not used by the user (at least not supposed to), but it isn't private either.
I just removed it from documentation. Documentation and simple usage are the concerns anyways.

JonnyJD added a commit that referenced this pull request Apr 21, 2013
See pull-request #22
@JonnyJD JonnyJD merged commit 4488e35 into master Apr 21, 2013
@JonnyJD JonnyJD deleted the easy_free branch April 21, 2013 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants