-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a generator for optionally overlapping blocks #35
Conversation
I already implemented this in the branch https://github.com/bastibe/PySoundFile/tree/blocks. |
I would suggest to use I think both are valid ways to implement the feature, but the former is much easier to implement because we don't actually have to know the block size (so we don't have to examine This is a matter of personal taste and previous experience, but I also think |
I actually think that a fractional overlap is nicer to work with. But this is not particularly important, I'm fine with either solution. I do want to point out though that both |
I don't really like prefixing We would have to use it in other places, too: I also suspect that it's un-Pythonic and it's actually a mistake of the |
|
That reeks like Hungarian Notation, which nobody uses anymore since decades [citation needed].
Yes, but it's also hideous and un-Pythonic. If you prefer, we can make it a fraction. Initially, I wanted to call it Actually, now that we dropped Would you prefer that? |
Let's go with both fraction and integer for Regarding |
OK, implementation-wise this wouldn't be a problem. But don't you fear people could get confused by this?
What about We could also implement both (but not to be used together), but I think this would also be confusing ...
Sure, sounds reasonable. |
I think
Frankly, no. I don't see any failure state that could be triggered by misinterpreting Regarding Thus, I am still in favor of Any thoughts? |
"Hop size" is quite common, especially when doing windowed signal analysis. A web search for "stft hop size" shows plenty of examples. I do like the sound and look of
I've seen all 3 out there. I'd prefer the first interpretation. I've read Joel's blog entry on Hungarian Notation years ago, it didn't convince me then, it doesn't convince me today (although it's interesting to know the history!). Do you know a case of something named I only know My opinion in short form: I strongly oppose |
Frankly, I have never seen hop size before. But since you have, it's a fine name as well. However, Numpy is using As for Would you be fine with that? |
Sure, I'm fine with an integer overlap called I actually grep'd the NumPy sources and didn't find any occurence of "overlap" as (part of) a function argument. I also checked in SciPy, and there I found only one: http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.signal.welch.html This uses |
Yes, I was thinking of Ok, are we decided on the function signature then?
Do you agree with the default values? |
I don't like the name I think I don't like setting a default block size. There is no reason why it should be exactly 1024 or any other fixed number. Even if there were a meaningful default value (which there isn't), it would depend on the sampling rate (which is unknown at the time of the function definition). I vote for
There is one further argument And then there are a lot of arguments missing from I'm not quite sure if we should also add For I both cases we could think about adding a |
Yes, I missed quite a lot there, didn't I? This looks much better now: SoundFile.blocks(file, blocksize=None, overlap=0, dtype='float64', always_2d=True, fill_value=None, out=None)
blocks(file, blocksize=None, overlap=0, dtype='float64', always_2d=True, fill_value=None, out=None, mode='r', sample_rate=None, channels=None, subtype=None, endian=None, format=None, closefd=None) I like the idea of I have not thought about writing to blocks yet though. Writing to blocks would have to either use out = np.empty(1024)
for data in sf.blocks('file.wav', 1024, mode='rw', out=out):
out[:] = data[:]*0.5
for data in sf.blocks('file.wav', 1024, mode='rw') as writer:
writer.send(data*0.5) We could also return an array to write to: for data, out in sf.blocks('file.wav', 1024, mode='rw'):
out[:] = data[:]*0.5 The first example is compatible with the second and third example. The second and third however differ in their implementation, so we can only implement one of them. I guess that the second example is somewhat more pythonic, since it does not rely on mutating arguments. On the other hand, A fourth version would be to re-use the yielded value as output, thus for data in sf.blocks('file.wav', 1024, mode='rw'):
data[:] = data[:]*0.5 This might even be compatible with Any thoughts on this? |
You're right, we don't need I guess your first example would work, but the Your second example isn't valid Python code. Your third example would work, but it would need a special case for Your fourth example is the one I used in my original implementation: e306f58 I think it's the best because all three modes work exactly the same way. |
Sounds good So in case of |
It would yield whatever |
I turned this issue into a pull request (with the very nice tool http://issue2pr.herokuapp.com/). The logic is quite involved, there are many code paths, so it's very likely that there are some bugs. The code would be much simpler if we would only support |
Looks pretty good.
|
Good question, I'm not quite sure. I thought The reason why I initially used But if you don't like it, I can change it back.
Sorry I missed that. I added a new commit 1735daa.
Sure, will do. |
My bad, I mistook the Looks great! |
import pysoundfile as sf
rms = [np.sqrt(np.mean(b**2)) for b in sf.blocks('myfile.wav', blocksize=1024, overlap=512)] Shall we put something like this into the docs? Is there an even simpler (but still somehow realistic) example? |
This is awesome! What a wonderful feature!
Absolutely! (We should probably rewrite the readme anyway, though, so this can probably wait.) |
I added a note to https://github.com/bastibe/PySoundFile/issues/26#issuecomment-47779426. |
I rebased this PR onto the current master and added tests for the Both the One situation that is not tested is the case when write and read position are different before calling the Any comments? |
The whole read/write pointer logic is quite involved, and probably hard to understand. It makes anything involving It is kind of sad that write mode does not allow for overlap. Overlap-Add is a very valuable technique. But maybe we should delegate that to a separate issue. |
Maybe we should remove the Also, this would create a clear distinction between the "high-level" functions and the "low-level" What do you think? |
I agree.
I think that's a hard decision to make. I don't know a real use case either, but still I'm reluctant to deliberately reduce the functionality libsndfile provides. I already removed this feature in the If we would change to a single read+write position, where should it be initially when opening an If we remove the separate read/write position, the next logical step would be to remove
It indeed is. I think a more realistic use case is to read block-wise from one file and write block-wise to another file.
Yes please!
I don't see one either, but that doesn't mean there won't be one.
Probably. I think it's important for new users to see that there are a few very simple functions (esp. |
I added a commit to disable The implementation indeed got simpler, but not as much as I hoped. But probably I missed a much better solution? One solution would be to get rid of the |
I think we should require the blocksize (or out). It doesn't make sense to request blocks of unknown size, and defaulting to all the remaining frames just makes it a copy of read. This would also further simplify the code. Looks good though. |
This wasn't my intention. If neither
How? |
Can I merge this and we do potential improvements later? |
Yes, please merge! |
Implement a generator for optionally overlapping blocks
The function signature should look something like this:
overlap
is a float of fractionalblock_len
.overlap=0.5
would advance the read pointer by512
in each iteration.fill_value
specifies the behavior in the last iteration: Either a fullblock_len
is returned (padded withfill_value
), or a shorter array is returned iffill_value=None
.