Implement a generator for optionally overlapping blocks #35

mgeier · 2014-06-18T08:26:06Z

for block sf.blocks('filename'):
    # do something

The function signature should look something like this:

def blocks('filename', block_len=1024, overlap=0.0, fill_value=None, out=None):

overlap is a float of fractional block_len. overlap=0.5 would advance the read pointer by 512 in each iteration.
fill_value specifies the behavior in the last iteration: Either a full block_len is returned (padded with fill_value), or a shorter array is returned if fill_value=None.

See also #16

mgeier · 2014-04-25T06:30:49Z

I already implemented this in the branch https://github.com/bastibe/PySoundFile/tree/blocks.
I wanted to wait with a pull request until #34 is resolved and merged, because the block-functionality depends on it.

mgeier · 2014-04-25T08:33:03Z

I would suggest to use overlap in frames, not as a fraction of the block size.

I think both are valid ways to implement the feature, but the former is much easier to implement because we don't actually have to know the block size (so we don't have to examine out).

This is a matter of personal taste and previous experience, but I also think block_len=1024, overlap=512 is easier to grasp at first sight than block_len=1024, overlap=0.5.

bastibe · 2014-04-25T14:55:10Z

I actually think that a fractional overlap is nicer to work with. But this is not particularly important, I'm fine with either solution.

I do want to point out though that both pyplot.specgram and pyplot.psd use noverlap instead of overlap. That is probably a good idea as it avoids the confusion around these terms.

mgeier · 2014-04-29T09:57:12Z

I don't really like prefixing n, but that's of course a matter of taste.

We would have to use it in other places, too: nchannels, nframes, nsections, which I also think would be very ugly.

I also suspect that it's un-Pythonic and it's actually a mistake of the pyplot-people to use it. They probably use it because it's called like that in Matlab(TM), but that should be in no way a reason for us to use it.

bastibe · 2014-04-30T07:24:22Z

n means that it is an integer. In this case, there is reason to think that it could be a fraction. n makes that clear.

mgeier · 2014-05-17T15:10:28Z

n means that it is an integer.

That reeks like Hungarian Notation, which nobody uses anymore since decades [citation needed].

In this case, there is reason to think that it could be a fraction. n makes that clear.

Yes, but it's also hideous and un-Pythonic.

If you prefer, we can make it a fraction.
We could also allow both and decide which one it is based on the type.
But all that would need more code.
And there is always the rounding issue which may lead to sneaky off-by-one errors.
I think it's better to leave the conversion from float to int to the user.

Initially, I wanted to call it hopsize (or stepsize), which has no such ambiguity, but sadly, this cannot be used without knowing the actual block size.

Actually, now that we dropped channels_first, it's not so much implementation effort anymore (we can just use len(out)) and we could re-consider using hopsize.

Would you prefer that?

bastibe · 2014-05-19T08:01:04Z

Let's go with both fraction and integer for overlap, then. That is the most versatile approach, and it won't add too much code. I think overlap is the canonical name, so we should stick with that.

Regarding len(out), we should probably do the same thing as in read, and ignore block_len if out is given.

mgeier · 2014-05-22T19:33:37Z

Let's go with both fraction and integer for overlap, then. That is the most versatile approach, and it won't add too much code.

OK, implementation-wise this wouldn't be a problem.

But don't you fear people could get confused by this?

I think overlap is the canonical name, so we should stick with that.

What about hopsize?

We could also implement both (but not to be used together), but I think this would also be confusing ...

Regarding len(out), we should probably do the same thing as in read, and ignore block_len if out is given.

Sure, sounds reasonable.

bastibe · 2014-05-25T16:38:20Z

I think overlap is the canonical name. It is the name I see most often. I've never seen hopsize anywhere. Also, "hopping" is more or less synonymous to "jumping", which is not a fitting metaphor, as we are not "jumping over" anything really. I think overlap is a good name.

But don't you fear people could get confused by this?

Frankly, no. I don't see any failure state that could be triggered by misinterpreting overlap. The only reason I can think of against this is that "There should be one-- and preferably only one --obvious way to do it.". But then again, "[...] practicality beats purity.".

Regarding noverlap and your invocation of Hungarian Notation, I want to point you to Joel on Software, who shows how Apps Hungarian Notation can be used for good. I think noverlap would be such a good use, as it makes a semantic distinction between overlap-as-fraction and overlap-as-number-of-frames. It is not a simple type annotation, as Systems Hungarian would have it.

Thus, I am still in favor of noverlap as a number of frames of overlap, as this is the only un-ambiguous solution I can come up with. If you still oppose this, we can compromize on overlap as both a fraction and a number of samples. This will at least always work, even though it is less clean.

Any thoughts?

mgeier · 2014-06-04T15:27:23Z

"Hop size" is quite common, especially when doing windowed signal analysis. A web search for "stft hop size" shows plenty of examples.
I can't say which one is used more often, I think both are valid and reasonable names.
And I think "jumping" is quite a good metaphor, as far as metaphors go. We jump from one block to the next.
The point is I'm not inventing this stuff, the term is already widely in use.

I do like the sound and look of overlap, it's just very ambiguous:

overlap in frames
overlapping frames divided by block size
overlap as fraction, but specifying the denominator, e.g. overlap=4 means a quarter of the block size is overlapping

I've seen all 3 out there. I'd prefer the first interpretation.
In addition to these 3 options, one could also read it as total overlap on both sides of one block. One could say that with a block size of 1024 and a hop size of 512 the frames fully overlap (meaning overlap == 1.0) because there is no part of the signal where there is no overlap.
To be honest, i've never seen that, but it would be a possible interpretation.

I've read Joel's blog entry on Hungarian Notation years ago, it didn't convince me then, it doesn't convince me today (although it's interesting to know the history!).
Everybody agrees that Systems Hungarian is utter crap. Apps Hungarian may have some limited use cases in other languages but I'd say it should never be used in a Python program.
Also, it only really works when using lowerCamelCase, which is a no-go in Python.

Do you know a case of something named n* in the Python standard lib or any other widely used Python library?

I only know ndims in NumPy, but this is not Hungarian, this is just an abbreviation for "Number of DIMensionS".
I don't think this is a particularly beautiful choice, but I can accept it.
noverlap doesn't fall into this category, because it is not an abbreviation for "Number of OVERLAPs".
Another bad example which is regularly used is nfft for the FFT size, in which case I'd use the much more Pythonic fftsize.

My opinion in short form: I strongly oppose noverlap, I think overlap is OK, but I'm not sure about the fraction stuff and I think hopsize is also OK.

bastibe · 2014-06-06T18:22:34Z

Frankly, I have never seen hop size before. But since you have, it's a fine name as well.

However, Numpy is using overlap, so I think we should go with that name instead of hopsize, if only for familiarity to Numpy users. Also, I simply like it better.

As for noverlap, you are probably right. overlapsize would then be the right name, if a bit wordy. Actually, I would be fine with that. Or we just use overlap and say in the docs that it is in frames. Users should be able to figure that one out. We can even add an error check for it. Since there should be only one way to do things, let's stick with that and simply disallow fractions.

Would you be fine with that?

mgeier · 2014-06-10T14:38:57Z

Sure, I'm fine with an integer overlap called overlap, this is what I used in my initial implementation: e306f58, this also leads to a very simple implementation.

I actually grep'd the NumPy sources and didn't find any occurence of "overlap" as (part of) a function argument.
What function are you referring to?

I also checked in SciPy, and there I found only one: http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.signal.welch.html

This uses noverlap, but I think this function is specialized enough not to influence our decision.

bastibe · 2014-06-10T18:38:33Z

Yes, I was thinking of scipy.signal.welch.

Ok, are we decided on the function signature then?

blocks(file, block_len=1024, overlap=512, fill_value=None, out=None)

Do you agree with the default values?

mgeier · 2014-06-11T15:36:26Z

I don't like the name block_len.
I think we should try to avoid abbreviations wherever it's reasonable. Also, I think that underscores should only be used if necessary to improve readability (as I understand PEP 8).
Therefore, blocklength would seem better to me.
But even better, and my favorite, is blocksize.

I think fill_value should keep the underscore because it's easier to read then fillvalue (although I'm not sure about that). Apart from that, fill_value is heavily used in numpy.ma.MaskedArray.

I don't like setting a default block size. There is no reason why it should be exactly 1024 or any other fixed number. Even if there were a meaningful default value (which there isn't), it would depend on the sampling rate (which is unknown at the time of the function definition).
Also, a default block size would conflict with the out argument.
Therefore, the default value should be blocksize=None.
blocksize shouldn't be optional, the only reason it's defaulted to None is to allow out.

I vote for overlap=0. If I don't specify an overlap, I don't expect any overlap.

fill_value=None and out=None is great.

There is one further argument max_blocks=-1 which is useful for write-only files (and it may of course also be used in 'r' and 'rw' mode).

And then there are a lot of arguments missing from read() and open(), namely mode='r', sample_rate=None, channels=None, subtype=None, endian=None, format=None, closefd=True, dtype='float64' and always_2d=True.

I'm not quite sure if we should also add start=None and stop=None like in the read() function.
Currently I don't see a reason why not to allow them.
I don't see a reason why blocks() should be less powerful than read().
I still have to check if start and stop work with write-only files. start probably doesn't but I guess stop could be useful.

For SoundFile.blocks() some of the arguments are not necessary, there we only need blocksize=None, overlap=0, max_blocks=-1, dtype='float64' and always_2d=True, fill_value=None and out=None.
start doesn't make sense there and I think stop would be too unclear and confusing.

I both cases we could think about adding a frames=-1 argument for completeness.
Actually, this would be quite useful, I don't know why I didn't think about this before ...

bastibe · 2014-06-15T11:28:35Z

Yes, I missed quite a lot there, didn't I? This looks much better now:

SoundFile.blocks(file, blocksize=None, overlap=0, dtype='float64', always_2d=True, fill_value=None, out=None)
blocks(file, blocksize=None, overlap=0, dtype='float64', always_2d=True, fill_value=None, out=None, mode='r', sample_rate=None, channels=None, subtype=None, endian=None, format=None, closefd=None)

I like the idea of max_blocks a lot! I think that the same cause would be better served by using frames, though. With overlapping blocks, the number of blocks can become quite confusing. The main purpose of blocks is to abstract that confusion away. Thus, I think it is much easier to provide the file length in frames, than in number of blocks. Also, frames provides a nice symmetry with read.

I have not thought about writing to blocks yet though. Writing to blocks would have to either use out or generator.send (example):

out = np.empty(1024)
for data in sf.blocks('file.wav', 1024, mode='rw', out=out):
    out[:] = data[:]*0.5

for data in sf.blocks('file.wav', 1024, mode='rw') as writer:
    writer.send(data*0.5)

We could also return an array to write to:

for data, out in sf.blocks('file.wav', 1024, mode='rw'):
    out[:] = data[:]*0.5

The first example is compatible with the second and third example. The second and third however differ in their implementation, so we can only implement one of them. I guess that the second example is somewhat more pythonic, since it does not rely on mutating arguments. On the other hand, generator.send is probably one of the lesser-known features of Python and might be unfamiliar to many. Also, the third example would have a different number of arguments depending on whether blocks is called in read mode, read/write mode or write mode.

A fourth version would be to re-use the yielded value as output, thus

for data in sf.blocks('file.wav', 1024, mode='rw'):
    data[:] = data[:]*0.5

This might even be compatible with generator.send.

Any thoughts on this?

mgeier · 2014-06-15T15:39:02Z

You're right, we don't need max_blocks.
I became aware of frames quite late, while I wrote the last paragraph of my previous comment. I didn't realize at the time that this would make max_blocks obsolete.

I guess your first example would work, but the out argument wasn't intended to be used like that. It's meant for reading, not for writing.
Therefore, I wouldn't advertise this usage.

Your second example isn't valid Python code.
Iteration doesn't mix well with the use of send(). You have to call send() to get the next item as return value.
In your example, you'd have to call send() in an infinite loop and use the (modified) return value in the next invocation of send().
As much as I like the send() method of generators, I don't think we should use it here.
I think the normal use case for this is to send something to the generator and receive something in response.
In our case it would be reversed, we would want to receive a block and then send a modified block back.
This would cause an awkward asymmetry because the very first time we would have to send(None) and the last return value would have to be ignored.
I think this would be a mis-use of the send() mechanism.

Your third example would work, but it would need a special case for 'rw' mode where both data and out are needed.
Also, it would need more memory than necessary in the case that the user can re-use input and output buffers.

Your fourth example is the one I used in my original implementation: e306f58

I think it's the best because all three modes work exactly the same way.

bastibe · 2014-06-17T07:01:11Z

Sounds good

So in case of out in 'rw', the function would just yield out, right?

mgeier · 2014-06-17T12:42:43Z

It would yield whatever SoundFile.read() returns.
This would normally be out, except when the file has less frames or less frames were requested, then it would return a smaller view of out.

mgeier · 2014-06-17T14:55:55Z

I added a possible implementation based on our discussion: b70f6d7

This depends on #46. Once this is resolved, I'll make a proper pull request.

mgeier · 2014-06-18T08:33:53Z

I turned this issue into a pull request (with the very nice tool http://issue2pr.herokuapp.com/).

The logic is quite involved, there are many code paths, so it's very likely that there are some bugs.

The code would be much simpler if we would only support 'r' mode, but I think 'w' mode is equally useful, and 'rw' should be there for completeness (although I'm not quite sure if it will ever be used).

bastibe · 2014-06-19T16:47:59Z

Looks pretty good.

Why did you change the default start inread?
Please add documentation for the underscore-methods.
Please add tests (possibly after Implement Unit Tests using pytest. #48 is resolved)

mgeier · 2014-06-23T17:00:13Z

Why did you change the default start in read?

Good question, I'm not quite sure.

I thought start=0 is a more natural default than start=None since the first frame is always 0.

The reason why I initially used None, is because this is also used in the built-in slice() function.
But I reckon that this connection is not very obvious (especially since step is not available), therefore it doesn't help anyone if we use a similar default value here.

But if you don't like it, I can change it back.
I think none of them is wrong.

Please add documentation for the underscore-methods.

Sorry I missed that. I added a new commit 1735daa.

Please add tests (possibly after #48 is resolved)

Sure, will do.

bastibe · 2014-06-24T12:58:38Z

I thought start=0 is a more natural default than start=None since the first frame is always 0.

My bad, I mistook the read function for the read method. In the function, this change makes sense.

Looks great!

mgeier · 2014-06-25T13:21:41Z

blocks() can be used in list comprehensions, e.g.:

import pysoundfile as sf

rms = [np.sqrt(np.mean(b**2)) for b in sf.blocks('myfile.wav', blocksize=1024, overlap=512)]

Shall we put something like this into the docs?

Is there an even simpler (but still somehow realistic) example?

bastibe · 2014-06-26T13:16:35Z

This is awesome! What a wonderful feature!

Shall we put something like this into the docs?

Absolutely! (We should probably rewrite the readme anyway, though, so this can probably wait.)

mgeier · 2014-07-02T14:03:48Z

I added a note to https://github.com/bastibe/PySoundFile/issues/26#issuecomment-47779426.

mgeier · 2014-07-15T14:26:38Z

I rebased this PR onto the current master and added tests for the blocks() function.

Both the blocks() function and the blocks() method now have 100% test coverage, although the tests call only the function.

One situation that is not tested is the case when write and read position are different before calling the blocks() method (in 'rw' mode, of course).
This is a quite exotic situation, I don't know if we even want to test that ...

Any comments?
Errors?
Possible improvements?

bastibe · 2014-07-17T08:47:12Z

The whole read/write pointer logic is quite involved, and probably hard to understand. It makes anything involving rw mode pretty messy (not just blocks). It might be worth considering to simplify the pointer logic to one unified read/write-pointer. I can't really think of any valuable use for a separate pointer for reading and writing.

It is kind of sad that write mode does not allow for overlap. Overlap-Add is a very valuable technique. But maybe we should delegate that to a separate issue.

bastibe · 2014-07-17T08:53:03Z

Maybe we should remove the blocks method, and have blocks only as a function. I don't see a use case where blocks would be used on anything other than the whole file.

Also, this would create a clear distinction between the "high-level" functions and the "low-level" SoundFile.

What do you think?

mgeier · 2014-07-17T15:33:52Z

The whole read/write pointer logic is quite involved, and probably hard to understand. It makes anything involving rw mode pretty messy (not just blocks).

I agree.
But it also kind of makes sense: I normally want to start reading in the beginning and start writing in the end, therefore I need separate read/write positions.

It might be worth considering to simplify the pointer logic to one unified read/write-pointer. I can't really think of any valuable use for a separate pointer for reading and writing.

I think that's a hard decision to make. I don't know a real use case either, but still I'm reluctant to deliberately reduce the functionality libsndfile provides.
Even if neither of us needs this, there may be other people who do.
And they will be really pissed if we remove this feature.

I already removed this feature in the blocks() function, knowing that whoever needs this, can use the lower-level blocks() method.

If we would change to a single read+write position, where should it be initially when opening an 'rw' file?

If we remove the separate read/write position, the next logical step would be to remove 'rw' mode altogether.
For my personal use cases this wouldn't even be a problem, but on the other hand, I would like to support as much of libsndfile as is reasonably doable.

It is kind of sad that write mode does not allow for overlap. Overlap-Add is a very valuable technique.

It indeed is.
But it cannot logically work in write mode. You have to read the values before you are able to accumulate the new values onto them.
Therefore it can only work in 'rw' mode, which I think is complicated enough already.

I think a more realistic use case is to read block-wise from one file and write block-wise to another file.
To allow overlap when writing, I'd write partial blocks, which would be easiest when using 50% overlap. If a block size of 1024 with 512 samples overlap is desired, I'd write in blocks of 512 (without overlap) and do the windowing and accumulation in user code.

But maybe we should delegate that to a separate issue.

Yes please!
This has no effect on the API and would be perfectly backwards compatible, so there's no hurry.

Maybe we should remove the blocks method, and have blocks only as a function. I don't see a use case where blocks would be used on anything other than the whole file.

I don't see one either, but that doesn't mean there won't be one.
Also, even if it's only used on the whole file it might be desirable to have a blocks() method, because you probably want to open the file first to get some information (sample rate, length, ...) and then use blocks() with arguments based on this information.

Also, this would create a clear distinction between the "high-level" functions and the "low-level" SoundFile.

Probably.
But for me this is not a strong motivation for anything.

I think it's important for new users to see that there are a few very simple functions (esp. sf.read() and sf.write()) which are probably enough for most people.
Everything else, including blocks(), is not for casual use but for people who have a reason for looking into those specialized things.
What I want to say with that is that I wouldn't draw the line between functions and methods, but already within the functions.

mgeier · 2014-07-27T14:17:39Z

I added a commit to disable blocks() in 'w' mode (as discussed before): d29c034

The implementation indeed got simpler, but not as much as I hoped.
Especially the frames argument is surprisingly hard to handle correctly.
I initially missed a corner case, which I tried to fix with a0c5997.
Using out seems like a hack, but it was the only way I could come up with, that didn't require much code duplication between blocks() and read().

But probably I missed a much better solution?

One solution would be to get rid of the frames argument (and consequently also the start and stop arguments).
But I think there may actually be use cases for this, so I'm reluctant to remove this feature.

bastibe · 2014-07-27T14:33:17Z

I think we should require the blocksize (or out). It doesn't make sense to request blocks of unknown size, and defaulting to all the remaining frames just makes it a copy of read. This would also further simplify the code.

Looks good though.

mgeier · 2014-07-27T19:00:36Z

It doesn't make sense to request blocks of unknown size, and defaulting to all the remaining frames just makes it a copy of read.

This wasn't my intention. If neither blocksize nor out was given, some obscure error would have been raised.
I added a more specific error message in 64e5d55.

This would also further simplify the code.

How?

mgeier · 2014-08-03T09:24:13Z

Can I merge this and we do potential improvements later?

bastibe · 2014-08-03T10:07:53Z

Yes, please merge!

Implement a generator for optionally overlapping blocks

bastibe referenced this pull request Apr 24, 2014

Add arguments channels_first, always_2d, out and fill_value

202fe2b

See also #16

mgeier mentioned this pull request Jun 5, 2014

Add a high-level interface à la playrec bastibe/PySoundCard#19

Open

mgeier mentioned this pull request Jun 17, 2014

read(): don't ignore frames if out is given? #46

Merged

mgeier mentioned this pull request Jun 18, 2014

add tell function #44

Merged

mgeier mentioned this pull request Jul 2, 2014

Update README #26

Merged

mgeier mentioned this pull request Jul 5, 2014

Change order of arguments in read() function #54

Merged

mgeier added 3 commits July 15, 2014 15:22

Change default value of start from None to 0

29a2696

Change a RuntimeError to TypeError

03ac57c

Add blocks() method and function

4a389fd

mgeier added 2 commits July 25, 2014 15:42

Disallow blocks() in 'w' mode

d29c034

blocks(): fix special case with frames and fill_value

a0c5997

blocks(): check if one of {blocksize, out} is specified

64e5d55

mgeier mentioned this pull request Aug 2, 2014

Change open modes to be more in line with Python's open() function #60

Merged

mgeier added a commit that referenced this pull request Aug 3, 2014

Merge pull request #35 from bastibe/blocks

b5a4abc

Implement a generator for optionally overlapping blocks

mgeier merged commit b5a4abc into master Aug 3, 2014

mgeier deleted the blocks branch August 3, 2014 16:11

mgeier mentioned this pull request Oct 9, 2017

Update SoundFile.blocks. #209

Merged

Implement a generator for optionally overlapping blocks #35

Implement a generator for optionally overlapping blocks #35

Conversation

mgeier commented Jun 18, 2014

mgeier commented Apr 25, 2014

mgeier commented Apr 25, 2014

bastibe commented Apr 25, 2014

mgeier commented Apr 29, 2014

bastibe commented Apr 30, 2014

mgeier commented May 17, 2014

bastibe commented May 19, 2014

mgeier commented May 22, 2014

bastibe commented May 25, 2014

mgeier commented Jun 4, 2014

bastibe commented Jun 6, 2014

mgeier commented Jun 10, 2014

bastibe commented Jun 10, 2014

mgeier commented Jun 11, 2014

bastibe commented Jun 15, 2014

mgeier commented Jun 15, 2014

bastibe commented Jun 17, 2014

mgeier commented Jun 17, 2014

mgeier commented Jun 17, 2014

mgeier commented Jun 18, 2014

bastibe commented Jun 19, 2014

mgeier commented Jun 23, 2014

bastibe commented Jun 24, 2014

mgeier commented Jun 25, 2014

bastibe commented Jun 26, 2014

mgeier commented Jul 2, 2014

mgeier commented Jul 15, 2014

bastibe commented Jul 17, 2014

bastibe commented Jul 17, 2014

mgeier commented Jul 17, 2014

mgeier commented Jul 27, 2014

bastibe commented Jul 27, 2014

mgeier commented Jul 27, 2014

mgeier commented Aug 3, 2014

bastibe commented Aug 3, 2014