Add open(), read() and write() functions #34

mgeier · 2014-04-21T21:20:18Z

This is the combination of a few commits from #18.
See also #14.

mgeier · 2014-05-04T18:20:46Z

I re-read the discussion about channels_first and I didn't find any really convincing arguments, neither for nor against it (except that it would definitely make the code more complicated).
I still have the feeling that it would be a good feature, but as long as there is no really convincing argument and as long as there is no better name I guess it's better to drop it.

I removed it in 12294be, but I will change the whole pull request to get fewer and more consistent commits.

mgeier · 2014-05-04T18:53:08Z

I combined the previous mess of commits into three fresh ones.
This also sent most of the discussion to some level of Github nirvana.

I'm not sure if we agree on all points now ...

@bastibe: we can take the changes to the tests from your branch open-read-write-alternative, they should still pass.

bastibe · 2014-05-05T07:07:54Z

I think we agree on the API for SoundFile.read and SoundFile.write now.

Implementation wise, have a look at a700fcf for simplifying SoundFile.read. I don't think we need the two helper functions. It would be nice to bring that complexity back down to a more manageable level.

Besides that, I am not satisfied with read write and open yet. But maybe we can postpone that and merge SoundFile.read and SoundFile.write first.

I think we agreed that read and write should contain some basic documentation. In particular, we should mention that file may be any IO class, a file name or a file object and that negative frames defaults to the whole file.
We should move the import in read to the top of the file.
Maybe it would be cleaner for read and write to have fixed arguments (no **kwargs). I feel that these two functions really meant to be simple above all else, so we don't lose much by limiting them to a subset SoundFile's functionality. If you want full control, using SoundFile itself is the right thing to do anyway.
I think that open really is redundant. I don't see how it is any simpler than SoundFile.

What do you think about these points?

mgeier · 2014-05-05T12:57:20Z

I fail to see how SoundFile.read() is simpler in a700fcf, can you please elaborate?

The helper function _create_out_array() indeed seems superfluous, but I will use it later also in the blocks() method to create a block in write mode.

The other helper function _check_frames_and_channels() is used in three places (including the write() function), I wouldn't want to repeat the code three times.

But this raises a more fundamental question: how defensive should we be when handling user input?
The main idea behind _check_frames_and_channels() is that it includes some error messages which should help the user in case of wrong arguments.
E.g. if the user passes a scalar value as data or out, there is an error message saying that only one- and two-dimensional arrays are allowed.
Also, if the array happens to be empty, there is an error message saying so instead of raising a division-by-zero error a few lines later.
I think this helps unexperienced users if they pass wrong arguments by mistake, but it also may be a little bit too defensive?
I'm having a hard time where exactly to draw the line ...

I created a new pull request (#36) containing only the commits about the read() and write() methods. After merging that, we can continue here with the discussion about the open(), read() and write() functions.

bastibe · 2014-05-09T14:24:13Z

I think _check_frames_and_channels is not a great function yet. While it is used in three places, not all of those places actually need all the functionality it provides. Also, it provides two distinct sets of functionality:

Check data integrity and throw errors
Calculate the number of channels

At the very least, this should be factored in separate functions. In #37, I actually do not use the function at all.

mgeier · 2014-05-11T17:47:45Z

You're right that _check_frames_and_channels() isn't a great function ...

I'm still not clear on your position regarding how defensive we should be when checking user input.
Your commits in #37 suggest that you want to be less defensive, is this true?

bastibe · 2014-05-11T18:43:54Z

Sorry I was so inactive last week. I was sick.

Anyway, about defensiveness:
Personally, I am fine with getting unclear error messages in case I blatantly mis-use something. I don't think we have to care about a user passing a scalar to out. After all, the documentation clearly states that out has to be a numpy array. We don't test the types for most inputs. This is just the reality of working with dynamic typing.
As for a zero-length out, I think we should simply not read anything. It is not technically an error to read zero frames. If the user asks for zero frames, he should get zero frames. I just added a test for this in d331c89.

This is the combination of a few commits from #18, plus a few more things. See also #14.

mgeier · 2014-05-29T16:51:40Z

Now that we've merged #36, I rebased this PR onto the current master (0ca0eb3).

I didn't add tests yet because I want to wait for #41.

Also, I'd like to wait until we find an agreement on the discussions in #33.

I quickly checked, though, by replacing the methods with the respective functions in the tests (at least in those that don't involve seek() or other SoundFile-specific stuff) and all tests pass.

I think we agreed that read and write should contain some basic documentation. In particular, we should mention that file may be any IO class, a file name or a file object and that negative frames defaults to the whole file.

Yes, we agreed.
However, I think it's soon enough if we improve the docstrings in the course of #27.
I don't want to tune the docstrings now just to re-write everything in #27.

We should move the import in read to the top of the file.

I put it there to avoid the slightly ugly from inspect import getargspec as _getargspec.
But if you prefer that, I'll change it, no problem.

Another, probably not very convincing argument would be that if only the low-level stuff is used, the import doesn't have to be done at all.

Maybe it would be cleaner for read and write to have fixed arguments (no **kwargs). I feel that these two functions really meant to be simple above all else,

This wouldn't be cleaner nor simpler.
It would just be a maintenance burden.

It would be nice to manipulate the signature automatically based on the method's signatures, but AFAIK this in only possible with severe hackery involving exec() or eval().
It could probably be done like in the decorator module (https://pypi.python.org/pypi/decorator) or like there:
http://python.dzone.com/articles/copying-python-function%E2%80%99s
http://stackoverflow.com/a/18626691/500098

But as soon as we have proper docstrings (#27), I think the need for that will go away.

so we don't lose much by limiting them to a subset SoundFile's functionality.

Oh no, please let's not!

It would be crazy to unnecessarily limit the features of the functions.

All parameters as they are implemented in this PR make perfect sense.
Removing any of them would just deliberately limit the usefulness of PySoundFile as a whole, and I wouldn't want that.

If you want full control, using SoundFile itself is the right thing to do anyway.

Sure, there still remain use cases where the methods have to be used.

But I guess a huge percentage of users will never have to call any of the methods directly.

And that's a good thing!

It also makes for a much cleaner documentation. All the SoundFile stuff can be moved to "advanced topics".

I think that open really is redundant. I don't see how it is any simpler than SoundFile.

Obviously it's redundant in the sense that it's literally return SoundFile(*args, **kwargs) and nothing more.

But it's just so much more Pythonic!

If you want to open a text file in Python, you use f = open('myfile.txt').
You don't use f = io.TextIOWrapper(...).

Therefore, I think it perfectly makes sense to prefer f = sf.open('myfile.wav') instead of f = SoundFile('myfile.wav').

I tried to make this point already (https://github.com/bastibe/PySoundFile/pull/14#issuecomment-36464595), sorry for the repetition.

mgeier · 2014-05-31T15:21:10Z

I thought a bit about the **kwargs stuff and I've probably changed my mind ...

I still think spelling out all arguments would mean an annoying amount of duplication and it would make the whole thing definitely harder to maintain.
But against these disadvantages from the view of the implementation there is a real benefit for users if they see the full function signature.
It's hard to decide, but probably the benefits outweigh the shortcomings ...?

Another benefit would be that we could avoid the whole getargspec stuff.

... instead of using *args and **kwargs.

bastibe · 2014-06-06T18:01:44Z

Yes, I think this definitely makes the functions much easier to understand.

However, I disagree with 49f0e23. I think the documentation should be placed where the functionality is actually implemented, not in some wrapper function. Ideally, the docstring information really should be available to both the functions and methods. Is there any way we can programmatically append the docstring from the method to the function?

Re-use constructor docstring for open()

mgeier · 2014-06-10T13:52:33Z

It's no problem to re-use the init-docstring for open(), I did this in 5ccfdb5.

I'm not sure if this is also a good idea for read(), because the two docstrings differ.
We could of course put the second (common) part of the docstring in a separate string and concatenate it with the two original docstrings, but then the common string is somewhere in the module namespace and may be confusing people.
It could also be a class attribute, but I guess this wouldn't really improve anything.

Also, I don't know if we can still separate the two parts nicely once we switch to NumPy-style docstrings (#27).
In the upcoming Sphinx documentation I'd like to move the read() function to the very top (because I think it will be the most commonly used feature). If we do that, I think it would also make sense to put the full documentation there and a much shorter docstring in SoundFile.read().

bastibe · 2014-06-10T18:32:19Z

I am uncomfortable with not documenting a function/method.

If I look up the documentation for a function/method (most likely through some functionality of my text editor), I want to see the documentation. A referral to some other place is nice, but my text editor can't parse that.

I'd rather have duplicate documentation than no documentation. Right now I get no documentation at all on SoundFile.read. I think it is fine to refer to another function/method for the full documentation, but the docstring should at least contain an abbreviated version of the documentation.

Frankly, I'd prefer to have a full copy of the docstring in both read and SoundFile.read. Documentation in general is chock full of repetition anyway, so this should not be a problem. Anything else is just making the user's life harder for no benefit at all.

mgeier · 2014-06-11T14:40:45Z

If you don't see any documentation for SoundFile.read(), there's something wrong.
Currently I get:

Read a number of frames from the file.

Reads the given number of frames in the given data format from
the current read position. This also advances the read
position by the same number of frames.
Use frames=-1 to read until the end of the file.

For further keyword arguments see the module-level function
read().

If you insist I can duplicate the text which is valid for both method and function, I don't like it but I don't want to get too religious about that, either. There are other questions where if have a much stronger opinion.

But I would like to point out that, as far as I've seen, it is common practice in NumPy to refer somewhere else for documentation.

E.g. numpy.ndarray.reshape():

Returns an array containing the same data with a new shape.

Refer to `numpy.reshape` for full documentation.

See Also
--------
numpy.reshape : equivalent function

See also http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.reshape.html#numpy.ndarray.reshape vs
http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html

Of course in our case it's not really an equivalent function, but it's still a similar situation.

bastibe · 2014-06-15T10:49:25Z

The main issue I see here is that we will probably have a lot of users who will use either the functions or the methods, but not both. It might be confusing for them to have full documentation available for one but not the other.

I feel that if we were to skip the documentation on one of SoundFile.read and read, it should be read that gets the abbreviated documentation, simply because the actual implementation is in SoundFile.read. However, this might be a disservice to users, as they are probably more likely to use read than SoundFile.read.

In the end, I would vote for having full documentation everywhere. I guess there is some middle ground though. In particular, I would probably be fine to simplify the function documentation a bit and defer some rarely-used details to the methods. This would make simple usage simpler and power-users more powerful.

We are in for a documentation rewrite anyway though, so we can easily defer this to a later time.

mgeier · 2014-06-15T14:34:41Z

OK, that sounds reasonable.
Let's see how the documentation looks once we change it to NumPy style (#27).

I duplicated the documentation in read() for the time being.

Can we merge now?

Add open(), read() and write() functions

mgeier mentioned this pull request Apr 25, 2014

Implement a generator for optionally overlapping blocks #35

Merged

mgeier mentioned this pull request May 5, 2014

Several changes to the read() and write() methods #36

Merged

bastibe mentioned this pull request May 9, 2014

Several changes to the read() and write() methods (alternative) #37

Closed

Add open(), read() and write() functions

0ca0eb3

This is the combination of a few commits from #18, plus a few more things. See also #14.

Repeat function arguments in module-level functions

1aec4c5

... instead of using *args and **kwargs.

Move documentation from read() function to read() method

5ccfdb5

Re-use constructor docstring for open()

Duplicate overlapping documentation in read()

c1701ab

bastibe added a commit that referenced this pull request Jun 15, 2014

Merge pull request #34 from bastibe/open-read-write

c79c314

Add open(), read() and write() functions

bastibe merged commit c79c314 into master Jun 15, 2014

mgeier deleted the open-read-write branch June 15, 2014 15:39

mgeier mentioned this pull request Apr 13, 2020

sd.rec: dimensions of returned ndarray spatialaudio/python-sounddevice#232

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add open(), read() and write() functions #34

Add open(), read() and write() functions #34

mgeier commented Apr 21, 2014

mgeier commented May 4, 2014

mgeier commented May 4, 2014

bastibe commented May 5, 2014

mgeier commented May 5, 2014

bastibe commented May 9, 2014

mgeier commented May 11, 2014

bastibe commented May 11, 2014

mgeier commented May 29, 2014

mgeier commented May 31, 2014

bastibe commented Jun 6, 2014

mgeier commented Jun 10, 2014

bastibe commented Jun 10, 2014

mgeier commented Jun 11, 2014

bastibe commented Jun 15, 2014

mgeier commented Jun 15, 2014

Add open(), read() and write() functions #34

Add open(), read() and write() functions #34

Conversation

mgeier commented Apr 21, 2014

mgeier commented May 4, 2014

mgeier commented May 4, 2014

bastibe commented May 5, 2014

mgeier commented May 5, 2014

bastibe commented May 9, 2014

mgeier commented May 11, 2014

bastibe commented May 11, 2014

mgeier commented May 29, 2014

mgeier commented May 31, 2014

bastibe commented Jun 6, 2014

mgeier commented Jun 10, 2014

bastibe commented Jun 10, 2014

mgeier commented Jun 11, 2014

bastibe commented Jun 15, 2014

mgeier commented Jun 15, 2014