Add tests #33

bastibe · 2014-04-21T18:01:18Z

Now that we are changing so many things, we should make sure that we don't break stuff.

bastibe · 2014-04-21T18:02:33Z

Sorry for simply pushing to master. I meant to publish the tests as a branch first. Anyway, 3185d2f adds some tests.

bastibe · 2014-04-21T21:41:43Z

I only implemented tests for SoundFile so far. I'll still work on testing all the global functions.

mgeier · 2014-04-27T14:02:04Z

Are the tests supposed to be executed with the following?

python -m unittest test_pysoundfile

I have no experience with testing in Python but I keep reading about nose and py.test ... what about using one of those?

I also read about tox, which seems to be useful for testing on different versions of the interpreter ...

bastibe · 2014-04-28T07:11:56Z

I use nosetests for running my tests. Simply run nosetests on your command line in the project directory. nosetests will auto-discover all test classes and run them.

mgeier · 2014-04-28T15:50:19Z

Cool, thanks!

I also tried it with pytest and this also works, BTW.

mgeier · 2014-04-29T10:45:27Z

I think we should structure the tests differently:
Start with the high-level functions (read(), write(), blocks()) and test all their features.
After that, we only have to test the remaining features which are only possible with the low-level interface (e.g. seek(), __getattr__()) and some remaining details from the methods read(), write(), ...

We could use coverage to discover parts in the low-level functionality which weren't tested already, but we wouldn't have to repeat all the tests for both the high- and the low-level interface.

This would be a shift from unit tests to integration tests, which I think is a better fit for PySoundFile.
I think strict unit tests are not really necessary, because they would mainly test libsndfile, which isn't our job here.

We can of course make some individual unit tests where necessary, but I don't think we need it for all functions/methods.

bastibe · 2014-04-30T07:21:26Z

I don't think it matters much if we test SoundFile and infer read from that or vice versa. However, we now have a somewhat comprehensive set of tests for SoundFile. It is easy to reason about this, since it is naturally bounded by the extents of SoundFile, and SoundFile actually implements all the features.

Feel free to write additional tests for the functions though.

mgeier · 2014-05-17T14:46:34Z

I don't think it matters much if we test SoundFile and infer read from that or vice versa.

I think it does matter a lot, because we simply can't infer anything about the functions by testing the methods, because they are not even called.
The other way around, however, it works. The functions call the methods, thereby testing more code at once.
There is of course the (probably quite unlikely) possibility that bug in a function and a bug in a method cancel each other out, which would lead to a false positive.
And of course there will be a lot of functionality that has to be tested in the methods anyway because they have much more features than the functions.

Feel free to write additional tests for the functions though.

I don't think that would be the right thing to do, because it would repeat code, leading to an unnecessary maintenance burden.
I think as soon as we add the functions, we should re-factor the tests to test the functions instead of the methods wherever reasonable.

bastibe · 2014-05-19T07:22:28Z

Nope. Unit tests should cover the basic units. The smallest testable units we have are the methods.

The bulk of our tests should test the methods. Once we know that the methods are correct, the functions will be correct implicitly. Sorry to be so blunt about this, but the point of unit tests is to test the basic building blocks.

mgeier · 2014-05-22T19:44:13Z

Nope. Unit tests should cover the basic units. The smallest testable units we have are the methods.

Yes.
But I still think classical unit tests are not the right thing to do in our case.

I think it makes hugely more sense to completely test the functions and then only test the remaining features on the methods.

In the end we should have tested everything, of course.
But, as always, it would be good to reduce duplication.

The bulk of our tests should test the methods. Once we know that the methods are correct, the functions will be correct implicitly.

I don't get it. Why? How?
The functions can have all kinds of bugs in them which we will not find if we don't call them.

Sorry to be so blunt about this, but the point of unit tests is to test the basic building blocks.

If have no problem with the bluntness.
But it doesn't change my mind, either.

bastibe · 2014-05-25T16:55:24Z

But I still think classical unit tests are not the right thing to do in our case.

Why? You haven't given a reason yet.

The point of unit tests is to test the basic bulding blocks, and then build on that. If you know that your basic building blocks work as intended, it is easy to reason about errors in higher-level abstractions. This doesn't work the other way.

A quick round of googling:

They all say that you should unit test first. After that, add all the integration tests you like. Also, they all harp on how useful unit tests are during the design process. I wholeheartedly agree, from my own experience in a few projects. Integration tests don't have that property.

One article also says that unit tests should not involve the file system. I'm willing to talk about this, maybe we can find a way to refactor the existing unit tests so they don't involve the file system.

But as for testing the functions and not the methods, you are wrong. Sorry.

bastibe · 2014-05-27T11:20:43Z

Matching pull request in #40.

mgeier · 2014-05-27T19:07:39Z

But I still think classical unit tests are not the right thing to do in our case.

Why? You haven't given a reason yet.

Sorry, I thought I did: repetition.

If we write a number of tests for the methods and then write the same tests for the functions that's twice the work for no gain.
And it's a maintenance burden.

I don't want to discuss further if our tests are or should be unit tests or rather integration tests because those are just names and nobody knows what they mean anyway. At least everyone is contradicting each other (and sometimes themselves) in their definitions, which are mostly very vague anyway.

I'd rather give a concrete example:

I think one of the many things we should test is reading a given number of frames from a given file or file-like object and checking if the resulting NumPy array is exactly as we'd expect.
This wouldn't be a single testcase but a set of them where we could vary the file we're using, the way we're opening it (filename, file descriptor, file-like object), the mode ('r'/'w'/'rw'), the number of frames (-2, -1, 0, 1, some, exactly as many as in the file, too many), and probably some other details.
Most of this variations should be somehow automated.

Case 1: How would we do this using SoundFile.read()?

We cannot read a file without opening it, so we'll have to write a test fixture which opens the file for us. Then we write the tests which use this fixture.

When a test fails, it will either fail in the fixture or in our test using SoundFile.read(), we can't know beforehand.

Once this test passes, we know that the fixture and SoundFile.read() work in this very case.
We know nothing about sf.read().

Case 2: How would we do this using sf.read() (the module-scope function)?

We wouldn't need a fixture for opening the file, because that's done internally, so we'll end up with less test code.

When a test fails, it will either fail in sf.read() or in SoundFile.read(), we can't know beforehand.

Once this test passes, we know that sf.read() and SoundFile.read() work in this very case.

Summary:
Case 1: We have to write additional fixture code and test less code in the end
Case 2: We write less test code and test more code

I must admit there is the theoretical possibility that there is a bug in SoundFile.read() that's exactly cancelling another bug in sf.read(), but I think that's highly unlikely to actually happen.

Long story short, that's why I'm suggesting that we should test the overlapping functionality of the read() method and function on the function and not on the method.

bastibe · 2014-05-28T11:54:49Z

Sorry, I thought I did: repetition.

Maybe there is a misunderstanding then. I would not propose to test everything for both read and sf.read. Instead, we should test sf.read thoroughly, and only do high-level "does it work in general" kind of tests for read. This requires no unnecessary duplication (at least no more than doing it the other way around).

We indeed can not read a file without opening it. Thus, we will have to instanciate a SoundFile object regardless of whether we test sf.read or read. The difference is, with sf.read the test will have to instanciate that object explicitly, whereas read will instanciate it implicitly.

Instanciating the object explicitly has the main advantage of being able to test SoundFile instanciation independently of method invocations on SoundFile. This is crucial for the tests! The purpose of tests is not only to find errors, but most importantly, to show the source of errors. We test the low level methods, and indeed also their "hidden" underscore-method implementation functions, to find errors as close to the source as possible.

Here's an example:

Say we only test the functions. This will mean that we have a very small number of tests classes, with a very big number of test methods. Every error will be attributed to the functions.

Say we only test the methods instead. This will mean that we have a bigger number of test classes, with a smaller number of tests methods each. Errors will be attributed to the methods that actually caused the errors, not higher-level wrappers.

You really are genuinely wrong about this. I have done a few projects with tests. High-level tests are nice. They are a form of documentation. But the important thing is to test the smallest possible parts. This forces you to think about the minute details of your program's inner workings. Often times, a program works fine in broad strokes, but there are edge cases and unintended side effects that you don't notice in day-to-day work. This is what testing is all about: You try to discover the edge cases of your innermost secrets, so they won't blow up in your face when some other part of the program uses them later.

Of course, none of this has anything to do with #44, which is entirely valid at any rate.

mgeier · 2014-05-29T18:02:10Z

Maybe there is a misunderstanding then.

Probably. I hope we can clear that up.

Just to be absolutely clear: I'm talking only about the overlapping functionality between methods and functions.

I would not propose to test everything for both read and sf.read.

Maybe that's the difference in our views. I'd propose to test both.
And it's really easy, too.

Instead, we should test sf.read thoroughly, and only do high-level "does it work in general" kind of tests for read.

That sounds a bit sloppy, but I guess it depends on how this is actually implemented.

This requires no unnecessary duplication (at least no more than doing it the other way around).

Well here's another difference.
It indeed requires no duplication, but it also tests only half the cases!

We indeed can not read a file without opening it. Thus, we will have to instanciate a SoundFile object regardless of whether we test sf.read or read. The difference is, with sf.read the test will have to instanciate that object explicitly, whereas read will instanciate it implicitly.

True. But the test framework doesn't know that.

Instanciating the object explicitly has the main advantage of being able to test SoundFile instanciation independently of method invocations on SoundFile.

Here we also disagree.
This is only true if we make additional tests for instantiation.
We can make them in any case.
These would have to be run first, therefore violating the guideline that the order of test invocations shouldn't matter.

This is crucial for the tests! The purpose of tests is not only to find errors, but most importantly, to show the source of errors. We test the low level methods, and indeed also their "hidden" underscore-method implementation functions, to find errors as close to the source as possible.

But it's just not possible to do them really isolated.
We'd have to do stuff in the test fixtures (exactly the same stuff as the functions do, BTW) and if a test fails, we still have to search if the error is in the fixtures or in the method.

In both cases the exact same steps have to be taken, just in my suggested scenario we have to write much less fixture code (with its maintenance cost and potential bugs).

We can try to test the underscore methods, but I suspect this could be quite impractical because even more additional setup code would be necessary.

I suspect that it might be more appropriate to design the test cases for the public methods/functions in a way that we can be sure that all code paths that we care about are actually executed.

Here's an example:

Say we only test the functions. This will mean that we have a very small number of tests classes, with a very big number of test methods.

Is this part of the argument?
Why should I care about number of classes vs number of methods?

Every error will be attributed to the functions.

Sure, but we can read the backtrace.

Say we only test the methods instead. This will mean that we have a bigger number of test classes, with a smaller number of tests methods each. Errors will be attributed to the methods that actually caused the errors, not higher-level wrappers.

True. But the error might actually originate from somewhere else, e.g. the test fixture.

So we still have to read exactly the same backtrace.

You really are genuinely wrong about this.

Probably. I've been wrong before ...
But probably you are wrong.

I have done a few projects with tests. High-level tests are nice. They are a form of documentation. But the important thing is to test the smallest possible parts.

I agree.

But as I'm trying to say, in our case the smallest possible part is, e.g., "open + read".
We cannot separate this (reasonably, i.e. without heaps of additional code for mocking etc.).

We can test "open" first, but that doesn't change anything in our discussion.

This forces you to think about the minute details of your program's inner workings. Often times, a program works fine in broad strokes, but there are edge cases and unintended side effects that you don't notice in day-to-day work. This is what testing is all about: You try to discover the edge cases of your innermost secrets, so they won't blow up in your face when some other part of the program uses them later.

I totally agree with your general statements.

But the reality in our case is that

f = SoundFile('test.wav')
data = f.read()

is on the same level of abstraction as

data = sf.read('test.wav')

There is nothing broad-strokier in one or the other.
They have exactly the same input and exactly the same expected output.
They have exactly the same edge cases.
They are, as far as the test output is concerned, identical.
The test source, however, is shorter and simpler and easier in the second case.

There may be unintended side effects in the method, which we would be made aware of in both cases.
However, if there are unintended side effects in the function, we can only find out by doing additional tests (which I would like to avoid).

But probably my mistake is somewhere here ...
Can you tell me a hypothetical edge case which would be less easy to locate/solve/... in the second case?

bastibe · 2014-06-06T17:54:28Z

Maybe that's the difference in our views. I'd propose to test both [methods and functions].

This is probably our main point of contention here. Everything else follows from this.

If we try to test only the basic building blocks and trust that rest is correct implicitly, then testing at a method level makes sense, as they are the building blocks everything is built from.

If we test everything anyway (including duplicated functionality) then there is no real difference to where we start and we might as well start at the function level if that requires less code.

Let's back up one step though. We should try to test (more or less) all code paths.

I would argue that is rather easy to see all the code paths in, say, _check_array. These can be easily tested. Now _check_array is called by both SoundFile.read and SoundFile.write, and provides an additional return path for those two methods. But since we checked the validity of _check_array already, we don't need to concern ourselves with these two particular return paths in testing SoundFile.read and SoundFile.write.

By similar arguing, I would not test the particulars of _read_or_write as part of SoundFile.read, or the particulars of SoundFile.read as part of read. Incidentally, this is the same reasoning by which we do not try to test any of the numpy functions or the inner workings of libsndfile: We trust that these libraries are tested already and thus, we don't need to concern ourselves with them while testing code that uses them.

This is way of testing every possible code path with as few tests as possible.

I don't see any way of achieving the same thing from testing only the high-level functions. Also, it is much harder to reliably find all the code paths in read than in _read_or_write, and thus more likely for us to overlook a code path if we test in terms of read.

mgeier · 2014-06-15T13:27:28Z

I agree that our goal should be to check as many code paths as possible (ideally all).

I think we should decide on a case-by-case basis if it's worth to make a separate test for an underscore-prefixed helper function or not.
I'm quite sure there are some helper functions where we can easily control all code paths by varying the arguments of the calling function(s), but there may be others where this is not so easy and a separate test would make sense.

bastibe mentioned this issue May 27, 2014

Updated tests #40

Merged

bastibe closed this as completed May 27, 2014

mgeier mentioned this issue May 29, 2014

Add open(), read() and write() functions #34

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests #33

Add tests #33

bastibe commented Apr 21, 2014

bastibe commented Apr 21, 2014

bastibe commented Apr 21, 2014

mgeier commented Apr 27, 2014

bastibe commented Apr 28, 2014

mgeier commented Apr 28, 2014

mgeier commented Apr 29, 2014

bastibe commented Apr 30, 2014

mgeier commented May 17, 2014

bastibe commented May 19, 2014

mgeier commented May 22, 2014

bastibe commented May 25, 2014

bastibe commented May 27, 2014

mgeier commented May 27, 2014

bastibe commented May 28, 2014

mgeier commented May 29, 2014

bastibe commented Jun 6, 2014

mgeier commented Jun 15, 2014

Add tests #33

Add tests #33

Comments

bastibe commented Apr 21, 2014

bastibe commented Apr 21, 2014

bastibe commented Apr 21, 2014

mgeier commented Apr 27, 2014

bastibe commented Apr 28, 2014

mgeier commented Apr 28, 2014

mgeier commented Apr 29, 2014

bastibe commented Apr 30, 2014

mgeier commented May 17, 2014

bastibe commented May 19, 2014

mgeier commented May 22, 2014

bastibe commented May 25, 2014

bastibe commented May 27, 2014

mgeier commented May 27, 2014

bastibe commented May 28, 2014

mgeier commented May 29, 2014

bastibe commented Jun 6, 2014

mgeier commented Jun 15, 2014