Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage on unit tests fails when tracking child processes #6887

Closed
alalazo opened this issue Jan 10, 2018 · 15 comments
Closed

Coverage on unit tests fails when tracking child processes #6887

alalazo opened this issue Jan 10, 2018 · 15 comments
Assignees
Labels
bug Something isn't working tests General test capability(ies)

Comments

@alalazo
Copy link
Member

alalazo commented Jan 10, 2018

As of 596d463 using concurrency=multiprocess to run:

$ coverage run bin/spack test

results in a few .coverage* files being corrupted. This may compromise the execution of:

$ coverage combine
$ codecov --required

and in the end the coverage report from codecov (see data from January 2018, where it dropped from ~75% to ~50%).

#6872 has been merged as a work-around to get more stable measurements, but the underlying issue with pytest + multiprocessing + coverage needs to be investigated further.

@adamjstewart @tgamblin @scheibelp @becker33

@alalazo alalazo added bug Something isn't working tests General test capability(ies) labels Jan 10, 2018
@alalazo
Copy link
Member Author

alalazo commented Jan 10, 2018

For the records, this seems to be the first commit behaving in a strange way on codecov, but probably this is Travis fault (it was an automatic trigger of a commit that already went good before).

The next commit that's passed on Travis is c3b003e, the build is here and the codecov report here. Both of the linux unit tests were not sent to codecov (so 8 reports instead of 10). It dates 2018-01-02T19:59:34Z.

Commits before those two seem to end in a good way. Next step: what changed on 2018-01-02 ?

@alalazo
Copy link
Member Author

alalazo commented Jan 10, 2018

Commit 088c193 scores 74.45% on codecov (last stable number). It has all 10 reports on codecov, but still the log of Travis shows:

Coverage.py warning: Couldn't read data from '/home/travis/build/spack/spack/.coverage.travis-job-spack-spack-323456025.travisci.net.9618.332776': CoverageException: Doesn't seem to be a coverage.py data file
Coverage.py warning: Couldn't read data from '/home/travis/build/spack/spack/.coverage.travis-job-spack-spack-323456025.travisci.net.9686.644486': CoverageException: Doesn't seem to be a coverage.py data file
Coverage.py warning: Couldn't read data from '/home/travis/build/spack/spack/.coverage.travis-job-spack-spack-323456025.travisci.net.9719.277474': CoverageException: Doesn't seem to be a coverage.py data file

codecov version is 2.0.10. On January 2 version 2.0.11 went out.

At this point I start to think that we had this bug with pytest + coverage since quite some time, but never noticed it because codecov was able to deal nicely with the merged coverage file (despite the failures above) until version 2.0.11.

@alalazo
Copy link
Member Author

alalazo commented Jan 10, 2018

I think I finally got the the bottom of this twisted issue. Between 2.0.10 and 2.0.11 codecov/codecov-python#112 was merged. This means that we started calling:

$ coverage combine

twice.

Now, doing this:

$ coverage run --concurrency=multiprocessing bin/spack test
$ coverage combine
Coverage.py warning: Couldn't read data from '/home/mculpo/PycharmProjects/spack/.coverage.nuvolari.1043.077127': CoverageException: Doesn't seem to be a coverage.py data file

will result very likely in some error. If you check your .coverage now you'll find it has data in it. The corrupted files will instead be empty and are not deleted like the good ones by the command.

Running again:

$ coverage combine

(which is what effectively codecov 2.0.11 does for us) will account only for the corrupted files and leave you with an empty report! The content is:

!coverage.py: This is a private format, don't read it directly!{}

@alalazo
Copy link
Member Author

alalazo commented Jan 10, 2018

Bottomline: yes, we have a buglet in our tests somewhere related to multiprocessing, coverage has a bug (because it doesn't realize that a .coverage file is already there before deleting it) , Travis has numerous unrelated bugs that cause random failures of jobs and codecov had a bug that once fixed broke our CI 😆

@adamjstewart
Copy link
Member

I would seriously like to get to the bottom of these Travis network connection problems. I've never seen so many failing jobs before. But there's not much we can do on our end for that one. Good work!

@alalazo alalazo self-assigned this Jan 10, 2018
@alalazo
Copy link
Member Author

alalazo commented Jan 11, 2018

Reported in this issue the behavior of coverage.

@nedbat
Copy link

nedbat commented Jan 11, 2018

@alalazo Thanks for the report. I asked on the bitbucket issue for a reproducible case, but I see the issue there is anonymous. Or we can talk about it more here :)

@alalazo
Copy link
Member Author

alalazo commented Jan 11, 2018

@nedbat Sure. Concerning a simple case to reproduce what I see with coverage:

$ coverage run -p $(which flake8) 
$ ls -a
.  ..  .coverage.nuvolari.4834.430541
$ touch .coverage.nuvolari.4677.764578
$ coverage combine
Coverage.py warning: Couldn't read data from '/home/mculpo/tmp/covbug/.coverage.nuvolari.4677.764578': CoverageException: Doesn't seem to be a coverage.py data file

At this point .coverage has data in it. When I do again:

$ coverage combine
Coverage.py warning: Couldn't read data from '/home/mculpo/tmp/covbug/.coverage.nuvolari.4677.764578': CoverageException: Doesn't seem to be a coverage.py data file

the .coverage file is empty.

@nedbat
Copy link

nedbat commented Jan 14, 2018

The problem here is two things working together:

  1. Files that can't be read as data files during combine are left in place. Empty files can't be read as data files.
  2. "coverage combine" deletes the .coverage file, and then combines all the ".coverage.*" files it can find. The second "coverage combine" deletes the data file, and then tries again to read the empty files, and cannot, so there is no data left.

If you use "coverage combine -a" instead, then it won't delete the data file during the combine step.

@alalazo
Copy link
Member Author

alalazo commented Jan 15, 2018

@nedbat You're completely correct, and what you say was clear to me before - basically that's what I got tracking the issue we had. What I was wondering is if point 2. above:

"coverage combine" deletes the .coverage file, and then combines all the ".coverage.*" files it can find.

is to be considered a bug or not for your application, as it may delete without warning a valid .coverage file.

In our case it was not trivial to understand this, as some command down the chain (codecov) started calling coverage combine "for us", at some point, and that resulted in a highly fluctuating coverage. To arrive at this conclusion we had to track:

  1. when the coverage started to behave weirdly
  2. what changed on that date + get the relevant PR of codecov
  3. reproduce what coverage combine called twice does in presence of empty .coverage.* files

What I was suggesting is that maybe coverage combine should check upfront for the presence of a .coverage file, and fail with a meaningful message if it is present. For instance:

Error: cannot combine coverage files when .coverage is already present.

If you instead think coverage combine should behave like it does, feel free to close the issue I opened on bitbucket (and apologies for the noise).

alalazo pushed a commit to epfl-scitas/spack that referenced this issue Jan 16, 2018
According to what was discovered in spack#6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
alalazo added a commit to epfl-scitas/spack that referenced this issue Jan 16, 2018
According to what was discovered in spack#6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
alalazo added a commit to epfl-scitas/spack that referenced this issue Jan 18, 2018
According to what was discovered in spack#6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
alalazo added a commit to epfl-scitas/spack that referenced this issue Jan 18, 2018
According to what was discovered in spack#6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
alalazo added a commit that referenced this issue Jan 20, 2018
* Revert "Travis: use --concurrency=multiprocessing only on build tests (#6872)"

This reverts commit 596d463.

* Removing 'coverage combine' in test script

According to what was discovered in #6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
@nedbat
Copy link

nedbat commented Jan 21, 2018

I think combine does this to parallel the behavior of run: it always generates a new data file, and both have a -a flag to append to the existing file. This might be too literal a design.

What about this: a key point in your scenario is that the second combine step doesn't find any usable data files. Perhaps we should avoid saving the new data file if there wasn't any actual combining that got done?

@nedbat
Copy link

nedbat commented Jan 21, 2018

Or perhaps: if after reading all the combinable files we could, there is no data at all, then don't write over the coverage data file?

@nedbat
Copy link

nedbat commented Jan 21, 2018

I just fixed this by raising an error if combine can't read any of the files it found.

@alalazo
Copy link
Member Author

alalazo commented Jan 22, 2018

Thanks @nedbat. Just read your messages above. It seems to me a sensible behavior.

ch741 pushed a commit to ch741/spack that referenced this issue Apr 20, 2018
* Revert "Travis: use --concurrency=multiprocessing only on build tests (spack#6872)"

This reverts commit 596d463.

* Removing 'coverage combine' in test script

According to what was discovered in spack#6887, one of the problems is
calling 'coverage combine' twice without the '-a' flag. This removes
the first call within our test scripts.
@alalazo
Copy link
Member Author

alalazo commented Feb 14, 2019

This issue has been solved a year ago, forgot to close

@alalazo alalazo closed this as completed Feb 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests General test capability(ies)
Projects
None yet
Development

No branches or pull requests

3 participants