Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite Crash on Coverage 5.0a2 inside Hypothesis #702

Closed
TylerADavis opened this issue Sep 8, 2018 · 17 comments
Closed

SQLite Crash on Coverage 5.0a2 inside Hypothesis #702

TylerADavis opened this issue Sep 8, 2018 · 17 comments

Comments

@TylerADavis
Copy link

@TylerADavis TylerADavis commented Sep 8, 2018

I was running some unit tests using hypothesis (link) which relies on coverage, and I ran into a crash. Specifically, I received the following error message:
coverage.misc.CoverageException: Couldn't use data file '/Users/me/my_project/.coverage': UNIQUE constraint failed: file.path

I have experienced this issue both on macOS and Linux while running python 3.6, and the issue does not persist with COVERAGE_STORAGE=json.

I have included a gist below which contains files to reproduce the error. Specifically, the error occurs when the hypothesis assume statement in test.py fails more than half the time and when wrap is called.

To reproduce the error, run the build.sh script, which will initialize a virtualenv, install required modules, and run a short example which should throw the error.

Additionally, I am including a vagrantfile which will run build.shin a VM, thus providing a clean slate for testing.

https://gist.github.com/TylerADavis/393ee6382b4f645774100ae0c490c5ba

Please let me know if there is any other information I could provide that would be useful.

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Sep 8, 2018

Thanks, this reproduces well, even without Vagrant in the mix.

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Sep 8, 2018

It looks like Hypothesis is creating a number of CoverageData objects, each of which is trying to write to the .coverage file. @DRMacIver thoughts?

@DRMacIver

This comment has been minimized.

Copy link
Contributor

@DRMacIver DRMacIver commented Sep 8, 2018

It looks like Hypothesis is creating a number of CoverageData objects, each of which is trying to write to the .coverage file. @DRMacIver thoughts?

Hmm. I thought the way Hypothesis was using those CoverageData objects they weren't trying to write to the file at all. I don't know whether my belief there was wrong or whether this has changed as part of the new version of Coverage.

I'm currently arguing that we should pull the Coverage support in Hypothesis anyway (HypothesisWorks/hypothesis#1551). This is not due to problems with coverage-the-library, I just don't think coverage information in general has proven useful in the sort of testing we can feasibly do in normal runs.

Given that, my recommended workaround is to put use_coverage=False in your Hypothesis settings, as I don't think you're losing much by doing that. I'll try to make sure we fix this (probably by removing the feature altogether) before you ship the final version of 5.0.

@DRMacIver

This comment has been minimized.

Copy link
Contributor

@DRMacIver DRMacIver commented Sep 8, 2018

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Sep 8, 2018

@DRMacIver Coverage.py 5.0 uses SQLite as the data storage, instead of a JSON-like file. So the file is created well before any save() call. This perfectly explains the problem. I'm wondering whether to retain a memory-only data scenario also?

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Sep 8, 2018

Or at least, avoid creating the file until save()....

@DRMacIver

This comment has been minimized.

Copy link
Contributor

@DRMacIver DRMacIver commented Sep 8, 2018

In the immediate future, the Hypothesis side of this is going away, so do what you like. :-) In the medium to long-term, I would definitely need to be able to to have in memory only coverage data if I were to keep using coverage in the kind of ways that I want to for coverage-guided testing purposes.

I'm a little confused about how creating the file without saving it causes this error though. It seems like data must be be written to it in order to violate a unique constraint?

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Sep 8, 2018

The current SQLite support writes to the .coverage file before save() is called, as data is added to the CoverageData object. That shouldn't be hard to change though.

@nedbat nedbat changed the title SQLite Crash on Coverage 5.0a2 SQLite Crash on Coverage 5.0a2 inside Hypothesis Oct 13, 2018
@anarcat

This comment has been minimized.

Copy link

@anarcat anarcat commented Oct 22, 2018

I am also seeing a reproducible crash when testing feed2exec in python 3.6 and 3.7 with pytest-cov 2.6, which i do not get in python 3.5:

INTERNALERROR> coverage.misc.CoverageException: Couldn't use data file '/home/anarcat/src/feed2exec/.coverage': UNIQUE constraint failed: file.path

i wonder if this could be something with changes in the sqlite3 standard library? or is this really just a bug in 5.0 alpha?

i first thought this was #716 which similarly stumbles upon the new sqlite storage...

thanks!

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Oct 22, 2018

@anarcat That is a separate problem: you need to be sure to use [run] parallel = True if your tests run simultaneously: #716 (comment)

@anarcat

This comment has been minimized.

Copy link

@anarcat anarcat commented Oct 23, 2018

i confirm setting parallel = True fixes this magically. it seems weird i need to tweak this - i don't remember enabling parallel builds - but at least it works, thanks! :)

@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Oct 23, 2018

@anarcat It seems like this is going to happen more and more. Can you help me out? How are you running your tests? What can I tell people to look out for?

@anarcat

This comment has been minimized.

Copy link

@anarcat anarcat commented Oct 23, 2018

I have no idea. :) The source for my project is here. Tests are ran by gitlab CI or locally (it doesn't matter) with this tox.ini file, which in turn calls setuptools (or distutils or whatever it's called) with the pytest-runner plugin. The setup.py test alias is defined in setup.cfg and then pytest magically finds the tests in feed2exec/tests. There's a minor betamax config in conftest.py but it should matter here. I have more test docs if you want as well..

@blueyed

This comment has been minimized.

Copy link
Contributor

@blueyed blueyed commented Oct 23, 2018

I've bisected this to d2f77ab.

The problem appears to be that the file_map is only read when opening the DB, but another subprocess might have added the file later when the ID is needed.

#723 fixes this for me.

My test case is https://github.com/Vimjas/covimerage/ (Vimjas/covimerage@bb84cc9), using tox -e py37-coveragepy5-coverage -- -x. (The initial failure is due to an autouse fixture, but also without this it fails, but later.)

To be clear: I am seeing the problem @anarcat mentioned, and which might be different from the original issue, sorry.

blueyed added a commit to blueyed/covimerage that referenced this issue Oct 23, 2018
blueyed added a commit to Vimjas/covimerage that referenced this issue Oct 23, 2018
Works around nedbat/coveragepy#702 with
coveragepy 5.0a3.

Failing build: https://circleci.com/gh/Vimjas/covimerage/1482
@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Nov 25, 2018

For anarcat's "UNIQUE constraint failed: file.path" problem, I would like people to use parallel=True. Or should I make that happen all the time anyway? I don't think I trust SQLite's concurrency enough to have multiple processes accessing the same database.

@strichter

This comment has been minimized.

Copy link
Contributor

@strichter strichter commented Jan 25, 2019

@nedbat We did experiments yesterday and sqlite3 does properly lock files during writes.

A few things could be tried to solve the problem:

  1. Since write calls to the database are idempotent, I think it is safe to set the isolation_level to "IMMEDIATE" circumventing the transaction machinery completely. We have run it with this setting against our entire code base run (14k tests, 3 days total runtime) and it has not produced a single failure.

  2. @anarcat If you run your tests with threads, I would try applying this PR: #760

blueyed added a commit to blueyed/coveragepy that referenced this issue Mar 31, 2019
@nedbat nedbat closed this in #723 Apr 8, 2019
@nedbat

This comment has been minimized.

Copy link
Owner

@nedbat nedbat commented Apr 8, 2019

I would rather not use "isolation_level = immediate" because in the future, writes to the database may not be idempotent (for example, counting line executions, rather than just true/false per line).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.