Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python segmentation faults #7873

Closed
nijel opened this issue Jun 8, 2017 · 24 comments
Closed

Python segmentation faults #7873

nijel opened this issue Jun 8, 2017 · 24 comments

Comments

@nijel
Copy link

nijel commented Jun 8, 2017

Since some time I'm getting Python segmentation faults during running tests. These tests utilize SQLite, tests using other database work just fine, so my assumption is that this is the guilty piece. In past the crashes were happening on Python 3.4 or 3.5, after some not related code change the crashes have moved to 2.7.

This is happening on both Trusty and Precise. Though we can't use Trusty for now due to #7790.

As a side note: This is not first time I experience segmentation faults on Travis, maybe it would be good to have some documentation how to debug that, or even better, to have installed segfault handler which will automatically display backtrace (AFAIK something like that is built into systemd). This way it would be way easier to figure out what is going on (and rule out crashes in third party modules).

@BanzaiMan
Copy link
Contributor

Do you have a build log URL that shows the problem you are describing here?

@nijel
Copy link
Author

nijel commented Jun 8, 2017

Sure it's here:

https://travis-ci.org/WeblateOrg/weblate/jobs/240671891#L1115

Sorry for not including it in the original report...

@BanzaiMan
Copy link
Contributor

Please try disabling caches first.

@nijel
Copy link
Author

nijel commented Jun 8, 2017

I did retry the build with cleared caches, but I can try disabling them as well.

nijel added a commit to WeblateOrg/weblate that referenced this issue Jun 8, 2017
See travis-ci/travis-ci#7873

Signed-off-by: Michal Čihař <michal@cihar.com>
@nijel
Copy link
Author

nijel commented Jun 8, 2017

Build with disabled caches is here: https://travis-ci.org/WeblateOrg/weblate/builds/240766995

@BanzaiMan
Copy link
Contributor

Thanks for testing. Do you mind trying to reproduce the segmentation fault locally? https://docs.travis-ci.com/user/common-build-problems/#Running-a-Container-Based-Docker-Image-Locally

@nijel
Copy link
Author

nijel commented Jun 8, 2017

I currently do not have the Docker environment...

@nijel
Copy link
Author

nijel commented Jun 9, 2017

I've just tried that and it's not reproducible locally.

@nijel
Copy link
Author

nijel commented Jun 9, 2017

I've tried to upgrade SQLite (as described in travis-ci/apt-package-safelist#368) in WeblateOrg/weblate@fd1bbfb and it seems to help. At least there is first successful build now: https://travis-ci.org/WeblateOrg/weblate/jobs/241211856, I'll see if that really helps in the long term or it was just luck...

@BanzaiMan
Copy link
Contributor

Recent builds are passing. Did anything change to explain this?

@nijel
Copy link
Author

nijel commented Jul 10, 2017

As mentioned in #7873 (comment), I've started to use newer SQLite library and it works fine since then:

addons:
  apt:
    packages:
    - sqlite3
    sources:
    - travis-ci/sqlite3

@nijel
Copy link
Author

nijel commented Jul 14, 2017

Apparently it no longer helps as SQLite based tests has now segfaulted again:

Was there any change in the environment recently?

nijel added a commit to WeblateOrg/weblate that referenced this issue Jul 14, 2017
See travis-ci/travis-ci#7873

Signed-off-by: Michal Čihař <michal@cihar.com>
@nijel
Copy link
Author

nijel commented Jul 14, 2017

Okay, after digging a while this seems to be known SQLite bug: https://www.sqlite.org/src/info/7f7f8026eda38, it happens when in-memory journal hits certain threshold.

I've created simple repo to trigger this bug:

https://travis-ci.org/nijel/sqlite-travis-test/builds/253613001

All it does is executing bunch of SQL statements utilizing transactions (it's based on reproducer in the SQLite bug report, just made to work on older version): https://github.com/nijel/sqlite-travis-test/blob/master/test.py

nijel added a commit to nijel/sqlite-travis-test that referenced this issue Jul 14, 2017
travis-ci/travis-ci#7873

Signed-off-by: Michal Čihař <michal@cihar.com>
nijel added a commit to WeblateOrg/weblate that referenced this issue Jul 14, 2017
This should help with Python segfaults, see
travis-ci/travis-ci#7873

Signed-off-by: Michal Čihař <michal@cihar.com>
@nijel
Copy link
Author

nijel commented Jul 14, 2017

The workaround which seems to work is not to use in memory database for tests, but file backed one.

The proper fix would be to upgrade to SQLite 3.14 or newer (maybe 3.12.2 or newer would work as well).

@nemesifier
Copy link

This has started happening for me too on this repo:
https://travis-ci.org/openwisp/django-x509/jobs/297505773

I also tried to restart one of the previous builds that was completed successfully, after the restart it failed.
The failures, in my case, seem to affect python 2.7 only, the python 3 builds run fine.

I tried the workaround proposed here, as well as removing the travis cache, but it didn't work out.

untitaker added a commit to pimutils/vdirsyncer that referenced this issue Jan 22, 2018
untitaker added a commit to pimutils/vdirsyncer that referenced this issue Jan 22, 2018
untitaker added a commit to pimutils/vdirsyncer that referenced this issue Jan 22, 2018
wichmannpas added a commit to wichmannpas/todoscheduler that referenced this issue Feb 3, 2018
wichmannpas added a commit to wichmannpas/todoscheduler that referenced this issue Feb 3, 2018
wichmannpas added a commit to wichmannpas/todoscheduler that referenced this issue Feb 3, 2018
brianmay added a commit to brianmay/spud that referenced this issue Feb 25, 2018
@stale
Copy link

stale bot commented Apr 13, 2018

Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically closing the issue. This is often because the request was already solved in some way and it just wasn't updated or it's no longer applicable. If that's not the case, please do feel free to either reopen this issue or open a new one. We'll gladly take a look again! You can read more here: https://blog.travis-ci.com/2018-03-09-closing-old-issues

@stale stale bot added the stale label Apr 13, 2018
@nijel
Copy link
Author

nijel commented Apr 13, 2018

I've just reran the test and it still fails:

https://travis-ci.org/nijel/sqlite-travis-test/builds/365965501

@stale stale bot removed the stale label Apr 13, 2018
@raphaelm
Copy link

raphaelm commented Jun 4, 2018

Running into the same issue. @nijel, what's your current workaround?

@nijel
Copy link
Author

nijel commented Jun 5, 2018

My workaround is not to run tests against in memory sqlite. The file backed sqlite makes it a bit slower, but works...

PS: This is the related commit for Weblate (Django based app): WeblateOrg/weblate@37c1f64

@raphaelm
Copy link

raphaelm commented Jun 5, 2018

FYI: We found a way to make the SQLite test faster instead of slower, with working around this bug (for now) at the same time: https://behind.pretix.eu/2018/06/05/too-many-tests/

@stale
Copy link

stale bot commented Sep 3, 2018

Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically closing the issue in 24 hours. This is often because the request was already solved in some way and it just wasn't updated or it's no longer applicable. If that's not the case, please do feel free to either reopen this issue or open a new one. We'll gladly take a look again! You can read more here: https://blog.travis-ci.com/2018-03-09-closing-old-issues

@stale stale bot added the stale label Sep 3, 2018
@nijel
Copy link
Author

nijel commented Sep 4, 2018

The issue is still valid, I've just reproduced it: https://travis-ci.org/nijel/sqlite-travis-test/builds/424173409

@stale stale bot removed the stale label Sep 4, 2018
@stale
Copy link

stale bot commented Dec 3, 2018

Thanks for contributing to this issue. As it has been 90 days since the last activity, we are automatically closing the issue in 7 days. This is often because the request was already solved in some way and it just wasn't updated or it's no longer applicable. If that's not the case, please respond before the issue is closed, or open a new one after. We'll gladly take a look again! You can read more here: https://blog.travis-ci.com/2018-03-09-closing-old-issues

@stale stale bot added the stale label Dec 3, 2018
@stale stale bot closed this as completed Dec 10, 2018
afabiani pushed a commit to GeoNode/geonode that referenced this issue Apr 29, 2019
* initial migration files

* Update master_base.html

to extend geonode_base.html instead of base.html

* fix get_current_site import

* fix login to be compatible with allauth

* fix templates directories settings

* Update base templates

update from 2.8 to 2.10 templates

* add geosites template folder to templates dirs

* Update _type_filters.html

updated from geonode template

* update geosites settings

* Update site_base_tags.py

* count users for site

* flake8

* fix login & logout  urls

* check if geosites is already added to INSTALLED_APPS

* enable geosites

* use get_or_create in post save resources and people

* rename geosites initial_data fixture to not conflict with geonode fixtures

* fix geosites test cases

* add geosites to contrib apps smoke tests

* create SiteResources and SitePeople for default site

This done after migration to avoid using using get_or_create as it is not thread safe anc  causing "segementation fault" error randomly

* create site related models if not created

* check if current site is master site to not do actions twice

* upgrade sqlite in travis test containers

travis-ci/travis-ci#7873

* upgrade sqlite in travis test containers

* add missing semi colons

* fix apt-add-repository command

* Update .travis.yml

* remove sqlite3 --version command

* fix global search to get resources for current site only

* override map, layer and document autocomplete, to get result for current site only

* handle groups for sites

* fix SiteGroups renaming

* geosites documentation update

* use apply filters instead of filtered queryset

* create SiteGroups(empty collection) while creating site

* flake*
nijel added a commit to WeblateOrg/weblate that referenced this issue Sep 23, 2019
At same time drop workaround for
travis-ci/travis-ci#7873, it seems no longer
necessary on Xenial builds.

Signed-off-by: Michal Čihař <michal@cihar.com>
nijel added a commit to WeblateOrg/weblate that referenced this issue Sep 24, 2019
It is still present on xenial

See travis-ci/travis-ci#7873

Signed-off-by: Michal Čihař <michal@cihar.com>
@DaffyTheDuck
Copy link

This has started happening for me too on this repo:
https://travis-ci.org/openwisp/django-x509/jobs/297505773

I also tried to restart one of the previous builds that was completed successfully, after the restart it failed.
The failures, in my case, seem to affect python 2.7 only, the python 3 builds run fine.

I tried the workaround proposed here, as well as removing the travis cache, but it didn't work out.

Same thing with:
https://travis-ci.org/openwisp/django-ipam/jobs/617307976?utm_medium=notification&utm_source=github_status

Any idea ?

lcorbasson added a commit to lcorbasson/agate that referenced this issue Jan 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants