Skip to content

Incomplete tests in Dgraph test suite. #451

@martinmr

Description

@martinmr

I was trying to run the dgraph tests (using our fork of Jepsen) but there are a lot of incomplete tests. Looking at the errors I see two types:

  1. "sleep interrupted" exceptions.
  2. Zero tries to start but fails because of an "address already in use" error.

I haven't been able to debug any of these two errors since they are in the core jepsen code and not in the Dgraph test suite. There was another error due to alters failing but I fixed it and the fix is included in #450

I tried merging the recent changes from this repo to the fork but that made it even worse. Now I get an error about the X11 display variable causing a library to crash. I tried setting the DISPLAY variable but it doesn't work. The test just gets stuck there. I made sure I am running the tests from a terminal emulator (not screen or tmux) and I still see the same issue.

The last time I successfully ran Jepsen we were using jepsen 0.1.15 but that's not possible anymore as the changes in the Dgraph test suite depend on changes made in later jepsen versions (the fork is currently using jepsen 0.1.17 and this repo is using 0.1.18-SNAPSHOT).

I was using the jepsen tool we have in the dgraph-io/dgraph repo (in the contrib/jepsen folder) and I added a wait in between tests (hoping this was just a timing issue) but that didn't fix it.

I'd appreciate any help as our new release is blocked on passing Jepsen tests.

Steps to reproduce.

  1. Clone dgraph's Jepsen fork (to avoid getting stuck on the X11 display issue). Or use the main repo if you don't see that error. But you'll need the changes in Add recent changes to Dgraph tests #450 to avoid issues with Alter requests.
  2. Run the Dgraph test suite. You don't need to run each test for long. I ran each test for 15 seconds and I am seeing the issues (also seeing them if I run the tests for longer).
  3. Most tests should pass but some are marked as incomplete.
  4. Look into the logs. You either see the "sleep interrupted" errors or the "address already in use" in the Dgraph logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions