-
Notifications
You must be signed in to change notification settings - Fork 713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incomplete tests in Dgraph test suite. #451
Comments
This is the weird error I see when I tried to merge the recent jepsen changes to our fork.
|
The failures we are seeing are not deterministic so we'll try to get the incomplete tests to pass in order to unblock us. But it would still be great to be able to run the full tests suite without running into these issues. |
This might be a bug in Jepsen's generator system, which I've been working on rewriting. Or it could be that something else happened in the test (e.g. a timeout), and it interrupted other threads. Reading through the stacktraces in jepsen.log (there might be multiple) could help.
This sounds like an automation issue? Like maybe Dgraph isn't being killed properly?
The first could be either; the second is most likely in the Dgraph test suite.
I'm guessing you're running headless? Try adding -Djava.awt.headless=true to the jvm-opts in project.clj: https://stackoverflow.com/questions/21343529/all-my-java-applications-now-throw-a-java-awt-headlessexception |
Thanks for the quick response. I see the "sleep interrupted" near the beginning of the test. I forgot to copy the stack trace because I saw it on my remote workstation but the stack trace didn't contain any reference to the code in the dgraph test suite so it's probably the same bug that you mentioned. The test did not even have a chance to start. The other failure is probably in the Dgraph test suite but I haven't been able to debug it. There have been some recent changes to the test suite itself and I think the error was introduce by them because we have not seen this error before. It works most of the time but there are times that the error happens so something about how jepsen creates and destroys the cluster changed. I'll try changing the project.clj file to see if I can get the test suite running with the latest changes. |
@aphyr, hi. I haven't been able to get past the headless error. I tried setting the option in the Dgraph project.clj file as well as in the jepsen project.clj file but it still doesn't work. I don't know what jepsen is trying to display but there should be a way to disable it. Thanks. |
I don't think I can patch it at the Jepsen level, cuz it's happening at compile time, but the project.clj fix is a one-liner, at least on every env I've used. Really, I need to go into ztellman/rhizome and patch this so it's not a compile-time thing. :jvm-opts ["-Djava.awt.headless=true"] |
That is the exact line I added to the project.clj file but it didn't had any effect. Adding it to the jepsen project.clj file didn't have any effect either. I think you are right; since this happens at compile time setting the flag probably does nothing. |
No, that's not it; compilation happens when you call |
I was trying to run the dgraph tests (using our fork of Jepsen) but there are a lot of incomplete tests. Looking at the errors I see two types:
I haven't been able to debug any of these two errors since they are in the core jepsen code and not in the Dgraph test suite. There was another error due to alters failing but I fixed it and the fix is included in #450
I tried merging the recent changes from this repo to the fork but that made it even worse. Now I get an error about the X11 display variable causing a library to crash. I tried setting the DISPLAY variable but it doesn't work. The test just gets stuck there. I made sure I am running the tests from a terminal emulator (not screen or tmux) and I still see the same issue.
The last time I successfully ran Jepsen we were using jepsen 0.1.15 but that's not possible anymore as the changes in the Dgraph test suite depend on changes made in later jepsen versions (the fork is currently using jepsen 0.1.17 and this repo is using 0.1.18-SNAPSHOT).
I was using the jepsen tool we have in the
dgraph-io/dgraph
repo (in thecontrib/jepsen
folder) and I added a wait in between tests (hoping this was just a timing issue) but that didn't fix it.I'd appreciate any help as our new release is blocked on passing Jepsen tests.
Steps to reproduce.
The text was updated successfully, but these errors were encountered: