New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.0 clean up shell scripts #6158
Conversation
|
||
*) | ||
echo "Usage: neo4j { console | start | stop | restart | status | info }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this supposed to say neo4j-arbiter
for the script name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lof of this script looks very close to the neo4j one. Can we have this one source that one, and just add its weirdities? Or are they not close enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this supposed to say neo4j-arbiter for the script name?
Thank you. I find that that bug has existed since 2013.
Aside from what might be an opportunity to cut down on duplication in neo4j-arbiter, I'm very happy with this. 👍 |
@srbaker All right. You got me. I've been trying to avoid removing that duplication, but you are quite right that it needs doing. |
This functionality has long been problematic for three reasons. * It uses lsof which is an endless cause of compatibility bugs because its interface changes across distributions and verions. * It doesn't actually work as intended. The expectation is that the server is ready to serve requests when the script exits. In fact that is not quite the case because Jetty has this weird behaviour where it binds the port before the handler is actually ready. So using "port bound" as a proxy for "ready" doesn't work. * It takes a massively simplistic view of what it means for Neo4j to be "ready". It doesn't take into account the complexity of recovery, upgrades or clustering; each of which can make the server unavailable even though the port is bound. We are losing functionality here because errors early in the DBMS lifecycle may not be repoorted. Probably the biggest loss is in the case when there is something else listening on the same port. Previously we attempted to catch that in the shell script and give a clear message but now the user will have to look in the logs find the information they need. In fact even this latter is not as simple as it looks since, following the changes for Bolt, it's now possible for the DBMS to be running perfectly happily with nothing listening on port 7474. We hope to introduce mitigation for this loss later. Either have something that polls real endpoints (e.g. curl), although it's not clear how that would work in the Bolt-only case; or modify the DMBS so that it can signal back to the parent process that it's up and happy. As a temporary measure we are sleeping for five seconds after starting the process and checking that it's still alive. That way we can tell the user about very early errors (e.g. port binding problems). The sleep time has been chosen experimentally and we can't be sure that it will always be long enough.
* Call the detection function from the code that needs it to improve maintainability. Duplicating calls to `uname` will not slow us down. * Use the call detection function everywhere instead of ad hoc detection. * Remove detection for OSes that we don't vary our behaviour for.
This is untested, not officially documented and has several long-outstanding bugs against it.
With our current build and release processes it is very unlikely that someone would run a SNAPSHOT build unintentionally, so this is just unnecessary clutter.
Uses the Sharness shell testing library (https://github.com/mlafeldt/sharness/).
This changes the way we construct the classpath to be easier to test, letting Java do the globbing for us.
This allows us to accommodate environments where the JVM takes a long time (or very little time) to come up. It also allows us to turn the sleep right down for testing purposes, so the tests run quickly.
We would like the shell script tests to work on OSX and Linux. Unfortunately sed is pretty incompatible between the two of them. We don't really want to take a backup of the file here, but the form we've given happens to work identically in both environments.
In order to make the tests work the same on OSX and Linux, we need to specify /bin/bash since /bin/sh is different on those platforms (Bourne shell vs Dash).
These tests will only fail if the entire tarball is somehow broken.
f414ebf
to
e142688
Compare
@srbaker Back to you. |
This is mostly a refactoring to simplify the code, but it does introduce one behavioural change (to make the implementation easier): it is now necessary to have a valid Java available for all commands, not just starting Neo4j.
This special case was inherited from whatever we copied when our scripts when our scripts were originally written. As far as I can tell the weird case that it was trying to deal with no longer exists and we don't support AIX anyway.
e142688
to
eae29e8
Compare
Fixes #139, fixes #270, fixes #133, fixes #5682.