Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0 clean up shell scripts #6158

Merged
merged 45 commits into from Jan 7, 2016
Merged

Conversation

benbc
Copy link
Contributor

@benbc benbc commented Dec 22, 2015

Fixes #139, fixes #270, fixes #133, fixes #5682.


*)
echo "Usage: neo4j { console | start | stop | restart | status | info }"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to say neo4j-arbiter for the script name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lof of this script looks very close to the neo4j one. Can we have this one source that one, and just add its weirdities? Or are they not close enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to say neo4j-arbiter for the script name?

Thank you. I find that that bug has existed since 2013.

@srbaker
Copy link
Contributor

srbaker commented Dec 22, 2015

Aside from what might be an opportunity to cut down on duplication in neo4j-arbiter, I'm very happy with this. 👍

@benbc
Copy link
Contributor Author

benbc commented Dec 23, 2015

@srbaker All right. You got me. I've been trying to avoid removing that duplication, but you are quite right that it needs doing.

This functionality has long been problematic for three reasons.

* It uses lsof which is an endless cause of compatibility bugs because
  its interface changes across distributions and verions.
* It doesn't actually work as intended. The expectation is that the
  server is ready to serve requests when the script exits. In fact that
  is not quite the case because Jetty has this weird behaviour where it
  binds the port before the handler is actually ready. So using "port
  bound" as a proxy for "ready" doesn't work.
* It takes a massively simplistic view of what it means for Neo4j to be
  "ready". It doesn't take into account the complexity of recovery,
  upgrades or clustering; each of which can make the server unavailable
  even though the port is bound.

We are losing functionality here because errors early in the DBMS
lifecycle may not be repoorted. Probably the biggest loss is in the case
when there is something else listening on the same port. Previously we
attempted to catch that in the shell script and give a clear message but
now the user will have to look in the logs find the information they
need.

In fact even this latter is not as simple as it looks since, following
the changes for Bolt, it's now possible for the DBMS to be running
perfectly happily with nothing listening on port 7474.

We hope to introduce mitigation for this loss later. Either have
something that polls real endpoints (e.g. curl), although it's not clear
how that would work in the Bolt-only case; or modify the DMBS so that it
can signal back to the parent process that it's up and happy.

As a temporary measure we are sleeping for five seconds after starting
the process and checking that it's still alive. That way we can tell the
user about very early errors (e.g. port binding problems). The sleep
time has been chosen experimentally and we can't be sure that it will
always be long enough.
* Call the detection function from the code that needs it to improve
  maintainability. Duplicating calls to `uname` will not slow us down.
* Use the call detection function everywhere instead of ad hoc
  detection.
* Remove detection for OSes that we don't vary our behaviour for.
This is untested, not officially documented and has several
long-outstanding bugs against it.
With our current build and release processes it is very unlikely that
someone would run a SNAPSHOT build unintentionally, so this is just
unnecessary clutter.
This changes the way we construct the classpath to be easier to test,
letting Java do the globbing for us.
This allows us to accommodate environments where the JVM takes a long
time (or very little time) to come up. It also allows us to turn the
sleep right down for testing purposes, so the tests run quickly.
We would like the shell script tests to work on OSX and
Linux. Unfortunately sed is pretty incompatible between the two of
them. We don't really want to take a backup of the file here, but the
form we've given happens to work identically in both environments.
In order to make the tests work the same on OSX and Linux, we need to
specify /bin/bash since /bin/sh is different on those platforms (Bourne
shell vs Dash).
These tests will only fail if the entire tarball is somehow broken.
@benbc benbc force-pushed the 3.0-clean-up-shell-scripts branch from f414ebf to e142688 Compare January 5, 2016 14:14
@benbc
Copy link
Contributor Author

benbc commented Jan 5, 2016

@srbaker Back to you.

This is mostly a refactoring to simplify the code, but it does introduce
one behavioural change (to make the implementation easier): it is now
necessary to have a valid Java available for all commands, not just
starting Neo4j.
This special case was inherited from whatever we copied when our scripts
when our scripts were originally written. As far as I can tell the weird
case that it was trying to deal with no longer exists and we don't
support AIX anyway.
@benbc benbc force-pushed the 3.0-clean-up-shell-scripts branch from e142688 to eae29e8 Compare January 5, 2016 16:49
srbaker added a commit that referenced this pull request Jan 7, 2016
@srbaker srbaker merged commit 3e8fa0b into neo4j:3.0 Jan 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants