-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FATAL ERROR: "172.17.0.2:51081" is in use (duplicate or overlapping run?) #140
Comments
You say: "restarted my machine." It'd be interesting to find out why exactly |
I see the exact same problem. I check with However, I can successfully start ais if I reformat the partition prior to starting docker. |
I also checked with I inspected the source code for the check that trigger the error message and found the checkRestarted function that check for markers. func (t *target) checkRestarted() (fatalErr, writeErr error) {
if fs.MarkerExists(fname.NodeRestartedMarker) {
// NOTE the risk: duplicate aisnode run - which'll fail shortly with "bind:
// address already in use" but not before triggering (`NodeRestartedPrev` => GFN)
// sequence and stealing nlog symlinks - that's why we go extra length
if _lsof(t.si.PubNet.TCPEndpoint()) {
fatalErr = fmt.Errorf("%s: %q is in use (duplicate or overlapping run?)",
t, t.si.PubNet.TCPEndpoint())
return
}
t.statsT.Inc(stats.RestartCount)
fs.PersistMarker(fname.NodeRestartedPrev)
}
fatalErr, writeErr = fs.PersistMarker(fname.NodeRestartedMarker)
return
} I tried deleting |
of course. But that's illegal - the whole point of this specific persistent marker, and the reason for its existence, is to let us know that the node restarted without proper shutting-down. |
The error message is a little confusing in that case. Shouldn't the system automatically try to recover from such a condition? In any case, good to know how to manually recover. |
the keyword is "overlapping run". Maybe there's a better way to express the fact that there is another instance of ais storage target running (and listening on the same local port), and that immediate exit seems to be the best remedy. |
Yes, the overlapping run is clear as such. However, the user doesn't explicitly spin up a second target, nor is on running on the host machine prior to starting the docker container. It is clearly the So the questions why the The only difference between successful runs and the ones experiencing this behaviour is the mounting of an improperly shutdown volume. |
I just don't reproduce it. Here's what I've done: # 1. run it first time
# `/tmp/cluster-minimal` here is just an arbitrary place where the container can write
$ docker run -d -p 51080:51080 -v /tmp/cluster-minimal:/ais/disk0 aistorage/cluster-minimal:latest # 2. use it somehow, this new cluster
$ AIS_ENDPOINT=http://localhost:51080 aisloader -bucket=ais://nnn -cleanup=false -totalputsize=50M -duration=0 -minsize=1MB -maxsize=1MB -numworkers=8 -pctput=100 -quiet
$ AIS_ENDPOINT=http://localhost:51080 ais ls --summary # 3. shutdown
$ AIS_ENDPOINT=http://localhost:51080 ais cluster shutdown # 4. restart
$ docker run -d -p 51080:51080 -v /tmp/cluster-minimal:/ais/disk0 aistorage/cluster-minimal:latest # 5. Finally, see that it sees ais://nnn bucket and generally works
export AIS_ENDPOINT=http://localhost:51080
$ ais show cluster
$ ais ls --summary
# and so on This is with aistore v3.19 |
The difference is that I hadn't run |
closing |
Error Message
Context
I have deployed aistore using the docker image with success at first following the docs. The fatal error message happened after I restarted my machine and run again the Docker image to start the cluster (since the container was not running anymore). Here is the docker run command:
The text was updated successfully, but these errors were encountered: