You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've seen instances where GateOne (commit 62af45) reloads and clears the pid file which causes timeouts for subsequent connections.
I've got a box that is online now (Ubuntu 14.04) that has Gate One timing out. I'm yet to restart the process incase we want to debug this live. I've also seen this with CentOS6.
The Ubuntu box uses upstart to manage Gate One (see /etc/init/gateone.conf). The hack in post-start to create the pid file is there incase the pid file wasnt created by Gate One.
# status gateone
gateone stop/waiting
# ps axuww|grep gateone
root 3666 0.0 0.9 231536 16040 ? Sl Feb20 0:00 /opt/virtualenvs/gateone/bin/python /opt/virtualenvs/gateone/bin/gateone --pid_file=/var/run/gateone.pid
root 10105 0.0 0.0 10464 928 pts/3 S+ 22:11 0:00 grep --color=auto gateone
# ls -la /var/run/gateone.pid
ls: cannot access /var/run/gateone.pid: No such file or directory
From /var/log/gateone/gateone.log, this is Gate One starting.
[W 150219 18:10:46 app_terminal:2714] dtach command not found. dtach support has been disabled.
[I 150219 18:10:46 server:4047] Imported applications: Terminal
[I 150219 18:10:46 server:4189] Version: 1.2.0 (20140609214034)
[I 150219 18:10:46 server:4190] Tornado version 3.2.2
[I 150219 18:10:46 server:4210] Connections to this server will be allowed from the following origins: '.*'
[I 150219 18:10:46 server:3728] No authentication method configured. All users will be ANONYMOUS
[I 150219 18:10:46 server:3855] Loaded global plugins: control_alt_w.js, help.js
[I 150219 18:10:46 server:4328] Listening on http://*:8080/
[I 150219 18:10:46 server:4348] Process running with pid 19048
Everything is fine for a while until this
[E 150220 17:20:39 server:1873] Error/Unknown WebSocket action, terminal:get_ter
minals: None (/opt/virtualenvs/gateone/local/lib/python2.7/site-packages/gateone
/applications/terminal/app_terminal.py line 732)
[I 150220 17:20:42 server:866] All user sessions have terminated.
[I 150220 17:20:42 server:876] The last idle session has timed out. Reloading...
[W 150220 17:20:42 app_terminal:2714] dtach command not found. dtach support ha
s been disabled.
[I 150220 17:20:42 server:4047] Imported applications: Terminal
[I 150220 17:20:42 server:4189] Version: 1.2.0 (20140609214034)
[I 150220 17:20:42 server:4190] Tornado version 3.2.2
[I 150220 17:20:42 server:4210] Connections to this server will be allowed from
the following origins: '.*'
[I 150220 17:20:42 server:3728] No authentication method configured. All users w
ill be ANONYMOUS
[I 150220 17:20:42 server:3855] Loaded global plugins: control_alt_w.js, help.js
[I 150220 17:20:42 server:4328] Listening on http://*:8080/
[E 150220 17:20:42 server:4357] Could not listen on 0.0.0.0:8080 (address:port i
s already in use by another application).
[E 150220 17:20:42 server:4371] Exception was: (98, 'Address already in use')
[I 150220 17:20:42 server:4378] Clearing cache_dir: /tmp/gateone_cache
The user started using Gate One at 17:02 successfully.
I'm not sure what was done on the client end that triggered the WebSocket error.
I use nginx to reverse proxy connections into the Gate One box which only has an internal IP address. I run several Gate One boxes concurrently and would like to keep the reverse proxy architecture in place so that I don't have to provide each Gate One box an external IP address.
2015/02/22 21:18:42 [error] 1206#0: *2207 upstream timed out (110: Connection timed out) while reading response header from upstream, client: XXX.XXX.XXX.XXX, server: ~^(node-.*)-(r-.*)\.REDACTED$, request: "GET /favicon.ico HTTP/1.1", upstream: "http://10.80.25.72:8080/favicon.ico", host: "REDACTED"
nginx terminates SSL and proxies to Gate One over plain HTTP.
The Gate One process seems to sit in a weird state after this. These are recurring logs within gateone.log up until 02/21 at 06:06:31.
# tail /var/log/gateone/gateone.log
[I 150221 06:06:31 server:4210] Connections to this server will be allowed from the following origins: '.*'
[I 150221 06:06:31 server:3728] No authentication method configured. All users will be ANONYMOUS
[I 150221 06:06:31 server:3855] Loaded global plugins: control_alt_w.js, help.js
[I 150221 06:06:31 server:4328] Listening on http://*:8080/
[E 150221 06:06:31 server:4357] Could not listen on 0.0.0.0:8080 (address:port is already in use by another application).
[E 150221 06:06:31 server:4371] Exception was: (98, 'Address already in use')
[I 150221 06:06:31 server:4378] Clearing cache_dir: /tmp/gateone_cache
[I 150221 06:06:31 server:4381] pid file removed.
[W 150221 06:06:32 utils:836] Could not open pid_file (/var/run/gateone.pid). You *may* have to kill gateone.py manually (probably not).
[W 150221 06:06:32 utils:836] Could not open pid_file (/var/run/gateone.pid). You *may* have to kill gateone.py manually (probably not).
Any idea what is causing the WebSocket error and is there a way I can gracefully recover without requiring a human to restart the Gate One process?
My config files below.
# cat 10server.conf
// This is Gate One's main settings file.
{
// "gateone" server-wide settings fall under "*"
"*": {
"gateone": { // These settings apply to all of Gate One
"address": "",
"auth": "none",
"api_timestamp_window": 30,
"ca_certs": null,
"cache_dir": "/tmp/gateone_cache",
"certificate": "/etc/gateone/ssl/certificate.pem",
"combine_css": "",
"combine_css_container": "gateone",
"combine_js": "",
"cookie_secret": "REDACTED",
"debug": true,
"disable_ssl": true,
"embedded": false,
"enable_unix_socket": false,
"gid": "0",
"https_redirect": false,
"js_init": "{showToolbar: false, showTitle: false}",
"keyfile": "/etc/gateone/ssl/keyfile.pem",
"locale": "en_US",
"log_file_max_size": 100000000,
"log_file_num_backups": 10,
"log_file_prefix": "/var/log/gateone/gateone.log",
"log_to_stderr": null,
"logging": "info",
"origins": [".*"],
"pid_file": "/var/run/gateone.pid",
"port": 8080,
"session_dir": "/tmp/gateone",
"session_timeout": 0,
"syslog_facility": "daemon",
"syslog_host": null,
"uid": "0",
"unix_socket_path": "/tmp/gateone.sock",
"url_prefix": "/",
"user_dir": "/var/lib/gateone/users",
"user_logs_max_age": "30d"
}
}
}
# cat 50terminal.conf
// This is Gate One's Terminal application settings file.
{
// "*" means "apply to all users" or "default"
"*": {
"terminal": {
"commands": {"LOGIN": "/usr/bin/sudo -u admin -i"},
"default_command": "LOGIN",
"dtach": false,
"enabled_filetypes": "all",
"environment_vars": {"TERM": "xterm-256color"},
"max_terms": 6,
"session_logging": false,
"syslog_session_logging": false
}
}
}
The text was updated successfully, but these errors were encountered:
rene00
changed the title
Gate One reloading removes gateone.pid causing timeouts
Gate One WebSocket causes reload which removes gateone.pid resulting in timeouts
Feb 22, 2015
After some thought ("I thought I remember fixing something like this a while back...") I believe I know what's going on: There's a bug in the version of Gate One you're using that I corrected in commit 9334592
commit 9334592411911aba35b2c387a1907beacba3deb7
Author: Dan McDougall <daniel.mcdougall@liftoffsoftware.com>
Date: Sat Aug 23 21:04:37 2014 -0400
core/server.py: Removed the code that restarts Gate One after the last user logs out. Turns out it messes up a lot of the time on a lot of platforms. It just isn't worth it.
So if you upgrade to the latest code this problem should go away. Alternatively you could see what I changed between f11d0e4 and 9334592 which was basically just removing a bunch of lines from server.py.
I've seen instances where GateOne (commit 62af45) reloads and clears the pid file which causes timeouts for subsequent connections.
I've got a box that is online now (Ubuntu 14.04) that has Gate One timing out. I'm yet to restart the process incase we want to debug this live. I've also seen this with CentOS6.
The Ubuntu box uses upstart to manage Gate One (see /etc/init/gateone.conf). The hack in post-start to create the pid file is there incase the pid file wasnt created by Gate One.
From /var/log/gateone/gateone.log, this is Gate One starting.
Everything is fine for a while until this
The user started using Gate One at 17:02 successfully.
I'm not sure what was done on the client end that triggered the WebSocket error.
I use nginx to reverse proxy connections into the Gate One box which only has an internal IP address. I run several Gate One boxes concurrently and would like to keep the reverse proxy architecture in place so that I don't have to provide each Gate One box an external IP address.
The relevant parts of the nginx config.
The nginx error logs.
nginx terminates SSL and proxies to Gate One over plain HTTP.
The Gate One process seems to sit in a weird state after this. These are recurring logs within gateone.log up until 02/21 at 06:06:31.
The current TCP state of play.
Note the connection in CLOSE_WAIT.
Any idea what is causing the WebSocket error and is there a way I can gracefully recover without requiring a human to restart the Gate One process?
My config files below.
The text was updated successfully, but these errors were encountered: