Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SW probe broken on upgrade to Turris OS 5.0 #38

Closed
mato opened this issue Jun 1, 2020 · 9 comments
Closed

SW probe broken on upgrade to Turris OS 5.0 #38

mato opened this issue Jun 1, 2020 · 9 comments

Comments

@mato
Copy link

mato commented Jun 1, 2020

Hi,

I upgraded a Turris Omnia from OS 4.0.5 to OS 5.0.0 and my SW probe broke. I had to apply c6efd4a plus the following:

--- reginit.sh  2020-06-01 23:19:47.149699431 +0200
+++ reginit.sh.new      2020-06-01 23:18:48.000612596 +0200
@@ -82,6 +82,7 @@
                FIRMWARE_APPS_CS_UNCOMP) FIRMWARE_APPS_CS_UNCOMP="$value" ;;
                FIRMWARE_APPS) FIRMWARE_APPS="$value" ;;
                REREG_TIMER) REREG_TIMER="$value" ;;
+               REG_WAIT_UNTIL) REG_WAIT_UNTIL="$value" ;;
                *)
                        echo >&2 "unknown keyword '$kw' in CON_INIT_CONF (1)"
                ;;

... to get things working.

Also, Turris needs the patches to disable buddyinfo, otherwise there's a ton of errors in the logs from that as well.

/cc @BKPepe

@PhilipHomburg
Copy link
Collaborator

I applied your patch and pushed it to the devel branch. Note that I have no control over when cz.nic people pick up a new release of the software probe code. However, the previous version should work, despite the diagnostics.

@mato
Copy link
Author

mato commented Jun 3, 2020

Thanks. I'm discussing with the Turris folks, suggest we keep this open until it's resolved on their side. Will you be releasing a formal new version with 0d30e89?

@mato
Copy link
Author

mato commented Jun 4, 2020

So, Turris OS 5.0.0 was released today with version 5020 of atlas-sw-probe. On reboot into the new version, my probe went offline and appeared to be looping on:

Jun  4 14:41:11 turris ATLAS[3099]: start reg
Jun  4 14:41:11 turris ATLAS[3099]: ATLAS registeration starting
Jun  4 14:41:11 turris ATLAS[3099]: unknown keyword 'REG_WAIT_UNTIL' in CON_INIT_CONF (1)
Jun  4 14:41:11 turris ATLAS[3099]: REGHOSTS reg03.atlas.ripe.net 193.0.19.246 2001:67c:2e8:11::c100:13f6 reg04.atlas.ripe.net 193.0.19.247 2001:67c:2e8:11::c100:13f7
Jun  4 14:41:11 turris ATLAS[3099]: ssh -i /usr/libexec/atlas-probe-scripts/etc/probe_key -p 443 atlas@reg03.atlas.ripe.net INIT
Jun  4 14:41:13 turris ATLAS[3099]: reg server asked us to wait or there was an error. REG_WAIT_UNTIL 1591281800

I manually appled 0d30e89 to /usr/libexec/atlas-probe-scripts/bin/reginit.sh on the router and re-started the probe, but it still seems to be getting REG_WAIT_UNTIL back from the Atlas servers after multiple attempts:

Jun  4 14:49:55 turris ATLAS[10457]: start reg
Jun  4 14:49:55 turris ATLAS[10457]: ATLAS registeration starting
Jun  4 14:49:55 turris ATLAS[10457]: there is WAIT, REG_WAIT_UNTIL  , now is 1591282195
Jun  4 14:49:55 turris ATLAS[10457]:  REG_WAIT_UNTIL expired go re-reg 1591282167 now 1591282195
Jun  4 14:49:55 turris ATLAS[10457]: REGHOSTS reg03.atlas.ripe.net 193.0.19.246 2001:67c:2e8:11::c100:13f6 reg04.atlas.ripe.net 193.0.19.247 2001:67c:2e8:11::c100:13f7
Jun  4 14:49:55 turris ATLAS[10457]: ssh -i /usr/libexec/atlas-probe-scripts/etc/probe_key -p 443 atlas@2001:67c:2e8:11::c100:13f6 INIT
Jun  4 14:49:57 turris ATLAS[10457]: reg server asked us to wait or there was an error. REG_WAIT_UNTIL 1591282308
Jun  4 14:53:00 turris ATLAS[10457]: start reg
Jun  4 14:53:00 turris ATLAS[10457]: ATLAS registeration starting
Jun  4 14:53:00 turris ATLAS[10457]: there is WAIT, REG_WAIT_UNTIL  , now is 1591282380
Jun  4 14:53:00 turris ATLAS[10457]:  REG_WAIT_UNTIL expired go re-reg 1591282308 now 1591282380
Jun  4 14:53:00 turris ATLAS[10457]: REGHOSTS reg03.atlas.ripe.net 193.0.19.246 2001:67c:2e8:11::c100:13f6 reg04.atlas.ripe.net 193.0.19.247 2001:67c:2e8:11::c100:13f7
Jun  4 14:53:00 turris ATLAS[10457]: ssh -i /usr/libexec/atlas-probe-scripts/etc/probe_key -p 443 atlas@193.0.19.246 INIT
Jun  4 14:53:01 turris crond[10821]: (root) CMD (/usr/bin/rainbow_button_sync.sh)
Jun  4 14:53:02 turris ATLAS[10457]: reg server asked us to wait or there was an error. REG_WAIT_UNTIL 1591282504

Will continue to monitor what happens and report back here.

@BKPepe
Copy link
Contributor

BKPepe commented Jun 4, 2020

Could be related to the issue, which me and @ja-pa observed as well based on your report and we described it in the mailing list.

@mato
Copy link
Author

mato commented Jun 4, 2020

Seems to have successfully registered now, took some hours:

Jun  4 17:42:30 turris ATLAS[10457]: start reg
Jun  4 17:42:30 turris ATLAS[10457]: ATLAS registeration starting
Jun  4 17:42:30 turris ATLAS[10457]: there is WAIT, REG_WAIT_UNTIL  , now is 1591292550
Jun  4 17:42:30 turris ATLAS[10457]:  REG_WAIT_UNTIL expired go re-reg 1591292492 now 1591292550
Jun  4 17:42:30 turris ATLAS[10457]: REGHOSTS reg03.atlas.ripe.net 193.0.19.246 2001:67c:2e8:11::c100: 13f6 reg04.atlas.ripe.net 193.0.19.247 2001:67c:2e8:11::c100:13f7
Jun  4 17:42:30 turris ATLAS[10457]: ssh -i /usr/libexec/atlas-probe-scripts/etc/probe_key -p 443      atlas@193.0.19.246 INIT
Jun  4 17:42:32 turris ATLAS[10457]: Got good controller info
Jun  4 17:42:32 turris ATLAS[10457]: check cached controller info from previous registeration
Jun  4 17:42:32 turris ATLAS[10457]: NO cached controller info. NO REMOTE port info
Jun  4 17:42:32 turris ATLAS[10457]: Do a controller INIT
Jun  4 17:42:32 turris ATLAS[10457]: Controller init -p  443 atlas@ctr-fsn01.atlas.ripe.net  INIT
Jun  4 17:42:33 turris ATLAS[10457]: initiating  KEEP connection to -R 54087 -p  443 ctr-fsn01.atlas.  ripe.net

@BKPepe
Copy link
Contributor

BKPepe commented Jun 4, 2020

I'm thinking if it wouldn't be better to provide @mato's probe id to @PhilipHomburg even by using a private message. Maybe there is something happening on the server-side?

@mato
Copy link
Author

mato commented Jun 4, 2020 via email

@BKPepe
Copy link
Contributor

BKPepe commented Jun 8, 2020

Based on @PhilipHomburg email in the mailing list for atlas-sw-probes. This can be closed.

@mato
Copy link
Author

mato commented Jun 8, 2020

My probe seems to be working fine, so yes, I'll close this issue. For reference, here's the acknowledgement of server-side issues related to software probe connectivity: https://www.ripe.net/ripe/mail/archives/atlas-sw-probes/2020-June/000065.html

@mato mato closed this as completed Jun 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants