Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provision loop after initial force provision and reboot #20

Open
jschilperoord opened this issue Jul 23, 2020 · 10 comments
Open

Provision loop after initial force provision and reboot #20

jschilperoord opened this issue Jul 23, 2020 · 10 comments
Labels
vendor bug Bugs that are caused by the vendor. They are here to inform people but will not be addressed here

Comments

@jschilperoord
Copy link

This place is probably my last resort to ask. So bear with me if you can :-)

I am running controller version atag_5.13.29_13635 with and software 4.4.51.5287926 on my USG. My controller is on the internet. So I have to connect my unprovisioned USG to the experiabox. Provision without the gateway-config.json. Copy the scripts. And copy the json file and then force provision. Before I reboot I connect the USG directly to the fiber modem.

After the reboot everything comes online and works like a charm Internet, IPTV and ipv6. But after I apply a config change that requires a provision on the USG it gets stuck in a provision loop and stops talking to the controller. I see errors in the log like:

user.err syslog: ace_reporter.reporter_fail(): Unknown[11]

Anyone seen this behavior?

@coolhva
Copy link
Owner

coolhva commented Jul 24, 2020

Yes, and I have found no solution. Using a remote controller and a gateway.config.json together (which is specifying two wan interfaces) does not seem to work together.

@Abchrisabc
Copy link

I have the same issue here, whenever I do want to exit this provisiong state, I have to manually re-enter the PPPoE settings in the USG directly, this will make the USG realize that it can make a connection. Doing this will however temporarily drop all traffic as you are re-initiating your PPPoE session and then getting the provisioning settings again. After this the USG will show up as being online and connected, however for every change, this needs to be done.

afbeelding
This is of course only a band aid and not the fix, I am still searching for a solution as well.

@coolhva
Copy link
Owner

coolhva commented Sep 8, 2020

Well it is better then nothing, thanks for updating and sharing :)

@coolhva coolhva added the vendor bug Bugs that are caused by the vendor. They are here to inform people but will not be addressed here label Sep 8, 2020
@jschilperoord
Copy link
Author

Thanks for sharing @Abchrisabc. I chose to run a controller on a PCEngines apu2 in my local network. To solve this issue. Thanks for this great config @coolhva 👍 Do we want to keeps this open for future reference. Or should I close?

@vertizio
Copy link

Do we want to keeps this open for future reference. Or should I close?

@jschilperoord keep it open. I just stumbled accross this and have been banging my head against the walls why my USG was going into provision state every now and then. Now I know why...

@rlaarman
Copy link

rlaarman commented Feb 7, 2021

Ran into the same issue today when replacing an old USG3P for a new one. It looks like the USG3P is using 127.0.0.1 for DNS after (or already during) provisioning which prevents it's from connecting to the remote cloudkey. Had to stop investigating due to the COVID curfew and temporary put in an experiabox. Will probably place a local cloudkey to resolve this for the client. I am not sure if I will be able to do some further testing, but maybe me noticing the DNS changing will trigger some clever thinking.

@StreborStrebor
Copy link

Is there by any chance a fix for this already?

My cloud key is in my remote office.
My Home connection with USG - provisioned by the cloud key in the office - needs to be able to pick up a gateway.config.json (for KPN IPTV)

@tariklehaine
Copy link

Also very curious if this has been resolved. Since I am running my UniFi Controller in Azure.

@sAnexeh
Copy link

sAnexeh commented Dec 5, 2021

I've tried a couple of things but haven't been able to fix the core issue yet. In the process of debugging I did create a local workaround that will probably suit nobody but myself, but I'd like to mention anyway.

I created a site on a local webserver with a reverse proxy towards my internet controller (http://lan-ip:8080 will forward everything to http://controller-ip:8080). I then created a cron that invokes an "mca-cli-op info" command via SSH on the USG every 5 minutes. As soon as the status Unknown[11] is detected, it will send an "mca-cli-op set-inform http://lan-ip:8080/inform" to the USG. The USG will then continue provisioning and report a healthy "Connected" in the controller.

@sAnexeh
Copy link

sAnexeh commented Mar 27, 2023

So, after some time has passed I tried to resolve this issue with the help of Ubiquiti Support. Unfortunately they are unwilling to help out due to using the custom config with config.gateway.json which according to them is unsupported.

It's not a DNS issue, not a routing issue, it doesn't seem to be a MSS Clamping issue. I can see the USG doing a POST on my remote controller. I can see the headers and the x-binary data. The remote controller responds with a 200 OK, but the set-inform fails with Unknown[11].

Because running a cron every 5 minutes doing a check on "mca-cli-op info" seems like a bit over the top, I decided to take a different approach to resolve the issue. Since I'm not using the USG as DNS on my LAN I decided to edit the /etc/hosts on the USG to change the A-record of my remote controller to my LAN proxy by editing the config.gateway.json. I added the following part (where remote-controller.tld has the LAN IP of the webserver that will proxy the traffic to the remote controller):

"system": {
"static-host-mapping": {
"host-name": {
"remote-controller.tld": {
"inet": ["192.168.178.2"]
}
}
},

I'm using the remote-controller.tld configured in the controller (Settings -> System -> Advanced -> Inform Host -> Override: remote-controller.tld). It won't work with direct IP as we can't manipulate that. I'm using a simple LAN webserver running on port 8080 that uses mod_rewrite to proxy incoming traffic to my remote controller. The content of the .htaccess:

RewriteEngine on
RewriteRule ^(.*)$ http://remote-controller.tld:8080/$1 [P]

The USG will do the set-inform to http://remote-controller.tld:8080/inform (which because of the edited host in /etc/hosts is actually the self hosted webserver on LAN). Because the webserver is not using the USG as DNS, it will resolve the remote-controller.tld to the actual IP of the remote controller there. In my case, the set-inform then succeeds. That's all.

I'm aware this still won't fix it for those not having a webserver locally but in my case this works as the sites I'm using in my controller at least all have some sort of LAN device (QNAP NAS, for example) that is able to run a webserver. If you are using the USG as primary DNS server there are still enough options to get it working, but it might take a little bit more effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vendor bug Bugs that are caused by the vendor. They are here to inform people but will not be addressed here
Projects
None yet
Development

No branches or pull requests

8 participants