Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rancher: Error with pre-create check: "unexpected end of JSON input" #20

Open
jmondragon opened this issue Dec 29, 2019 · 15 comments
Open
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@jmondragon
Copy link

When trying the v2 binary from within Rancher, I get the following errors:

Error with pre-create check: "unexpected end of JSON input"

and

Timeout waiting for ssh key

While trying to provision nodes for the cluster. I'm not sure what troubleshooting steps to perform next. I will try to provision via cli as outlined in the Readme.

@lnxbil
Copy link
Owner

lnxbil commented Dec 29, 2019

Hi,

can you try the latest prerelease version? It has improved error reporting and was tested with Rancher OS extensively and may indicate what the problem is.

@jmondragon
Copy link
Author

I did end up trying the latest, and it gave more information. It gave me the following error:

Flag provided but not defined: -proxmoxve-storage-type;

Seemingly randomly choosing the flag (e.g. -proxmoxve-user, -proxmoxve-storage, etc.)

I went back to v2 and changed my HOST from the IP address to a hostname, and it seemed to work better. I wonder if it has something to do with the . in IP addresses or FQDN.

I still ended up having trouble connecting via SSH, but the VM did get created. I will continue to troubleshoot.

Thanks,
Josh

@CarlosLVar
Copy link

CarlosLVar commented Jan 2, 2020

Hello there,

We too have encountered the same problem when using v2, will try to provide a hostname instead of an IP as well.

Also, we are experiencing the other problem with v3, but I recon that should probably go in another issue, though logs indicate just that the mentioned flag is empty whereas others aren't (and the host doesn't change if I put an IP, it takes the default 192... Would have to check with a hostname as well.

Thank you for your work!

@ropeguru
Copy link

ropeguru commented Jan 7, 2020

Testing the v3pre3 and every run of trying provision a node gives me a different error like:

Flag provided but not defined: -proxmoxve-disksize-gb; Timeout waiting for ssh key

Each run, or even each node trying to be deployed, will give an error like this. Sometimes it is -proxmoxve-disksize-gb or -proxmoxve-guest-ssh-port or -proxmoxve-storage. Seems to just pick a random defined entry to fail on.

Really looking forward to getting this working.

Edit: I originally tried the v2 and also had issues. Running Rancher v2.3.3 and Proxmox 6.1-5

@ropeguru
Copy link

ropeguru commented Jan 7, 2020

Adding some additional info..

It appears that when Rancher reads the template to gather defined values, something is going wrong.

For instance, I have "NFS-Datastore" defined for the storage location. In the log below, it is showing as "local". Also I have the disk size configured for 50GB, but the entry below is 16Gb. And the list goes on.

Debugguing Rancher container, I find the following in the log: (passwords have been changed)

2020/01/07 18:35:04 [DEBUG] create cmd [create -d proxmoxve --engine-install-url https://releases.rancher.com/install-docker/19.03.sh --proxmoxve-user root --proxmoxve-driver-debug --proxmoxve-image-file NFS-Datastore:iso/rancheros.iso --proxmoxve-password password --proxmoxve-disksize-gb 50 --proxmoxve-guest-ssh-port 22 --proxmoxve-memory-gb 8 --proxmoxve-storage-type qcow2 --proxmoxve-guest-username docker --proxmoxve-host 192.168.1.171 --proxmoxve-realm pam --proxmoxve-storage NFS-Datastore --proxmoxve-guest-password 123456]
2020/01/07 18:35:04 [INFO] Provisioning node ranchnode1
2020/01/07 18:35:04 [DEBUG] stdout: Incorrect Usage.
2020/01/07 18:35:04 [INFO] [node-controller-docker-machine] Incorrect Usage.

So something in the create cmd under Rancher is not correct. The values passed in the first entry are what I have in my template.

Edited for more constructive info.

Another edit: I found the issue by manually running the create command in my Rancher container.

It seems that the v3-pre3 is still using the option form of --proxmox- but the options being generated in a Rancher template are in the form --proxmoxve-. That is why it is failing.

If I manually run a machine-create in the Rancher container using the v3-pre3 driver, I still get the json error, but it does connect and pull the next valid ID.

Hope this helps.

@lnxbil
Copy link
Owner

lnxbil commented Jan 23, 2020

New version with a lot of fixes and merged PR. Please try again and report back.

@jmondragon
Copy link
Author

jmondragon commented Jan 23, 2020

Thank you, I'll give it a try. I don't see the docker-machine-driver-proxmoxve.linux-amd64 (on the v3 release page), only zip and tar.gz files. Can I use the link to the tar.gz within Rancher?

@lnxbil
Copy link
Owner

lnxbil commented Jan 23, 2020

Thank you, I'll give it a try. I don't see the docker-machine-driver-proxmoxve.linux-amd64 (on the v3 release page), only zip and tar.gz files. Can I use the link to the tar.gz within Rancher?

Ah sorry, I clicked the "published" button too soon. The binaries weren't completely uploaded yet. I reuploaded them and now they should be present.

@jmondragon
Copy link
Author

This worked great! I did have to use the latest 1.5.5 rancheros-proxmoxve-autoformat.iso as outlined in the README.

@cyrus104
Copy link

cyrus104 commented Apr 25, 2020

I am using V3 and am getting this error.

I currently have a RancherOS running in one VM on a 3-node proxmox cluster, inside RancherOS docker has running Rancher that has this driver load. When I try to create a new cluster with a node, I get this same exact issue.

@mjkl-gh
Copy link

mjkl-gh commented Jun 7, 2020

I'm running into this issue as well on V3 using both the latest (1.5.6) and 1.5.5 image. I have no idea where to begin debugging this or which information to provide.

I've tried this from a windows 10 laptop (inside ubuntu wsl1 20.04) and regular ubuntu 20.04 both with the same result (on the same pve node)

edit:

Just noticed that go-resty is returning a "596 tls_process_server_certificate: certificate verify failed" after which proxmoxve driver is returning the unexpected end of JSON input"

This is weird since I set up my CA correctly for my hostname (using Let's encrypt). It returns the same error when using the IP address instead of hostname.

@lnxbil
Copy link
Owner

lnxbil commented Jul 29, 2020

Can you please test with v4? I already could not replicate the issue with v3.

@mjkl-gh
Copy link

mjkl-gh commented Jul 30, 2020

Aiaiai. I must unfortunately admit I remember fixing this issue. If I'm working from memory it was something stupid as using the wrong pve node or misspelling a hostname. I will try to get back to you on this!

@lnxbil lnxbil added help wanted Extra attention is needed question Further information is requested labels Sep 30, 2020
@cedvan
Copy link

cedvan commented Oct 22, 2021

Same problem with v4.

I have test deploy with docker-machine manually with same parameters, and works!
But from rancher UI failed with Timeout waiting for ssh key and no other information to help debug 😢

@cedvan
Copy link

cedvan commented Oct 22, 2021

Hum ok, problem with dns resolution, use IP in host and works. It's sad but work...

Be careful, during my tests, the creation of VMs failed but yet I had residual vm-disks, be sure to check and delete them manually if necessary. Otherwise you will have 500 during the next creations because vm-disk already exist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants