Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of params command optional for phone-home #78

Closed
geekgonecrazy opened this issue Sep 6, 2020 · 16 comments
Closed

Make use of params command optional for phone-home #78

geekgonecrazy opened this issue Sep 6, 2020 · 16 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@geekgonecrazy
Copy link

geekgonecrazy commented Sep 6, 2020

Working through getting my physical servers working with tinkerbell to start doing some experimentation.. they fail saying the “params” command is not found.

Leased from datacenter so flashing or updating ipxe might not be possible or ideal.

Expected Behaviour

Boots

Current Behaviour

Errors out at params not found

Possible Solution

Maybe hardware option? Or use query string

Steps to Reproduce (for bugs)

  1. Run ipxe with out params built in
  2. Try to boot

Context

My solution was just to remove the params here:
https://github.com/tinkerbell/boots/blob/master/ipxe/script.go#L34

Then just let it phone home even with out params. Seems like using ?body=${body}&type=${body} might could work. I’m also not even sure if needed? Looking at the phone home code seems like it’s mostly ignored?

Your Environment

  • Operating System and version (e.g. Linux, Windows, MacOS):

  • How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details: tinkerbell it’s self kvm on the network

  • Link to your project or a code example to reproduce issue:

@gianarb
Copy link
Contributor

gianarb commented Sep 7, 2020

I encountered the same issue when trying to use osie locally. The problem here is that osie init.sh fails if phone-home does not get resolved. This should not happen but it happens now because osie still uses a lot of things that only Packet needs.

@geekgonecrazy I know from Slack that you are doing something a bit different here because your cloud provider (ovh) does not give you full control. But technically the phone-home is a feature that Tinkerbell provides.

You should set this param in your ipxe script:

set tinkerbell http://127.0.0.1

And this is a kernel param systemd.setenv=phone_home_url=${tinkerbell}/phone-home. This is an example we use for testing purpose in unit test

https://github.com/tinkerbell/boots/blob/9d6cded511d1e7678ec601a29aac4137f25ecb5a/installers/coreos/ipxe_script_test.go#L47

@geekgonecrazy
Copy link
Author

I can see some potential use cases for the phone home in general.

I tried using "params" even from the shell and the command is for sure not present in the version compiled and on this server. According to this: https://ipxe.org/cmd/params the ipxe had to be compiled with PARAM_CMD option defined.

I had some issues with tftp as well I think that you might be referring to. No clue if related to this or not.

But in this case it loaded the ipxe script fine but then errored out when trying to run params command. Soon as I took that section out and changed the code to:

imgfetch ${tinkerbell}/phone-home
imgfree

It worked fine

@gianarb
Copy link
Contributor

gianarb commented Sep 7, 2020

I think you can avoid params if your provider does not support that, you can replace manually the concrete values where you see ${}.

Anyway, I think in 2 months we will have a very different osie image, smaller and easier to customize.

I am going in an unexplored land here, but technically you can change osie with something like your own Alpine and as long as the architecture you are using is a supported one something should work. But as I said, not really easy and it sounds only reading my comment :)

@geekgonecrazy
Copy link
Author

geekgonecrazy commented Sep 7, 2020

I think maybe misunderstood my problem.

All of the other ${} substitutions and variables work just like intended. The values come from boots when it loads auto.ipxe from boots like normal.

By params command I mean the literal ipxe params command used here on this line: https://github.com/tinkerbell/boots/blob/master/ipxe/script.go#L34

All other ${} substitutions and variables work fine. It’s simply that block:

https://github.com/tinkerbell/boots/blob/master/ipxe/script.go#L34-L36

If the version of ipxe running doesn’t have that option compiled in you cannot define a params block and pass them auto-magically using ##params

Changing:

https://github.com/tinkerbell/boots/blob/master/ipxe/script.go#L34-L38

To:

imgfetch ${tinkerbell}/phone-home
imgfree

Was literally the only change I made. The rest of the ipxe script executes and all variables are populated from boots

@geekgonecrazy geekgonecrazy changed the title Optional use of params for phone-home Make use of params command optional for phone-home Sep 8, 2020
@geekgonecrazy
Copy link
Author

geekgonecrazy commented Sep 8, 2020

This better explains the full setup and patch: https://geekgonecrazy.com/2020/09/07/tinkerbell-or-ipxe-boot-on-ovh/

@mmlb
Copy link
Contributor

mmlb commented Sep 8, 2020

Hey @geekgonecrazy in Packet we always boot our own version of ipxe exactly to avoid this situation (iPXE built without some things we need). I think removing the params would be fine in general, but is just a symptom of a bigger issue. Can you chain your own ipxe binary? That would be the most compatible and future-proof solution.

@mmlb
Copy link
Contributor

mmlb commented Sep 8, 2020

@geekgonecrazy you can do this like chain tftp://${tinkerbell}/one_of_the_ipxe_files, where one_of_the_ipxe_files is one of the files https://github.com/tinkerbell/boots/blob/master/job/dhcp.go#L105-L113

@geekgonecrazy
Copy link
Author

@mmlb ah interesting idea..

Downside is.. would have to some how make it only check net1 and not net0. At least in my case. Since their server always replies on net0 with my script (or theirs if I don't have one set in their api).

So would have to some how make the chained ipxe version only check net1.. or.. the script some how be able to detect its in my version of ipxe and chain to boots instead of ipxe.

Something like:

if version_with_params
chain http://${tinkerbell}/auto.ipxe
else
chain tftp://${tinkerbell}/ipxe-version
fi

@geekgonecrazy
Copy link
Author

geekgonecrazy commented Sep 8, 2020

#!ipxe
ifclose net0 
dhcp net1 
set iface net1 

params || goto load_ipxe
chain --autofree http://10.10.5.1/auto.ipxe || exit

:load_ipxe
chain tftp://10.10.5.1/ipxe.efi || shell

actually I think this will work.. if only I can ever get ipxe.efi to boot. It just hangs there indefinitely.

Just for the heck of it tried the others too. Those error out with "Exec format error"

@mmlb
Copy link
Contributor

mmlb commented Sep 9, 2020

@geekgonecrazy glad that works. We do have a patched version number which you can use to branch off of too. Its used to detect iPXEs burned into nics and our own iPXE. Its used in boot's dhcp handling code, but I'm not sure if ipxe sends the version info when fetching ipxe script though.

@geekgonecrazy
Copy link
Author

well it works as far as the logic works. But the actual booting from the ipxe versions shipped does not. I left for many hours and it just hangs there.

So we might need to fork off into another issue for this specifically. But, none of the ipxe methods work from here. Not sure if any way to debug, but stays on:

image

@geekgonecrazy
Copy link
Author

geekgonecrazy commented Sep 9, 2020

Tried out the various ones from https://boot.ipxe.org

https://boot.ipxe.org/snponly.efi - this one finally worked. So would need to add that one, and determine what criteria is used to offer that one up to boot

from what i'm reading of snponly.efi it only tries to boot the specific nic that is being chained. - https://ipxe.org/appnote/buildtargets Which would also be big win here.

according to lshw the nic is:

image

Just not sure if need metadata on hardware json to add this target and have it selected.. or if a way to automatically detect/select it.

maybe like a snpOnly:true on the interface in the hardware json?

https://theforeman.org/2019/06/install-esxi-through-foreman-using-ipxe-bootstrapping.html alternatively here looks like an example of a typical dhcp config returning if architecture == 00:07 looks like not looking at that in boots at all.. so might be easier to add an option on the hardware json definition.

@mmlb
Copy link
Contributor

mmlb commented Sep 10, 2020

I think its time to fork off into a different issue. snponly looks like something we'd want all the time anyway to be honest.

@mmlb
Copy link
Contributor

mmlb commented Sep 10, 2020

Also looking at the foreman config those would map to https://github.com/tinkerbell/boots/blob/master/dhcp/pxe.go#L23-L26 iirc. I don't know what the EBC architecture is though :D.

@mmlb
Copy link
Contributor

mmlb commented Sep 10, 2020

DDG points me https://github.com/jljusten/tianocore/wiki/EBC-FAQ and similar, so it must be true.

geekgonecrazy added a commit to geekgonecrazy/boots that referenced this issue Feb 1, 2021
This is needed because SNP is the only ipxe build that will run on x64 EFI.

Specifically most of the BestValue / Infra-3 etc at OVH only work with SNP.

More details can be found in tinkerbell#78 and tinkerbell#79

Signed-off-by: Aaron Ogle <aaron@geekgonecrazy.com>
@tstromberg tstromberg added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jul 27, 2021
@tstromberg tstromberg added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Nov 16, 2021
@jacobweinstock
Copy link
Member

jacobweinstock commented Dec 24, 2022

I believe with snp.efi this issue is resolved? @geekgonecrazy, please reopen if this is not the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

5 participants