Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

after syscloning image, booting to image fails and boots to shell #5673

Closed
talandt opened this issue Sep 27, 2018 · 19 comments
Closed

after syscloning image, booting to image fails and boots to shell #5673

talandt opened this issue Sep 27, 2018 · 19 comments

Comments

@talandt
Copy link

talandt commented Sep 27, 2018

mn= centos7, node = centos7
following documents for sysclon, sysimaging on xcat-read-the-docs what looked to be a successful syclone failed.
after nodesetting the node to the correct image and netbooting the node boots to a shell prompt
PING ATTEMPT 1:
PING mgt.cluster (172.20.0.1) 56(84) bytes of data.
64 bytes from mgt.cluster (172.20.0.1): icmp_seq=1 ttl=64 time=0.125 ms

--- mgt.cluster ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.125/0.125/0.125/0.000 ms

We have connectivity to your SystemImager server!

get_scripts_directory
rsync -a mgt::scripts/ /scripts/
/etc/init.d/functions: line 301: rsync: command not found
Last command exited with 127
Killing off running processes.

writsh: no job control in this shell
sh-4.2# e_variables
cat: /etc/issue: No such file or directory
[ 102.185906] random: crng init done

sh-4.2#

Any suggestions on what may have transpired

@talandt
Copy link
Author

talandt commented Sep 27, 2018

xcat version = 2.14.1

@hu-weihua
Copy link

hu-weihua commented Sep 28, 2018

@talandt, Thanks for using sysclone. sysclone was developed few years ago, its framework can be used for different node, but need to specific coding for specific scenario. All supported scenario are listed here and all development environments based on physical machine, NOT support virtual machine. So before using sysclone, please make sure sysclone can support your scenario. Due to xCAT team has limited resource, unless we get business justification, we won’t assign resource to further develop this feature now. xCAT is an open source software, if you would like to contribute for this feature, we are welcome very much.

But your issue is strange, rsync: command not found.
xCAT shipped a customized kernel and initrd to support sysclone. rsync was packaged into kernel
.

#pwd
.../xcat-core/xCAT-genesis-builder

# cat install | grep rsync
..... mkdosfs parted rsync shutdown sort ssh-keygen tr blockdev findfs insmod kexec lvm mdadm mke2fs pivot_root sshd swapon tune2fs mkreiserfs reiserfstune  pvcreate lvremove vgremove vgcreate  lvcreate  lvscan  lvchange vgchange pvdisplay lvdisplay vgdisplay blkid dmsetup sfdisk # for sysclone

So could you run below commands in your MN and sent me the result back? thanks

#  rpm -aq |grep -i xcat-genesis-base

#  rpm -ql `rpm -aq |grep -i xcat-genesis-base-ppc64` | grep -i rsync

Enter genesis (when you install target node failed, enter the target node shell), run

sh-4.2#export

@hu-weihua hu-weihua self-assigned this Sep 28, 2018
@talandt
Copy link
Author

talandt commented Sep 28, 2018

]# rpm -aq |grep -i xcat-genesis-base
xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch

rpm -ql rpm -aq |grep -i xcat-genesis-base-ppc64 | grep -i rsync

rpm: no arguments given for query

sh-4.2# export
export BOOTIF="b4:96:91:14:03:c6"
export DEBUG_MEM_LEVEL="0"
export DEVICE="eth0"
export DRACUT_QUIET="yes"
export HOME="/"
export NEWROOT="/sysroot"
export NICSTOBRINGUP=" eth1"
export OLDPWD
export PATH="/sbin:/bin:/usr/bin:/usr/sbin:/tmp"
export PUBKEY="MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDJUubftYoXI7PUj4JKZUdYdh+52UVKUb+JImUN/Z9fjGFEiaZHeO1gZKIN4kFK0Dp8WMk2WUXTASMKGmjpM1DSrjUQ261ApvhzEFIBFkBgP89MNDp/4/lp1SApJTMYcZVh0GT6UQpD5PlKP1d7e0aJaXDv9Va32n52mbZeTn40DQIDAQAB"
export PWD="/"
export RD_DEBUG="no"
export SCRIPTNAME="centos7.4-x86_64-install-compute-nvidia"
export SHELL="/bin/sh"
export SHLVL="3"
export STY="559.console.(none)"
export TERM="screen"
export TERMCAP="SC|screen|VT 100/ANSI X3.64 virtual terminal:\:DO=\E[%dB:LE=\E[%dD:RI=\E[%dC:UP=\E[%dA:bs:bt=\E[Z:\:cd=\E[J:ce=\E[K:cl=\E[H\E[J:cm=\E[%i%d;%dH:ct=\E[3g:\:do=^J:nd=\E[C:pt:rc=\E8:rs=\Ec:sc=\E7:st=\EH:up=\EM:\:le=^H:bl=^G:cr=^M:it#8:ho=\E[H:nw=\EE:ta=^I:is=\E)0:\:li#24:co#80:am:xn:xv:LP:sr=\EM:al=\E[L:AL=\E[%dL:\:cs=\E[%i%d;%dr:dl=\E[M:DL=\E[%dM:dc=\E[P:DC=\E[%dP:\:im=\E[4h:ei=\E[4l:mi:IC=\E[%d@:ks=\E[?1h\E=:\:ke=\E[?1l\E>:vi=\E[?25l:ve=\E[34h\E[?25h:vs=\E[34l:\:ti=\E[?1049h:te=\E[?1049l:us=\E[4m:ue=\E[24m:so=\E[3m:\:se=\E[23m:mb=\E[5m:md=\E[1m:mh=\E[2m:mr=\E[7m:\:me=\E[m:ms:\
Co#8:pa#64:AF=\E[3%dm:AB=\E[4%dm:op=\E[39;49m:AX:\:vb=\Eg:as=\E(0:ae=\E(B:\:ac=\140\140aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~..--++,,hhII00:\:Km=\E[M:k0=\E[10~:k1=\EOP:k2=\EOQ:k3=\EOR:k4=\EOS:\:k5=\E[15~:k6=\E[17~:k7=\E[18~:k8=\E[19~:k9=\E[20~:\:k;=\E[21~:F1=\E[23~:F2=\E[24~:F3=\E[25~:F4=\E[26~:\:F5=\E[28~:F6=\E[29~:F7=\E[31~:F8=\E[32~:F9=\E[33~:\:FA=\E[34~:kb:K2=\E[G:kB=\E[Z:kh=\E[1~:@1=\E[1~:\:kH=\E[4~:@7=\E[4~:kN=\E[6~:kP=\E[5~:kI=\E[2~:kD=\E[3~:\:ku=\EOA:kd=\EOB:kr=\EOC:kl=\EOD:"
export UDEVRULESD="/run/udev/rules.d"
export UDEVVERSION="219"
export WINDOW="0"
export XCAT="mgt:3001"
export XCATMASTER="mgt"
export XCATPORT="3001"
export hookdir="/lib/dracut/hooks"
export ksdevice="enp11s0f0"
export ramdisk_size="200000"
export xcatd="mgt:3001"
sh-4.2#

@talandt
Copy link
Author

talandt commented Sep 28, 2018

The above export output was when I was in nodeconsole session
The following is from ssh session:
[xCAT Genesis running on n1 /]# export
declare -x HOME="/"
declare -x LOGNAME="root"
declare -x MAIL="/var/mail/root"
declare -x OLDPWD
declare -x PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
declare -x PWD="/"
declare -x SHELL="/bin/bash"
declare -x SHLVL="1"
declare -x SSH_CLIENT="172.20.0.1 39752 22"
declare -x SSH_CONNECTION="172.20.0.1 39752 172.20.101.1 22"
declare -x SSH_TTY="/dev/pts/2"
declare -x TERM="xterm"
declare -x USER="root

@talandt
Copy link
Author

talandt commented Sep 28, 2018

what is more interesting is that if I attempt to install the node with my original diskfull images that was created from copycds, it fails the same way. seems the xcat mn is messed up and the installs are trying to access the xcat mn as if it was a systemimager. Is there a way to shut this off on the mn so I can get back to normal?

@hu-weihua
Copy link

hu-weihua commented Sep 30, 2018

hi @talandt,

  • It is strange for your information. First section you grep out x86_64 version genesis, but second section you operated a ppc64 version genesis. Could you tell me which arch you are using now? One more thing, xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch was built by yourself or it was shipped by xcat?
# rpm -aq |grep -i xcat-genesis-base
xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch

# rpm -ql `rpm -aq |grep -i xcat-genesis-base-ppc64` | grep -i rsync
rpm: no arguments given for query
  • sysclone and normal installation are two separate systems, they won't affect each other. Could you paste the node and the osimage definition you are using for normal diskful installation . It is better to paste sysclone osimage definition at the same time.

@hu-weihua
Copy link

hu-weihua commented Sep 30, 2018

I noticed something you pasted in mail list, It seems nodeset failed to switch osimage before you do diskful installation. could you try rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia for normal diskful isntallation? After running this command, check if the provmethod of n1 has switched to centos7.4-x86_64-install-compute-nvidia?

#  nodeset n1 osimage=centos7.4-x86_64-install-compute-nvidia

n1: sysclone centos7.4-x86_64

Shouldn’t nodeset represent osimage???

It is better to paste all information in github (current page) in the future. it is convenient to track all information. Thanks.

@robin2008
Copy link
Member

@talandt
It seems you are not using the xcat shipped genesis base package, but xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch

so you need to run rpm -ql xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch to get the contents, I think the key is no rsync in this package, you may need to get a new one from who delivery it to you.

@talandt
Copy link
Author

talandt commented Oct 1, 2018 via email

@talandt
Copy link
Author

talandt commented Oct 1, 2018

I noticed something you pasted in mail list, It seems nodeset failed to switch osimage before you do diskful installation. could you try rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia for normal diskful isntallation? After running this command, check if the provmethod of n1 has switched to centos7.4-x86_64-install-compute-nvidia?

#  nodeset n1 osimage=centos7.4-x86_64-install-compute-nvidia

n1: sysclone centos7.4-x86_64

Shouldn’t nodeset represent osimage???

It is better to paste all information in github (current page) in the future. it is convenient to track all information. Thanks.

[root@e13502 ~]# rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia
Provision node(s): n1
[root@e13502 ~]# lsdef n1|grep provmethod
provmethod=centos7.4-x86_64-install-compute-nvidia

@talandt
Copy link
Author

talandt commented Oct 1, 2018

[root@e13502 ~]# rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia
Provision node(s): n1
[root@e13502 ~]# lsdef n1|grep provmethod
provmethod=centos7.4-x86_64-install-compute-nvidia

@talandt talandt closed this as completed Oct 1, 2018
@talandt talandt reopened this Oct 1, 2018
@talandt
Copy link
Author

talandt commented Oct 1, 2018

Sorry, I was just following direction of the xCAT support and didn’t realize they asked for ppc64 grep until after I send the requested results. I am using x86_64 From: Weihua Hu notifications@github.com Sent: Saturday, September 29, 2018 9:55 PM To: xcat2/xcat-core xcat-core@noreply.github.com Cc: Thomas Alandt talandt@lenovo.com; Mention mention@noreply.github.com Subject: [External] Re: [xcat2/xcat-core] after syscloning image, booting to image fails and boots to shell (#5673) hi @talandthttps://github.com/talandt, * It is strange for your information. First section you grep out x86_64 version genesis, but second section you operated a ppc64 version genesis. Could you tell me which arch you are using now? # rpm -aq |grep -i xcat-genesis-base xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch # rpm -ql rpm -aq |grep -i xcat-genesis-base-ppc64 | grep -i rsync rpm: no arguments given for query * sysclone and normal installation are two separate systems, they won't affect each other. Could you paste the node and the osimage definition you are using for normal diskful installation . It is better to paste sysclone osimage definition at the same time. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#5673 (comment)>, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AT8PDcg-qbOIIV0ur1lhUpVgJtickv50ks5ugCRxgaJpZM4W9FBW.

@talandt
Copy link
Author

talandt commented Oct 1, 2018

I noticed something you pasted in mail list, It seems nodeset failed to switch osimage before you do diskful installation. could you try rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia for normal diskful isntallation? After running this command, check if the provmethod of n1 has switched to centos7.4-x86_64-install-compute-nvidia?

#  nodeset n1 osimage=centos7.4-x86_64-install-compute-nvidia

n1: sysclone centos7.4-x86_64

Shouldn’t nodeset represent osimage???

It is better to paste all information in github (current page) in the future. it is convenient to track all information. Thanks.
Ever since I tried syscclone, the node now fails the original diskfull insall. Here are the latest results.

#rinstall n1 osimage=centos7.4-x86_64-install-compute-nvidia

lsdef n1 |grep provmethod

provmethod=centos7.4-x86_64-install-compute-nvidia

#rsetboot n1 net
#rpower n1 reset

Results of n1 installing:
[xCAT Genesis running on n1 /]#

[root@e13502 ~]# nodeset n1 stat
n1: sysclone centos7.4-x86_64 <---how is it getting set to sysclone???

[root@e13502 ~]# lsdef -t osimage
centos7.4-x86_64-install-compute (osimage)
centos7.4-x86_64-install-compute-nvidia (osimage)
centos7.4-x86_64-netboot-compute (osimage)
centos7.4-x86_64-statelite-compute (osimage)
rhels7.5-x86_64-install-compute (osimage)
rhels7.5-x86_64-install-service (osimage)
rhels7.5-x86_64-netboot-compute (osimage)
rhels7.5-x86_64-statelite-compute (osimage)

My mistake. I should have bee reverting back to centos7.4-x86_64-compute not centos7.4-x86_64-compute-nvidia, for the original diskfull install.
centos7.4-x86_64-compute-nvidia would correctly be the sysclone image

@talandt
Copy link
Author

talandt commented Oct 3, 2018

@talandt
It seems you are not using the xcat shipped genesis base package, but xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch

so you need to run rpm -ql xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch to get the contents, I think the key is no rsync in this package, you may need to get a new one from who delivery it to you.

@talandt
Copy link
Author

talandt commented Oct 3, 2018

Here is what I have
[root@e13502 ~]# rpm -qa |grep genesis
xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch
xCAT-genesis-scripts-x86_64-2.14.1.lenovo2-1.noarch

[root@e13502 ~]# xdsh -V
Version 2.14.1.lenovo2 (git commit 06d7097, built Mon Aug 20 12:55:53 UTC 2018)

[root@e13502 ~]# rpm -ql xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch|grep rsync
[root@e13502 ~]#

I have not modified anything from the xcat install (via RPM). To me it seems this is an issue with the xcat version. I am not able to locate a higher level (lenovo2 version) of genesis-base-x86_64.

Updated xcat to latest version but still have same packaged xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch as prevoius.
[root@e13502 yum.repos.d]# xdsh -V
Version 2.14.3 (git commit 7b7d9ab, built Tue Aug 21 07:16:00 EDT 2018)

[root@e13502 yum.repos.d]# rpm -qa |grep genesis
xCAT-genesis-scripts-x86_64-2.14.3-snap201808210716.noarch
xCAT-genesis-scripts-ppc64-2.14.3-snap201808210716.noarch
xCAT-genesis-base-ppc64-2.14-snap201804041553.noarch
xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch

So it seems that rsync is not supported for x86 based genesis?

@hu-weihua
Copy link

hi @talandt , every genesis shipped by xCAT support rsync command.

[root@host ~]# lsdef -v
lsdef - Version 2.14.3 (git commit 7b7d9ab67589afec1ba58cd5138154290c529c0e, built Tue Aug 21 07:16:00 EDT 2018)

[root@host ~]# rpm -qa |grep genesis
xCAT-genesis-scripts-ppc64-2.14.3-snap201808210716.noarch
xCAT-genesis-base-ppc64-2.14-snap201804041553.noarch
xCAT-genesis-base-x86_64-2.14-snap201803282249.noarch
xCAT-genesis-scripts-x86_64-2.14.3-snap201808210716.noarch

[root@host ~]# rpm -ql xCAT-genesis-base-x86_64-2.14-snap201803282249.noarch|grep rsync
/opt/xcat/share/xcat/netboot/genesis/x86_64/fs/usr/bin/rsync
/opt/xcat/share/xcat/netboot/genesis/x86_64/fs/usr/share/perl5/URI/rsync.pm

[root@host ~]# rpm -ql xCAT-genesis-base-ppc64-2.14-snap201804041553.noarch |grep rsync
/opt/xcat/share/xcat/netboot/genesis/ppc64/fs/usr/bin/rsync
/opt/xcat/share/xcat/netboot/genesis/ppc64/fs/usr/share/perl5/URI/rsync.pm

Another thing is every genesis compiled by customer has high version than the genesis shipped by xCAT, that is why you still have xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch in your environment.

There are 2 solutions:

  • Ask the people who compiled xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch for you to add rsync in the compile script, then recompile a new genesis package for you.

  • If you do not need specific functions just supported by customized genesis, uninstall xCAT-genesis-base-x86_64-2.14.1.lenovo1-1.noarch and xCAT-genesis-scripts-x86_64-2.14.3-snap201808210716.noarch with --nodep option in your environment , then reinstall xCAT, the genesis shipped by xCAT will work.

@hu-weihua
Copy link

@talandt, Is there any thing I can do for this issue? If there is not, could you help to close this issue? thanks .

@talandt
Copy link
Author

talandt commented Nov 6, 2018 via email

@gurevichmark
Copy link
Contributor

Closing, workaround found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants