Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quagga cannot start on image installed using sonic2sonic upgrade #579

Closed
marian-pritsak opened this issue May 9, 2017 · 5 comments
Closed

Comments

@marian-pritsak
Copy link
Collaborator

SONiC Software Version: SONiC.HEAD.240-8f34839
Distribution: Debian 8.8
Kernel: 3.16.0-4-amd64
Build commit: 8f34839
Build date: Tue May  9 07:45:48 UTC 2017
Built by: johnar@jenkins-worker-1

Docker images:
REPOSITORY                TAG                 IMAGE ID            SIZE
docker-orchagent-mlnx     latest              8f2dc4757732        256 MB
docker-syncd-mlnx         latest              1907b0ad3a57        405.7 MB
docker-dhcp-relay         latest              a84b35132b9e        251.6 MB
docker-database           latest              5ae54585019b        219.5 MB
docker-snmp-sv2           latest              a870210fac95        289.3 MB
docker-teamd              latest              5ed3986424b4        253.5 MB
docker-platform-monitor   latest              bc3f2854284f        268.6 MB
docker-lldp-sv2           latest              811ef1a27405        255.1 MB
docker-fpm-quagga         latest              aaef6cc119d2        260 MB

Daemons zebra and bgpd are not running

root@arc-switch1026:/home/admin# docker exec -it bgp ps ax
  PID TTY      STAT   TIME COMMAND
    1 ?        Ss+    0:00 /usr/bin/python /usr/bin/supervisord
   25 ?        Sl     0:00 /usr/sbin/rsyslogd -n
   51 ?        Ss     0:00 /usr/lib/quagga/watchquagga --daemon zebra bgpd
   53 ?        Sl     0:00 fpmsyncd
   70 ?        Rs+    0:00 ps ax
# docker logs bgp                                                                                                                                                                                                                                
/usr/lib/python2.7/dist-packages/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  'Supervisord is running as root and it is searching '
2017-05-09 12:18:00,785 CRIT Supervisor running as root (no user in config file)
2017-05-09 12:18:00,786 WARN Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2017-05-09 12:18:00,848 INFO RPC interface 'supervisor' initialized
2017-05-09 12:18:00,850 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-05-09 12:18:00,852 INFO supervisord started with pid 1
2017-05-09 12:18:01,855 INFO spawned: 'start.sh' with pid 8
2017-05-09 12:18:02,858 INFO success: start.sh entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-05-09 12:18:04,390 INFO spawned: 'rsyslogd' with pid 25
2017-05-09 12:18:05,393 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-05-09 12:18:05,455 CRIT reaped unknown pid 42)
2017-05-09 12:18:05,504 CRIT reaped unknown pid 46)
2017-05-09 12:18:05,641 INFO spawned: 'fpmsyncd' with pid 53
2017-05-09 12:18:06,667 INFO success: fpmsyncd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-05-09 12:18:06,681 INFO exited: start.sh (exit status 0; expected)
@marian-pritsak
Copy link
Collaborator Author

In case SAME image is installed from ONIE, everything works fine

@jleveque jleveque self-assigned this May 10, 2017
@marian-pritsak
Copy link
Collaborator Author

@jleveque did you try to reproduce this?

@jleveque
Copy link
Contributor

jleveque commented May 24, 2017

I have reproduced the issue. Zebra and bgpd fail to start because the ownership of the /var/run/quagga/ directory of images installed using the SONiC-to-SONiC update is incorrect. It should be owned by the quagga user, but it is instead owned by a different user. This prevents the processes from writing their PID files to /var/run/quagga/, so the processes exit in error. I have not determined a reason for the incorrect ownership yet.

@lguohan
Copy link
Collaborator

lguohan commented May 24, 2017

I think there is some problem in the docker.tar.gz file.

./aufs/diff/c3948b1fd959e3b937d497769cd18a8f60f8ce1d2ddd6893dba908358ad11e2e/run/quagga/^@^@^@^@^@^@^@^@^@
^@^@^@0000755^@0000152^@0000157^@00000000000^@13111135400^@024446^@ 5^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@ustar  ^@sshd^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ssh^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^
@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@.

the uid and uid are correct, but not the uname and gname.

according to the standard,

https://www.gnu.org/software/tar/manual/html_node/Standard.html

The magic field indicates that this archive was output in the P1003 archive format. If this field contains TMAGIC, the uname and gname fields will contain the ASCII representation of the owner and group of the file respectively. If found, the user and group IDs are used rather than the values in the uid and gid fields.

@lguohan
Copy link
Collaborator

lguohan commented May 24, 2017

addressed in #626

@lguohan lguohan closed this as completed May 24, 2017
lguohan pushed a commit that referenced this issue Jul 31, 2019
* src/sonic-utilities ee56d54...cb0e745 (11):
  > sonic_utilities: Support for DOM Threshold values for EEPROM dump
(#545)
  > [portstat] Fix portstat show RX_UTIL over 100% for 100G (#563)
  > sonic_installer: fix read-only filesystem support for firmware
update (#565)
  > Revert "show acl table command output should show binding column
correctly even with single port (#447)" (#589)
  > show acl table command output should show binding column correctly
even with single port (#447)
  > [config] Do no stop or restart dependent services (#582)
  > sfpshow: prevent 'show int trans eeprom --dom' from crashing (#567)
  > [warm-reboot] add docker upgrade --warm option and roll back support
(#559)
  > [ecnconfig] Validate input WRED parameters (#579)
  > [sonic-utilities] Add fstrim to reboot (#535)
  > Fixing the expected neighbor command due to change in output format
under sonic-buildimage/pull/3036 (#584)
Kalimuthu-Velappan pushed a commit to Kalimuthu-Velappan/sonic-buildimage that referenced this issue Sep 12, 2019
* Update help infomration

* Validate WRED parameters

Avoid illegal inputs, e.g., Kmin > Kmax

* Delete the useless code

* Update help information for arguments
yxieca added a commit to yxieca/sonic-buildimage that referenced this issue Oct 26, 2019
Submodule src/sonic-swss 2529d79..15652b2:
  > [mirrororch]: Add retry logic when deleting referenced mirror session (sonic-net#1104)

Submodule src/sonic-utilities 0cfa942..c049e54:
  > [neighbor_advertiser]: Add sleep in setting mirror session and ACL rules (sonic-net#714)
  > [warm/fast reboot] continue executing when killing docker failed (sonic-net#713)
  > [ecnconfig] Validate input WRED parameters (sonic-net#579)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
yxieca added a commit that referenced this issue Oct 26, 2019
Submodule src/sonic-swss 2529d79..15652b2:
  > [mirrororch]: Add retry logic when deleting referenced mirror session (#1104)

Submodule src/sonic-utilities 0cfa942..c049e54:
  > [neighbor_advertiser]: Add sleep in setting mirror session and ACL rules (#714)
  > [warm/fast reboot] continue executing when killing docker failed (#713)
  > [ecnconfig] Validate input WRED parameters (#579)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
madhanmellanox pushed a commit to madhanmellanox/sonic-buildimage that referenced this issue Mar 23, 2020
…exthop_group_… (sonic-net#579)

* Fix race condition and avoid unnecessary create/delete nexthop_group_member

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>

* [VS]: Fix test_CrmNexthopGroupMember port oper status down problem

Signed-off-by: Jipan Yang <jipan.yang@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants