Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hostcfgd crash when delete entire config table. #106

Merged
merged 9 commits into from
Mar 30, 2024

Conversation

liuh-80
Copy link
Contributor

@liuh-80 liuh-80 commented Mar 13, 2024

What I did
Fix hostcfgd crash when delete entire config table.

Why I did it
hostcfgd subscribe table change in register_callbacks() method. When a config table been deleted, the 'data' parameter of callback method will be 'None', however most callback doesn't handle the 'None' case, they only handle empty dictionary case. when this happen hostcfgd will crash.

How I verified it
Pass all UT.
Add new UT for code coverage.

Work item tracking
  • Microsoft ADO: 27166522

Details if related

@liuh-80 liuh-80 marked this pull request as ready for review March 13, 2024 11:43
@liuh-80 liuh-80 requested a review from qiluo-msft March 13, 2024 11:43
@qiluo-msft
Copy link
Contributor

qiluo-msft commented Mar 21, 2024

Could you paste the crash stack? Could you check the coverage? #Closed

scripts/hostcfgd Outdated
main()
try:
main()
except Exception as e:
Copy link
Contributor

@qiluo-msft qiluo-msft Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except

This bugfix is not related to your main topic, we can move it to another PR. And this is not the ideal way to fix a generic issue relating systemd handle simple service return code.
Not sure if this help https://unix.stackexchange.com/a/734184 #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, change reverted

@qiluo-msft
Copy link
Contributor

qiluo-msft commented Mar 21, 2024

@alexrallen @dgsudharsan Please help review this PR? wondering why this is not identified by original testcases? #Closed

@liuh-80
Copy link
Contributor Author

liuh-80 commented Mar 21, 2024

Could you paste the crash stack? Could you check the coverage?

Here is the crash stack in syslog:

Mar 13 06:40:24.354944 vlab-01 INFO hostcfgd[17776]: Traceback (most recent call last):
Mar 13 06:40:24.355016 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 1799, in
Mar 13 06:40:24.355353 vlab-01 INFO hostcfgd[17776]: main()
Mar 13 06:40:24.355403 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 1796, in main
Mar 13 06:40:24.355596 vlab-01 INFO hostcfgd[17776]: daemon.start()
Mar 13 06:40:24.355643 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 1788, in start
Mar 13 06:40:24.355975 vlab-01 INFO hostcfgd[17776]: self.config_db.listen(init_data_handler=self.load)
Mar 13 06:40:24.356032 vlab-01 INFO hostcfgd[17776]: File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2539, in listen
Mar 13 06:40:24.356884 vlab-01 INFO hostcfgd[17776]: self.__fire(table, row, data)
Mar 13 06:40:24.356955 vlab-01 INFO hostcfgd[17776]: File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 2606, in __fire
Mar 13 06:40:24.357448 vlab-01 INFO hostcfgd[17776]: handler(table, key, data)
Mar 13 06:40:24.357505 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 1733, in callback
Mar 13 06:40:24.357844 vlab-01 INFO hostcfgd[17776]: return func(key, op, data)
Mar 13 06:40:24.357899 vlab-01 INFO hostcfgd[17776]: ^^^^^^^^^^^^^^^^^^^
Mar 13 06:40:24.357940 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 1599, in aaa_handler
Mar 13 06:40:24.358215 vlab-01 INFO hostcfgd[17776]: self.aaacfg.aaa_update(key, data)
Mar 13 06:40:24.358263 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 304, in aaa_update
Mar 13 06:40:24.358430 vlab-01 INFO hostcfgd[17776]: self.modify_conf_file()
Mar 13 06:40:24.358472 vlab-01 INFO hostcfgd[17776]: File "/usr/local/bin/hostcfgd", line 487, in modify_conf_file
Mar 13 06:40:24.358523 vlab-01 INFO hostcfgd[17776]: authorization.update(self.authorization)
Mar 13 06:40:24.358581 vlab-01 INFO hostcfgd[17776]: TypeError: 'NoneType' object is not iterable
Mar 13 06:40:24.540918 vlab-01 NOTICE systemd[1]: hostcfgd.service: Main process exited, code=exited, status=1/FAILURE

@qiluo-msft qiluo-msft merged commit ba78bdb into sonic-net:master Mar 30, 2024
5 checks passed
@ycoheNvidia ycoheNvidia mentioned this pull request May 9, 2024
liat-grozovik pushed a commit that referenced this pull request May 12, 2024
Fixed #80 rebase failing newly introduced test from #106
mssonicbld pushed a commit to mssonicbld/sonic-host-services that referenced this pull request May 21, 2024
**What I did**
Fix hostcfgd crash when delete entire config table.

**Why I did it**
hostcfgd subscribe table change in register_callbacks() method. When a config table been deleted, the 'data' parameter of callback method will be 'None', however most callback doesn't handle the 'None' case, they only handle empty dictionary case. when this happen hostcfgd will crash.

**How I verified it**
Pass all UT.
Add new UT for code coverage.
@mssonicbld
Copy link

Cherry-pick PR to 202311: #129

mssonicbld pushed a commit that referenced this pull request May 21, 2024
**What I did**
Fix hostcfgd crash when delete entire config table.

**Why I did it**
hostcfgd subscribe table change in register_callbacks() method. When a config table been deleted, the 'data' parameter of callback method will be 'None', however most callback doesn't handle the 'None' case, they only handle empty dictionary case. when this happen hostcfgd will crash.

**How I verified it**
Pass all UT.
Add new UT for code coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants