Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify IPCClient and prevent corrupt messages #52445

Merged
merged 7 commits into from Apr 10, 2019

Conversation

@dwoz
Copy link
Contributor

commented Apr 8, 2019

What does this PR do?

  • Re-raise queued exceptions with their original traceback
  • Removes singleton from IPCClient and prevents corrupt message passing

What issues does this PR fix or reference?

  • Fixes many failing and flaky tests

Previous Behavior

We tried to share a single IPCClient across the entire codebase resulting in send calls stomping on each other and creating corrupted messages.

New Behavior

Allow IPCClient to be regular instances

Tests written?

No - Fixing existing tests

Full py2 test suite
Full py3 test suite

Commits signed with GPG?

Yes

dwoz added 2 commits Apr 8, 2019
@dwoz dwoz requested review from DmitryKuzmenko and s0undt3ch Apr 8, 2019
@dwoz dwoz requested review from cachedout and thatch45 Apr 8, 2019
@thatch45

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

You had me at "Removes singleton"

@cachedout

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2019

Hi @dwoz !

One question that I have about this is that it seems like it means that from this point forward, callers would have to ensure that they are passing in the current loop or we'll end up with the potential for multiple loops per-thread, which seems bad. The old approach tried to prevent this via the singleton implementation but I don't see those same safeguards in place here. If we're going down this route, should we consider disallowing a loop to even be passed in to an IPC client?

@DmitryKuzmenko

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

@cachedout Tornado 5+ disallows multiple io loops per thread now. Moreover they removed the ioloop argument from all tornado API.

@cachedout

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2019

@DmitryKuzmenko Ah, yes. Tornado 5 was going to be my next question. :) That should provide the safety we need in these cases. Will this change ever have to run under < 5 though?

@DmitryKuzmenko

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

@cachedout should run good, but let's see. =)

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

@cachedout One thing to note is the use of tornado.ioloop.IOLoop.current here. This should ensure we are re-using an existing loop if one is not passed in explicitly.

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

Here is a centos 7 Python 2 full test run with these change, there were two failures which look to be un-related: https://jenkinsci.saltstack.com/job/salt-centos-7-py2-dwoz/3/

@codecov

This comment has been minimized.

Copy link

commented Apr 10, 2019

Codecov Report

Merging #52445 into 2018.3 will decrease coverage by 16.25%.
The diff coverage is 25%.

Impacted file tree graph

@@             Coverage Diff             @@
##           2018.3   #52445       +/-   ##
===========================================
- Coverage   48.78%   32.52%   -16.26%     
===========================================
  Files        1511     1511               
  Lines      256875   256865       -10     
  Branches    55626    53866     -1760     
===========================================
- Hits       125306    83554    -41752     
- Misses     131285   163403    +32118     
- Partials      284     9908     +9624
Flag Coverage Δ
#arch ?
#centos7 32.52% <25%> (+0.03%) ⬆️
#debian8 ?
#opensuse42 ?
#py2 32.52% <25%> (-0.1%) ⬇️
#py3 ?
#pycryptodomex ?
#transport ?
#ubuntu1404 ?
Impacted Files Coverage Δ
salt/transport/ipc.py 53.27% <25%> (-6.57%) ⬇️
salt/transport/frame.py 13.63% <0%> (-46.97%) ⬇️
salt/ext/ssl_match_hostname.py 0% <0%> (-36.18%) ⬇️
salt/client/ssh/wrapper/config.py 0% <0%> (-35.56%) ⬇️
salt/output/virt_query.py 12.9% <0%> (-35.49%) ⬇️
salt/states/pagerduty_schedule.py 10.44% <0%> (-34.33%) ⬇️
salt/states/rbac_solaris.py 8.42% <0%> (-33.69%) ⬇️
salt/states/smartos.py 6.34% <0%> (-33.51%) ⬇️
salt/states/zabbix_mediatype.py 3.43% <0%> (-33.34%) ⬇️
salt/runners/bgp.py 21.48% <0%> (-33.06%) ⬇️
... and 1385 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6028b9...1bdaf29. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Apr 10, 2019

Codecov Report

Merging #52445 into 2018.3 will decrease coverage by 3.29%.
The diff coverage is 46.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##           2018.3   #52445     +/-   ##
=========================================
- Coverage   48.78%   45.48%   -3.3%     
=========================================
  Files        1511     1511             
  Lines      256875   256870      -5     
  Branches    55626    55625      -1     
=========================================
- Hits       125306   116849   -8457     
+ Misses     131285   130529    -756     
- Partials      284     9492   +9208
Flag Coverage Δ
#arch ?
#centos7 45.48% <46.66%> (+12.99%) ⬆️
#debian8 ?
#opensuse42 ?
#py2 32.53% <43.33%> (-0.1%) ⬇️
#py3 32.33% <43.33%> (+31.75%) ⬆️
#pycryptodomex ?
#transport ?
#ubuntu1404 ?
Impacted Files Coverage Δ
salt/transport/ipc.py 55.25% <46.66%> (-4.59%) ⬇️
salt/modules/s3.py 54.34% <0%> (-28.27%) ⬇️
salt/modules/napalm_snmp.py 63.15% <0%> (-21.06%) ⬇️
salt/modules/svn.py 61.62% <0%> (-20.94%) ⬇️
salt/tops/ext_nodes.py 58.62% <0%> (-20.69%) ⬇️
salt/modules/zfs.py 63.5% <0%> (-18.25%) ⬇️
salt/states/user.py 39.84% <0%> (-18.23%) ⬇️
salt/renderers/jinja.py 71.42% <0%> (-17.86%) ⬇️
salt/modules/telegram.py 60.78% <0%> (-17.65%) ⬇️
salt/grains/mdadm.py 73.91% <0%> (-17.4%) ⬇️
... and 979 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6028b9...1bdaf29. Read the comment docs.

@cachedout

This comment has been minimized.

Copy link
Collaborator

commented Apr 10, 2019

@dwoz My concern is more about the behavior if one is passed in explicitly. I'm not sure if that's happening anywhere but if it is, could it be an issue?

@s0undt3ch

This comment has been minimized.

Copy link
Member

commented Apr 10, 2019

@dwoz

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

@cachedout Looking at this again I see that, prior to this change, if a caller passed in an event loop that is not the 'current loop' it would have made a new instance and allowed two loops in the same thread.

https://github.com/saltstack/salt/pull/52445/files#diff-d5f1a701c958ed082895806182d4d7fdL246

So I don't think we're loosing anything here.

@dwoz dwoz merged commit 89bd258 into saltstack:2018.3 Apr 10, 2019
10 of 12 checks passed
10 of 12 checks passed
codecov/patch 46.66% of diff hit (target 48.78%)
Details
codecov/project 45.48% (-3.3%) compared to b6028b9
Details
WIP Ready for review
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
jenkins/pr/docs The docs job has passed
Details
jenkins/pr/lint Python lint test has passed
Details
jenkins/pr/py2-centos-7 The py2-centos-7 job has passed
Details
jenkins/pr/py2-ubuntu-1604 The py2-ubuntu-1604 job has passed
Details
jenkins/pr/py2-windows-2016 The py2-windows-2016 job has passed
Details
jenkins/pr/py3-centos-7 The py3-centos-7 job has passed
Details
jenkins/pr/py3-ubuntu-1604 The py3-ubuntu-1604 job has passed
Details
jenkins/pr/py3-windows-2016 The py3-windows-2016 job has passed
Details
This was referenced Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.