Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHM: respect user namespace in SHM reachable #4225

Conversation

hoopoepg
Copy link
Contributor

  • in case if EP/IFACES have different namespace id then
    do not allow to connect such clients

- in case if EP/IFACES have different namespace id then
  do not allow to connect such clients
@hoopoepg hoopoepg added the WIP-DNM Work in progress / Do not review label Sep 25, 2019
@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/9236/ for details.

@yosefe
Copy link
Contributor

yosefe commented Sep 25, 2019

can we still connect different namespaces, just by not using procfs link?

@hoopoepg
Copy link
Contributor Author

not sure, will investigate

@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/9237/ for details.

@mellanox-github
Copy link
Contributor

Test PASSed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/12427/ for details (Mellanox internal link).

@mellanox-github
Copy link
Contributor

Mellanox CI: PASSED on 29 workers (click for details)

Note: the logs will be deleted after 02-Oct-2019

Agent/Stage Status
_main ✔️ SUCCESS
hpc-arm-cavium-jenkins_W0 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W1 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W2 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W3 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W0 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W1 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W2 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W3 ✔️ SUCCESS
hpc-test-althca_W0 ✔️ SUCCESS
hpc-test-althca_W1 ✔️ SUCCESS
hpc-test-althca_W2 ✔️ SUCCESS
hpc-test-althca_W3 ✔️ SUCCESS
hpc-test-node-gpu_W0 ✔️ SUCCESS
hpc-test-node-gpu_W1 ✔️ SUCCESS
hpc-test-node-gpu_W2 ✔️ SUCCESS
hpc-test-node-gpu_W3 ✔️ SUCCESS
hpc-test-node-legacy_W0 ✔️ SUCCESS
hpc-test-node-legacy_W1 ✔️ SUCCESS
hpc-test-node-legacy_W2 ✔️ SUCCESS
hpc-test-node-legacy_W3 ✔️ SUCCESS
hpc-test-node-new_W0 ✔️ SUCCESS
hpc-test-node-new_W1 ✔️ SUCCESS
hpc-test-node-new_W2 ✔️ SUCCESS
hpc-test-node-new_W3 ✔️ SUCCESS
r-vmb-ppc-jenkins_W0 ✔️ SUCCESS
r-vmb-ppc-jenkins_W1 ✔️ SUCCESS
r-vmb-ppc-jenkins_W2 ✔️ SUCCESS
r-vmb-ppc-jenkins_W3 ✔️ SUCCESS

@yosefe
Copy link
Contributor

yosefe commented Sep 25, 2019

@hoopoepg isn't this blocking all shared memory transports?
i guess UCX falls back to TCP?

@hoopoepg
Copy link
Contributor Author

I tested master using UCX_TLS=shm,cma - it works
but I didn't test 1.6 branch

@yosefe
Copy link
Contributor

yosefe commented Sep 25, 2019

how does it work if this PR makes shared memory unreachable? what am i missing?

@hoopoepg
Copy link
Contributor Author

there fix is add namespace id to address guid, same namespaces have same ID & it doesn't affect to processes from same namespace, but blocks CMA on different namespaces (but not fail on start)

@mellanox-github
Copy link
Contributor

Test PASSed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/12428/ for details (Mellanox internal link).

@mellanox-github
Copy link
Contributor

Mellanox CI: PASSED on 29 workers (click for details)

Note: the logs will be deleted after 02-Oct-2019

Agent/Stage Status
_main ✔️ SUCCESS
hpc-arm-cavium-jenkins_W0 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W1 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W2 ✔️ SUCCESS
hpc-arm-cavium-jenkins_W3 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W0 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W1 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W2 ✔️ SUCCESS
hpc-arm-hwi-jenkins_W3 ✔️ SUCCESS
hpc-test-althca_W0 ✔️ SUCCESS
hpc-test-althca_W1 ✔️ SUCCESS
hpc-test-althca_W2 ✔️ SUCCESS
hpc-test-althca_W3 ✔️ SUCCESS
hpc-test-node-gpu_W0 ✔️ SUCCESS
hpc-test-node-gpu_W1 ✔️ SUCCESS
hpc-test-node-gpu_W2 ✔️ SUCCESS
hpc-test-node-gpu_W3 ✔️ SUCCESS
hpc-test-node-legacy_W0 ✔️ SUCCESS
hpc-test-node-legacy_W1 ✔️ SUCCESS
hpc-test-node-legacy_W2 ✔️ SUCCESS
hpc-test-node-legacy_W3 ✔️ SUCCESS
hpc-test-node-new_W0 ✔️ SUCCESS
hpc-test-node-new_W1 ✔️ SUCCESS
hpc-test-node-new_W2 ✔️ SUCCESS
hpc-test-node-new_W3 ✔️ SUCCESS
r-vmb-ppc-jenkins_W0 ✔️ SUCCESS
r-vmb-ppc-jenkins_W1 ✔️ SUCCESS
r-vmb-ppc-jenkins_W2 ✔️ SUCCESS
r-vmb-ppc-jenkins_W3 ✔️ SUCCESS

@mellanox-github
Copy link
Contributor

Mellanox CI: ABORTED on 1 workers (click for details)

Note: the logs will be deleted after 15-Nov-2019

Agent/Stage Status
_main ❓ ABORTED

@hoopoepg hoopoepg closed this Nov 28, 2019
@jakirkham
Copy link
Contributor

Was this superseded by some other PR?

@hoopoepg
Copy link
Contributor Author

yes
#4511

@jakirkham
Copy link
Contributor

Thanks Sergey! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP-DNM Work in progress / Do not review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants