Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

[Deployment] Disable IB driver installation by default #2595

Merged
merged 3 commits into from
Apr 22, 2019

Conversation

abuccts
Copy link
Member

@abuccts abuccts commented Apr 18, 2019

Disable IB driver installation by default.

Azure VM builtin IB kernel modules into vmlinux image,
IB driver installation will fail in this case.

If IB installation is needed during deployment,
set enable-ib-installation field to true in config.

Disable IB driver installation by default.

Azure VM builtin IB kernel modules into vmlinux image,
IB driver installation will fail in this case.

If IB installation is needed during deployment,
set `skip-ib-installation` field to `true` in config.
@abuccts abuccts requested review from ydye and squirrelsc April 18, 2019 03:11
@coveralls
Copy link

Coverage Status

Coverage remained the same at 53.314% when pulling a584048 on xiongyf/disable-ib into 0412422 on master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 53.314% when pulling a584048 on xiongyf/disable-ib into 0412422 on master.

@coveralls
Copy link

coveralls commented Apr 18, 2019

Coverage Status

Coverage remained the same at 53.314% when pulling 620109d on xiongyf/disable-ib into 0412422 on master.

@@ -28,4 +28,4 @@ version: "384.111"
pre-installed-nvidia-path: /usr/local/nvidia


skip-ib-installation: false
skip-ib-installation: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Can the name of configuration is positive (enable-IB-driver-installation), instead of negative(skip, disable, turn-off)?
  2. add some comments and update document to explain this setting, and potential issue once it's enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Rename skip-ib-installation to enable-ib-installation in config.
@abuccts abuccts requested a review from squirrelsc April 19, 2019 02:40
@@ -27,5 +27,8 @@ version: "384.111"

pre-installed-nvidia-path: /usr/local/nvidia


skip-ib-installation: false
# Azure VM builtin IB kernel modules into vmlinux image,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not apply to Azure only. how about as below?

Some servers has already installed IB drivers, so this flag is disabled by default. If this flag is enabled, OpenPAI will try best to install the correct IB driver, but it may be failed due to compatibility.

Copy link
Member Author

@abuccts abuccts Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The installation works if ib drivers have already been installed, and we need to re-install drivers in /var/drivers path. The flag should be enabled in this case.
The only issue is linux kernel on Azure has builtin ib kernel modules, which fails the re-installation. Those kind of kernels are only used on Azure VM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't tested on AWS, aliyun, and other various hardware configurations. So don't specify Azure here only, it may be a common case in cloud providers.

Remove Azure in comments.
@abuccts abuccts merged commit 09c1bf1 into master Apr 22, 2019
@abuccts abuccts deleted the xiongyf/disable-ib branch April 22, 2019 02:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants