Skip to content

Conversation

tylertitsworth
Copy link
Contributor

@tylertitsworth tylertitsworth commented Jun 12, 2024

Description

Mount your custom SSH config and SSH keys to use manually, but otherwise it should generate a key and start a server for usage in k8s.

Related Issue

n/a

Changes Made

  • The code follows the project's coding standards.
  • No Intel Internal IP is present within the changes.
  • The documentation has been updated to reflect any changes in functionality.

Validation

  • I have tested any changes in container groups locally with test_runner.py with all existing tests passing, and I have added new tests where applicable.

Tyler Titsworth added 2 commits June 11, 2024 21:39
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
@tylertitsworth tylertitsworth added the WIP Work in Progress label Jun 12, 2024
@tylertitsworth tylertitsworth self-assigned this Jun 12, 2024
Copy link

github-actions bot commented Jun 12, 2024

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails

Scanned Manifest Files

Tyler Titsworth added 8 commits June 11, 2024 21:45
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>

RUN python -m pip install --no-cache-dir -r multinode-requirements.txt

ENV LD_LIBRARY_PATH="/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}:/usr/local/lib/python${PYTHON_VERSION}/dist-packages/oneccl_bindings_for_pytorch/opt/mpi/libfabric/lib:/usr/local/lib/python${PYTHON_VERSION}/dist-packages/oneccl_bindings_for_pytorch/lib"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepend /lib/x86_64-linux-gnu overrides conda's installation of openssl which conflicts with dpkg

@tylertitsworth
Copy link
Contributor Author

@dmsuehir @louie-tsai
Please test your BKMs with the multinode image.

@dmsuehir
Copy link
Contributor

Thanks, I'm testing this multinode base with a rebuilt version of our workflow container that has the SSH config removed from our dockerfile. I'll report back with the results when testing is done.

@dmsuehir
Copy link
Contributor

@tylertitsworth Based on my testing, this looks good. I ran on a single node and on multiple nodes with the updated container.

@dmsuehir
Copy link
Contributor

dmsuehir commented Jun 13, 2024

@HarshaRamayanam FYI

@tylertitsworth tylertitsworth force-pushed the tylertitsworth/ssh-ipex-multinode branch from 630f798 to 00d8147 Compare June 17, 2024 23:05
@tylertitsworth tylertitsworth added Review and removed WIP Work in Progress labels Jun 18, 2024
sharvil10
sharvil10 previously approved these changes Jun 18, 2024
Copy link
Contributor

@sharvil10 sharvil10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

jitendra42
jitendra42 previously approved these changes Jun 26, 2024
@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

dmsuehir
dmsuehir previously approved these changes Jun 27, 2024
Copy link
Contributor

@dmsuehir dmsuehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested this update with our k8s HF workflow.

@sharvil10 sharvil10 dismissed stale reviews from dmsuehir, jitendra42, and themself via 12c2f53 June 27, 2024 21:10
@tylertitsworth tylertitsworth merged commit 6006b33 into main Jun 28, 2024
@tylertitsworth tylertitsworth deleted the tylertitsworth/ssh-ipex-multinode branch June 28, 2024 16:28
ma-pineda pushed a commit that referenced this pull request Jun 28, 2024
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Co-authored-by: sharvil.shah <sharvils@mlp-prod-clx-5669.ra.intel.com>
Co-authored-by: sharvil10 <sharvil.shah@intel.com>
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
Signed-off-by: ma-pineda <miguel.pineda.juarez@intel.com>
ma-pineda pushed a commit that referenced this pull request Jun 28, 2024
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Co-authored-by: sharvil.shah <sharvils@mlp-prod-clx-5669.ra.intel.com>
Co-authored-by: sharvil10 <sharvil.shah@intel.com>
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
Signed-off-by: ma-pineda <miguel.pineda.juarez@intel.com>
ma-pineda pushed a commit that referenced this pull request Jun 28, 2024
Signed-off-by: Tyler Titsworth <tyler.titsworth@intel.com>
Co-authored-by: sharvil.shah <sharvils@mlp-prod-clx-5669.ra.intel.com>
Co-authored-by: sharvil10 <sharvil.shah@intel.com>
Co-authored-by: Jitendra Patil <jitendra.patil@intel.com>
Signed-off-by: ma-pineda <miguel.pineda.juarez@intel.com>

COPY generate_ssh_keys.sh .

RUN cat /generate_ssh_keys.sh >> ~/.bashrc && \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about client public/private key? generate_ssh_key.sh only cover server key

# hadolint global ignore=SC2002
RUN mkdir -p /var/run/sshd && \
cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might also need to config ssh_config as below to have another ssh port support for sshd in docker with different port
Host * Port 2345

cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might also need to set sshd_config to use different sshd port among all instances.
"Port 2345 >> /etc/sshd/sshd_config"

cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also set below sshd config to have have Root login without password.
not sure whether we also need it or not
RUN sed -i'' -e's/^#PermitRootLogin prohibit-password$/PermitRootLogin yes/' /etc/ssh/sshd_config \ && sed -i'' -e's/^#PasswordAuthentication yes$/PasswordAuthentication no/' /etc/ssh/sshd_config \ && sed -i'' -e's/^#PermitEmptyPasswords no$/PermitEmptyPasswords yes/' /etc/ssh/sshd_config \ && sed -i'' -e's/^UsePAM yes/UsePAM no/' /etc/ssh/sshd_conf

> User also need to append content of id_rsa.pub in `/etc/ssh/authorized_keys` in the SSH server container.
> Since the SSH key is not owned by default user account in docker, please also do "chmod 644 id_rsa.pub; chmod 644 id_rsa" to grant read access for default user account.
> Users could also use "/usr/bin/ssh-keygen -t rsa -b 4096 -N '' -f ~/mnt/ssh_key/id_rsa" to generate a new SSH Key inside the container.
> Users need to mount a config file to list all hostnames at location `/root/.ssh/config` on the SSH client container.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be handled by docker image if we config /etc/ssh/sshd_config and ssh_config well.

```

2. Add hosts to config

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this step could be further simplified if you config /etc/ssh well during docker build. we don't need to put those efforts on users, and the manual configuration is also error-prone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants