Skip to content

Commit

Permalink
Merge pull request #110 from dellhpc/release-v0.2
Browse files Browse the repository at this point in the history
Omnia Release v0.2
  • Loading branch information
j0hnL committed Jun 24, 2020
2 parents c011bf7 + 4c9a6e3 commit 986eccb
Show file tree
Hide file tree
Showing 45 changed files with 881 additions and 335 deletions.
116 changes: 70 additions & 46 deletions CONTRIBUTING.md
Expand Up @@ -7,59 +7,83 @@ These guidelines are based on the [pravega project](https://github.com/pravega/p

This document will evolve as the project matures. Please be sure to regularly refer back in order to stay in-line with contribution guidelines.

## Issues and Pull Requests
To produce a pull request against Omnia, follow these steps:

* **Create an issue:** Create an issue and describe what you are trying to solve. It doesn't matter whether it is a new feature, a bug fix, or an improvement. All pull requests need to be associated to an issue. See more here: Creating an issue
* **Issue branch:** Create a new branch on your fork of the repository. Typically, you need to branch off master, but there could be exceptions. To branch off master, use git checkout master; git checkout -b <new-branch-name>.
* **Push the changes:** To be able to create a pull request, push the changes to origin: git push --set-upstream origin <new-branch-name>. I'm assuming that origin is your personal repo, e.g., `lwilson/omnia.git`.
* **Branch name:** Use the following pattern to create your new branch name: issue-number-description, e.g., issue-1023-reformat-testutils.
* **Create a pull request:** Github gives you the option of creating a pull request. Give it a title following this format Issue ###: Description, _e.g., Issue 1023: Reformat testutils. Follow the guidelines in the description and try to provide as much information as possible to help the reviewer understand what is being addressed. It is important that you try to do a good job with the description to make the job of the code reviewer easier. A good description not only reduces review time, but also reduces the probability of a misunderstanding with the pull request.
* **Merging:** Merging of pull requests will be handled by project mantainers

When preparing a pull request it is important to stay up-to-date with the master. We recommend that you rebase against the upstream repository _frequently_. To do this, use the following commands:
```
git pull --rebase upstream master #upstream is dellhpc/omnia
git push --force origin <pr-branch-name> #origin is your fork of the repository (e.g., <github_user_name>/omnia.git)
## How to Contribute to Omnia
Contributions to Omnia are made through [Pull Requests (PRs)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). To make a pull request against Omnia, use the following steps:

1. **Create an issue:** [Create an issue](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue) and describe what you are trying to solve. It does not matter whether it is a new feature, a bug fix, or an improvement. All pull requests need to be associated to an issue. When creating an issue, be sure to use the appropriate issue template (bug fix or feature request) and complete all of the required fields. If your issue does not fit in either a bug fix or feature request, then create a blank issue and be sure to including the following information:
* **Problem description:** Describe what you believe needs to be addressed
* **Problem location:** In which file and at what line does this issue occur?
* **Suggested resolution:** How do you intend to resolve the problem?
2. **Create a personal fork:** All work on Omnia should be done in a [fork of the repository](https://help.github.com/en/github/getting-started-with-github/fork-a-repo). Only the maintiners are allowed to commit directly to the project repository.
3. **Issue branch:** [Create a new branch](https://help.github.com/en/desktop/contributing-to-projects/creating-a-branch-for-your-work) on your fork of the repository. All contributions should be branched from `devel`. Use `git checkout devel; git checkout -b <new-branch-name>` to create the new branch.
* **Branch name:** The branch name should be based on the issue you are addressing. Use the following pattern to create your new branch name: issue-number, e.g., issue-1023.
4. **Commit changes to the issue branch:** It is important to commit your changes to the issue branch. Commit messages should be descriptive of the changes being made.
* **Signing your commits:** All commits to Omnia need to be signed with the [Developer Certificate of Origin (DCO)](https://developercertificate.org/) in order to certify that the contributor has permission to contribute the code. In order to sign commits, use either the `--signoff` or `-s` option to `git commit`:
```
git commit --signoff
git commit -s
```
Make sure you have your user name and e-mail set. The `--signoff | -s` option will use the configured user name and e-mail, so it is important to configure it before the first time you commit. Check the following references:

* [Setting up your github user name](https://help.github.com/articles/setting-your-username-in-git/)
* [Setting up your e-mail address](https://help.github.com/articles/setting-your-commit-email-address-in-git/)

5. **Push the changes to your personal repo:** To be able to create a pull request, push the changes to origin: `git push origin <new-branch-name>`. Here I assume that `origin` is your personal repo, e.g., `lwilson/omnia.git`.
6. **Create a pull request:** [Create a pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request) with a title following this format Issue ###: Description (_i.e., Issue 1023: Reformat testutils_). It is important that you do a good job with the description to make the job of the code reviewer easier. A good description not only reduces review time, but also reduces the probability of a misunderstanding with the pull request.
* **Important:** When preparing a pull request it is important to stay up-to-date with the project repository. We recommend that you rebase against the upstream repo _frequently_. To do this, use the following commands:
```
git pull --rebase upstream master #upstream is dellhpc/omnia
git push --force origin <pr-branch-name> #origin is your fork of the repository (e.g., <github_user_name>/omnia.git)
```
* **PR Description:** Be sure to fully describe the pull request. Ideally, your PR description will contain:
1. A description of the main point (_e.g., why was this PR made?_),
2. Linking text to the related issue (_e.g., This PR closes issue #<issue_number>_),
3. How the changes solves the problem, and
4. How to verify that the changes work correctly.

## Omnia Branches and Contribution Flow
The diagram below describes the contribution flow. Omnia has two lifetime branches: `devel` and `master`. The `master` branch is reserved for releases and their associated tags. The `devel` branch is where all development work occurs. The `devel` branch is also the default branch for the project.

![Omnia Branch Flowchart](docs/images/omnia-branch-structure.png "Flowchart of Omnia branches")

## Developer Certificate of Origin
Contributions to Omnia must be signed with the [Developer Certificate of Origin (DCO)](https://developercertificate.org/):
```
## Creating an Issue
When creating an issue, there are two important parts: title and description. The title should be succinct, but give a good idea of what the issue is about. Try to add all important keywords to make it clear to the reader. For example, if the issue is about changing the log level of some messages in the segment store, then instead of saying "Log level" say "Change log level in the segment store". The suggested way includes both the goal where in the code we are supposed to do it.
Developer Certificate of Origin
Version 1.1
For the description, there three parts:
Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129
* *Problem description:* Describe what it is that we need to change. If it is a bug, describe the observed symptoms. If it is a new feature, describe it is supposed to be with as much detail as possible.
Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.
* *Problem location:* This part refers to where in the code we are supposed to make changes. For example, if it is bug in the client, then in this part say at least "Client". If you know more about it, then please add it. For example, if you that there is an issue with SegmentOutputStreamImpl, say it in this part.
* *Suggestion for an improvement:* This section is designed to let you give a suggestion for how to fix the bug described in the Problem description or how to implement the feature described in that same section. Please make an effort to separate between problem statement (Problem Description section) and solution (Suggestion for an improvement).
Developer's Certificate of Origin 1.1
We next discuss how to create a pull request.

## Creating a Pull Request
When creating a pull request, there are also two important parts: title and description. The title can be the same as the one of the issue, but it must be prefixed with the issue number, e.g.:
```
Issue 724: Change log level in the segment store
```
The description has four parts:
By making a contribution to this project, I certify that:
* __Changelog description*:__ This section should be the two or three main points about this PR. A detailed description should be left for the What the code does section. The two or three points here should be used by a committer for the merge log.
* __Purpose of the change:__ Say whether this closes an issue or perhaps is a subtask of an issue. This section should link the PR to at least one issue.
* __What the code does:__ Use this section to freely describe the changes in this PR. Make sure to give as much detail as possible to help a reviewer to do a better job understanding your changes.
* __How to verify it:__ For most of the PRs, the answer here will be trivial: the build must pass, system tests must pass, visual inspection, etc. This section becomes more important when the way to reproduce the issue the PR is resolving is non-trivial, like running some specific command or workload generator.
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
## Signing Your Commits
We require that developers sign off their commits to certify that they have permission to contribute the code in a pull request. This way of certifying is commonly known as the [Developer Certificate of Origin (DCO)](https://developercertificate.org/). We encourage all contributors to read the DCO text before signing a commit and making contributions.
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
To make sure that pull requests have all commits signed off, we use the [Probot DCO plugin](https://probot.github.io/apps/dco/).
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
### Signing off a commit

#### Using the command line
To make sure that pull requests have all commits signed off, we use the Probot DCO plugin.
Use either `--signoff` or `-s` with the commit command.

Make sure you have your user name and e-mail set. The `--signoff | -s` option will use the configured user name and e-mail, so it is important to configure it before the first time you commit. Check the following references:

[Setting up your github user name](https://help.github.com/articles/setting-your-username-in-git/)

[Setting up your e-mail address](https://help.github.com/articles/setting-your-commit-email-address-in-git/)
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
```
2 changes: 1 addition & 1 deletion LICENSE
Expand Up @@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright [yyyy] [name of copyright owner]
Copyright 2020 Dell Inc. or its subsidiaries. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
12 changes: 9 additions & 3 deletions README.md
@@ -1,11 +1,17 @@
# Omnia
<img src="docs/images/omnia-logo.png" width="500px">

![GitHub](https://img.shields.io/github/license/dellhpc/omnia) ![GitHub issues](https://img.shields.io/github/issues-raw/dellhpc/omnia) ![GitHub release (latest by date including pre-releases)](https://img.shields.io/github/v/release/dellhpc/omnia?include_prereleases) ![GitHub last commit (branch)](https://img.shields.io/github/last-commit/dellhpc/omnia/devel) ![GitHub commits since tagged version](https://img.shields.io/github/commits-since/dellhpc/omnia/omnia-v0.2/devel)

#### Ansible playbook-based deployment of Slurm and Kubernetes on Dell EMC PowerEdge servers running an RPM-based Linux OS

Omnia (Latin: all or everything) is a deployment tool to turn Dell EMC PowerEdge servers with RPM-based Linux images into a functioning Slurm/Kubernetes cluster.

## Omnia Documentation
For Omnia documentation, including installation and contribution instructions, see [docs](docs/README.md).
For Omnia documentation, including installation and contribution instructions, please see the [website](https://dellhpc.github.io/omnia).

### Current maintainers:
## Current maintainers:
* Lucas A. Wilson (Dell Technologies)
* John Lockman (Dell Technologies)

## Omnia Contributors:
<img src="docs/images/delltech.jpg" height="150px" alt="Dell Technologies"> <img src="docs/images/pisa.png" height="150px" alt="Universita di Pisa">
6 changes: 6 additions & 0 deletions docs/CONTRIBUTORS.md
@@ -0,0 +1,6 @@
# Omnia Maintainers
- Luke Wilson and John Lockman (Dell Technologies)
<img src="images/delltech.jpg" height="90px" alt="Dell Technologies">

# Omnia Contributors
<img src="images/delltech.jpg" height="90px" alt="Dell Technologies"> <img src="images/pisa.png" height="100px" alt="Universita di Pisa">
92 changes: 66 additions & 26 deletions docs/INSTALL.md
@@ -1,7 +1,5 @@
# Installing Omnia

## TL;DR

## TL;DR Installation

### Kubernetes
Install Kubernetes and all dependencies
```
Expand All @@ -12,54 +10,96 @@ Initialize K8s cluster
```
ansible-playbook -i host_inventory_file kubernetes/kubernetes.yml --tags "init"
```

### Install Kubeflow
```
ansible-playbook -i host_inventory_file kubernetes/kubeflow.yaml
```

### Slurm
```
ansible-playbook -i host_inventory_file slurm/slurm.yml
```

## Build/Install
# Omnia
Omnia is a collection of [Ansible](https://www.ansible.com/) playbooks which perform:
* Installation of [Slurm](https://slurm.schedmd.com/) and/or [Kubernetes](https://kubernetes.io/) on servers already provisioned with a standard [CentOS](https://www.centos.org/) image.
* Installation of auxiliary scripts for administrator functions such as moving nodes between Slurm and Kubernetes personalities.

### Kubernetes

* Add additional repositories:
Omnia playbooks perform several tasks:
`common` playbook handles installation of software
* Add yum repositories:
- Kubernetes (Google)
- El Repo (nvidia drivers)
- Nvidia (nvidia-docker)
- El Repo (for Nvidia drivers)
- EPEL (Extra Packages for Enterprise Linux)
* Install common packages
* Install Packages from repos:
- bash-completion
- docker
- gcc
- python-pip
- docker
- kubelet
- kubeadm
- kubectl
- nfs-utils
- nvidia-detect
- yum-plugin-versionlock
* Restart and enable system level services
- Docker
- Kubelet

`computeGPU` playbook installs Nvidia drivers and nvidia-container-runtime-hook
* Add yum repositories:
- Nvidia (container runtime)
* Install Packages from repos:
- kmod-nvidia
- nvidia-x11-drv
- nvidia-container-runtime
- ksonnet (CLI framework for K8S configs)
* Enable GPU Device Plugins (nvidia-container-runtime-hook)
* Modify kubeadm config to allow GPUs as schedulable resource
* Start and enable services
- nvidia-container-runtime-hook
* Restart and enable system level services
- Docker
- Kubelet
* Initialize Cluster
* Configuration:
- Enable GPU Device Plugins (nvidia-container-runtime-hook)
- Modify kubeadm config to allow GPUs as schedulable resource
* Restart and enable system level services
- Docker
- Kubelet

`master` playbook
* Install Helm v3
* (optional) add firewall rules for Slurm and kubernetes

Everything from this point on can be called by using the `init` tag
```
ansible-playbook -i host_inventory_file kubernetes/kubernetes.yml --tags "init"
```

`startmaster` playbook
* turn off swap
*Initialize Kubernetes
* Head/master
- Start K8S pass startup token to compute/slaves
- Initialize networking (Currently using WeaveNet)
- Setup K8S Dashboard
- Create dynamic/persistent volumes
* Compute/slaves
- Join k8s cluster
- Initialize software defined networking (Calico)

`startworkers` playbook
* turn off swap
* Join k8s cluster

`startservices` playbook
* Setup K8S Dashboard
* Add `stable` repo to helm
* Add `jupyterhub` repo to helm
* Update helm repos
* Deploy NFS client Provisioner
* Deploy Jupyterhub
* Deploy Prometheus
* Install MPI Operator


### Slurm
* Download and build Slurm source
* Install necessary dependencies
* Downloads and builds Slurm from source
* Install package dependencies
- Python3
- munge
- MariaDB
- MariaDB development libraries
* Build Slurm configuration files

2 changes: 1 addition & 1 deletion docs/PREINSTALL.md
Expand Up @@ -5,7 +5,7 @@ Omnia assumes that prior to installation:
* Systems have a base operating system (currently CentOS 7 or 8)
* Network(s) has been cabled and nodes can reach the internet
* SSH Keys for `root` have been installed on all nodes to allow for password-less SSH
* Ansible is installed on the master node
* Ansible is installed on either the master node or a separate deployment node
```
yum install ansible
```
Expand Down

0 comments on commit 986eccb

Please sign in to comment.