Skip to content
This repository has been archived by the owner on Apr 27, 2022. It is now read-only.

Ansible Playbook for BMI Installation #153

Merged
merged 15 commits into from
Apr 27, 2018

Conversation

djfinn14
Copy link
Contributor

No description provided.

Copy link
Contributor

@naved001 naved001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good. @chemistry-sourabh will do a comprehensive review after he learns ansible.

$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get update
$ sudo apt-get install ansible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you put these commands in a code block and remove the dollar sign, we could just copy-paste the all commands into a terminal.

command 1
command 2

protocol: tcp
match: tcp
destination_port: 3260
jump: ACCEPT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rule wouldn't persist through a reboot. We should document this somewhere. For centos 7 and up, firewalld is what's used to manage iptables rules; there's no iptables service that can save the rules (it can be installed separately, but it didn't save the configuration for me :/)

is there a way to setup these rules using firewalld here?

https://www.rootusers.com/how-to-open-a-port-in-centos-7-with-firewalld/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed iptables and added firewalld

become: true
when: ansible_distribution == 'CentOS' or ansible_distribution == 'Red Hat Enterprise Linux'

- name: Change SELinux to permissive for CentOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be permanent. Do we only want to set this during the installation? to make it permanent we need to edit some file and save it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my testing, when using the selinux task in ansible it actually does change the file and it persists across reboots.

@apoorvemohan
Copy link
Collaborator

@pgrosu Could you please review this one?

@radonm
Copy link

radonm commented Nov 2, 2017

Readme needs update - The Bare Metal Imaging (BMI) is a core component of the Massachusetts Open Cloud - it is mass open cloud now...

@pgrosu
Copy link

pgrosu commented Nov 2, 2017

Hi Dan,

This is a great start, and if you prefer you can send my requests via email in private. So as this not part of Travis could you please provide me with the two Ubuntu and CentOS preconfiguration environments for both - these can be VMs on a specific deployment - and the step-by-step details on how the settings for all the necessary configurations and minimum version restrictions. Will the Yaml files (main.yml, site.yml, etc) run as is without any changes? Where was this tested on? For which environment were the DHCP ranges created? Either some step-by-step documentation or information would be needed for me to add to the appropriate configuration entries of the UAT framework for each of these test scenarios. As Rado indicated if we look at the README file, statements like the following don't give me confidence in what I should do:

  1. Modify bmi_config.cfg to match whatever your current HIL and Ceph setup is.

  2. Modify dnsmasq.conf within roles/dhcp/tasks/main.yml to match your requirements.

  3. Comment out any of the roles you don't want run in site.yml

( The above was taken from: https://github.com/djfinn14/ims/blob/0eb117f424bc94e86cefbd16dc1dd9aa69aa41f9/scripts/install/production/README.md )

I am happy to test, but I would like some documentation similar to how I provided with my manuals to perform validations for deployment. I'm not trying to be a pain, but I'm swamped and would not like to start guessing. We need to maintain the nice predictability we initiated this summer, where we were only document-driven. In fact, if there is no clear documentation we should not not accept PRs.

I attached the two BMI manuals as a guide and reference:

Thanks,
Paul

Copy link

@radonm radonm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments about ceph packages install

become: true

- name: Install cephlibs
pip:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using pip for this ? Packages should be coming from yum repos on centos you can get it by
yum -y install http://download.ceph.com/rpm-luminous/el7/noarch/ceph-release-1-1.el7.noarch.rpm
and then yum install whatever you need from ceph

On rhel there is some code in the dev scripts using rhcs 1.2, if you change that to 2x you will get the up to date packages or you can use upstream the same as above on CentOS

Does pip install latest dev code for ceph? Latest dev != production

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, that package actually isn't a ceph package, it is a package created by a 3rd party that provides rados and rbd python bindings to connect BMI to an exisiting ceph cluster. I also believe the only way to install it is through pip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://pypi.python.org/pypi/python-cephlibs/ is what you guys use? hmmm... two years old code marked as deprecated? you should probably use the official rados/rbd bindings (whose source code is here https://github.com/ceph/ceph/tree/master/src/pybind and listed on pypi simply as "rados" and "rbd"). I'll make an issue about it. (But for now, what you're doing is fine)

@djfinn14
Copy link
Contributor Author

djfinn14 commented Nov 7, 2017

@pgrosu I sent you an email with this but also want to have it here:

Could you please provide me with the two Ubuntu and CentOS preconfiguration environments for both - these can be VMs on a specific deployment - and the step-by-step details on how the settings for all the necessary configurations and minimum version restrictions

I am not entirely sure what you are want from me here. You need a clean VM (CentOS, RHEL or Ubuntu) that is set up to communicate to a Ceph cluster and HIL. I personally tested it in PRB by geting a clone of the bmi-dev vm, doing my best to wipe all of the packages and bmi setup within it, and then running the playbook.

Will the Yaml files (main.yml, site.yml, etc) run as is without any changes?

Yes, the YAML files can run without any changes, but like is stated you will want to make changes to the files I recommended so that it installs correctly according to your environment.

Where was this tested on? For which environment were the DHCP ranges created?

I first tested this on my kumo VMs. I had a CentOS and Ubuntu VM that I could rebuild. Those tests just were to make sure things like the tgt and dnsmaq services were getting started. You can technically run the dev install scripts for Ceph and HIL and then copy bmi_config.cfg.test into bmi_config.cfg and run the Ansible playbook if you want to have a self contained "toy" setup to see how it runs. My real test came on that cloned BMI-dev VM I mentioned earlier. I saved the bmiconfig file and dnsmasq config file and did my best to wipe everything else, then ran the playbook and tested to make sure I could run the normal BMI commands such as adding an image to the database, listing the database, provision/deprovisioning a node.

Modify bmi_config.cfg to match whatever your current HIL and Ceph setup is.

If you actually look at the bmi_config.cfg file, you can see it has instructions on what to put for each field, and there is a bmi_config.cfg.test file that has example settings.

Modify dnsmasq.conf within roles/dhcp/tasks/main.yml to match your requirements.

If you look at roles/dhcp/tasks/main.yml you can see there are pre-filled in setting for the dnsmasq.conf. You can keep the defaults, or you may want to change things like the interface you are using.

Comment out any of the roles you don't want run in site.yml

If you open the site.yml file you can see there are 3 roles listed. I tried to make each role self contained, so if you already had tgt setup, for example, you comment out "- tgt" and then run the playbook and you would only install the dhcp and bmi.

Let me know if this answers your questions.

@pgrosu
Copy link

pgrosu commented Nov 8, 2017

Hi Dan (@djfinn14),

I am in the middle of a couple of hard research problems I working through and are taking most of my time, so I'll give a quick overview behind what I am asking. You have done a lot of great work here, but now there is one more bridge that needs crossing. So in my experience through different software projects, the easiest way I have found them to grow their user-base is by having a clearly guided transition to implementation from a minimal starting point. This means that you have to think like a new user, and thus educate and guide your perspective users from start to finish. That undoubtably takes time and work beyond a set of configurations and a Readme file. Imagine you are a new user who sees our MOC/IMS Github location, and wants to better understand why such an Ansible playbook implementation important, how to test it from a minimal starting point and how it will help them. I'm not saying explain everything, but if you pick a person who is new to our project or the MOC, and provide him/her with your set of instructions, would they be able to reproduce them without Googling or inquiring other resources? Do they understand the connection to the rest of the project? Would all users get the same result? This is a foundation of system validation. Since this not yet part of a smoke-test on a continuous-integration platform, this is even more pertinent.

Hope it makes sense and is helpful,
Paul

@naved001
Copy link
Contributor

naved001 commented Nov 8, 2017

@pgrosu You could still review the ansible script nonetheless. Everything doesn't have to be blocked on just one thing.

@pgrosu
Copy link

pgrosu commented Nov 10, 2017

@naved001 I understand what you are saying, but we want spend a bit more time at the beginning to save us simple, overlooked gaps as the project grows - otherwise this becomes more internal knowledge, which has a high-probability of shrinking the user-base over time. It is okay to have a human check as a secondary check, driven by a set of SOPs (Standard Operating Procedures) as a primary set of operational semantics when performing functional testing in order to guarantee repeatability. That is why we initiated that process through a first set of manuals/guides we created over the summer. Over time we want those to become automated as a large set of tests/scenarios for continuous integration that is more thorough than Travis, which would encompass things such as system validation.

@djfinn14 If you have time after today's meeting we can sit together for some of this.

@naved001
Copy link
Contributor

naved001 commented Nov 15, 2017

@pgrosu
Could you make a list of things that you want @djfinn14 to do to get your approval on this PR? Please be as specific as you can and keep it simple. Once you pin down an exact set of requirements, we can work on it one at a time. Meanwhile, you could review at the ansible script itself (the main meat of this PR).

Just keep in mind that this script is aimed at people with sufficient/reasonable know-how of the linux world.

Copy link
Collaborator

@apoorvemohan apoorvemohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iPXE and PXE are not being setup with this ansible scripts

sudo yum install ansible
```

2. Add your hosts to the ansible hosts file (/etc/ansible/hosts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having an example here? On how to append the hostname to /etc/ansible/hosts. Not sure if it makes sense to have an example here?

e.g.
#ungrouped localhost for BMI installation
192.168.122.76

environment:
HIL_USERNAME: hil
HIL_PASSWORD: secret
with_items:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Python 2.7.5 I had to execute "pip install requests urllib3 pyOpenSSL --force --upgrade" on "CentOS Linux release 7.4.1708 (Core)" to install BMI.

3. Modify bmi_config.cfg to match whatever your current HIL and Ceph setup is.

4. Modify dnsmasq.conf within roles/dhcp/tasks/main.yml to match your requirements.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add instruction to modify HIL credentials in scripts/install/production/roles/bmi/tasks/main.yml

- name: Bootstrap the database
command: "{{ item }}"
environment:
HIL_USERNAME: hil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HIL environment variables were not set for me using after the installation completed successfully


- name: Change SELinux to permissive for CentOS
selinux:
policy: targeted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selinux needs to "disabled" for BMI to work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permissive is basically disabled with warnings on. No need to disable it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, BMI pro not working with permissive. Tested in Kumo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from this page here

  • in permissive mode SELinux does not enforce its policy, but only logs what it would have blocked (or granted)
  • applications that are SELinux-aware might still behave differently with permissive mode than when SELinux is completely disabled

Based on the second point, I'll defer to you on this one. But do you know what selinux aware app we have that needs it disabled? TGT?

cc: @chemistry-sourabh

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't completely understand the meaning of "selinux aware". I'll have to read on it.

- gcc
- cpan
- make
- firewalld
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firewalld needs to be "disabled" for BMI to work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Firewalld just manages iptables rules for you in CentOS. Why would you outright disable it? And Dan tested this setup, so it definitely works.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is dropping DHCP request during BMI provision

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested in kumo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that has nothing to do with firewalld itself. It only manages iptables rules.

If the firewalld service is disabled on the machine (by default, it's enabled on centos), then we have to directly make changes to iptables. But the problem with that is those changes aren't saved since there's no iptables.service on centos anymore (you have to run the iptables command everytime on boot, or add to rc.local).

See if firewalld is running, and then see if iptables has rules for dhcp port(s)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose allowing port 67 and 68 should when firewalld is running. Needs to be tested tough.

Updated the README to include instructions on modifying the hosts file, the HIL
credentials and bashrc. Also modified firewalld and selinux.
@apoorvemohan apoorvemohan removed the request for review from pgrosu February 13, 2018 22:15
@naved001 naved001 merged commit 02679ad into CCI-MOC:master Apr 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants