New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ceph support #380
Ceph support #380
Conversation
@nrb - the travis build appears to have failed because it is testing the third-party roles that are added as submodules by this pull request. What needs to be done to exclude them from the testing? |
@@ -49,7 +72,8 @@ fi | |||
which openstack-ansible || ./scripts/bootstrap-ansible.sh | |||
|
|||
# ensure all needed passwords and tokens are generated | |||
./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_extras_secrets.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bootstrap does user_secrets I thought, so the user_extras_secrets afterwords was intentional? So changing this back might be patch/merge snafu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@claco line 76 './scripts/pw-token-gen.py --file $RPCD_SECRETS' does the same thing as the line that was removed. pw-token-gen.py is only run by the bootstrap-aio.sh and run-upgrade.sh scripts in openstack-ansible.
037f3e1
to
571fab5
Compare
Thanks @git-harry . We will need to bump osad's sha, however I guess we'll need to first wait for https://review.openstack.org/#/c/209537/ to merge and get backported? |
I'm loathe to bump master's SHA until we get 11.0 actually released, but that's likely to be something for further discussion. In terms of excluding the external roles, we'll likely have to add a grep or find excluding those to the ansible-lint line. Probably the syntax checking line, too. |
@nrb - in the end we needed to delete the roles before ansible-lint runs - https://github.com/git-harry/rpc-openstack/blob/ceph-squash/scripts/linting.sh#L51 because the alternative was to exclude 5 playbooks. |
Ok, that works too. |
571fab5
to
5af02e0
Compare
Looks like ansible-ceph-common sha will need to be bumped to include https://github.com/ceph/ansible-ceph-common/commit/92f9f72bf94d79a4c988fea7c2d7a2da19b4edc7. |
5af02e0
to
050fe86
Compare
Hey guys, I'm getting the following error during the failed: [d34d-test1_glance_container-6879f982] => (item=({'client': [u'glance'], 'component': 'glance_api', 'service': ['glance-api']}, 'python-ceph')) => {"attempts": 5, "failed": true, "item": [{"client": ["glance"], "component": "glance_api", "service": ["glance-api"]}, "python-ceph"]}
stderr: E: There are problems and -y was used without --force-yes
stdout: Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
libboost-system1.54.0 libboost-thread1.54.0 libcephfs1 liblttng-ust-ctl2
liblttng-ust0 libnspr4 libnss3 libnss3-nssdb librados2 librbd1 liburcu1
python-cephfs python-rados python-rbd
The following NEW packages will be installed:
libboost-system1.54.0 libboost-thread1.54.0 libcephfs1 liblttng-ust-ctl2
liblttng-ust0 libnspr4 libnss3 libnss3-nssdb librados2 librbd1 liburcu1
python-ceph python-cephfs python-rados python-rbd
0 upgraded, 15 newly installed, 0 to remove and 0 not upgraded.
Need to get 11.9 MB of archives.
After this operation, 30.0 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
libcephfs1 librados2 librbd1 python-rados python-rbd python-cephfs
python-ceph
msg: Task failed as maximum retries was encountered Probably needs something like this: - apt-key: url=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc state=present but not sure which play needs that. |
@d34dh0r53 - I think this is the upstream issue that will be addressed by https://review.openstack.org/#/c/228894/ |
@git-harry cherry picking that patch did the trick but now I'm running into another issue where one of my ceph_osd containers is creating a journal file that is filling up the disk:
This may be related to being on an AIO, are there configuration parameters that need to be adjusted for an AIO? The one setting I found in Thanks |
@d34dh0r53 - when building an AIO some of the variables do get updated/added. The journal size is being set to 5120 when on an AIO [1]. It looks like you are using a flavour with an ephemeral disk of at least 250 GB, when that is the case the OSA scripts put the containers into logical volumes. All prior testing had been done with smaller instances or with dedicated OSD hosts. [1] - https://github.com/rcbops/rpc-openstack/pull/380/files#diff-93518fda10b0403d3c5c20b4df4740a6R54 |
@d34dh0r53 , what @git-harry said. I've always built this on an AIO where LVM isn't used for containers, so never hit this. I can see it being a problem though since our container_size is limited to 5G. Should we just decrease the journal size to 1G so that this should work irrespective of AIO instance? Also @git-harry, can you please update scripts/deploy.sh to overwrite |
1eb1329
to
ea19f2a
Compare
does this need to be squashed? |
@prometheanfire - I think the first three commits should be kept separate. The work logically broke along those lines so I thought it made things easier to view. The rest of the commits are recent changes that I didn't squash down to make it clear that new changes had been introduced. The pull request has been around for so long I didn't want anyone who'd previously reviewed it to think any changes in the SHAs were just due to rebasing. If you guys would prefer it squashed, I don't have an issue with reducing it down to three commits. |
I like the idea of squashing it into 3 commits. @git-harry do you want to include #430 into this PR since it's documentation that probably should accompany the work, or prefer to leave it as is in a separate commit? |
👍 |
cool, lgtm 👍 but needs a rebase |
👍 |
This commit adds support for deploying a Ceph cluster using rpc-openstack as well setting the appropriate configuration for integration with os-ansible-deployment's Ceph capabilities. This commit makes use of the following roles to deploy a Ceph cluster: https://github.com/ceph/ansible-ceph-common https://github.com/ceph/ansible-ceph-mon https://github.com/ceph/ansible-ceph-osd The roles are submodules and so should get automatically cloned at the same time as os-ansible-deployment. The default configuration is designed to deploy three mon containers, one on each controller, with separate physical hosts for OSDs. The Ceph pools and users required by OpenStack are automatically created as part of the setup. Ceph can be deployed on a AIO, when this is done three OSD containers are created in an attempt to provide a more realistic representation of a Ceph cluster. The rbd pool is created automatically when a new Ceph cluster is deployed. This pool is not used by an OpenStack deployment but does consume pgs and so is removed if it exists and is empty. This commit sets the following in user_extras_variables.yml: pool_default_size: 3 pool_default_min_size: 2 mon_osd_full_ratio: .90 mon_osd_nearfull_ratio: .80 raw_multi_journal: true journal_size: 80000 secure_cluster: true secure_cluster_flags: - nodelete The role defaults are: pool_default_size: 2 pool_default_min_size: 1 mon_osd_full_ratio: .95 mon_osd_nearfull_ratio: .85 raw_multi_journal: false journal_size: 0 secure_cluster: false secure_cluster_flags: - nopgchange - nodelete - nosizechange The Ceph playbooks have been excluded from ansible-lint because their inclusion causes the third-party Ceph roles to be tested and they fail the ansible-lint checks. Co-Authored-By: git-harry <git-harry@live.co.uk>
This commit adds the following functionality: - A new plugin for use in monitoring a ceph cluster - An Ansible library for gathering OSD-host facts - Updates to the setup-maas.yml playbook and rpc_maas role to deploy and configure the ceph_monitoring.py plugin. Co-Authored-By: Matt Thompson <mattt@defunct.ca> Co-Authored-By: Hugh Saunders <hugh@wherenow.org>
Add logstash template file for ceph. Add a beaver ceph.conf file for ceph containers and hosts. Due to the permissions on /var/log/ceph.log and /var/log/ceph-audit.log it isn't possible to capture these at this point, and Ceph offers no otpions to adjust this, so we will need an alternative solution for this. This applies to the mons only.
Good work peeps. I'm going to merge, so we can backport to kilo and pop some tags. |
Ceph support (cherry picked from commit 79a472b)
This adds support for Ceph, including: