New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
set osd max memory based on host memory #3113
Conversation
roles/ceph-config/tasks/main.yml
Outdated
| set_fact: | ||
| num_osds: "{{ devices | length }}" | ||
| when: | ||
| - devices|length > 0 and (osd_scenario == 'collocated' or osd_scenario == 'non-collocated') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(osd_scenario == 'collocated' or osd_scenario == 'non-collocated') should be on a new line
when:
- devices|length > 0
- (osd_scenario == 'collocated' or osd_scenario == 'non-collocated')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used and based on neha-ojha/ceph-ansible@9e203db#r30481853. I'll put them in two lines.
roles/ceph-config/tasks/main.yml
Outdated
| set_fact: | ||
| num_osds: "{{ lvm_volumes | length }}" | ||
| when: | ||
| - lvm_volumes|length > 0 and osd_scenario == 'lvm' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, two lines are needed in the when condition.
roles/ceph-config/tasks/main.yml
Outdated
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" | ||
| when: | ||
| - devices|length > 0 and osd_scenario == 'lvm-batch' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, two lines are needed. Also, there is no lvm-batch scenario, this should still check for the lvm scenario.
roles/ceph-config/tasks/main.yml
Outdated
|
|
||
| - name: count number of osds for lvm-batch scenario | ||
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ceph-volume lvm batch --report will need to be run in a separate task using the command module, with the output from the command being registered into a variable. Then that variable can be used here to set the num_osds fact.
roles/ceph-config/tasks/main.yml
Outdated
|
|
||
| - name: count number of osds for lvm-batch scenario | ||
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" | ||
| when: | ||
| - devices|length > 0 and osd_scenario == 'lvm-batch' | ||
| - devices|length > 0 | ||
| - osd_scenario == 'lvm-batch' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be - osd_scenario == 'lvm'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, updating it in my next commit with the lvm-batch changes.
roles/ceph-config/tasks/main.yml
Outdated
|
|
||
| - name: count number of osds for lvm-batch scenario | ||
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding this here would mean it'd be run on all host, I think it must be run on an OSD.
By the way, you have to do it in two steps :
- name: get number of osds for lvm-batch scenario
command: "{{ docker_exec_cmd }} ceph-volume lvm batch --report devices"
register: lvm_batch_devices
when:
- devices|length > 0 and osd_scenario == 'lvm-batch'
- name: set_fact num_osds
set_fact:
num_osds: "{{ (lvm_batch_devices | from_json).osds | length }}"
when:
- devices|length > 0 and osd_scenario == 'lvm-batch'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guits would we just use delegate_to and let the first OSD host take care of it. We'd probably only need to run this once I'd imagine, not on every OSD host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewschoen that's correct.
I think we must use something else than docker_exec_cmd in that case.
(roles/ceph-defaults/tasks/facts.yml:25: docker_exec_cmd: "docker exec ceph-mon-{{ hostvars[groups[mon_group_name][0]]['ansible_hostname'] }}")
| {% if osd_objectstore == 'bluestore' %} | ||
| [osd] | ||
| osd memory target = {{ osd_memory_target }} | ||
| {% if is_hci is defined %} # is_hci should be defined in group_vars/all.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have to add this variable in ceph-defaults/defaults/main.yml
roles/ceph-config/tasks/main.yml
Outdated
|
|
||
| - name: count number of osds for lvm-batch scenario | ||
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding this here would mean it'd be run on all host, I think it must be run on an OSD.
By the way, you have to do it in two steps :
- name: get number of osds for lvm-batch scenario
command: "{{ docker_exec_cmd }} ceph-volume lvm batch --report devices"
register: lvm_batch_devices
when:
- devices|length > 0 and osd_scenario == 'lvm-batch'
- name: set_fact num_osds
set_fact:
num_osds: "{{ (lvm_batch_devices | from_json).osds | length }}"
when:
- devices|length > 0 and osd_scenario == 'lvm-batch'
roles/ceph-config/tasks/main.yml
Outdated
|
|
||
| - name: count number of osds for lvm-batch scenario | ||
| set_fact: | ||
| num_osds: "{{ (ceph-volume lvm batch --report devices | from_json()).osds | length }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When #3111 merges this command will also need the --osds-per-device osds_per_device flag added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're also missing --format=json here.
d38e1e3
to
f8bb932
Compare
|
@guits @andrewschoen I've updated the PR, could you guys take another look please. |
roles/ceph-config/tasks/main.yml
Outdated
| command: "ceph-volume lvm batch --report --format=json {{ devices| join(' ') }}" | ||
| register: lvm_batch_devices | ||
| run_once: true | ||
| delegate_to: "{{ groups[osd_group_name][0]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the closing }} here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing it. I see that #3111 got merged, I'll add --osds-per-device osds_per_device as well.
aa89972
to
a235e8f
Compare
|
@neha-ojha please, could you squash your commits? thanks! |
roles/ceph-config/tasks/main.yml
Outdated
| - osd_scenario == 'lvm' | ||
|
|
||
| - name: get number of osds for lvm-batch scenario | ||
| command: "ceph-volume lvm batch --report --format=json --osds-per-device osds_per_device {{ devices| join(' ') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewschoen is this supposed to be run only in non containerized deployment?
(see: #3113 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, but I suspect it should work on containerized deployments as well. docker_exec_cmd is expecting a mon at this point, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is exactly why i'm raising this.
It means, we would need to set a new fact like docker_exec_cmd_osd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha you have to add this task before L26
- name: set_fact docker_exec_cmd_osd
set_fact:
docker_exec_cmd_osd: "docker exec ceph-osd-{{ hostvars[groups[osd_group_name][0]]['ansible_hostname'] }}"
when:
- containerized_deployment
- groups.get(osd_group_name, []) | length > 0
then
command: "{{ docker_exec_cmd_osd | default('') }} ceph-volume lvm batch --report --format=json --osds-per-device osds_per_device {{ devices| join(' ') }}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guits will this command: be added as a part of a new task for containerized_deploymentor will it replace https://github.com/ceph/ceph-ansible/pull/3113/files#diff-408c9554f34e961d163f68b645ee6da3R27?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guits updated it, let me know if it looks alright. Managed to get 5d9f269#diff-2857b72f37476e1b5d16e7e81783ecbaL241 as a by-product of that change.
dc2c0bf
to
3478343
Compare
|
@guits squashed. |
|
jenkins test pipeline |
roles/ceph-config/tasks/main.yml
Outdated
| - osd_scenario == 'lvm' | ||
|
|
||
| - name: get number of osds for lvm-batch scenario | ||
| command: "ceph-volume lvm batch --report --format=json --osds-per-device osds_per_device {{ devices| join(' ') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha you have to add this task before L26
- name: set_fact docker_exec_cmd_osd
set_fact:
docker_exec_cmd_osd: "docker exec ceph-osd-{{ hostvars[groups[osd_group_name][0]]['ansible_hostname'] }}"
when:
- containerized_deployment
- groups.get(osd_group_name, []) | length > 0
then
command: "{{ docker_exec_cmd_osd | default('') }} ceph-volume lvm batch --report --format=json --osds-per-device osds_per_device {{ devices| join(' ') }}"
5fbc159
to
5d9f269
Compare
| {% if is_hci is defined %} # is_hci should be defined in group_vars/all.yml | ||
| {% if num_osds > 0 and ansible_memtotal_mb * hci_safety_factor / num_osds > osd_memory_target %} # hci_safety_factor is the safety factor for HCI deployments | ||
| osd memory target = {{ ansible_memtotal_mb * hci_safety_factor / num_osds }} | ||
| {% endif %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha something is wrong here, I think you want a {% elif %}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't pay attention to my last comment, I think you are just missing a {% endif %}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I noticed it when I wrote a pythonic version of these ifs :) fixing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha perhaps I'm missing something but as it is
If osd_objectstore = 'bluestore' and is_hci is defined, it means you are going to add osd memory target twice in the config, L156 + either L159 or L163
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need the default value of osd memory target to be set to 4000000000. If those conditions are met, we will override this value else not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha Isn't the following what you want to achieve ?
{% if osd_objectstore == 'bluestore' %}
[osd]
{% if is_hci and num_osds > 0 %}
{% if ansible_memtotal_mb * hci_safety_factor / num_osds > osd_memory_target %}
# hci_safety_factor is the safety factor for HCI deployments
{% set _osd_memory_target = (ansible_memtotal_mb * hci_safety_factor / num_osds) %}
{% elif ansible_memtotal_mb * non_hci_safety_factor / num_osds > osd_memory_target %}
# non_hci_safety_factor is the safety factor for dedicated nodes
{% set _osd_memory_target = (ansible_memtotal_mb * non_hci_safety_factor / num_osds) %}
{% endif %}
{% endif %}
osd memory target = {{ _osd_memory_target | default(osd_memory_target) }}
{% endif %}
5d9f269
to
5a0f5bd
Compare
|
@markhpc do you think the default of using 0.5 of the node memory is too conservative? In your earlier graphs it looked like RSS could be ~10% over the target |
|
One thing I noticed is that the behavior is very different on my ubuntu dev box and on incerta running centos7. On my dev box, I saw variations in the 10-20% range. On CentOS the mapped usage tracked RSS very tightly; maybe more like 1%. I'm not sure why that is, other than perhaps the CentOS kernel was set up to reclaim memory much more aggressively. Is it possible to get memory usage from QE runs to see how it's been doing in master under different OSes? |
|
to more directly answer your question: I don't think you'll have any problem with 0.7 unless there's a bunch of other stuff running on it (hyper-converged, or co-located rgw/mon/mgr/etc). The one case where that might not hold up is if The node has a very small amount of memory (say 4GB) in which case you might want a constant+ratio for OS (maybe 1GB + 20% of remaining or something). |
| {% if osd_objectstore == 'bluestore' %} | ||
| [osd] | ||
| osd memory target = {{ osd_memory_target }} | ||
| {% if is_hci is defined %} # is_hci should be defined in group_vars/all.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if is_hci is defined will always return true
if is_hci is enough here I guess
| {% if is_hci is defined %} # is_hci should be defined in group_vars/all.yml | ||
| {% if num_osds > 0 and ansible_memtotal_mb * hci_safety_factor / num_osds > osd_memory_target %} # hci_safety_factor is the safety factor for HCI deployments | ||
| osd memory target = {{ ansible_memtotal_mb * hci_safety_factor / num_osds }} | ||
| {% endif %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neha-ojha Isn't the following what you want to achieve ?
{% if osd_objectstore == 'bluestore' %}
[osd]
{% if is_hci and num_osds > 0 %}
{% if ansible_memtotal_mb * hci_safety_factor / num_osds > osd_memory_target %}
# hci_safety_factor is the safety factor for HCI deployments
{% set _osd_memory_target = (ansible_memtotal_mb * hci_safety_factor / num_osds) %}
{% elif ansible_memtotal_mb * non_hci_safety_factor / num_osds > osd_memory_target %}
# non_hci_safety_factor is the safety factor for dedicated nodes
{% set _osd_memory_target = (ansible_memtotal_mb * non_hci_safety_factor / num_osds) %}
{% endif %}
{% endif %}
osd memory target = {{ _osd_memory_target | default(osd_memory_target) }}
{% endif %}
578c7d2
to
bb31e3e
Compare
07fa768
to
62b15fb
Compare
8e8fba6
to
1090009
Compare
|
ok, this is passing tests now. |
f3059ee
to
59305fb
Compare
|
jenkins test pipeline |
BlueStore's cache is sized conservatively by default, so that it does not overwhelm under-provisioned servers. The default is 1G for HDD, and 3G for SSD. To replace the page cache, as much memory as possible should be given to BlueStore. This is required for good performance. Since ceph-ansible knows how much memory a host has, it can set `bluestore cache size = max(total host memory / num OSDs on this host * safety factor, 1G)` Due to fragmentation and other memory use not included in bluestore's cache, a safety factor of 0.5 for dedicated nodes and 0.2 for hyperconverged nodes is recommended. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1595003 Signed-off-by: Neha Ojha <nojha@redhat.com> Co-Authored-by: Guillaume Abrioux <gabrioux@redhat.com>
|
This PR is now breaking ceph-volume tests. Ideally, we would like to see some tests pass with devel (ceph-volume, ceph master, etc...): |
This PR addresses https://bugzilla.redhat.com/show_bug.cgi?id=1595003 by adding an
osd_memory_targetoption to ceph-ansible.Signed-off-by: Neha Ojha nojha@redhat.com