Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve System VM startup and memory usage [4.11] #3127

Closed
wants to merge 3 commits into from

Conversation

PaulAngus
Copy link
Member

Description

In order to reduce memory footprint and improve boot speed/predictability; The following changes have been made:

  • Add vm.min_free_kbytes to sysctl
  • periodically clear disk cache (depending on memory size)
  • only start guest services specific to hypervisor
  • use systemvm code to determine hypervisor type (not systemd)
  • start cloud service at end of post init rather than through systemd
  • reduce initial threads started for httpd
  • fix vmtools config file
  • disable all required services (do not start on boot)
  • start only required services during post init.
  • allow one-shot ACS configuration services to terminate after running.

Fixes: #3039

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Screenshots (if appropriate):

How Has This Been Tested?

Changes have been subjected to regression testing and a burn-in test where portforwarding rules were constantly updated over a period of 3 days. no swapping occured in a 256MB RAM VR.

Add vm.min_free_kbytes to sysctl
periodically clear disk cache (depending on memory size)
only start guest services specific to hypervisor
use systemvm code to determine hypervisor type (not systemd)
start cloud service at end of post init rather than through systemd
reduce initial threads started for httpd
fix vmtools config file
disable all required services (do not start on boot)
start only required services during post init.
@PaulAngus
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@PaulAngus a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2530

@PaulAngus PaulAngus changed the title Add cache backpressure to sysctl Improve System VM startup and memory usage [4.11] Jan 11, 2019
@PaulAngus
Copy link
Member Author

@blueorangutan matrix

@PaulAngus
Copy link
Member Author

@blueorangutan test matrix

@blueorangutan
Copy link

@PaulAngus a Trillian-Jenkins matrix job (centos6 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

# log = true

# Disables core dumps on fatal errors; they're enabled by default.
enableCoreDump = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with openvmtools @PaulAngus as we don't install vmware tools, unless you've changed that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this a bug in the Debian distribution of open-vm-tools.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=889884

HYPERVISOR=$(</etc/cloudstack-agent_detected_hypervisor)
case $HYPERVISOR in
xen-pv|xen-domU)
systemctl stop ntpd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have all three xen cases in this block.

@@ -583,14 +583,14 @@ setup_ntp() {
}

routing_svcs() {
echo "haproxy apache2" > /var/cache/cloud/enabled_svcs
echo "cloud nfs-common portmap" > /var/cache/cloud/disabled_svcs
echo "cloud haproxy apache2" > /var/cache/cloud/enabled_svcs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On VRs we don't need to run any agent that is run by the cloud service.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we could, i kept the separation in case it was useful in future, (this is why i left the virtualbox case in despite it not required at this time).
I can combine the xen cases if that is preferred.

@@ -108,6 +108,19 @@ function configure_services() {
systemctl disable strongswan
systemctl disable x11-common
systemctl disable xl2tpd
systemctl disable vgauth
systemctl disable sshd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause error in somecases, ideally we want sshd enabled in all cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recollection is that sshd is actually only started after the firewall rules are established, otherwise there will be a period of vulnerability during startup.

@@ -68,9 +68,10 @@ function install_packages() {
python-flask \
haproxy \
radvd \
sharutils genisoimage aria2 \
sharutils genisoimage \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may break one of the register template features that uses aria2, @nvazquez can you comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the use of aria2 was scrapped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aria2 Used by KVM hosts with "bypass Sec.Stor." during template download.

Just tested Add template (from remote http) - agent/java download the template...
"Upload from local" uses POST, so should not be relevant.

strongswan libcharon-extra-plugins libstrongswan-extra-plugins \
virt-what open-vm-tools qemu-guest-agent hyperv-daemons
virt-what open-vm-tools qemu-guest-agent hyperv-daemons \
haveged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@blueorangutan
Copy link

Trillian test result (tid-3309)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 162513 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3127-t3309-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_public_ip_range.py
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Intermittent failure detected: /marvin/tests/smoke/test_templates.py
Intermittent failure detected: /marvin/tests/smoke/test_usage.py
Intermittent failure detected: /marvin/tests/smoke/test_volumes.py
Smoke tests completed. 64 look OK, 4 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_05_stop_ssvm Failure 913.86 test_ssvm.py
test_06_stop_cpvm Failure 914.11 test_ssvm.py
test_07_reboot_ssvm Error 913.33 test_ssvm.py
test_08_reboot_cpvm Error 913.30 test_ssvm.py
test_04_extract_template Failure 128.42 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_06_download_detached_volume Failure 138.70 test_volumes.py

@blueorangutan
Copy link

Trillian test result (tid-3313)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 160913 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3127-t3313-xenserver-71.zip
Intermittent failure detected: /marvin/tests/smoke/test_public_ip_range.py
Intermittent failure detected: /marvin/tests/smoke/test_scale_vm.py
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Intermittent failure detected: /marvin/tests/smoke/test_templates.py
Intermittent failure detected: /marvin/tests/smoke/test_usage.py
Intermittent failure detected: /marvin/tests/smoke/test_volumes.py
Smoke tests completed. 63 look OK, 5 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_scale_vm Error 22.66 test_scale_vm.py
test_05_stop_ssvm Failure 921.40 test_ssvm.py
test_06_stop_cpvm Failure 919.59 test_ssvm.py
test_07_reboot_ssvm Error 931.04 test_ssvm.py
test_08_reboot_cpvm Error 931.10 test_ssvm.py
test_09_destroy_ssvm Error 1573.53 test_ssvm.py
test_04_extract_template Failure 128.27 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_06_download_detached_volume Failure 144.92 test_volumes.py

@blueorangutan
Copy link

Trillian test result (tid-3310)
Environment: vmware-65 (x2), Advanced Networking with Mgmt server 7
Total time taken: 179447 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3127-t3310-vmware-65.zip
Intermittent failure detected: /marvin/tests/smoke/test_network.py
Intermittent failure detected: /marvin/tests/smoke/test_projects.py
Intermittent failure detected: /marvin/tests/smoke/test_public_ip_range.py
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Intermittent failure detected: /marvin/tests/smoke/test_templates.py
Intermittent failure detected: /marvin/tests/smoke/test_usage.py
Intermittent failure detected: /marvin/tests/smoke/test_volumes.py
Smoke tests completed. 63 look OK, 5 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_reboot_router Error 1224.49 test_network.py
test_05_stop_ssvm Failure 915.15 test_ssvm.py
test_06_stop_cpvm Failure 916.49 test_ssvm.py
test_07_reboot_ssvm Error 910.28 test_ssvm.py
test_08_reboot_cpvm Error 910.24 test_ssvm.py
test_04_extract_template Failure 182.59 test_templates.py
ContextSuite context=TestISOUsage>:setup Error 0.00 test_usage.py
test_06_download_detached_volume Failure 227.86 test_volumes.py

@rohityadavcloud
Copy link
Member

Note: Debian 9.7 has been released, build will fail due to incorrect iso url.
https://www.debian.org/News/2019/20190123

Copy link
Contributor

@wido wido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few comments about the memory


# clear memory cache to ultimately reduce swapping

sync && echo 1 > /proc/sys/vm/drop_caches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you are doing there, but it is the Linux kernel which will remove files from it's case if it needs this memory for something else.

I wouldn't mess with this.

Copy link
Member Author

@PaulAngus PaulAngus Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As recommended amount of memory for a Debian host is 4GB as we are working so far below that, I would say that we are operating well outside 'normal' practices. Personally, I have long questioned the use of Debian, and would have thought FreeBSD was a better choice. wrt to memory, the minimum memory for Debian is 96MB.

at 256MB RAM we can't really afford almost ANY caching of disk to memory. If we started at 4GB then i wouldn't dream of touching this.

The specific, highly replicable issue that is designed to fix, is that the cached disk access is not given back, instead the host starts swapping memory to disk, quickly making the VR unusably due to the CPU usage that this causes (see #3039)

I hope you can see why this sub-optimal fix is proposed to remedy the massively sub-optimal host spec that the VR is trying to operate in.

I should add that this was 'discovered' and tested through manually monitoring the VR memory usage over time and determining that the disk cache was the 'culprit' for the growing memory footprint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is safe in general, AND we are should not be concerned with disk latency, since we are doing network IO 99% of time.


phymem=$(free|awk '/^Mem:/{print $2}')
if [ $phymem -lt 513000 ]; then
sync && echo 1 > /proc/sys/vm/drop_caches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same applies here. The kernel is the one who clears the page cache if it needs the memory for something else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As recommended amount of memory for a Debian host is 4GB as we are working so far below that, I would say that we are operating well outside 'normal' practices. Personally, I have long questioned the use of Debian, and would have thought FreeBSD was a better choice. wrt to memory, the minimum memory for Debian is 96MB.

at 256MB RAM we can't really afford almost ANY caching of disk to memory. If we started at 4GB then i wouldn't dream of touching this.

The specific, highly replicable issue that is designed to fix, is that the cached disk access is not given back, instead the host starts swapping memory to disk, quickly making the VR unusably due to the CPU usage that this causes (see #3039)

I hope you can see why this sub-optimal fix is proposed to remedy the massively sub-optimal host spec that the VR is trying to operate in.

This ridiculous level of outside intervention is only being applied to VRs with 512MB or less available to it, which itself is a ridiculously low amount of RAM to give a Debian host.

I should add that this was 'discovered' and tested through manually monitoring the VR memory usage over time and determining that the disk cache was the 'culprit' for the growing memory footprint.

echo "conntrackd keepalived haproxy dnsmasq" > /var/cache/cloud/disabled_svcs
mkdir -p /var/log/cloud
}

setup_secstorage() {
log_it "Setting up secondary storage system vm"
sysctl vm.min_free_kbytes=8192
#sysctl vm.min_free_kbytes=8192
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why comment it and not remove it?

@wido
Copy link
Contributor

wido commented Mar 5, 2019

I understand @PaulAngus

I'd really opt that the default VR then gets more memory. The changes you want to make are fine with me, but memory is cheap nowadays, so why don't we bump the default to 1GB.

@PaulAngus
Copy link
Member Author

Thanks @wido. I could (if i was allowed) name at least 3 organisations with 3000+ VRs. That adds up to a lot of memory and hosts to put them in, plus cooling yadda, yadda, yadda....
Maybe we should have a health warning somewhere about running with extremely low memory.

@rohityadavcloud
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

echo "dnsmasq" >> /var/cache/cloud/disabled_svcs
else
echo "dnsmasq" >> /var/cache/cloud/enabled_svcs
echo "cloud dnsmasq" >> /var/cache/cloud/enabled_svcs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloud service should only run on cpvm, ssvm.


configure_apache2
configure_strongswan
configure_issue
configure_cacerts

# patch known systemd/sshd memory leak - https://github.com/systemd/systemd/issues/8015#issuecomment-476160981
echo '@include null' >> /etc/pam.d/systemd-user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@blueorangutan
Copy link

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2748

@rohityadavcloud rohityadavcloud self-assigned this May 23, 2019
@rohityadavcloud
Copy link
Member

Per Paul's advise, closing this one in favour of the PR on master branch that brings in enhancements and memory optimisation changes #3126

@rohityadavcloud rohityadavcloud removed this from the 4.11.3.0 milestone May 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants