New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU overload while using chashed load balancing policy #6932

Closed
pavel-odintsov opened this Issue Sep 6, 2018 · 0 comments

Comments

Projects
None yet
2 participants
@pavel-odintsov

pavel-odintsov commented Sep 6, 2018

Hello!

  • Program: dnsdist
  • Issue type: Bug report

Short description

I enabled dnsdist in this configuration:

setWHashedPertubation(1234)
setServerPolicy(chashed)

newServer({address="10.0.0.1:53", weight=1000000, name="first", retries=1, pool="" })
newServer({address="10.0.0.2:53", weight=1000000, name="second", retries=1, pool="" })
newServer({address="10.0.0.3:53", weight=1000000, name="third", retries=1, pool=""})

newServer({address="10.0.0.4:53", weight=1, name="backup", retries=1,  pool="" })

Environment

  • Operating system: Debian
  • Software version: Jessie
  • Software source: compiled yourself from master

Steps to reproduce

  1. Create 4 server pool with chashed load balancing policy and set weights to 1 for single server and set weight to 1000000 for three servers.
  2. Run it
  3. Make few queries

Expected behaviour

Normal CPU usage:

ps aux|grep dnsdist
dnsdist  20669  0.3  0.0 854180 21840 ?        Ssl  11:53   0:01 /usr/bin/dnsdist --supervised --disable-syslog -u dnsdist -g dnsdist -C /etc/dnsdist.conf

perf top output for weigh LB algo:

Samples: 2K of event 'cycles:ppp', Event count (approx.): 84345390
Overhead  Shared Object           Symbol
  22.81%  dnsdist                 [.] healthChecksThread
   2.59%  [kernel]                [k] update_cfs_rq_h_load
   2.33%  [kernel]                [k] _raw_spin_lock_irqsave
   2.28%  [kernel]                [k] ep_poll_callback
   2.13%  [ip_tables]             [k] ipt_do_table
   1.77%  [kernel]                [k] __wake_up_common
   1.55%  libc-2.24.so            [.] malloc
   1.41%  [kernel]                [k] native_sched_clock
   1.33%  [kernel]                [k] idle_cpu
   1.28%  [kernel]                [k] syscall_return_via_sysret
   1.16%  [kernel]                [k] fib_table_lookup
   1.12%  [kernel]                [k] update_curr
   1.08%  [kernel]                [k] cpuacct_charge
   1.04%  [kernel]                [k] update_load_avg
   1.01%  [kernel]                [k] copy_user_enhanced_fast_string
   1.01%  libpthread-2.24.so      [.] __libc_recvmsg
   0.87%  dnsdist                 [.] processQuery
   0.85%  [kernel]                [k] kprobe_ftrace_handler
   0.84%  [kernel]                [k] check_preempt_curr
   0.83%  [kernel]                [k] copy_user_generic_unrolled
   0.83%  [kernel]                [k] _raw_spin_unlock_irqrestore
   0.81%  [kernel]                [k] try_to_wake_up
   0.81%  dnsdist                 [.] _init
   0.81%  [kernel]                [k] _raw_spin_lock
   0.78%  dnsdist                 [.] responderThread
   0.74%  [kernel]                [k] update_cfs_shares
   0.73%  dnsdist                 [.] DNSDistPacketCache::cachedValueMatches
   0.72%  [kernel]                [k] udp_set_dev_scratch

Actual behaviour

CPU core overload:

40853 dnsdist   20   0 1130868 165168   8784 S 100.3  0.1   1:36.02 dnsdist

perf top during overload:

Samples: 24K of event 'cycles:ppp', Event count (approx.): 12920942699
Overhead  Shared Object          Symbol
  85.33%  libstdc++.so.6.0.22    [.] std::_Rb_tree_increment
  11.56%  dnsdist                [.] chashed
   0.67%  dnsdist                [.] _init
   0.21%  [kernel]               [k] nmi
   0.20%  [kernel]               [k] native_irq_return_iret
   0.17%  [kernel]               [k] swapgs_restore_regs_and_return_to_usermode
   0.15%  [kernel]               [k] native_sched_clock
   0.12%  [kernel]               [k] _raw_spin_lock
   0.12%  [kernel]               [k] perf_event_task_tick
   0.08%  [kernel]               [k] __intel_pmu_enable_all.constprop.19
   0.08%  dnsdist                [.] healthChecksThread
   0.07%  [kernel]               [k] trigger_load_balance
   0.06%  [kernel]               [k] update_fast_timekeeper
   0.05%  [kernel]               [k] task_tick_fair
   0.05%  [kernel]               [k] cpuacct_account_field
   0.04%  [kernel]               [k] intel_pmu_handle_irq

Usecase

I use so big weight to implement "backup" server approach and send requests only in case when all other servers failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment