Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize hosts read #454

Merged
merged 28 commits into from Oct 14, 2016
Merged

Optimize hosts read #454

merged 28 commits into from Oct 14, 2016

Conversation

jreidinger
Copy link
Member

@jreidinger jreidinger commented Oct 11, 2016

WIP, will be blog entry

rationale for this pr is bug report https://bugzilla.suse.com/show_bug.cgi?id=877047
even with this change I get to less then 20% of original time and found more possible optimization in cfa that will even speed it up.

content for blog:
We get bug report, that yast2 network is slow when there is a lot of entries in /etc/hosts file. We take it as opportunity to tost new profiler support in yast. Profiler shows that problem is in SCR calls, which took a lot of time, so another oppurtinity to test new framework cfa with augeas parser. Augeas have already lense for /etc/hosts file, so it is quite easy to add it. In short here are results for network CLI with /etc/hosts containing ~ 10 000 entries:

before:

time yast2 lan.rb list
id  name,           bootproto
0   Intel Ethernet controller, DHCP


real    1m15.735s
user    1m15.076s
sys 0m0.164s

after:

time yast2 lan list
id  name,           bootproto
0   Intel Ethernet controller, DHCP


real    0m19.079s
user    0m18.348s
sys 0m0.244s

We also improved cfa itself
config-files-api/config_files_api#10 which we speed up by 50%.

)
next { host => names } if names != []
end
if Ops.greater_than(SCR.Read(path(".target.size"), CFA::Hosts::PATH), 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when doing cleanup this should be replaced too ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, cfa currently do not have method to recognize if file is not empty or exists. It raise exception. Of course I can use some cli tool to get it

else
Host.Update(
@hostname_initial,
Ops.get_string(@settings, "HOSTNAME", ""),
[Ops.get_string(@settings, "IPADDR", "")]
Ops.get_string(@settings, "IPADDR", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check that in details yet so it is more less note for me. Here you change semantics (list of ips -> single ip)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I change API because it is used only as single parameter. I think it do not make much sense to change hostname for more ips, it means you have one hostname for more ips which looks bogus for me.

super(PARSER, PATH, file_handler: file_handler)
end

def hosts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and some other methods miss documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -27,6 +27,8 @@
# Authors: Michal Svec <msvec@suse.cz>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be removed ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def remove_hostname(hostname)
entries = data.select(hostname_matcher(hostname))
entries.each do |entry|
entry = entry[:value]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overwriting variable used for iterating in loop is usually not considered to be a good idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not? in ruby it is just variable assigned to some value, but I can create new one if you are more comfortable with it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

private

def ip_matcher(ip)
Matcher.new { |k, v| v["ipaddr"] == ip }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k is not used so it should be _

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

end

def hostname_matcher(hostname)
Matcher.new do |k, v|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

true
end

# Dump the Hosts settings to a map, for autoinstallation use.
# @return autoinstallation settings
def Export
return {} if @hosts.empty?
exported_hosts = @hosts.hosts
return {} if exported_hosts.empty?

# Filter out IPs with empty hostname (so that valid autoyast
# profile is created)(#335120)
# FIXME: this also removes records with empty names from @hosts. Such
# side effect is unexpected and should be removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I get it correctly this FIXME is no longer true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right

require "cfa/augeas_parser"

module CFA
class Hosts < BaseModel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the people that did not design CFA::BaseModel, a comment with the full URL of its API docs will be useful: http://www.rubydoc.info/github/config-files-api/config_files_api/CFA/BaseModel

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

ff02::2 ipv6-allrouters
ff02::3 ipv6-allhosts
10.100.128.72 pepa.labs.suse.cz pepa pepa2
10.254.128.01 pepa1.labs.suse.cz pepa1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "01" is incidental complexity, make that "1".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is just string for parsing, I do not care about it much

end

# returns old format of hosts used in Host module.
# @return [Hash<Array<String>>] format is hash where key is ip and value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rewrite this piece

@coveralls
Copy link

Coverage Status

Coverage decreased (-22.0%) to 24.651% when pulling 8d9d6b8 on optimize_hosts_read into 9888c9e on master.

def set_host(ip, canonical, aliases = [])
entries = data.select(ip_matcher(ip))
if entries.empty?
log.info "adding new entry for ip #{ip}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahem, add_entry(ip, canonical, aliases) instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, that cause another search thrue tree searching for ip

Copy link
Member

@mvidner mvidner Oct 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What? The body of this if block is an exact copy of the body of that method, plus return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right, add_entry do not search for existing ips, so I will adapt

@jreidinger
Copy link
Member Author

regarding API, I agree with naming and will change it.

But why are you missing those two methods? Does it needed on any place? When I design API I try to avoid adding method that noone need as it is hard to properly design it. E.g. what will be return value? array? or first matching entry? And how it look like? string like now? or hash with canonical and aliases? or only canonical? and for find_by_hostname ( name is confusing for me ) does it return ip or aliases? or hash with all three ( canonical, ip and aliases) ? So if such method someone find useful, then I am not against adding it, when use case is clear.

@jreidinger
Copy link
Member Author

@mvidner also thinking about remove_hostname and delete_by_hostname. I also found it confusing. as delete_by_ip removes entry where is given ip. That make sense and is intuitive. But delete_by_hostname do not delete entry, only in specific case. It just remove given hostname from entries where it is available, but keep ip if it have other hostnames.

so for

1.0.0.0 siteA siteB siteC

if I call delete_by_hostname("siteC") it can indicate that it remove whole this line, but it instead do just

1.0.0.0 siteA siteB

@mvidner
Copy link
Member

mvidner commented Oct 13, 2016

BTW regarding the /etc/hosts editor (Network Services →Hostnames, yast2 host) - it uses a "bulk" API: read everything via Host.name_map, and then Host.clear and Host.add_name/Host.remove_ip in a loop.

@jreidinger
Copy link
Member Author

@mvidner I know, it is a bit misoptimal as it kills comment placement in file. Better way would be direct operation on hosts model and adding, removing and replacing entries there.

# If it is canonical name, then the first alias becomes the canonical hostname
# @param [String] hostname
# @return [void]
def remove_hostname(hostname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this to delete_name (my original suggestion of delete_by_name was wrong)


# Replaces or adds a new host entry.
# If more than one entry with the given ip exists
# then it replaces the last instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a suspicion that with the refactoring of Host.Update we have broken something. I will do some manual tests and report back.

@mvidner
Copy link
Member

mvidner commented Oct 14, 2016

I have now run the code and it says Internal Error, undefined method set_host.

This happened when I changed both the IP and the hostname at once.

@mvidner
Copy link
Member

mvidner commented Oct 14, 2016

delete_hostname has no tests, that's why the set_host has escaped. set_entry also has no tests BTW.

@mvidner
Copy link
Member

mvidner commented Oct 14, 2016

Regarding the augeas failure:

ruby-augeas-0.5.0 needs augeas-1.0.0: hercules-team/ruby-augeas@9025d88
and it uses this setup in its Travis file: (hercules-team/ruby-augeas@7a69aa1)

  - sudo add-apt-repository -y ppa:raphink/augeas
  - sudo apt-get update
  - sudo apt-get install libaugeas-dev libxml2-dev

@coveralls
Copy link

Coverage Status

Coverage increased (+1.5%) to 48.07% when pulling 9bdd9af on optimize_hosts_read into 9888c9e on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.5%) to 48.07% when pulling e6b65e0 on optimize_hosts_read into 9888c9e on master.

Copy link
Member

@mvidner mvidner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, almost there!
Please add a changelog entry.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.5%) to 48.07% when pulling 63ed52c on optimize_hosts_read into 430be2c on master.

@jreidinger
Copy link
Member Author

thanks

@jreidinger jreidinger merged commit 5f6cb6d into master Oct 14, 2016
@jreidinger jreidinger deleted the optimize_hosts_read branch October 14, 2016 15:21
@mchf
Copy link
Member

mchf commented Oct 14, 2016

I've obviously was not fast enough. However, I did some tests. And I get an Internal Error sometimes:
cfa_host

@mchf
Copy link
Member

mchf commented Oct 14, 2016

The issue was raised in installer even with already merged version of the patch.

@jreidinger
Copy link
Member Author

checking

@jreidinger
Copy link
Member Author

@mchf fix is in #456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants