Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inet_db, inet_hosts: improve inet_db behaviour when .hosts file changes #2516

Merged
merged 1 commit into from
Feb 20, 2020

Conversation

max-au
Copy link
Contributor

@max-au max-au commented Jan 22, 2020

Avoid re-parsing changed file over and over, also fix race condition
when ETS table with IP <-> name mappings is first deleted, then
re-created.

@IngelaAndin IngelaAndin added the team:PS Assigned to OTP team PS label Jan 22, 2020
@max-au max-au changed the base branch from maint-22 to maint January 23, 2020 03:35
Avoid re-parsing changed file over and over, also fix race condition
when ETS table with IP <-> name mappings is first deleted, then
re-created.
@RaimoNiskanen
Copy link
Contributor

Could you describe more what changes you have made and why?

@vans163
Copy link
Contributor

vans163 commented Feb 11, 2020

Seems its using lookup in the hot paths instead of match, which is a mandatory speedup if your querying 100k+ names per second.

Uses a set instead of a bag.

Also its not clearing the ets hosts file representation everytime it changes, but adding the difference.

It looks like a big performance win if your doing a lot of gethostbyname.

@max-au
Copy link
Contributor Author

max-au commented Feb 11, 2020

@vans163 exactly that, thank you.

Most important fix is the smarter way to handle .hosts file timestamp changes. Existing code reads the file, removes all entries from the ETS table. Then, it adds entries from file one by one. So even if .hosts file contents is unchanged, there may be quite a long time for race conditions (especially your installation has 15,000 entries in .hosts file).

Suggested code first add missing entries, then removes deleted entries. This is still not atomic (I wish there is ets:swap(Tab1, Tab2) that is atomic), but at least it does not fail name resolution while ETS table is being re-populated.

@max-au
Copy link
Contributor Author

max-au commented Feb 12, 2020

Also, in existing implementation, if multiple processes try to resolve name concurrently, and .hosts file has changed, every process will schedule file reload. And file will be reloaded multiple times, aggravating race condition. This diff avoids multiple reloads.
It has been extensively tested in production for at least 1.5 years.

@RaimoNiskanen
Copy link
Contributor

This PR certainly points out a bunch of weak spots in how inet_db handles /etc/hosts, and has got many good solutions.

However, I have read the code and if I managed to understand it right, I have some concerns...

  • tolower(Name) should be used on the Byname keys since the lookups should be case insensitive, but the returned data should preserve case
  • When looking up a name, the alias list that is returned might be from an entry of the wrong address family. This could be fixed by using {Family, tolower(Name)} as key in the Byname table
  • I think that using a temporary private ETS table when populating the Byaddr and Byname tables is probably expensive, since each update (since you construct "bag" semantics on top of a "set" table) will copy the whole entry to and from the ETS table and the heap. I think using a heap data structure would be more efficient.
  • It seems the code deletes now unused entries before overwriting the updated ones, and I think you said the other way around... Not important anyway, since it is only newly unused ones that are deleted, not the whole table.
  • I see nothing that changes how multiple processes that schedule file reload is handled. Since the update time must be shorter with this code, the time window should be smaller, but I can not find any code to avoid colliding reloads. Is there a commit missing? I will meditate over how this problem could be fixed.

I have pushed a branch to my GitHub account with a commit on top of this, that tries to fix the points above. It has not even been tested in our daily builds yet, but might be interesting to look at for the discussion. master...RaimoNiskanen:raimo/kernel/inet_db-hosts-file

I guess what I should do would be to write test cases to expose the two first bullet points above, which would be considered bugs, and also to get regression tests on the undocumented behaviour of inet_db and inet_hosts...

@RaimoNiskanen
Copy link
Contributor

I am fairly confident that i have a solution to the clustering update problem for the /etc/hosts file, which I have pushed to my branch mentioned above. While at it I added a few micro optimizations of timestamp handling and ETS lookups too.

The solution moves reading file info to the server in inet_db since all reads goes through the same file server process anyway, so there was no point in having those reads in the clients. The timestamp for the recorded file info is passed to the server that can deduce if the update request is valid or late, which should avoid repeating updates due to multiple simultaneous client requests.

I will throw this into our daily tests on Monday and add the regression test i talked about. I would be delighted to hear if my branch works as well in your use case as yours, or if I have introduced new bugs...

@RaimoNiskanen
Copy link
Contributor

I just force pushed an update - a missing lists:reverse/1.

@RaimoNiskanen
Copy link
Contributor

I have force pushed another update. This time the first commit is a regression test. The rest of my branch is as before.

This PR branch with just the regression test commit fails the regression test:

Check hosts file contents

Verify failed {{hostent,"a.example.com",
                    ["B.example.com","c.example.com"],
                    inet,4,
                    [{127,17,17,1},{127,17,17,5}]},
           "a.example.com",inet}: {hostent,"a.example.com",
                                   ["B.example.com"],
                                   inet,4,
                                   [{127,17,17,1}]}

Verify failed {{hostent,"a.example.com",
                    ["B.example.com","c.example.com"],
                    inet,4,
                    [{127,17,17,1}]},
           "b.example.com",inet}: nxdomain

Verify failed {{hostent,"a.example.com",
                    ["B.example.com","c.example.com"],
                    inet,4,
                    [{127,17,17,1}]},
           "c.example.com",inet}: {hostent,"A.example.com",
                                   ["c.example.com"],
                                   inet,4,
                                   [{127,17,17,1}]}

Verify failed {{hostent,"D.example.com",[],inet,4,[{127,17,17,2}]},
           "d.example.com",inet}: nxdomain

Verify failed {{hostent,"A.example.com",
                    ["c.example.com"],
                    inet6,16,
                    [{0,0,0,0,17,17,17,3}]},
           "a.example.com",inet6}: nxdomain

With this PR's commit reverted it passes, and with my rewrites it passes.

I have added my branch to our daily tests, we'll have a result from thoes in a few days...

@max-au The question remains if my branch work in your use case?

@RaimoNiskanen
Copy link
Contributor

@max-au Although this is a performance fix - it is quite a big rewrite of (albeit often unused) core functionality, so it would be more appropriate to have on 'master'. Is that OK?

@max-au
Copy link
Contributor Author

max-au commented Feb 17, 2020

@RaimoNiskanen Looking at your fix, it is totally superior!
And really massive compared to my attempt. To explain some history, original patch worked for R16, hence private ETS tables instead of maps on heap. I'm perfectly happy with closing my PR on favour of yours.

I agree this turned into a feature worth putting on master rather than maint.

Unfortunately, I do not have a test to confirm if it works or not. However it's not a terribly difficult test case to write.
Put a large (>10k entries) hosts file, then start N (>100) concurrent processes resolving names in that file. Then, change .hosts timestamp without changing contents, and ensure file was reloaded once, and while reloading, there were no name resolution errors.

@RaimoNiskanen
Copy link
Contributor

@max-au Thank you for your feedback!

R16 - it sure has been a while... Back then I guess it could have been possible used gb_sets as a heap data structure. And I think even before my rewrite this was a change with such implications it should be in a major release.

The test case sounds like fun to write! I'll see if I can figure out a not to dirty way to ensure that the hosts file is reloaded only once...

Regarding this PR (there is no competing PR): If you change the target branch to 'master' (that should work since there have been no changes in this area on 'maint'), then when I merge my rewrite that is on top of this PR to 'master'; this PR should be marked as merged into 'master'...

@max-au max-au changed the base branch from maint to master February 19, 2020 02:48
@max-au
Copy link
Contributor Author

max-au commented Feb 19, 2020

IIRC we tried using heap-based structures, and fprof reported a significant regression. At that time we did not realise fprof was not the right tool to measure performance. Now we have tool that suites better, but we haven't re-measured.
I changed base to master, thanks for this suggestion!

@vans163
Copy link
Contributor

vans163 commented Feb 19, 2020

IIRC we tried using heap-based structures, and fprof reported a significant regression. At that time we did not realise fprof was not the right tool to measure performance. Now we have tool that suites better, but we haven't re-measured.
I changed base to master, thanks for this suggestion!

Curious if its no secret what tool did you use instead? fprof does not report GC time, so in the odd case that producing tons of garbage is much faster in execution but slower when GC cleans wont be noticed by fprof AFAIK.

EDIT:

This made me curious and I did a small benchmark. It seems on a table lookup 'miss', lookup is 2x faster than a try catch wrapped lookup_element. Most usecases will see a miss here I think in 99.9% of cases. I did it in elixir so maybe their try catch implementation is much worse then erlangs?

:inets.start
:ssl.start
#get benchwarmer dep
{:ok,{_,_,req1}} = :httpc.request 'https://raw.githubusercontent.com/mroth/benchwarmer/master/lib/benchwarmer/results.ex'
{:ok,{_,_,req2}} = :httpc.request 'https://raw.githubusercontent.com/mroth/benchwarmer/master/lib/benchwarmer.ex'
Code.eval_string(req1)
Code.eval_string(req2)

tab = :ets.new(Tab, [:public, :ordered_set])
:ets.insert(tab, {1,1})

Benchwarmer.benchmark fn->
try do :ets.lookup_element(tab, 1, 2) catch _,_ -> nil end
end

Benchwarmer.benchmark fn->
try do :ets.lookup_element(tab, 2, 2) catch _,_ -> nil end
end

Benchwarmer.benchmark fn->
case :ets.lookup(tab, 1) do
[] -> nil
[{_,a}] -> a
end
end

Benchwarmer.benchmark fn->
case :ets.lookup(tab, 2) do
[] -> nil
[{_,a}] -> a
end
end

@max-au
Copy link
Contributor Author

max-au commented Feb 20, 2020

@vans163 for quick perf-related experiments, we use erlperf (https://github.com/max-au/erlperf). It's very simple yet powerful enough, and allows us to quickly see concurrency-related issues.

For fprof we use patched that does record GC and suspend time. I cannot remember or even find the original patch (I suspect @RaimoNiskanen is the author), so I just shared what we use - max-au@17f39c8 for your convenience.

@RaimoNiskanen
Copy link
Contributor

Even if fprof records GC and suspend time, it is heavily biased towards over-estimating Erlang function calls vs. execution time in C or I/O time, just because its own logging execution time burdens all traced functions. So calling e.g gb_sets appears much slower than calling ETS or a map BIF, and file system calls seems faster than they should.

It evens out in the long run between function calls, so comparing between different Erlang functions works.

@RaimoNiskanen RaimoNiskanen merged commit fb39f10 into erlang:master Feb 20, 2020
@max-au max-au deleted the inet_db_improvements branch August 28, 2021 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:PS Assigned to OTP team PS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants