Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rez-env Performance and socket.getfqdn() #617

Closed
mottosso opened this issue Apr 26, 2019 · 18 comments
Closed

rez-env Performance and socket.getfqdn() #617

mottosso opened this issue Apr 26, 2019 · 18 comments

Comments

@mottosso
Copy link
Contributor

15% of the time spent calling rez env is spent calling socket.getfqdn(), under some circumstances.

Problem

This command is run alongside rez-env which gets you the toy.fritz.box part of the welcome banner.

$ rez env
resolved by manima@toy.fritz.box, on Fri Apr 26 09:56:24 2019, using Rez v2.29.1
...

However I found it can get quite costly, depending on your setup.

import time
import socket
t0 = time.time()
socket.getfqdn()
t1 = time.time()
print("%.2f seconds" % (t1 - t0))
# 0.27 seconds

That's on Windows 10 with wireless connection to router.

0.02 seconds

This on the other hand is on one connected via wire on another Windows 10 machine. I'm not entirely confident the numbers are based solely on wi-fi versus wired, but the former is my working environment which gets me a 0.27 delay whenever I enter an environment.

A Solution

Spontaneously, I would try and "bake" the value of that variable, like in an environment variable.

fqdn = os.getenv("REZ_CACHED_FQDN") or socket.getfqdn()

At least then there's the option of avoiding the penalty.

I'm also curious what your results are, and perhaps most importantly; do we need it? If so, does it need to be written in the terminal, or could it be included in logs etc? In which case maybe it could get called asynchronously.

Background

I found the issue by running Rez through PySpy.

$ set PYTHONPATH=%REZ_INSTALL_DIR%\Lib\site-packages\rez-2.29.0-py2.7.egg
$ c:\Python27\Scripts\py-spy.exe -- c:\Python27\python.exe -m rez.cli._main env -- exit
Collecting samples from 'c:\Python27\python.exe -m rez.cli._main env -- exit' (python v2.7.14)
Total Samples 100
GIL: 0.00%, Active: 100.00%, Threads: 1

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)
 77.78%  77.78%   0.700s    0.700s   meth (c:\Python27\lib\socket.py:228)
 17.78%  17.78%   0.260s    0.260s   getfqdn (c:\Python27\lib\socket.py:141)
  1.11%   1.11%   0.010s    0.010s   forward (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rez\vendor\yaml\reader.py:107)
  1.11%   1.11%   0.010s    0.010s   <module> (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rezplugins\package_repository\memory.py:7)
  1.11%   1.11%   0.010s    0.010s   exists (c:\Python27\lib\genericpath.py:26)
  1.11%   1.11%   0.010s    0.010s   find_module (c:\Python27\lib\pkgutil.py:186)
  0.00%   3.33%   0.000s    0.030s   get_plugin_class (C:\Users\manima\Dropbox\dev\anima\rez2\Lib\site-packages\rez-2.29.0-py2.7.egg\rez\plugin_managers.py:280)
@bpabel
Copy link
Contributor

bpabel commented Apr 26, 2019

For comparison, running socket.getfqdn() takes 4.5 seconds on my production machine!! And I'm not on wifi. TBH, maybe ditch that portion of the welcome banner entirely.

@mottosso
Copy link
Contributor Author

You can do a quick patch in rez.__init__.py to get a sense of difference in time spent.

rez.__init__.py

...

import socket
socket._getfqdn = socket.getfqdn

def getfqdn():
  return "none.ya.beeswax"

socket.getfqdn = getfqdn

...

@nerdvegas
Copy link
Contributor

nerdvegas commented Apr 26, 2019 via email

@bpabel
Copy link
Contributor

bpabel commented Apr 26, 2019

socket.gethostname() is pretty much instant.

@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Apr 27, 2019

@bpabel @mottosso how long does it take to run a nslookup with your IP and same with your hostname? If everything is right with your network, it should be instant. If if it's not instant (or almost instant), something might be wrong with your network (more particularly DNS server).

@mottosso
Copy link
Contributor Author

>>> timeit.timeit("socket.getfqdn()", setup="import socket", number=10)
2.576872299999998
>>> timeit.timeit("socket.gethostname()", setup="import socket", number=10)
0.001569600000010496

how long does it take to run a nslookup with your IP and same with your hostname?

How do I do that? :O

@mottosso
Copy link
Contributor Author

mottosso commented Apr 27, 2019

Oh and also, I found what caused the slowdown on the above slow machine, my VPN. I'm connected from London to a company in Tokyo.

Without it, I'm getting near instant results.

>>> timeit.timeit("socket.getfqdn()", setup="import socket", number=10)
0.009264200000018263

@bpabel
Copy link
Contributor

bpabel commented Apr 27, 2019

Regardless of how long it takes and what network changes can be made to improve the startup times, it seems like an unnecessary requirement to make network requests just to startup a Rez command.

@nerdvegas
Copy link
Contributor

nerdvegas commented Apr 28, 2019 via email

@bpabel
Copy link
Contributor

bpabel commented Apr 28, 2019

@nerdvegas It's still not clear to me why the context needs a fully qualified domain name. Why does the domain name have any bearing on what happens if a context is shared? When you say sharing contexts, are you talking about the feature that allows you to pass an .rxt file into rez-env? I just browsed through that code and it doesn't look like it's using the host to make any decisions about how to resolve the context.

Also, I'm willing to bet that the number of people actually sharing contexts can be counted on one hand (if there actually are any), so I don't find that use case a compelling reason not to improve the performance of a simple rez resolve, which is felt by every single person that uses rez.

@nerdvegas
Copy link
Contributor

nerdvegas commented Apr 28, 2019 via email

@bpabel
Copy link
Contributor

bpabel commented Apr 28, 2019

I agree, switching to hostname is probably enough to solve this issue.

But just for the sake of curiosity

It's feasible that someone somewhere
would on occasion copy this rxt file and share it, so it has to be fully
formed. Thus, a rez-env does actually need to establish what host that
context was created on.

I don't think rez-env actually does need to establish the host. What would happen, right now, if someone shared an rxt file from another host? As far as I can tell, rez doesn't use the host to make any decisions about how to load or resolve a context.

@nerdvegas
Copy link
Contributor

nerdvegas commented Apr 28, 2019 via email

@bpabel bpabel mentioned this issue Apr 28, 2019
@JeanChristopheMorinPerso
Copy link
Member

I fear the hostname is not enough. Studios which have multiple sites, it's entire possible they will have multiple domain and so the the hostnames could be duplicated and cause confusion if fqdn is not used.

@bpabel
Copy link
Contributor

bpabel commented Apr 28, 2019

@JeanChristopheMorinPerso In what scenario would you see this being confusing? In 99% of the cases, the host and fqdn are going to be the localhost.

@nerdvegas postulated a scenario where someone using a shared context could possibly write a package with a late-bound function that checked the host/fqdn where the context was created and make decisions about how the package loads based off the hostname/fqdn. While that's technically possible, I'd be surprised if anyone is actually doing that, and even if they were, there are almost certainly better ways of doing whatever they're doing based off other context metadata.

TBH, if having just the hostname in the context metadata would be confusing, I think that's more of an argument for removing the hostname entirely, not adding the fqdn. The fqdn and the hostname aren't used by rez (other than the welcome message).

I just can't envision a scenario where this would ever confuse someone.

@nerdvegas
Copy link
Contributor

nerdvegas commented Apr 29, 2019 via email

@mottosso
Copy link
Contributor Author

I thought of one more approach to this; caching.

The fqdn call could e.g. be made once per machine, and stored in a users home directory. That way, you'll get precision and speed all rolled into one, at the expense of the value eventually getting stale. I would imagine the value isn't something that changes very often, and when it does, one could simply delete the file. One could even set it up to get deleted on logout/login such that the cost is paid only once per boot.

@nerdvegas
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants