Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-48623: reduce XenAPI.py connection rate and drop 4 useless API calls #5533

Merged
merged 16 commits into from
Mar 28, 2024

Conversation

edwintorok
Copy link
Contributor

On a Unix domain socket (used by SM in Dom0 for example) we lacked the persistent connection optimisation, so we created a new connection for each API call, thus querying the version was done in 4 API calls and 4 separate connections
(with a single login thankfully).

Looking more deeply calling _get_api_version() is not even needed, it was only ever used by some compat code that got dropped in 2010 (but the version queries remained).
For backwards compatibility I kept the possibility to query the api version, but wrapped it in a cached property, and avoid calling it by default.
This shaves off 4 API calls after each login (i.e. every SM script call), and is in general useful (even for HTTP(S) clients).

I've only done some minimal testing on this PR, so a draft for now. Will report back when some wider testing and performance numbers are available.

ydirson and others added 10 commits March 6, 2024 16:56
This test did not exercised the SR type expected by user, and notably
failed when no default SR was defined.

Adds the option of installing a VM in a non-default SR using
VM.with_new.

Signed-off-by: Yann Dirson <yann.dirson@vates.fr>
Previously sub-second durations were printed as "ago", now they are
pretty-printed using Mtime.Span's logic, for example "666ms ago".

If the point in time is unexpectedly in the future, print "0ns ago".

Durations lasting at least 1 seconds are printed as before.

Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Signed-off-by: Ming Lu <ming.lu@cloud.com>
…390570

CA-390570: Py3 socket.sendto needs bytes instead of a string
When creating a backup VDI on an SR we want to derive the VDI's UUID
from the SR. This way we can't be later tricked into accepting a
different VDI as a backup.

Implement a hash that derives the VDI UUID and pass this UUID to SM to
be used.

Currently the creation of this VDI in Xapi is detected based on its
name-label: "Pool Metadata Backup". The SM stack usually creates a new
random UUID but will use a given UUID passed as "vdi_uuid" for its
"vdi_create" command. The difficulty in the implementation is that
"vdi_uuid" is used in other commands as well and we have to make sure to
set it only in the intended context. It is not straight forward to pass
the UUID from Xapi_vdi.create down to Sm.

Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
We introduce deterministic UUID for backup VDIs to avoid searching for
VDIs. Add this capability to the scripts that create and restore
backups. In case the VDI with the expected UUID does not exist, let the
user confirm probing for it.

Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
We have added an interactive prompt in case the backup VDI was created
by an earlier version and does not match the expected UUID. To
facilitate scripting, add a -y (yes) flag that assumes a confirmation.

Signed-off-by: Christian Lindig <christian.lindig@cloud.com>
@edwintorok
Copy link
Contributor Author

Looks like I might have to fix some pytype warnings (but locally pytype says it doesn't support Python >3.11 yet, and I have Python 3.12). Lets see what the CI says.

@edwintorok
Copy link
Contributor Author

Error: Function ServerProxy.__init__ was called with the wrong arguments,          Expected: (self, uri, transport, encoding, verbose: bool = ..., ...),   Actually passed: (self, uri, transport, encoding, verbose: int, ...), Called from (traceback):,   line 227, in xapi_local
  Error: No attribute 'errno' on module 'socket'

They look like genuine errors. I'll open a separate to fix those first though.

@edwintorok
Copy link
Contributor Author

Also apparently there is no cached_property in 3.6, just 3.8, I'll write it by hand instead..

Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a useful and relatively straightforward optimisation.

@edwintorok
Copy link
Contributor Author

On sequential API calls (e.g. 500 VDI.destroy) there is an about 5% reduction in API execution times. I haven't done bootstorm or parallel VM start measurements yet, where I'd expect to see more improvements.

@edwintorok edwintorok force-pushed the private/edvint/xenapi branch 2 times, most recently from d684a22 to 68112fb Compare March 27, 2024 16:09
Fixes this pytype report:
```
pytype_reporter:          Expected: (self, uri, transport, encoding, verbose: bool = ..., ...)
pytype_reporter:   Actually passed: (self, uri, transport, encoding, verbose: int, ...)
pytype_reporter:          Expected: (self, uri, transport, encoding, verbose, allow_none: bool = ..., ...)
pytype_reporter:   Actually passed: (self, uri, transport, encoding, verbose, allow_none: int)
```

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
pytype claims that there is no 'errno' in the socket module, but there is:
```
Python 3.6.8 (default, Nov 16 2020, 16:55:22)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.errno.ETIMEDOUT
110
```

Also `session.local_logout` is a remote method that is proxied by `__getattr__`,
I don't know why `pytype` complains about that one, and not about `logout`, but suppress it.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Signed-off-by: Edwin Török <edwin.torok@cloud.com>
…ication

This is already the case for HTTP(S) connections, but XenAPI overrides the xmlrpclib implementation
to provide a Unix socket transport, and it lacks this optimization.

Otherwise we've noticed that SM makes at least 4 separate connections to XAPI to query the version number for example.
On a busy system each of those connections can get delayed by 100ms:

* OCaml's tick thread switches threads every 50ms, and you might need 2 switches when XAPI is busy with CPU-bound workloads:
  * one to accept the connection
  * another to handle the request on the newly spawned thread

See:
https://github.com/python/cpython/blob/v2.7.5/Lib/xmlrpclib.py#L1361-L1371
https://github.com/python/cpython/blob/v3.6.8/Lib/xmlrpc/client.py#L1237-L1245

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Old version of XenAPI.py used to look at the API version to implement a compatibility layer,
but that code got dropped and the API version is now completely unused:

df9b539 ("CA-35286: Remove forward compatability code from the python XenAPI module in favour of the compatability layer in xapi")

Remove the unused argument, this will allow us to avoid making 4 additional API calls after every login.

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
The API version is not used anymore, so remove the 4 additional queries it'd make.
For backward compatibility retain the field as a cached property: should a client of this code want to access the API version,
then it'll make the 4 queries at that time, and cache the result for the duration of the API object.

Cached properties are writable, so the fallback that writes the API version to 1.1 should keep working too

Signed-off-by: Edwin Török <edwin.torok@cloud.com>
@edwintorok
Copy link
Contributor Author

I've fixed the pytype warnings

@edwintorok edwintorok changed the base branch from master to feature/perf March 27, 2024 16:34
@edwintorok edwintorok marked this pull request as ready for review March 27, 2024 16:34
@edwintorok edwintorok merged commit 107f23c into xapi-project:feature/perf Mar 28, 2024
28 checks passed
Copy link

pytype_reporter extracted 47 problem reports from pytype output

.

You can check the results of the job here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants