DM-53001: Improve ResourcePath implementation for davs:// endpoints by airnandez · Pull Request #138 · lsst/resources

airnandez · 2026-03-17T15:30:36Z

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

Partial reads are sent directly to the backend server after first redirection. The URL of the resource at the backend server is recorded and used for future partial reads. When the handle is closed, notify the backend server so it releases its resources allocated to serve our partial reads. Reduce the number of requests when creating a directory hierarchy if the server allows for it.

When doing partial reads, explicitly close the connection to backend server to let it know it can shutdown. This is needed since Pyarrow's pq.read_table() does not call close() on the handle it opens.

…ections

…ng to local temporary directory

The Butler often requests the size of a file it recently imported to the datastore. The intention of this cache is to reduce the number of such requests by retrieving the file size from the PUT response when possible, and cache it. Subsequent requests for this file size will be served from the cache if the requests arrives within the validity period of the cache entry.

DM-53468 fixes an issue in pyarrow that prevented us to send requests directly to the backend server when reading parquet files. Now that that issue is fixed, we can implement that feature which reduces the number of GET requests going through the frontend server when reading chunks of the same file. This feature is also beneficial when reading contents of zip files.

Refactoring in particular to allow specialization of the method to retrieve metadata about a file or directory according to the known capabilities of the server (e.g. XRootD or dCache). The main motivation is that the dCache frontend server leaves the network connection in a state that makes it unusable by the client to send subsequent requests. In that situation, the client must establish a new TLS connection which may be expensive depending on the kind of request. For details see: dCache/dcache#8005

This configuration setting allows for specifying one or more frontend base URLs the client can randomly select to send requests against. This can be used as a mechanism for balancing load among several frontend servers as well as to provide a path for a gradual migration of existing butler repos which have recorded direct URLs in their registry database.

codecov · 2026-03-17T15:32:19Z

Codecov Report

❌ Patch coverage is 88.11475% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.63%. Comparing base (efc6958) to head (7516530).
⚠️ Report is 22 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
python/lsst/resources/dav.py	82.35%	15 Missing and 9 partials ⚠️
...t/resources/_resourceHandles/_davResourceHandle.py	91.07%	2 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #138      +/-   ##
==========================================
- Coverage   84.92%   83.63%   -1.30%     
==========================================
  Files          36       36              
  Lines        7132     7601     +469     
  Branches      874      916      +42     
==========================================
+ Hits         6057     6357     +300     
- Misses        821      976     +155     
- Partials      254      268      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dhirving

I'm about halfway through this -- will finish the rest in the morning. So far only nitpicks/questions/minor concerns.

python/lsst/resources/_resourceHandles/_davResourceHandle.py

python/lsst/resources/davutils.py

python/lsst/resources/dav.py

python/lsst/resources/davutils.py

dhirving · 2026-03-20T01:36:23Z

python/lsst/resources/davutils.py

+
+    Caching file sizes helps preventing sending requests to the server for
+    retrieving the size of recently uploaded files. This is in particular
+    intended to efficiently serve `Butler` requests for the size of a file it


As far as I can tell, the Butler only does that size thing in a single place (here).

I'm assuming you added this cache because you are seeing a performance hit from extra HTTP requests when data processing workers write their outputs to storage (here).

(There are a couple other places we end up in this code path, e.g. Butler.ingest(), but I don't think they happen during data processing.)

It looks like when this happens, the Butler already has the data in memory or on a local disk, so we already know its size and could just record it directly. This would be "safer" anyway, because we would be recording the initial size before any potential corruption from network transfer.

I need to check a couple details with Tim (on vacation until Monday) but it looks like only a few lines of plumbing in daf_butler to fix this -- would you like me to tweak it there instead of maintaining all of the complexity for this cache?

Yes, we observed that the Butler systematically asks for the size of a file it just uploaded. That means an additional request to the server after each PUT requests. Given the amount of requests issued against the datastore when processing, it is beneficial to avoid as many requests as we can.

I agree that it would be cleaner if we address this at the level of the butler. If the butler already knows the size of the file it uploads then it should be better it uses that value. An alternative or even complementary solution would be that ResourcePath.write() returns the number of bytes actually written to the data store (it currently returns None) . The butler could then store that value or compare it against the value it expects.

But yes, I would be happy to remove this file size caching mechanism when possible.

dhirving

Looks good to me, other than the handful of minor comments I sent yesterday.

python/lsst/resources/davutils.py

airnandez added 19 commits February 3, 2026 12:26

Update documentation, in particular new defaults

88db92b

Don't keep connection with backend server established

d28f262

When doing partial reads, explicitly close the connection to backend server to let it know it can shutdown. This is needed since Pyarrow's pq.read_table() does not call close() on the handle it opens.

Send a "Connection: close" header to the backend server

810e5b2

Use a temporary file on write() to make the operation more atomic

328a7ee

Ensure all interactions with backend servers close their network conn…

8b65681

…ections

Specialize exists_and_size for XRootD via HEAD request

e5a22bd

On PUT use a temporary file name which preserves the extension

e1cc95f

Improve mkcol to avoid renaming temporary file

b5bf301

Avoid copy of data when downloading entire remote file to local file

fb5411c

Avoid request to server to check if resource is a file when downloadi…

a1b606f

…ng to local temporary directory

Use lsst.utils.logging instead of standard logging

0a2569f

Add support for HTTP basic authentication

8eb0563

Allow the client to be configured to not reuse network connection

4c53f74

Make metadata retrieval consistently uses stat()

686468e

Add support for override decorator for Python v3.11

311f433

dhirving reviewed Mar 20, 2026

View reviewed changes

dhirving approved these changes Mar 20, 2026

View reviewed changes

python/lsst/resources/davutils.py Outdated Show resolved Hide resolved

Addressed reviewer's comments

7516530

airnandez merged commit 9f54cac into main Mar 23, 2026
21 of 22 checks passed

airnandez deleted the tickets/DM-53001 branch March 23, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-53001: Improve ResourcePath implementation for davs:// endpoints#138

DM-53001: Improve ResourcePath implementation for davs:// endpoints#138
airnandez merged 21 commits intomainfrom
tickets/DM-53001

airnandez commented Mar 17, 2026

Uh oh!

codecov bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

dhirving left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhirving Mar 20, 2026

Uh oh!

airnandez Mar 23, 2026

Uh oh!

dhirving left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

airnandez commented Mar 17, 2026

Checklist

Uh oh!

codecov bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dhirving left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhirving Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

airnandez Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

dhirving left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 17, 2026 •

edited

Loading