Skip to content

Add atime and mtime support for files ops, start with files.put #1372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: 3.x
Choose a base branch
from

Conversation

vram0gh2
Copy link
Contributor

@vram0gh2 vram0gh2 commented Jun 11, 2025

Add concrete support for atime and mtime checking/setting in the files.put operation and provide infrastructure to add this time metadata support to other operations as well.

Includes a fix in facts/files.py to accommodate negative timestamps from stat on the remote host.

Added support for more os.stat_result file metadata to be provided in the test json files, by updating the mock os.stat patch used for testing. This includes atime, mtime, but also the other members of the standard stat 10-tuple as well.

@Fizzadar: Looking for any feedback/implementation concerns, style changes you'd like, and how best to handle documenting the specifics and caveats. I added a note in the operation docstring, but if atime and mtime support is implemented in other operations (e.g., files.file) these will be true there as well. Is there a preferred way to break that information out into something outside of the function-specific docstring?

The expanded mock os.stat added a fair number of LOC, but it's also now equipped for future features/higher fidelity.

Also, I used the American spelling of "normalize" in one of the added support functions, happy to change it to British if preferred.

  • Pull request is based on the default branch (3.x at this time)
  • Pull request includes tests for any new/updated operations/facts
  • Pull request includes documentation for any new/updated operations/facts
  • Tests pass (see scripts/dev-test.sh)
  • Type checking & code style passes (see scripts/dev-lint.sh)

vram0gh2 added 6 commits June 10, 2025 23:55
…pport

Add a restricted analog to the ``touch`` command.  Supports mutating atime and
mtime, however only allows modification of one time field at a time.  A single
unambiguous datetime val is the time source since any ``touch`` operation with
specified atime/mtime requires the month, day, and time at a minimum.  Is BSD
aware in that datetimes are always converted to UTC rather than relying on the
richer timezone support in GNU ``touch``.

The single time field setting approach is less flexible per invocation than
native ``touch``, but also far simpler and nicely fits with file operations and
their way of working on a single attribute at a time.
Time metadate updates are not coalesced, as ``touch`` can do with a single date
and time arg, or via reference files.  However, this actually allows more
flexibility than native ''touch`` by allowing both atime and mtime values to be
specified independently.

Two support functions are added as well.
Negative timestamps are valid and permissible.  Ensure
pyinfra.facts.files.stat doesn't fail to match stat command output when such
values occur.
….files

In preparation for tests of mtime and ctime support, make the mock os.stat
actually use any stat fields provided in the test json, with non-zero defaults
for all fields.

Doesn't break any existing test, but lays foundation for any future ops or facts
which may use this info.

Doesn't currently support setting the st_{a, m}time_ns high-resolution time
fields.
Change the datetime Python value to be generated from the fromisoformat method
rather than strptime.  This yields identical results as before when no
UTC offset information is provided, but respects it if it does.  It is also
consistent with the os.stat parsing.
Add several testcases to cover multiple aspects of functionality:
- Change detection, incl. differences <1s that are visible in the seconds vals
- Separate actions on mtime and atime
- Local files as the time references
- Remote files as the time references
- Use of the server's timezone when a datetime arg has no tzinfo
@vram0gh2 vram0gh2 marked this pull request as ready for review June 11, 2025 04:59
Copy link
Member

@Fizzadar Fizzadar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is really neat @vram0gh2! Sorry it took so long for me to do a proper review. Replying to your notes specifically:

I added a note in the operation docstring, but if atime and mtime support is implemented in other operations (e.g., files.file) these will be true there as well. Is there a preferred way to break that information out into something outside of the function-specific docstring?

Currently not, and yeah that could be messy to duplicate everywhere. One option would be to put it into the top level docstring in operations/files.py such that it appears at the top of the operation doc and then reference that within each operation.

I do think this would be a great to expand to the other ops!

The expanded mock os.stat added a fair number of LOC, but it's also now equipped for future features/higher fidelity.

👍 really nice, good add.

Also, I used the American spelling of "normalize" in one of the added support functions, happy to change it to British if preferred.

All good, American makes sense as consistent w/Python, though I'm sure pyinfra contains a mix of both already 🙃

@@ -778,6 +778,56 @@ def get(
yield FileDownloadCommand(src, dest, remote_temp_filename=host.get_temp_filename(dest))


def _canonicalize_timespec(field, local_file, timespec):
assert field == "atime" or field == "mtime"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an enum for the field argument would be cleaner here + support type-checking.

Comment on lines 792 to 795
if field == "atime":
return datetime.fromtimestamp(lf_stat.st_atime, tz=timezone.utc)
else:
return datetime.fromtimestamp(lf_stat.st_mtime, tz=timezone.utc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if field == "atime":
return datetime.fromtimestamp(lf_stat.st_atime, tz=timezone.utc)
else:
return datetime.fromtimestamp(lf_stat.st_mtime, tz=timezone.utc)
tstamp = if_stat.st_atime if field == "atime" else if_stat.st_mtime
return datetime.fromtimestamp(tstamp, tz=timezone.utc)

Nitpicking, just a bit neater.

@@ -802,6 +854,9 @@ def put(
+ create_remote_dir: create the remote directory if it doesn't exist
+ force: always upload the file, even if the remote copy matches
+ assume_exists: whether to assume the local file exists
+ atime: value of atime the file should have, use ``True`` to match the local file
+ mtime: value of atime the file should have, use ``True`` to match the local file
+ timesrc: the source of the time value if atime or mtime are ``True``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover argument?

vram0gh2 added 4 commits July 12, 2025 22:17
In operations/util/files.py, define an enum to indicate whether the metadata
time field being manipulated by touch() is atime or mtime.`

This feeds into operations/files.py and the _canonicalize_timespec() support
function.
@vram0gh2
Copy link
Contributor Author

Thanks for the review, @Fizzadar !

I made changes/fixes as per the feedback . Docstrings should now be fixed, and we're using an enum to indicate atime or mtime.

There could be some stylistic room to argue in that I gave the enum a somewhat? long name of MetadataTimeField to be clear it wasn't referring to a portion of an ISO date string or components of a datetime. This results in a few spots where lines are wrapped. I wouldn't object to shortening this if the ambiguity argument is outweighed by the desire for compactness.

If this is considered merge-ready otherwise, perhaps we can address the documentation issue if/when atime and mtime support is added to other ops and leave it where it is for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants