Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: mtime of downloaded file is incorrect by UTC offset #4

Closed
mcsimps2 opened this issue Jan 28, 2020 · 0 comments · Fixed by #42
Closed

Storage: mtime of downloaded file is incorrect by UTC offset #4

mcsimps2 opened this issue Jan 28, 2020 · 0 comments · Fixed by #42
Assignees
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@mcsimps2
Copy link

Google Cloud Storage v1.25.0
Python 3.7.3
OS: OSX & Win7

Issue: If I upload a file to Google Cloud Storage and then immediately download it, the mtime is incorrect - for me, I'm in EST, so I'm 5 hours behind UTC. That's the exact timedelta that occurs between the file's original mtime and the recorded mtime after the file is downloaded.

Here's an example screenshot:
Screen Shot 2020-01-27 at 9 35 52 PM
The original file mtime in Google Cloud Storage is 1/23/20 9:04 PM (which is correct from the file I uploaded), but when I download the file, the mtime becomes 1/24/20 2:04 AM, which is 5 hours ahead of what is supposed to be (the UTC offset from my timezone).

The issue is here in blob.download_to_filename:

updated = self.updated
if updated is not None:
            mtime = time.mktime(updated.timetuple())
            os.utime(file_obj.name, (mtime, mtime))

In my example, updated is the timezone-aware datetime corresponding to 2020-01-24 02:04:11.184000+00:00 (it has tzinfo==UTC). The updated.timetuple() is

time.struct_time(tm_year=2020, tm_mon=1, tm_mday=24, tm_hour=2, tm_min=4, tm_sec=9, tm_wday=4, tm_yday=24, tm_isdst=0)

The problem, I believe, is that the timetuple doesn't know this is a UTC date, nor did it convert to my timezone. The docs of mktime note, "Its argument is the struct_time or full 9-tuple (since the dst flag is needed; use -1 as the dst flag if it is unknown) which expresses the time in local time, not UTC." Perhaps, we should do this instead:

if updated is not None:
   mtime = updated.timestamp() # For Python3, not sure of the Python2 version
   os.utime(file_obj.name, (mtime, mtime))

The timestamp() function accounts for the timezone information in the datetime object.
I've just been doing this manually in my code after downloading a file because my application is sensitive to mtimes, and it seems to fix the issue.

@HemangChothani HemangChothani self-assigned this Jan 28, 2020
@crwilcox crwilcox transferred this issue from googleapis/google-cloud-python Jan 31, 2020
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Jan 31, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 3, 2020
@jkwlui jkwlui added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
4 participants