Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid unnecessary timezone conversion when use_numpy=True #354

Closed
edwinwzhe opened this issue Jan 30, 2023 · 2 comments · Fixed by #355
Closed

avoid unnecessary timezone conversion when use_numpy=True #354

edwinwzhe opened this issue Jan 30, 2023 · 2 comments · Fixed by #355

Comments

@edwinwzhe
Copy link

Hi guys

Thanks for the excellent work. We are prototyping with ClickHouse and it is pretty amazing!
When I was benchmarking. I noticed a potential improvement for reading when use_numpy=True.
I will create a PR if it is a sensible change in your view.

Basically, the issue is that NumpyDateTimeColumnBase.apply_timezones_after_read always cast the numpy array to pandas DateTimeIndex, convert timezone and then back to numpy array. This operation is not cheap and it is not always necessary. In our use case, we use UTC everywhere, it will be a nice speedup if we could avoid unnecessary tz conversion.

def apply_timezones_after_read(self, dt):
    # should probably ensure self.local_timezone and self.timezone are of the same type for consistency
    timezone = self.timezone if self.timezone else get_timezone(self.local_timezone)

    if self.offset_naive and getattr(timezone, 'zone') != 'UTC':
        # tz convert makes a difference to the returned datetime only when timezone isnt UTC when offset_naive is True
        ts = pd.to_datetime(dt, utc=True).tz_convert(timezone)
        ts = ts.tz_localize(None)

        return ts.to_numpy(self.datetime_dtype)
    else:
        return dt
    
@xzkostyan
Copy link
Member

Hi.

Feel free to make PR.

edwinwzhe pushed a commit to edwinwzhe/clickhouse-driver that referenced this issue Jan 31, 2023
edwinwzhe pushed a commit to edwinwzhe/clickhouse-driver that referenced this issue Jan 31, 2023
@edwinwzhe edwinwzhe changed the title avoid unnecessary timezone covertion when use_numpy=True avoid unnecessary timezone conversion when use_numpy=True Jan 31, 2023
xzkostyan added a commit that referenced this issue Feb 1, 2023
@edwinwzhe
Copy link
Author

thanks for merging the PR @xzkostyan I am looking forward to contribute to the project.
thanks again for the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants