Skip to content
This repository has been archived by the owner on Oct 20, 2022. It is now read-only.

Improve Performance #4

Open
sargurgn opened this issue Jun 22, 2015 · 7 comments
Open

Improve Performance #4

sargurgn opened this issue Jun 22, 2015 · 7 comments

Comments

@sargurgn
Copy link

Hi,
My table has 2 million records. Your UtcToLocal function takes around 120 seconds to fetch the converted records. Any help to optimize this would be great help. Thanks in advance.

My table structure:

id startTime endTime
1 1340227721513 1340227751699
2 1340227751699 1340227818999
3 1340227818999 1340227844130
4 1340227844130 1340227869292
5 1340227869292 1340227905563
6 1340227905563 1340227965950
7 1340227965950 1340227986058
8 1340227986058 1340228032563
9 1340228032563 1340228047664
10 1340228047664 1340228176083

I have created the following function, for the conversion

CREATE FUNCTION Tzdb.CONVERT_TZ (@utcTimeInMillis BIGINT, @localTzRef varchar(45))
RETURNS BIGINT
AS
BEGIN

  DECLARE @localTimeInMillis AS BIGINT
  DECLARE @inputDateTimeOffsetVal AS DATETIMEOFFSET
  DECLARE @outputDateTimeVal AS DATETIME

  SET @inputDateTimeOffsetVal = CONVERT(DATETIMEOFFSET, DATEADD(SECOND,(@utcTimeInMillis/1000),{d '1970-01-01'}))
  SET @outputDateTimeVal = CAST(Tzdb.UtcToLocal(@inputDateTimeOffsetVal, @localTzRef)AS DATETIME)
  SET @localTimeInMillis = CAST(DATEDIFF(SECOND,{d '1970-01-01'},@outputDateTimeVal) AS BIGINT)

  RETURN @localTimeInMillis;
END;
GO

I am accessing this way,

SELECT
Tzdb.CONVERT_TZ(startTime,'America/New_york') as startTime, 
Tzdb.CONVERT_TZ(endTime,'America/New_york') as endTime
FROM MyTable;

If i replace your Tzdb.UtcToLocal function with SWITCHOFFSET it takes not more than 4 secs for 2 million records. I see that there are 3 queries in the function which fetches the zoneId and offset. Any ways to optimize this?

@mattjohnsonpint
Copy link
Owner

Probably there are ways to optimize this, but I didn't design this library with perf as a top priority. I'll keep this issue open to remind me to investigate - but feel free to send a PR if you figure out a better approach! 😄

@mattjohnsonpint mattjohnsonpint changed the title How to work with Load Improve Performance Oct 8, 2015
@mattjohnsonpint
Copy link
Owner

I've been discussing this offline with another developer (who will hopefully chime in here), who thinks better perf can be obtained by using inline table-valued functions instead of scalar-valued functions.

I'm looking forward to his PR.

@mattjohnsonpint
Copy link
Owner

FYI, I am working on performance improvements, but also (and sorry I didn't notice this earlier @sargurgn) - timestamps are rarely kept in numeric format unless they are in UTC. The idea of a number of milliseconds since 1970-01-01 local time is not a good one. I recommend against that. If you need local time, keep it in a datetimeoffset, datetime2, or datetime type.

@maggiepint
Copy link
Collaborator

@mj1856 you want to switch to inline functions, and you want to schema bind. Both should get you some serious boost on performance, as the query optimizer can interpret them significantly better. Since I'm going to use this in my own code, I'll see if I can't deal with this this week. I have some reports that are going to call your functions on millions of rows, so if they suck I'll find out :-)

@srutzky
Copy link

srutzky commented Jul 11, 2016

@mj1856 , @maggiepint , and sargurgn : As it relates to performance improvements, I have mocked up (and tested) new versions of GetZoneId and UtcToLocal (was going to do LocalToUtc later if this worked out well). I am not sure of the best way to show them. I did not want to update the existing functions as I changed them both to be Inline TVFs, hence there is no possibility backwards compatibility if I were to replace the definitions of the same function names. And, I did not know how you wanted to handle naming convention, and I have very little time to work on this in the first place but wanted to help. So, for now, hopefully it is ok for me to post the code here that can be run in a Database containing this project (since the new functions are unique object names). I also included the simple performance tests and they show improvements on both functions. If I should do a pull request instead I can do that and perhaps someone can make changes to naming convention, etc? Either way, here is what I have:

CODE

GO
CREATE FUNCTION [tzdb].[GetZoneId2]
(
    @tz VARCHAR(50)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
 WITH zone1 AS
 (
    SELECT TOP 1 l.[CanonicalZoneId] AS [ZoneId]
    FROM [Tzdb].[Zones] z
    INNER JOIN [Tzdb].[Links] l
      ON z.[Id] = l.[LinkZoneId]
    WHERE z.[Name] = @tz
    UNION ALL
    SELECT NULL AS [ZoneId]
 )
 SELECT TOP 1 ISNULL(zone1.[ZoneId], (
    SELECT TOP 1 [Id]
    FROM [Tzdb].[Zones]
    WHERE [Name] = @tz)) AS [ZoneId]
  FROM zone1
  ORDER BY zone1.[ZoneId] DESC;
GO

GO
CREATE FUNCTION [tzdb].[UtcToLocal2]
(
    @utc datetime2,
    @tz varchar(50)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
  SELECT TOP 1
         TODATETIMEOFFSET(DATEADD(MINUTE, ntrvl.[OffsetMinutes], @utc), ntrvl.[OffsetMinutes])
         AS [LocalTime]
  FROM   [Tzdb].[Intervals] ntrvl
  WHERE  ntrvl.[ZoneId] = (SELECT zn.[ZoneId] FROM [Tzdb].[GetZoneId2](@tz) zn)
  AND    ntrvl.[UtcStart] <= @utc
  AND    ntrvl.[UtcEnd] > @utc;
GO

TESTS

SET NOCOUNT ON;

-- functional test
SELECT [Tzdb].[GetZoneId]('America/Los_Angeles');
SELECT * FROM [Tzdb].GetZoneId2('America/Los_Angeles');


-- performance test (run individually by highlighting to get most accurate results)
SET STATISTICS TIME ON;
SELECT [Tzdb].[GetZoneId]('America/Los_Angeles')
FROM sys.objects;
SET STATISTICS TIME OFF;

SET STATISTICS TIME ON;
SELECT t.*
FROM [Tzdb].GetZoneId2('America/Los_Angeles') t
CROSS JOIN sys.objects so
SET STATISTICS TIME OFF;
------------------------------------------------

-- functional test
SELECT Tzdb.UtcToLocal('2015-07-01 00:00:00', 'America/Los_Angeles')
SELECT * FROM Tzdb.[UtcToLocal2]('2015-07-01 00:00:00', 'America/Los_Angeles');


-- performance test (run individually by highlighting to get most accurate results)
SET STATISTICS TIME ON;
SELECT Tzdb.UtcToLocal(so.create_date, 'America/Los_Angeles')
FROM sys.objects so
SET STATISTICS TIME OFF;

SET STATISTICS TIME ON;
SELECT lcl.[LocalTime]
FROM sys.objects so
CROSS APPLY Tzdb.[UtcToLocal2](so.create_date, 'America/Los_Angeles') lcl;
SET STATISTICS TIME OFF;

@mattjohnsonpint
Copy link
Owner

@srutzky - Thanks! I'll try to carve out some time to look this over soon, and also at Maggie's suggestions in #14.

Hang in there. :)

@HolisticDeveloper
Copy link

@srutzky Thanks for sharing your UtcToLocal2 function -- the performance is significantly better!

Note, this is also subject to the TODATETIMEOFFSET bug described in #3.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants