Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue with time zones #7854

Closed
shavluk opened this issue Nov 19, 2023 · 15 comments · Fixed by #7859
Closed

Performance issue with time zones #7854

shavluk opened this issue Nov 19, 2023 · 15 comments · Fixed by #7859

Comments

@shavluk
Copy link

shavluk commented Nov 19, 2023

I've identified a performance issue related to the execution speed when working with a global temporary table in 5.0.0.1271-RC2 Below is a simplified example to reproduce the problem:

create global temporary table gtt_test (
    id  integer not null,
    t   timestamp default current_timestamp
) on commit preserve rows;

alter table gtt_test add constraint pk_gtt_test primary key (id);

The problem occurs during the insertion of a large number of rows, and the performance has significantly decreased compared to Firebird 3.0.10. The insertion speed has dropped by approximately 2 times.

Here is an example of inserting a large number of rows:

execute block as
declare variable id_session int;
declare variable k int = 0;
begin
  while (k < 1000000) do
  begin
    insert into gtt_test(id) values (:k);
    k = k + 1;
  end
end
@sim1984
Copy link

sim1984 commented Nov 19, 2023

Starting with Firebird 4.0, current_timestamp returns a timestamp with time zone type value. Therefore, the compare is not quite correct. Replace with default localtimestamp and compare again.

@hvlad
Copy link
Member

hvlad commented Nov 19, 2023

Could you provide runtime statistics for both servers also ?

@shavluk
Copy link
Author

shavluk commented Nov 19, 2023

I have the server running in compatibility mode with DataTypeCompatibility = 3.0
The localtimestamp and current_timestamp show the same times

When using localtimestamp the problem does go away.

10000000 record(s) was(were) inserted into GTT_TEST
firebird 3.0
------ Performance info ------
Prepare time = 0ms
Execute time = 48s 625ms
Current memory = 442 306 032
Max memory = 446 333 296
Memory buffers = 50 000
Reads from disk to cache = 0
Writes from cache to disk = 33 317
Fetches from cache = 59 352 468

firebird 5.0 (localtimestamp)
------ Performance info ------
Prepare time = 0ms
Execute time = 42s 969ms
Current memory = 432 436 544
Max memory = 436 109 328
Memory buffers = 50 000
Reads from disk to cache = 0
Writes from cache to disk = 7 487
Fetches from cache = 59 391 477

firebird 5.0 (current_timestamp)
------ Performance info ------
Prepare time = 0ms
Execute time = 1m 54s 344ms
Current memory = 432 425 632
Max memory = 436 098 416
Memory buffers = 50 000
Reads from disk to cache = 0
Writes from cache to disk = 7 487
Fetches from cache = 59 391 477

@sim1984
Copy link

sim1984 commented Nov 19, 2023

Parameter DataTypeCompatibility = 3.0 in this case, it does not play any role. It only affects the data sent to the client. In this case, execution takes place on the server. Current_timestamp returns a type with a timezone, and then it is converted to a type without a timezone. There is a performance loss on this. I would compare the different options:
Timestamp default current_timestamp
Timestamp with time zone default current_timestamp
Timestamp default localtimestamp

@hvlad
Copy link
Member

hvlad commented Nov 19, 2023

The problem is definitely not related with GTT's

@shavluk
Copy link
Author

shavluk commented Nov 19, 2023

The error is related to the speed of converting the "timestamp with time zone" format to "timestamp"

@hvlad
Copy link
Member

hvlad commented Nov 19, 2023

Here is test case

execute block
  returns (test varchar(16), duration int)
as
declare n int;
declare m int = 100000;
declare ts timestamp;
declare ts2 timestamp;
declare tz timestamp with time zone;
declare tz2 timestamp with time zone;
declare t0 timestamp;
begin
  tz = current_timestamp;

  n = 0;
  t0 = cast('now' as timestamp);
  while (n < m) do
  begin
    ts = tz;
    n = n + 1;
  end

  test = 'TZ -> TS';
  duration = datediff(millisecond, t0, cast('now' as timestamp));
  suspend;

  n = 0;
  t0 = cast('now' as timestamp);
  while (n < m) do
  begin
    tz = ts;
    n = n + 1;
  end

  test = 'TS -> TZ';
  duration = datediff(millisecond, t0, cast('now' as timestamp));
  suspend;

  n = 0;
  t0 = cast('now' as timestamp);
  while (n < m) do
  begin
    ts2 = ts;
    n = n + 1;
  end

  test = 'TS -> TS';
  duration = datediff(millisecond, t0, cast('now' as timestamp));
  suspend;

  n = 0;
  t0 = cast('now' as timestamp);
  while (n < m) do
  begin
    tz2 = tz;
    n = n + 1;
  end

  test = 'TZ -> TZ';
  duration = datediff(millisecond, t0, cast('now' as timestamp));
  suspend;
end

My results:

TEST DURATION
TZ -> TS 221
TS -> TZ 246
TS -> TS 12
TZ -> TZ 13

@sim1984
Copy link

sim1984 commented Nov 19, 2023

I think there is something to investigation here. Overhead costs due to type conversion are unavoidable, but perhaps they can be made significantly lower. I suggest renaming the ticket according to the real problem.

@hvlad
Copy link
Member

hvlad commented Nov 19, 2023

The most time consuming operation is creating ICU's UCalendar object.

@asfernandes asfernandes self-assigned this Nov 20, 2023
@asfernandes
Copy link
Member

I'm looking at it. Will cache UCalendar instances as it's done with other ICU objects.

@hvlad
Copy link
Member

hvlad commented Nov 20, 2023

I have a patch, would you like to look at it?

@asfernandes
Copy link
Member

You'd better assign items to yourself when you are actively implementing a fix.
Not the first time I'm implementing something and you said you already did.
I know you were investigating, but it was not clear you were already going to implement the fix and the ticket was unassigned.

@hvlad
Copy link
Member

hvlad commented Nov 20, 2023

You'd better assign items to yourself when you are actively implementing a fix. Not the first time I'm implementing something and you said you already did.

Ok. But note - I don't assign myself until I'm sure I have the proper solution.

I know you were investigating, but it was not clear you were already going to implement the fix and the ticket was unassigned.

Just ask.

hvlad added a commit that referenced this issue Nov 20, 2023
@hvlad
Copy link
Member

hvlad commented Nov 20, 2023

Here it is #7858.
I still not sure it is fully correct, so review is appreciated.

@asfernandes
Copy link
Member

You'd better assign items to yourself when you are actively implementing a fix. Not the first time I'm implementing something and you said you already did.

Ok. But note - I don't assign myself until I'm sure I have the proper solution.

So you are using it wrong.

I know you were investigating, but it was not clear you were already going to implement the fix and the ticket was unassigned.

Just ask.

"Assign" is a form of communication. When I look at unassigned issue, nobody is implementing it.

As there was previous case of wrong usage of it, I overly communicated my intention with "I'm looking at it. Will cache UCalendar instances as it's done with other ICU objects."

But please from now, assign tickets and if you were not going to fix anymore, unasign.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment