PGReplay query limit ? #4

mvives-broadsign · 2018-05-04T12:50:59Z

Hi,
I'm currently working with pgreplay in order to evaluate a migration for my company and I am facing this kind of message:
Execution is 3 minutes behind schedule

I'm running pgreplay and postgres on two separate servers and the postgres server does not seem to have any load issue at all (CPU/RAM/I/O are good). The pgreplay server however has one CPU at 100% for a couple of hours now.
The pgreplay files I'm replaying is around 20M records on a ~20 hours timeframe. Only read only queries (It's traffic from a pg hot_standby).
Is it possible that we are hitting an issue where the machine running pgreplay is not powerful enough ? (c5.large on AWS)

P.S: I am also using the -j option of pgreplay but seeing that the replay is still running after 12+ hours, I don't think it changes something in our case :)

The text was updated successfully, but these errors were encountered:

koleo · 2018-05-04T13:06:49Z

Maybe you want to use the "-s" option to speed up the replay on the destination host?

I often use pgreplay -j -s 10000 [...] to replay a sql load as fast as possible.

mvives-broadsign · 2018-05-04T13:40:55Z

I'm not that interested in speeding up the replay, I'm more interested in knowing if the message Execution is 3 minutes behind schedule comes because postgres is loaded or because pgreplay is loaded.. I tend to thinks it's pgreplay but I am trying to confirm now.
We expect to test our master server also but this one has 10 times the amount of queries so if pgreplay is too loaded, it's gonna be an issue.

laurenz · 2018-05-04T14:03:19Z

It means one of the following:

pgreplay is indeed overloaded and cannot cope with running that many statements simultaneously.
Your database is slower than the original database.

It is normal for pgreplay to keep one core busy, since it is constantly polling the database connections for messages from the database server; that is not necessarily a sign that it is not keeping up.

If execution does not fall more then 3 minutes behind schedule, I'd suspect that a couple of queries just took longer than expected. You might use a tool like pg_stat_statements or pgBadger to figure out which queries took long.

Usually, if the target system is consistently slower than the original database, you'll see pgreplay falling behind schedule more and more. If pgreplay does not fall more than 3 minutes behind schedule on a 20 hour run, I'd say there is nothing much to worry.

How many statements per second do you have? pgreplay tends to get overloaded if that goes into the thousands.

The option -j only speeds up execution if there are times without activity — it just skips these intervals instead of doing nothing.

Since pgreplay is single-threaded, it won't be faster if you run it on a machine with more cores.

mvives-broadsign · 2018-05-04T14:08:31Z

Thanks for the explanations. I'll wait for the run to end in order to draw conclusions.
For the moment I'm at ~500 stmts/sec but when testing the next server (in a month probably ), I expect to reach ~5000stmts/secs. Will it be an issue ?

laurenz · 2018-05-04T14:12:01Z

I don't know the limits of pgreplay (never used it on such busy databases), but I have heard reports that it cannot keep up with very many statements per second.

Try it and give me feedback :^)

mvives-broadsign · 2018-05-04T14:14:22Z

All right, I'll you know at that point ;)
In the meantime I'll try to finish the test with my current setup and see how far it drifts

Thanks for the quick answers

mvives-broadsign · 2018-05-07T12:32:03Z

The test finishes somehow properly (I open another question for that) and does not drift more than 3 minutes after 22hours. ;)

laurenz added the question label May 4, 2018

mvives-broadsign closed this as completed May 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PGReplay query limit ? #4

PGReplay query limit ? #4

mvives-broadsign commented May 4, 2018

koleo commented May 4, 2018

mvives-broadsign commented May 4, 2018

laurenz commented May 4, 2018

mvives-broadsign commented May 4, 2018

laurenz commented May 4, 2018

mvives-broadsign commented May 4, 2018

mvives-broadsign commented May 7, 2018

PGReplay query limit ? #4

PGReplay query limit ? #4

Comments

mvives-broadsign commented May 4, 2018

koleo commented May 4, 2018

mvives-broadsign commented May 4, 2018

laurenz commented May 4, 2018

mvives-broadsign commented May 4, 2018

laurenz commented May 4, 2018

mvives-broadsign commented May 4, 2018

mvives-broadsign commented May 7, 2018