Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Massive) Performance degradation when preparing a command #2603

Open
kcliffe opened this issue Sep 3, 2019 · 7 comments

Comments

@kcliffe
Copy link

commented Sep 3, 2019

We have a fairly basic scenario in a dotnet core 2.2 API solution which produces a parameterized query.

An extract:

 using (var cmd = new NpgsqlCommand(query))
                    {
                        conn.Open();
                        cmd.Connection = conn;

                        cmd.Parameters.AddWithValue("farmId", farmId);
                        cmd.Parameters.AddWithValue("activityTypes", activityTypes);
                        cmd.Parameters.AddWithValue("products", products);
                        cmd.Parameters.AddWithValue("laus", laus);
                        cmd.Parameters.AddWithValue("offset", offset);
                        cmd.Parameters.AddWithValue("limit", limit);
                        cmd.Parameters.AddWithValue("startdate", from);
                        cmd.Parameters.AddWithValue("enddate", to);
                        cmd.Parameters.AddWithValue("keyword", keyword ?? "");
                        await cmd.PrepareAsync();

                        using (var reader = await cmd.ExecuteReaderAsync())
                        {
                            var sb = new StringBuilder();
                            for (int i = 0; i < reader.FieldCount; i++)
                            {
                                sb.Append(reader.GetName(i));
                                sb.Append(",");
                            }

                            Log.Debug($"Field names: {sb.ToString()}");

                            while (reader.Read())
                            {
                                rows.Add(new Model.Activity
                                {
                                    ...
                                });
                            }

                            await reader.NextResultAsync();
                            await reader.ReadAsync();

                            var totalItems = reader.GetInt32(0);
                            totalPages = (int) Math.Ceiling((double)totalItems / limit);

                        }

When load testing this repository operation, with the line await cmd.prepareAsync() in place, performance at 20 requests per seconds exceeds 30 seconds(!) for a response.

The load test executes exactly the same query for each request.

Removing the cmd.PrepareAsync() line gives us 400ms at 50 requests per second.

Reading the guidance i thought that using prepare when the query was being repeatedly executed was recommended, but obviously something we are doing is not correct?

The results are similar whether the solution is run on my desktop, or in Docker (linux container) in ECS.

I switched to the synchronous prepare verion just in case there was some sort of async context switch thing happening, but no change.

Further technical details

Npgsql version: 4.0.8
PostgreSQL version: 10.7 (AWS Aurora)
Operating system: Windows 10 Enterprise (1809)

Other details about my project setup:

@roji

This comment has been minimized.

Copy link
Member

commented Sep 4, 2019

This should definitely not be happening. Can you please post a full minimal app? The SQL is missing, the database schema... Please provide all details.

@kcliffe

This comment has been minimized.

Copy link
Author

commented Sep 4, 2019

Here's a start:

(I should note - we have ~45 ms latency from test client to the service / database in ECS)

using Npgsql;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace NPGSql_Perf_Repro
{
    class Program
    {
        static void Main(string[] args)
        {
            ///
            /// Notes
            /// Benchmark method below takes a range of args
            /// During the perf tests we ran, ONLY farmid is being supplied
            /// 
            /// Our testing method:
            ///     The repo class is wrapped by API controller operations
            ///     We ran the soluion in release mode from the command line 
            ///     We configured artillery (artillery.io) to load 50 users over 120 seconds and then maintain this rate for 3 minutes
            ///     With the commmand.Prepare() included sharp declines in performance were seen almost immediately.
            ///     Tests exercised the repo method with THE SAME ARGS each time

            /// The database:
            ///     - PostGres 10.7, db.r5.xlarge (4 CPU, 32 gig of ram, 10gbps network)
            ///     approx 959110 rows in the view, 3618377 in the underlying table
            ///     

        }
    }

    public class Repo
    {
        // fix position index of the field indexes
        private const string FieldList = @"a.objectid,""source"",farmid,""description"",""operator"",activitytype,rateunits,activitydate,a.applicationsource,rate,area,isconfirmed,canconfirm,requiresnutrients,datasource";
        private readonly string _connectionString;

        public Repo(string connectionString)
        {
            _connectionString = connectionString;
        }

        /// <summary>
        ///     Benchmark method
        /// </summary>
        /// <param name="offset"></param>
        /// <param name="limit"></param>
        /// <param name="farmId"></param>
        /// <param name="from"></param>
        /// <param name="to"></param>
        /// <param name="activityTypes"></param>
        /// <param name="laus"></param>
        /// <param name="products"></param>
        /// <param name="keyword"></param>
        /// <param name="orderColumn"></param>
        /// <param name="orderDir"></param>
        /// <returns></returns>
        public async Task<PagedResult> ListActivities(int offset,
            int limit,
            int farmId,
            DateTime from,
            DateTime to,
            List<string> activityTypes,
            List<string> laus,
            List<string> products,
            string keyword,
            string orderColumn,
            string orderDir)
        {
            int totalPages = 0;
            var rows = new List<Activity>();

            // TODO. Review Npgsql connection pooling docs
            using (var conn = new NpgsqlConnection(_connectionString))
            {
                StringBuilder baseQuery = new StringBuilder();
                baseQuery.Append($@"
                        from vwactivities a
                        where
                            farmid = @farmId");

                if (activityTypes.Any())
                {
                    baseQuery.Append(" and activitytype = ANY(ARRAY[@activityTypes]::text[])");
                }

                if (products.Any())
                {
                    baseQuery.Append(" and description = ANY(ARRAY[@products]::text[])");
                }

                baseQuery.Append(" and activitydate >= @startdate and activitydate <= @enddate");

                if (!string.IsNullOrWhiteSpace(keyword))
                {
                    keyword = $"%{keyword.Trim(' ', '%')}%";
                    baseQuery.Append(" and (description ilike @keyword or operator ilike @keyword)");
                }

                // build count query with different select statement
                var countQuery = string.Concat("select count(1) ", baseQuery.ToString());

                // we can't set order via params, they get quoted which produces a syntax error
                baseQuery.Append($" order by a.{orderColumn} {orderDir}");
                baseQuery.Append(" offset @offset limit @limit;");

                var query = string.Concat($"select {FieldList} ", baseQuery.ToString(), countQuery);
            
                using (var cmd = new NpgsqlCommand(query))
                {
                    conn.Open();
                    cmd.Connection = conn;

                    cmd.Parameters.AddWithValue("farmId", farmId);
                    cmd.Parameters.AddWithValue("activityTypes", activityTypes);
                    cmd.Parameters.AddWithValue("products", products);
                    cmd.Parameters.AddWithValue("laus", laus);
                    cmd.Parameters.AddWithValue("offset", offset);
                    cmd.Parameters.AddWithValue("limit", limit);
                    cmd.Parameters.AddWithValue("startdate", from);
                    cmd.Parameters.AddWithValue("enddate", to);
                    cmd.Parameters.AddWithValue("keyword", keyword ?? "");

                    // KILLS Performance. Issue logged with NPGSql team
                    //await cmd.PrepareAsync();

                    using (var reader = await cmd.ExecuteReaderAsync())
                    {
                        //var sb = new StringBuilder();
                        //for (int i = 0; i < reader.FieldCount; i++)
                        //{
                        //    sb.Append(reader.GetName(i));
                        //    sb.Append(",");
                        //}

                        while (reader.Read())
                        {
                            rows.Add(new Activity
                            {
                                ObjectId = (int)reader.GetInt64(0),
                                Source = reader.IsDBNull(1) ? null : reader.GetString(1),
                                FarmId = reader.GetInt32(2),
                                Description = reader.IsDBNull(3) ? null : reader.GetString(3),
                                Operator = reader.IsDBNull(4) ? null : reader.GetString(4),
                                ActivityType = reader.IsDBNull(5) ? null : reader.GetString(5),
                                RateUnits = reader.IsDBNull(6) ? null : reader.GetString(6),
                                ActivityDate = reader.GetDateTime(7),
                                ApplicationSource = reader.IsDBNull(8) ? null : reader.GetString(8),
                                Rate = reader.IsDBNull(9) ? null : reader.GetString(9),
                                Area = reader.IsDBNull(10) ? null : reader.GetString(10),
                                IsConfirmed = reader.GetBoolean(11),
                                CanConfirm = reader.GetBoolean(12),
                                RequiresNutrients = reader.GetBoolean(13),
                                Datasource = reader.IsDBNull(14) ? null : reader.GetString(14)
                            });
                        }

                        await reader.NextResultAsync();
                        await reader.ReadAsync();

                        var totalItems = reader.GetInt32(0);
                        totalPages = (int)Math.Ceiling((double)totalItems / limit);

                        if (totalPages > offset)
                            // TODO. fix the int / long thing
                            offset = totalPages - 1;
                    }

                    return new PagedResult(rows.ToList(), totalPages, 0, 0);
                }
            }
        }
    }

    public class PagedResult
    {
        public PagedResult(IList<Activity> activities, long totalPages, long queryBuildMs, long queryExecutionMs)
        {
            Activities = activities;
            TotalPages = totalPages;
            QueryBuildMs = queryBuildMs.ToString();
            QueryExecutionMs = queryExecutionMs.ToString();
        }

        public IList<Activity> Activities { get; }
        public long TotalPages { get; }

        public string QueryBuildMs { get; set; }
        public string QueryExecutionMs { get; set; }
    }

    public class Activity
    {
        public int ObjectId { get; set; }
        public string Source { get; set; }
        public int FarmId { get; set; }
        public string Description { get; set; }
        public string Operator { get; set; }
        public string ActivityType { get; set; }
        public string Area { get; set; }
        public string Rate { get; set; }
        public string RateUnits { get; set; }
        public DateTime ActivityDate { get; set; }
        public string ApplicationSource { get; set; }
        public bool IsConfirmed { get; set; }
        public bool CanConfirm { get; set; }
        public bool RequiresNutrients { get; set; }
        public string Datasource { get; set; }
    }
}
@kcliffe

This comment has been minimized.

Copy link
Author

commented Sep 4, 2019

Schema:

CREATE TABLE public.activity (
	objectid int4 NOT NULL,
	"source" varchar(5) NOT NULL,
	farmid int4 NOT NULL,
	description varchar(150) NULL,
	"operator" varchar(250) NULL,
	activitytype varchar(256) NULL,
	rateunits varchar(100) NULL,
	activitydate timestamp NOT NULL,
	applicationsource varchar(50) NOT NULL,
	rate text NULL,
	area varchar(150) NULL,
	requiresnutrients bool NOT NULL DEFAULT false,
	isconfirmed bool NOT NULL DEFAULT true,
	canconfirm bool NOT NULL DEFAULT false,
	datasource varchar(50) NULL,
	CONSTRAINT activity_pkey PRIMARY KEY (objectid,"source")
);
GO

CREATE INDEX activity_activitydate_idx ON public.activity USING btree (activitydate DESC);
CREATE INDEX activity_type_idx ON public.activity USING btree (activitytype);
CREATE INDEX activity_farmid_idx ON public.activity USING btree (farmid);
CREATE INDEX activity_operator_idx ON public.activity USING btree (operator DESC);
CREATE INDEX activity_description_idx ON public.activity USING btree (description);
CREATE INDEX activity_date_idx ON public.activity USING btree (activitydate);
GO

create view vwactivities
as
	SELECT objectid, "source", farmid, "description", "operator", activitytype, rateunits, activitydate, applicationsource, rate, "area",
		requiresnutrients, isconfirmed, canconfirm, datasource
	FROM public.activity
	WHERE activitytype not in ('GeometryDelete', 'GeometryAdd','GeometryEdit', 'AttributeModification', 'BlockAttributeModification' );
@kcliffe

This comment has been minimized.

Copy link
Author

commented Sep 4, 2019

Sample Data:
( i can generate as many of these as you like, but posting as text here would be an issue)

INSERT INTO public.temp_activity (objectid,"source",farmid,description,"operator",activitytype,rateunits,activitydate,applicationsource,rate,area,datasource,requiresnutrients,isconfirmed,canconfirm) VALUES 
(6711709,'basic',60862850,'Rolling','Self completed','Cultivation',NULL,'2019-09-03 12:00:00.000','gis_farm.sde.spatialevent',NULL,'4.92000000',NULL,false,true,false)
,(6711710,'basic',60862850,'80 Calves',NULL,'Grazing',NULL,'2019-09-03 12:00:00.000','gis_farm.sde.spatialevent',NULL,'4.92000000',NULL,false,true,false)
,(6711711,'basic',60862850,'Combine','Self completed','Harvest','t','2019-09-03 12:00:00.000','gis_farm.sde.spatialevent','1000.00000000','4.92000000',NULL,false,true,false)
,(6711712,'basic',60862850,'Pond','Self completed','Irrigation','l','2019-09-03 12:00:00.000','gis_farm.sde.spatialevent','30000.00000000','4.92000000',NULL,false,true,false)
,(6711713,'basic',60862850,NULL,NULL,'GeometryAdd',NULL,'2019-09-03 12:00:00.000','gis_farm.sde.spatialevent',NULL,NULL,NULL,false,true,false)
,(6712509,'basic',60862850,'SWEDE SEED','NORTH OTAGO GROUNDSPREAD(JV)','Seed','kg/ha','2019-09-03 12:00:00.000','gis_farm.sde.spatialevent','1.00000000','4.92000000',NULL,false,true,false)
,(6712510,'basic',60862850,'Dig some swedes',NULL,'CustomerDefined',NULL,'2019-09-03 12:00:00.000','gis_farm.sde.spatialevent',NULL,'4.92000000',NULL,false,true,false)
,(6712511,'basic',60862850,NULL,NULL,'BlockAttributeModification',NULL,'2019-09-03 12:00:00.000','gis_farm.sde.spatialevent',NULL,NULL,NULL,false,true,false)
,(6712109,'basic',60862850,'pond','Self completed','Effluent','l/ha','2019-09-03 12:00:00.000','gis_farm.sde.spatialevent','10000.00000000','4.92000000',NULL,false,true,false)
,(6712110,'basic',60862850,'1720000-15% POT SERP SUPER','NORTH OTAGO GROUNDSPREAD(JV)','Fertiliser','kg/ha','2019-09-03 12:00:00.000','gis_farm.sde.spatialevent','500.00000000','4.92000000',NULL,false,true,false)

@roji roji added the performance label Sep 8, 2019

@roji roji added this to the 4.1 milestone Sep 8, 2019

@roji roji self-assigned this Sep 11, 2019

@roji

This comment has been minimized.

Copy link
Member

commented Sep 11, 2019

@kcliffe looking at your code sample in depth, it seems that you're dynamically generating your SQL statements. There's nothing wrong with that, unless it generates an explosion of completely different SQL strings. If that occurs and you prepare all statements, you risk taking up lots of resources at the server (since prepared statements have a cost), without any preparation benefit (assuming each individual statement variant isn't used very much).

Your code sample isn't a complete repro - I have no idea how you're calling the ListActivities function and with what kind of parameters. The best thing to do, would be for you to run a simple test where you're repeatedly calling with the same parameter set, to make sure you're generating exactly one SQL variant. If the preparation perf problem disappears, we know it's related either to SQL statement explosion, or to some specific issue in the generation logic (although nothing stands out). If the preparation perf persists, you can then submit a much simpler repro which would allow me to investigate further.

One unrelated note: to get the total row count, rather than reissuing the same query again with COUNT(*) you can simply count the rows client-side as you're consuming the rows with a reader. Another more complex option is to examine NpgsqlCommand.Statements[i].Rows, which gets populated automatically after you consume a statement's resultset (though this is an Npgsql-specific extension which may get changed in the future).

Please note that 4.1 will be released at around the same time as .NET Core 3.0 (so very soon) - it would be great if we can figure this out quickly, and merge any fixes before then, if any are needed.

@kcliffe

This comment has been minimized.

Copy link
Author

commented Sep 12, 2019

Hi Shay,

We found during load testing that our AWS ECS instance (hosting our microservice in Docker) was limited to a pretty low network throughput. Essentially with > 40 requests per sec, we exceed available bandwidth. I'm guessing that prepare() will suffer because of the additional network call.

What i cannot explain is that i managed to get the same result locally or via ECS. On my machine, network bandwidth would be a lot higher - i'm guessing maybe we got throttled network wise at the db end.

At least i though so. Until i wrote the code below:

In which case with or without the prepare runs at the same speed (prepare version is no quicker, but no slower).

So right now, i cannot explain why running HTTP load tests either against the service on my machine or in ECS is a lot slower with Prepare().

Our plan is to use performance insights on the DB server to try to see whats happening, but i can't say when we will get around to that.

Re the count query you mention, the query we construct is paged via the simple offset / limit technique, thats why we build another query. We need to count the POSSIBLE TOTAL results given the same filters.

namespace Service.Atlas.Tests
{
    public class IssuePerfTest
    {
        [Test]
        public async Task Test()
        {
            var activityRepo = new ActivityRepo("Host=dev-materializedviews-cluster.cluster-ro-ckxycbarkbtw.ap-southeast-2.rds.amazonaws.com;Database=postgres;Port=5432;Username=ravensdown_ro;Password=#pU5V5h5fcxd");
            DateTime from = DateTime.UtcNow.AddDays(-14);
            DateTime to = DateTime.UtcNow;
            var stringList = new List<string>();

            var stopWatch = new Stopwatch();
            stopWatch.Start();

            for (int i=0; i<500; i++)
            {
                await activityRepo.ListActivities(0 ,20, 60869943, from, to, stringList, stringList, stringList, "", false, "activitydate", "asc" );
                Console.WriteLine(i);
            }

            Console.WriteLine(stopWatch.Elapsed);
        }
    }
}
@roji

This comment has been minimized.

Copy link
Member

commented Sep 12, 2019

40 requests per sec, we exceed available bandwidth. I'm guessing that prepare() will suffer because of the additional network call.

Since prepared statements are persisted, preparation should only do an additional network call the first time a given SQL variant is executed on a given physical connection. In other words, there's an internal prepared statement cache that does not get cleared when a pooled connection is closed (i.e. returned to the pool). The next time the connection is returned by the pool, calling Prepare should do nothing at all, since the prepared statement is already cached. See these docs for more info.

In which case with or without the prepare runs at the same speed (prepare version is no quicker, but no slower).

Preparation should definitely make this code execute faster - I'd double-check that preparation was actually enabled when testing. If it was, then since there's no degradation we can possibly assume there's something about your dynamic query composition that's causing the issue. You may want to play around with your minimal repro to make it closer to what you're doing in production, with the hope that the perf issue will reappear.

Note that variables that simply go into parameters aren't a problem: different parameters values don't cause extra variants of SQL and are all handled by the same prepared statement.

Re the count query you mention, the query we construct is paged via the simple offset / limit technique, thats why we build another query. We need to count the POSSIBLE TOTAL results given the same filters.

Makes sense, I missed that in your code.

At this point I don't really have anything pointing to an issue in Npgsql (or at least no repro). I'll put this into the backlog for a while in the hope that you can provide more info.

@roji roji modified the milestones: 4.1, Backlog Sep 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.