Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection enters .runningQuery state and never exits it. #92

Open
drewmccormack opened this issue Jun 6, 2019 · 13 comments
Open

Connection enters .runningQuery state and never exits it. #92

drewmccormack opened this issue Jun 6, 2019 · 13 comments
Assignees

Comments

@drewmccormack
Copy link

Have adopted the library to do direct SQL queries. At first I used a single shared connection. It was fast, but after about 30 minutes of production running, I found it would enter a .runningQuery state and never exit, jamming the whole app.

Passing a timeout option did not help, though I suspect the timeout is for the PostgreSQL DB itself, not the library code.

I have since moved to creating a new connection on every request. This is much more costly, but it at least does work. However, I do see gradual memory rise over a day (from about 12MB to 48MB). It is usable, but not nice. I suspect this is the same problem: connections entering the .runningQuery state, and never exiting the state, meaning all the associated state is kept around.

I know I could use a pool to make things faster, but the bigger concern is just this jammed state for the connection. Has anyone else seen this? Unfortunately, it takes a while to show up, so it is rare-ish. Makes testing more difficult. Perhaps the connection itself should have some sort of timeout. It would at least error and then continue on properly.

@ianpartridge
Copy link
Contributor

Thanks for reporting. Do you have a recreate we could use?

@drewmccormack
Copy link
Author

drewmccormack commented Jun 6, 2019

Not familiar with the term, I'm afraid. What is a "recreate"?

@ianpartridge
Copy link
Contributor

Sorry. A testcase or some example code that triggers the problem, so we can try and recreate it ourselves.

@drewmccormack
Copy link
Author

No, I'm afraid not. That is part of the problem. It only appears after quite a time running in production. Otherwise I could probably be more specific about what is going on.

I guess it would be quite difficult to investigate without a test. Perhaps it would show up if you connection was put under duress for a while.

@ianpartridge ianpartridge added this to the 2019.13 milestone Jun 19, 2019
@kilnerm
Copy link
Contributor

kilnerm commented Jun 24, 2019

@drewmccormack Can you give some details on the kind of database operations you are using and the table structures they are operating on?

What version of SwiftKuery and SwiftKueryPostgreSQL are you using?

In the interim I'll have a comb through the code and see if I can see anything that looks amiss.

@kilnerm
Copy link
Contributor

kilnerm commented Jun 24, 2019

From eyeballing the code I think we may be missing self.setState(.idle) at line 428 of PostgreSQLConnection.swift.

@drewmccormack
Copy link
Author

I took a look. The code I am using, release 2.1.1, looks quite different at that point in the code to what is in master. So I guess it could not be directly the line you mentioned, since I can't even find that catch, but it could be a problem in the code that has been updated already in master.

@drewmccormack
Copy link
Author

drewmccormack commented Jun 24, 2019

I'm afraid I am completely confused. If I look in my code (version 2.1.1), the func you reference at line 428 looks completely different to what is in the master. And yet, using blame, there don't seem to be any changes made to that code since January (version 2.1.1 was released in April).

I downloaded the zipped code directly from GitHub to confirm that my copy of 2.1.1 is the same as the tagged version. It is.

What am I missing?

@kilnerm
Copy link
Contributor

kilnerm commented Jun 24, 2019

Pretty sure 2.1.1 was tagged on a branch created from an earlier release as we didn't want to publish all the commits at that time. Some commits would have been cherry picked from master onto the branch before it was tagged.

When we make the next release the commit history will all line up again.

I'll have another look at the 2.1.1 version.

@kilnerm
Copy link
Contributor

kilnerm commented Jun 24, 2019

So looking at the 2.1.1 version I don't see anything obvious.....

Can you let me know the sort of operations you are running against the database so I can try to recreate the problem please?

@drewmccormack
Copy link
Author

There is really nothing in the SQL that would explain it, I don't think. It is very straightforward stuff. And the server will run for many thousands of requests before a problem arises. If a particular query were the problem, I would expect it to fail as soon as that query was run the first time, but I don't think that is the nature of the issue.

Here are a few SQL commands we use to give a flavor (...but again, these are issued a lot, and work):

INSERT INTO Users(identifier) VALUES ($1);
SELECT * FROM Users WHERE identifier=$1;
SELECT DISTINCT Users.* FROM Users INNER JOIN DeviceUserMap ON Users.identifier = DeviceUserMap.useridentifier WHERE DeviceUserMap.deviceidentifier = ANY($1);
SELECT * FROM Users WHERE lower(email)=$1;
UPDATE Users SET hashedpassword=$2, salt=$3, isadministrator=$4, emailisverified=$5, lastverificationemailsent=$6, numberofdevices=$7, mailoptinstatus=$8 WHERE identifier=$1;
INSERT INTO Devices(identifier, platform) VALUES ($1, $2) ON CONFLICT (identifier) DO NOTHING;
SELECT DISTINCT Devices.*, useridentifier FROM Devices INNER JOIN DeviceUserMap ON identifier = deviceidentifier WHERE identifier = $1;
SELECT DISTINCT Devices.*, useridentifier FROM Devices INNER JOIN DeviceUserMap ON identifier = deviceidentifier WHERE useridentifier = ANY($1);
UPDATE DeviceUserMap AS m SET useridentifier = CAST(v.useridentifier AS UUID) FROM (VALUES \(inputTable)) AS v(deviceidentifier, useridentifier) WHERE m.deviceidentifier = CAST(v.deviceidentifier AS UUID);"

@kilnerm
Copy link
Contributor

kilnerm commented Jun 24, 2019

Ok, thanks.

I will piece something together and put some load through it to see if I can get the problem to occur.

@pushkarnk
Copy link

pushkarnk commented Jun 24, 2019

May be we should request @drewmccormack to have the problem reproduced with a branch that has increased logging / debug prints? 🤔

@ianpartridge ianpartridge modified the milestones: 2019.13, 2019.14 Jul 3, 2019
@kilnerm kilnerm modified the milestones: 2019.14, 2019.15 Jul 17, 2019
@djones6 djones6 modified the milestones: 2019.15, 2019.17 Aug 5, 2019
@djones6 djones6 modified the milestones: 2019.17, 2019.18 Aug 28, 2019
@djones6 djones6 modified the milestones: 2019.18, 2019.19 Sep 11, 2019
@djones6 djones6 removed this from the 2019.19 milestone Oct 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants