-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AATAMS_ACOUSTIC_REPORTING: deadlock #311
Comments
@xhoenner No idea. But maybe the answer lies in the question? - "In the same time..." |
so what? re-run the harvester tonight on 10-nsp and see if that works? |
I don't know if those 2 harvesters running together can lock same tables etc, so this is why I mentioned it. Yes, run on 10-nsp-mel, see what happens. We can also schedule them to run on different times. |
Alright we don't have the deadlock issue anymore, but have a weird connection issue. Have the data bags been changes @danfruehauf ?
|
@xhoenner The credentials look OK. No idea... |
@danfruehauf please make changes in chef to run this harvester on another day (Thursday?), I ran it on Monday and Tuesday this week and it didn't throw this error. |
Changed to Thursday. |
This seems to have solved this issue. |
Occurred yesterday on 10aws @lwgordonimos In finally block
Exception in component iPostgresqlOutput_4
org.postgresql.util.PSQLException: ERROR: deadlock detected
Detail: Process 27575 waits for AccessExclusiveLock on relation 31181122 of database 16415; blocked by process 25913.
Process 25913 waits for AccessShareLock on relation 48016904 of database 16415; blocked by process 27575.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1592)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1327)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:192)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:451)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:336)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:282)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.tPostgresqlInput_3Process(Populate_layers.java:3660)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.tPostgresqlInput_2Process(Populate_layers.java:3180)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.tPostgresqlInput_1Process(Populate_layers.java:2027)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.tPostgresqlConnection_1Process(Populate_layers.java:841)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.iIncludeSdiLibraries_1Process(Populate_layers.java:718)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.runJobInTOS(Populate_layers.java:4435)
at aatams_acoustic_reporting.populate_layers_0_1.Populate_layers.runJob(Populate_layers.java:4302)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.tRunJob_4Process(aatams_acoustic_harvester.java:2273)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.tRunJob_1Process(aatams_acoustic_harvester.java:2153)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.iPostgresqlDbUpdate_1Process(aatams_acoustic_harvester.java:1981)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.runJobInTOS(aatams_acoustic_harvester.java:4422)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.main(aatams_acoustic_harvester.java:4214)
Exception in component tRunJob_4
java.lang.RuntimeException: Child job running failed
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.tRunJob_4Process(aatams_acoustic_harvester.java:2284)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.tRunJob_1Process(aatams_acoustic_harvester.java:2153)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.iPostgresqlDbUpdate_1Process(aatams_acoustic_harvester.java:1981)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.runJobInTOS(aatams_acoustic_harvester.java:4422)
at aatams_acoustic_reporting.aatams_acoustic_harvester_0_1.aatams_acoustic_harvester.main(aatams_acoustic_harvester.java:4214)
finish;2016-11-25 03:00:19+11:00 538 minutes |
neat! |
Unfortunately can't tell what those PIDs were exactly, but the TRUNCATE was waiting for something that locked aatams_acoustic_detections_map. |
Is the harvester just copying tables from the aatams database into harvest - in order to perform reporting? I wonder if it would be possible to move the aatams db in under a harvest schema. Then the harvester could be eliminated altogether. If the data is also being used to serve geoserver layers - then presumably that could be handled with some supporting view code? |
The other thing I've done in the past (not with Postgres, but Oracle/SQL Server with good success) is to use a linked database to transparently access a remote database in a "local" namespace. Could also be an option if combining databases is unsuitable due to size or any other concern which keeps them separate... Have you used this in Postgres? |
wasn't aware of dblink - looks interesting. We did use postgres read replication - which was good in that it separated out the geoserver,wms,wfs load from the reporting/backup load. |
That's the purpose of an entire subjob yet. The other subjobs generate the views for the reporting and Geoserver. |
Would be interesting to see if the views could be modified to use a linked database under the hood, e.g. http://stackoverflow.com/questions/13993302/postgresql-slow-query-dblink-and-inner-join SELECT *
FROM table1 tb1
LEFT JOIN (
SELECT *
FROM dblink('dbname=db2','SELECT id, code FROM table2')
AS tb2(id int, code text);
) USING (code) |
That's very neat. |
haha that's exactly it, the much dreaded snail! @pblain, sounds like it is time for Talend harvesters to have their own icons, similarly to Jenkins for instance. Any chance this can be added to the backlog? |
Morning made @xhoenner |
So, talking to @xhoenner this morning, the harvester currently:
Truncates are performed in the same transaction as the copying of data. In postgres, truncates take an exclusive lock on the table so nothing else can access the table outside of the transaction until the transaction is committed. The truncate cannot be performed until the exclusive access can be granted. Many of the copies take a long long time to perform when using the valid_detections data as its source so exclusive locks are in place for long long periods of time preventing any other access to the data (e.g. for data downloads, backups etc). Other harvesters have moved away from truncating tables in favor of updating file related data only so exclusive access locks of this type aren't required. |
All the above takes place in the aatams_acoustic_reporting schema. Once a month data from reporting views are copied to the reporting schema for reporting (this is summary data so it is not a long running process). |
It actually takes place in the |
Thanks for the clarification @xhoenner. Some possible ways we could look at resolving this:
|
The dblink functionality looks pretty limited - all the data that you want to work with needs to be fetched locally anyway. |
What about using dblink to link the data. And then using the projection as the source for the copy action to co-locate it in one place so that it can be accessed efficiently? That might eliminate the need to maintain a complicated harvester as well - since the required sql to perform the copy could just be ordinary reporting sql. |
Good suggestion - no need to stream the data via talend |
But not required if we are working in the aatams database. |
seems to be fixed as the harvester ran successfully on 10-aws. |
Harvester failed on 10-nsp with the following deadlock error returned:
Importantly, the same harvester ran successfully on 14-nsp at the same time... any idea @danfruehauf?
The text was updated successfully, but these errors were encountered: