Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"idle in transaction" connection #3260

Closed
silenius opened this issue Sep 28, 2016 · 8 comments

Comments

Projects
None yet
6 participants
@silenius
Copy link

commented Sep 28, 2016

CKAN Version if known (or site URL)

2.5.2, full venv below:

(ckan_bbpf) www2# pip freeze -l
Babel==0.9.6
Beaker==1.7.0
bleach==1.4.2
certifi==2016.8.31
-e git+https://github.com/ckan/ckan.git@919201e886fa95c8f26b9d970cb9ef7bbb916ca9#egg=ckan
decorator==4.0.4
fanstatic==0.12
FormEncode==1.3.0
Genshi==0.6
html5lib==0.9999999
Jinja2==2.6
Mako==1.0.2
Markdown==2.4
MarkupSafe==0.23
nose==1.3.7
ofs==0.4.1
ordereddict==1.1
Pairtree===0.7.1-T
passlib==1.6.2
Paste==1.7.5.1
PasteDeploy==1.5.2
PasteScript==2.0.2
pbr==0.11.0
psycopg2==2.4.5
Pygments==2.0.2
Pylons==0.9.7
python-dateutil==1.5
pytz===2012j
pyutilib.component.core==4.5.3
repoze.lru==0.6
repoze.who==2.0
repoze.who-friendlyform==1.0.8
requests==2.7.0
Routes==1.13
simplejson==3.3.1
six==1.10.0
solrpy==0.9.5
SQLAlchemy==0.9.6
sqlalchemy-migrate==0.9.1
sqlparse==0.1.11
Tempita==0.5.2
tzlocal==1.2
unicodecsv==0.14.1
vdm==0.13
WebError==0.11
WebHelpers==1.3
WebOb==1.0.8
WebTest==1.4.3
zope.interface==4.1.1

Please describe the expected behaviour

No idle in transaction connection for more than a few seconds.

Please describe the actual behaviour

Any connection in the state idle in transaction for more than a few seconds usually indicates a problem with the application:

postgres=# select backend_start, xact_start, query_start, state_change, waiting, state, query from pg_stat_activity where datname='ckan_bbpf' and state = 'idle in transaction';
-[ RECORD 1 ]-+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
backend_start | 2016-09-21 16:06:59.026356+02
xact_start    | 2016-09-21 16:06:59.826694+02
query_start   | 2016-09-21 16:06:59.872203+02
state_change  | 2016-09-21 16:06:59.872341+02
waiting       | f
state         | idle in transaction
query         | SELECT system_info.id AS system_info_id, system_info.key AS system_info_key, system_info.value AS system_info_value, system_info.state AS system_info_state, system_info.revision_id AS system_info_revision_id 
              | FROM system_info 
              | WHERE system_info.key = 'ckan.main_css' 
              |  LIMIT 1
@wardi

This comment has been minimized.

Copy link
Contributor

commented Sep 28, 2016

Is this as simple as adding meta.Session.commit() after
https://github.com/ckan/ckan/blob/master/ckan/model/system_info.py#L56 ? If so it looks like there's a few more places to fix in that module.

@silenius

This comment has been minimized.

Copy link
Author

commented Sep 28, 2016

I haven't had time to dig yet, but I think so. If I remember correctly (but I may be wrong) in Pylons the SQLAlchemy session is "removed" only in the context of an HTTP request, so I think that an explicit .commit() or (.rollback()) should be made if the session is used outside of an HTTP request, otherwise it is idling forever (...and prevent VACUUM from removing tuples that were deleted by transactions started after that open transaction)

@silenius silenius changed the title "idle in transaction" transaction "idle in transaction" connection Sep 28, 2016

@florianm

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2016

I'm having the same problems, my CKAN loads dataset pages extremely slowly.

My setup is a datacats & CKAN 2.6.0a (docs).
{
"ckan_version": "2.6.0a",
"site_url": "xxx",
"site_description": "",
"site_title": "xxx",
"error_emails_to": "xxx",
"locale_default": "en_AU",
"extensions": [
"cesium_viewer",
"resource_proxy",
"datastore",
"datapusher",
"datawagovau_theme",
"stats",
"archiver",
"qa",
"featuredviews",
"showcase",
"disqus",
"scheming_datasets",
"recline_grid_view",
"recline_graph_view",
"recline_map_view",
"text_view",
"image_view",
"recline_view",
"pdf_view",
"webpage_view",
"dcat",
"dcat_rdf_harvester",
"dcat_json_harvester",
"dcat_json_interface",
"geo_view",
"geojson_view",
"spatial_metadata",
"spatial_query",
"geopusher",
"harvest",
"ckan_harvester",
"csw_harvester",
"hierarchy_display",
"hierarchy_form"
]
}

Some observations:

  • I'm using a few of my own forks of extensions, will test with bare bones CKAN
  • I'm using a datacats setup, running CKAN, datapusher, postgres, SOLR and Redis in separate docker containers
  • I've applied the database optimisations of the Performance tips, no measurable improvement in page load speed, although index size went from ~2GB to ~1GB
  • The test instance with a handful of datasets runs quicker, so the slowness seems to correlate with number of datasets
  • One dataset page loads in about 20 sec, all spent on TTFB
  • Home page loads normally without showing datasets, toggling the layout to display featured org / group makes the home page load in about 20 sec
  • Running on an AWS t2.large did max out the CPUs with postgres AUTOVACCUM and IDLE, changing to a more CPU-optimised c4.large brought the CPU load down
  • According to logs (loglevel DEBUG), this query may be the culprit, or the postgres IDLE running immediately before this query:
2016-10-04 04:33:28,689 DEBUG [ckan.lib.search.query] 
Package query: {'fq': [' +site_id:"default" +state:active'], 'facet.mincount': 1, 'rows': 2, 'facet.field': ['groups', 'owner_org'], 'facet': 'true', 'q': '+capacity:public', 'facet.limit': -1, 'wt': 'json', 'fl': 'groups'}
watch -n 0.25 "psql -h db -d ckan -U postgres -c ' SELECT datname,pid,query FROM pg_stat_activity;'\
 | grep -v IDLE | grep -v 'SELECT datname,pid,query FROM pg_stat_activity;' \
| grep -v ^\(  | grep -v 'datname  | pid ' | grep -v -- '-----------+---------+-'"

shows db queries involving revisions taking a long time:

FROM (SELECT revision.id AS revision_id, revision.timestamp AS revision_timestamp, revision.author AS revision_author, revision.message AS revision_message, revision.state AS revision_state, revision.approved_timestamp AS revision_approved_timestamp FROM revision                                                                                                                                                                                                                                             WHERE revision.author = 'michael_rule' ORDER BY revision.timestamp DESC) AS anon_1

Is this likely the same problem you're having, @silenius ?

@TkTech

This comment has been minimized.

Copy link
Member

commented Oct 4, 2016

@florianm your issue (terrible revision query performance) appears to be unrelated to @silenius's issue (hanging transactions). I would suggest you make a new issue, as that's definitely something we should look at.

@florianm

This comment has been minimized.

Copy link
Contributor

commented Oct 4, 2016

Thanks @TkTech, will do!

@silenius

This comment has been minimized.

Copy link
Author

commented Oct 4, 2016

@florianm: we don't have any performance issues .. but our dataset is quite small. The idle in transaction connection is something I noticed when we switched to pgbouncer here.

If the performance issue is on the PostgreSQL side a first step would be to log slow queries and paste an explain analyze of the problematic query (on https://explain.depesz.com/ for example).

@florianm

This comment has been minimized.

Copy link
Contributor

commented Oct 5, 2016

@silenius thanks, will do over at #3264 !

@smotornyuk

This comment has been minimized.

Copy link
Member

commented Nov 21, 2016

Added pull request #3324

I checked more closely, when this connection appears. And instead of adding Session.commit in few places, i've just added it in the end of load environment function, so possibility of regression is reduced for feature changes

amercader added a commit that referenced this issue Nov 29, 2016

@amercader amercader closed this Dec 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.