New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAL object leaks memory #427

Closed
toorsukhmeet opened this Issue Nov 3, 2016 · 12 comments

Comments

Projects
None yet
3 participants
@toorsukhmeet

toorsukhmeet commented Nov 3, 2016

I have an application that runs on CherryPy (v8.1.2) on Ubuntu (14.04.5 LTS) and python 2.7.6. I use standalone pyDAL (v16.09) to connect to my MySQL DB on AWS RDS. I'm seeing a constant memory leak on my servers which causes the process to use up 100% of RAM in a few hours and eventual die a painful death (no SWAP enabled; enabling SWAP pushes back certain death a little).

I have managed to reproduce the leak with a very small loop that repeatedly opens and closes the DAL object. I confirmed that memory continues to grow using htop and will eventually end up using 100% of RAM. I thought it was an MySQL adapter specific problem but I was able to repro it with the default sqlite adapter as well.

from pydal import DAL

def open_close_dal():
   sql = DAL()
   sql.close()

if __name__ == '__main__':
   while True:
      open_close_dal()

I read some really old threads on this issue that suggested calling sql._adapter.close() or sql._adapter.close_all_instances() or del sql but it doesn't seem to make a difference. I've tried tracking down the memory leak and managed to confirm that it is the guts of the DAL that are leaking. One iteration of the loop, with pympler.tracker enabled reports these objects created:

                                       types |   # objects |   total size
============================================ | =========== | ============
                                        dict |         672 |    423.38 KB
                                        list |        3525 |    354.35 KB
                                         str |        2851 |    162.49 KB
             <class 'collections.OrderedDict |         129 |    135.02 KB
                                        code |         585 |     73.12 KB
          <class 'pydal.dialects.MetaDialect |          32 |     28.25 KB
            <class 'pydal.parsers.MetaParser |          17 |     15.01 KB
                                        type |          16 |     14.12 KB
  <class 'pydal.representers.MetaRepresenter |          16 |     14.12 KB
                                       tuple |         168 |     12.85 KB
          <class 'pydal.dialects.sqltype_for |         178 |     11.12 KB
                                     weakref |          89 |      7.65 KB
                                         int |         258 |      6.05 KB
         <class 'pydal.representers.for_type |          49 |      3.06 KB
                             function (wrap) |          25 |      2.93 KB
@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 4, 2016

@toorsukhmeet Will inspect this during the weekend

@toorsukhmeet

This comment has been minimized.

toorsukhmeet commented Nov 9, 2016

More data requested by @mdipierro:

I just ran a few tests and I can confirm that the leak does repro on Mac OS X El Capitan (10.11.6 (15G1004)) with the latest version of pyDAL (16.9) and the HEAD and latest tags for web2py repo (https://github.com/web2py/web2py/releases). The leak did not repro from the last pypi version of web2py (2.1.1). I ran the program I listed in PyCharm on Mac OSX and watched the memory usage using htop installed on the Mac using Homebrew.

Here are the detailed results:

    1. pyDAL 16.9
from pydal import DAL

def open_close_dal():
   sql = DAL()
   sql.close()

if __name__ == '__main__':
   while True:
      open_close_dal()

Memory usage at ~4 mins: 193M (https://www.dropbox.com/s/vj5v6hqya4mot2v/Screenshot%202016-11-08%2011.43.28.png?dl=0)
Memory usage at ~11 mins: 432M (https://www.dropbox.com/s/hac3monmy134b8f/Screenshot%202016-11-08%2011.50.51.png?dl=0)

    1. web2py 2.1.1 (pip install web2py==2.1.1)
from gluon import DAL

def open_close_dal():
   sql = DAL()
   sql.close()

if __name__ == '__main__':
   while True:
      open_close_dal()

Memory usage at ~5 mins: 16.7M (https://www.dropbox.com/s/yjy8vv7fy7i79b0/Screenshot%202016-11-08%2012.14.52.png?dl=0)
Memory usage at ~11 mins: 16.7M (https://www.dropbox.com/s/49laoi3tk74qhvp/Screenshot%202016-11-08%2012.21.32.png?dl=0)

from gluon import DAL

def open_close_dal():
   sql = DAL()
   sql.close()

if __name__ == '__main__':
   while True:
      open_close_dal()

Memory usage at ~4 mins: 187M (https://www.dropbox.com/s/5hx01xve1mpemg5/Screenshot%202016-11-08%2012.32.54.png?dl=0)
Memory usage at ~11 mins: 398M (https://www.dropbox.com/s/wcnpz15cor4g698/Screenshot%202016-11-08%2012.40.32.png?dl=0)

@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 9, 2016

@toorsukhmeet Can you please share:

  • the python version
  • the result of this:
from gluon import DAL

sql = DAL()

def open_close_dal():
   sql._adapter.reconnect()
   sql.close()

if __name__ == '__main__':
   while True:
      open_close_dal()

Seems somehow related to pools, but you're not using them. It feels a bit tricky.

@toorsukhmeet

This comment has been minimized.

toorsukhmeet commented Nov 9, 2016

  • Python version: Python 2.7.10 (default, Oct 23 2015, 19:19:21)
  • The above does not leak

The question I have is if the sql instances can be shared between threads this way (after reconnecting). I have a workaround on my server now which creates a thread local instance of DAL and keeps it open forever which mitigates the unbounded leak problem but still has the issue of having a large number of connections to the database open all the time. If this is a valid way to disconnect the instance from one thread and reusing on another thread then at least I can maintain a limited pool of connections. But ideally close() would do the correct thing.

@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 9, 2016

@toorsukhmeet yes, pydal is currently thread and forking safe, so it can be a workaround. I will keep inspecting this, probably will require some days.

@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 9, 2016

@toorsukhmeet ok found the leak.

@mdipierro this is because every DAL instance create data into THREAD_LOCAL variable and never remove the contents.

I've tested this:

def run_leak(n=100):
    for _ in xrange(0, n):
        db = DAL('sqlite:memory')
        db.close()

with some profiling on the memory usage:

(Pdb) run_leak()
(Pdb) objgraph.show_growth()
dict            1722       +90
list            1775       +90
OrderedDict      312       +90
(Pdb) run_leak()
(Pdb) objgraph.show_growth()
dict            1807       +85
list            1860       +85
OrderedDict      397       +85
(Pdb) run_leak()
(Pdb) objgraph.show_growth()
dict            1892       +85
list            1945       +85
OrderedDict      482       +85

and is quite obvious the THREAD_LOCAL is increasing due to connections:

>>> from pydal.connection import THREAD_LOCAL
>>> dir(THREAD_LOCAL)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_pydal_connection_4390457232_66113', '_pydal_cursors_4390457232_66113', '_pydal_db_instances_', '_pydal_db_instances_zombie_']
>>> run_leak(100000)
>>> len(dir(THREAD_LOCAL))
68690
>>> dir(THREAD_LOCAL)[:30]
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_pydal_connection_4381352592_66113', '_pydal_connection_4381352720_66113', '_pydal_connection_4381352848_66113', '_pydal_connection_4381352912_66113', '_pydal_connection_4388182992_66113', '_pydal_connection_4388822160_66113', '_pydal_connection_4388822288_66113', '_pydal_connection_4388872336_66113', '_pydal_connection_4390457168_66113', '_pydal_connection_4390457232_66113', '_pydal_connection_4390495376_66113', '_pydal_connection_4391236048_66113', '_pydal_connection_4391239056_66113', '_pydal_connection_4391239120_66113', '_pydal_connection_4391239184_66113']

So this is happening on web2py because it creates a DAL instance per-request. On weppy the DAL instance is the same across the processes and is initialized when the server start, this is why I missed the leak.

The proposal is to edit the DAL.close method to add some cleaning of the connection data on the THREAD_LOCAL object:

def close(self):
        self._adapter.close()
        if self._db_uid in THREAD_LOCAL._pydal_db_instances_:
            db_group = THREAD_LOCAL._pydal_db_instances_[self._db_uid]
            db_group.remove(self)
            if not db_group:
                del THREAD_LOCAL._pydal_db_instances_[self._db_uid]
        # HERE: some magic to clean connections and cursors too

@mdipierro what do you think?

@mdipierro

This comment has been minimized.

Contributor

mdipierro commented Nov 9, 2016

Good catch! Looks good to me!

@toorsukhmeet

This comment has been minimized.

toorsukhmeet commented Nov 10, 2016

Great! @gi0baro when do you think you will be able to patch this? Thanks in advance.

@toorsukhmeet

This comment has been minimized.

toorsukhmeet commented Nov 10, 2016

@gi0baro I can confirm this fixes the leak. Would you be able to tag a new pydal release?

@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 10, 2016

@toorsukhmeet can you try to clone the latest pydal master (commit 199664b) on your web2py setup and check out if the leak is gone? If everything looks good I will push a new release.

@gi0baro

This comment has been minimized.

Member

gi0baro commented Nov 10, 2016

@toorsukhmeet I'm sorry I just missed your message. Will push a new version in a few minutes.

@gi0baro gi0baro closed this Nov 10, 2016

@toorsukhmeet

This comment has been minimized.

toorsukhmeet commented Nov 10, 2016

Thanks @gi0baro

valq7711 added a commit to valq7711/pydal that referenced this issue Aug 26, 2017

sync (#1)
* added codecov settings file

this should silence unnecessary messages from codecov. we can then
adjust settings going forward

* checkpoint

* possibly fixed the issue with id in import_from_csv

* fixed import_from_csv_file

* removed unwanted print

* another import fix

* more import debugging

* Enabled entity quoting as default also on DAL class (web2py#365)

* Enhancement/teradata lastrowid (web2py#363)

* Initial change to remove ESCAPE from teradata syntax

* Revert "Initial change to remove ESCAPE from teradata syntax"

This reverts commit 81489e9.

* Fix lastrowid error for Teradata adapter

* further update to teradata adapter

* Allow custom Rows classes (web2py#367)

* Added Rows.append and Rows.insert (web2py#366)

* [WIP] Indexes support (web2py#361)

* First implementation of web2py#152

* Returning True on index creation/drop

* Added where option on postgre create_index

* Added tests for indexes

* Added support for additional arguments to Table.create_index

* Better expansion in sql instructions for indexes, better errors for indexes

* Re-implemented _lastsql on DAL class (web2py#368)

* Fixed web2py#319 and web2py#333

* turn off codecov/changes report

hopefully :)

* Some fixes on StringIO usages with py3

* Better blobs handling with MongoDB and python3

* Added missing representation of 'password' fields on NoSQL adapters

* Updated mongoDB tests on py3 and blobs

* Updated changelog

* Releasing 16.06

* Added postgres3 adapters, put back multiple values type boolean parsing in base parser

* Removed postgres2 adapter from tests matrix since implicitly tested on postgres3

* Updated dev version

* Making connection and cursors fork-safe in multiprocess environments

* Fixing missing inheritance of additional fields expressions in dialects

* Releasing 16.06.09

* Updated dev version

* Added forced parsing of date and datetime fields in SQLite parser

* Added aggregation type detection test for fields

* Fix web2py#376

* Fixed web2py#377

* default quote_template for CommonDialect

* Enhance d13e82f

* Fix web2py#380

* Releasing 16.06.20

* Updated dev version

* fixes web2py#382

* Improving parsing performance (web2py#385)

* Fix web2py#383

* Releasing 16.06.28

* Updated dev version

* added rows.join('name',lambda ids: db.referenced.field.belongs(ids))

* remove test that depends on details of json serialization

* fixed removal order of new table in test

* yet simpler and smarter row.join

* fixed rows.join(...)

* one more improvement in join

* allow many2many in row.join

* ignore pool_size for sqlite

* better security

* enabled python3.5 on travis (web2py#393)

* fix py35 Field.__hash__(), close web2py#392 (web2py#394)

* added py35 to tox.ini

* Avoid to run travis tests on py35 and google

* fixes web2py#396

* Skip travis tests on python 3.3

* Releasing 16.07

* Updated dev version

* Fix web2py#399 - postgres adapter lastrowid() to use currval(seq) instead of lastval()

* reinstated self._last_insert, no need to remove working logic

* defied long in _compat

* merged pydal

* portalocker: removed pywin32 dep, added tests (web2py#401)

* Avoid upload errors on py3 if upload content is a bytes string

* Compat module cleaning

* Releasing 16.08

* Updated dev version

* MongoDB: testcase and fix for query(field==list:reference)

* Enhancement/teradata test (web2py#413)

* Initial change to remove ESCAPE from teradata syntax

* Revert "Initial change to remove ESCAPE from teradata syntax"

This reverts commit 81489e9.

* Initial teradata tests changes.  More to come

* first sql.py change, TestAddMethod now OK

* include teradata in setup

* Postgres fixes (web2py#414)

* Fixed postgres2 and postgres3 adapters inheritance, updated `PostgreMeta` to choose the right class depending on uri

* Fixed representation for lists on postgres2 and postgres3 adapters

* Fixed typo in PostgreDialectArrays

* Ensure strings in PostgreArrayRepresenter._listify_elements

* New test helper class DALtest that handles connection boilerplate (web2py#415)

* Enhancement/teradata test (web2py#418)

* Initial change to remove ESCAPE from teradata syntax

* Revert "Initial change to remove ESCAPE from teradata syntax"

This reverts commit 81489e9.

* TestBelongs for Teradata

* More Teradata test changes

* run the test.

* fix for opening files in py3 (web2py#420)

* fix for opening files in py3

fixes also web2py/web2py#1451

* fix for PY2 detection

* Releasing 16.09

* Updated dev version

* Enhancement/teradatatest (web2py#426)

* Initial change to remove ESCAPE from teradata syntax

* Revert "Initial change to remove ESCAPE from teradata syntax"

This reverts commit 81489e9.

* Skipping TestChainedJoinUnique for Teradata.

* Clean connection data on thread local on DAL.close (web2py#427)

* Releasing 16.11

* Updated dev version

* Joinable subselects (web2py#408)

* Minimal implementation of joinable subselects (work in progress)
* Generate subselects in Set.nested_select() instead of BaseAdapter._select()
* Add support for subselects to BaseAdapter._count()
* Prevent table name collisions when applying record versioning filters
* Fix common filters in select with subselects
* Add support for subselect fields to Rows.render() and fix bugs
* Refactor Google datastore adapter and BaseAdapter.get_table()
* Check for table name collisions when building a SELECT query
* When building subselect string, ignore cache if outer_scoped is not empty
* Minor optimization in SQLAdapter._select_wcols()
* Add parameter "correlated" to Select constructor defaults to True. When set to False, the subquery will be self-contained, meaning it will never reference any tables from the parent scope.
* Implement proper expansion of correlated subqueries
* Check that subquery in belongs() expression has exactly 1 column
* Implement expansion of correlated subqueries in count(), update(), delete()
* Update Mongo unit tests to new adapter API
* Move part of Set.nested_select() to adapter code
* Unit tests for the Select class
* Additional unit tests for fixed bugs
* Expand aliased table names using dialect methods
* Fix expansion of correlated subqueries used in UPDATE/DELETE statements

* Refactor table name attributes and fix related aliasing/rname bugs

* Update IMAP adapter methods to new API

* Add table alias support to UPDATE/DELETE in MSSQL dialect

* Store values for callbacks on insert/update/delete operations (web2py#431)

* Changed listify to store computations in values

* Moved attempt_upload logic to callbacks

* Fixed callbacks access to fields

* One more check that Table.with_alias() updates all field attributes

* Rename Table.sqlsafe and Table.sqlsafe_alias

* Refactor field name attributes and fix related rname bugs

* Fix type error in MSSQL TFK reference template

* Use the correct constraint name in MSSQL TFK references

* Added missing 'OpRow.get' method

* Fixed after_update callbacks argument in validate_and_update

* tablemap should contains all tables to avoid key errors in adapter _select_wcols

* Fix field migrations and some final refactoring

* Fixed wrong add operation with float types web2py#445

* Updated changelog

* Releasing 17.01

* Fix DAL(None).define_table()

* Fix perpetual migrations due to changes in name quoting

* Fix storing Rows objects in pickle-based cache

* Pass the list of field references when creating a subset of Rows object

* Disable part of pickling test on MSSQL, driver problem...

* Completely disable the pickling test on MSSQL (driver problem...)

* Updated dev version

* Correctly configure adapters that need connection for configuration

* Added locking for reconnect method switch, created a mixin for adapters that need reconnection mocking

* added tests for py36

* Always find the driver on adapters

* Cleaner solution for web2py#446

* Updated changelog

* Releasing 17.03

* added missing "oracle" in adapters.register_for() (web2py#457)

Fixes web2py/web2py#1599

* field.set_attributes(..) could return the instance (web2py#458)

Would let the code be  more straightforward -- example, when passing the "decorated" field to the function...
Now I have to   "scatter" the code in two places for each field...

* testing py36 on appveyor (web2py#456)

* Fix get in pydal.helpers.classes Reference class calling __getattr__ … (web2py#465)

* Fix get in pydal.helpers.classes Reference class calling __getattr__ with 3 variables instead of 2 as per signature

* Move fix of wrong method call in Reference class from get to __getattr__

* Make DAL support jsonb for postgres (web2py#469)

* Make DAL support jsonb for postgres

* added parser for jsonb

* experimantal code to address issue #1613

* fixed error introduced in recent refactoring by me

* mysql and postgres unix_socket support

* fixed _find_work_folder

* support for gcloud sql v2. thanks Karoly Kantor

* small typo in warnings

* Releasing 17.07

* fixed issue web2py#480

* Fixes web2py/web2py#1683 (web2py#484)

* allow DAL(..., adapter_args=dict(migrator=InDBMigrator))

* fixed custom migrator

* Updated dev version

* Cleaning up f71aade

* Fix Table objects in common fields (web2py#486)

This fixes the problem of putting and tables on common_fields and having lazy_define_table throw an exception because Table doesn't have a clone attribute.

* fixed typo changing `oracle_fix` to `cmd_fix` in adapters/oracle.py (web2py#471)

fixed error where %-formats were only applying to second part of concatenated SQL statement in dialects/oracle.py

* override _select_aux_execute method in adapters/oracle.py to use oracle-specific `fetchall` (web2py#473)

* fixed MSSQL TOP

* possible fix for tTIZLy33Yw8

* fixed del of missing key, thanks Lazaro Aulan

* Fixed st_asgeojson (web2py#489)

see https://groups.google.com/forum/#!topic/web2py/EixMPrPr9SY
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment