Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PsqlDosMigrator: refactor the connection handling #5783

Merged
merged 1 commit into from
Nov 22, 2022

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 19, 2022

The PsqlDosMigrator requires a connection to the database for various operations. To this end, it created an Engine from the connection parameters provided by the Profile, through which a connection was opened in the _connection_context. However, this connection was never explicitly closed. Moreover, although the method allowed passing an optional connection that was already open, this was not consistently used in the entire class. This could lead to too many connections being opened to the client resulting in an exception when the limit was reached.

Here the class is refactored as follows:

  • The engine is created in the constructor and assigned to the private attribute _engine.
  • A connection property is added that will return an open connection. If a connection hasn't already been opened in a previous call, one is opened and assigned to the _connection attribute.
  • The close method is added which when called will close the connection if it was opened and dispose of the engine. This should be called if the caller is done using the migrator and its connections to the database should be cleaned up.
  • The _connection_context context manager is removed and all internal methods now simply call the connection property to get an open connection.

The PsqlDosBackend uses the new connection handling interface of the PsqlDosMigrator through the migrator_context context manager. Each method that requires a migrator should request it through this method. At the end of the context the close method is automatically called on the migrator ensuring its opened connections are properly relinquished.

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @sphuber !

Besides the minor comments on the code, I am wondering:

  • is there anything we can test here? is there a way to prove that we're now using "the right number of connections"? maybe using techniques like the memory leak test; looking for all connection objects in memory?
    process_instances = get_instances(processes.Process, delay=0.2)
  • if I understand correctly, then every migrator instance now (re)uses one connection. can there be scenarios where this also leads to too many connections being opened?

self._engine = create_sqlalchemy_engine(self.profile.storage_config)
self._connection = None

def close(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method is mainly intended to be used upon destruction of the migrator, you might consider making it a private method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it needs to be called from the outside. In fact, it is the PsqlDosBackend that calls it when it itself has received the close call. The PsqlDosBackend is also the only class the should be calling the PsqlDosMigrator.

self._connection = None

self._engine.dispose()
self._engine = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this currently breaks subsequent calls to the connection property.

Perhaps the cleaner way would be to create self._engine not in the constructor, but lazily upon calling self.property.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I was thinking that once you call close the PsqlDosMigrator instance should no longer be used. The same principle applies for the RmqCommunicator of kiwipy for example. But then again, I don't really see a downside to moving the creation of the engine to self.connection as well, so will move it there.

@sphuber
Copy link
Contributor Author

sphuber commented Nov 19, 2022

is there anything we can test here? is there a way to prove that we're now using "the right number of connections"? maybe using techniques like the memory leak test; looking for all connection objects in memory?

I had thought about this, but I am not sure. We could test at the end of the test suite, but at that point, redundant connections may have already been cleaned. It would still be possible that in certain modes too many connections get opened that would not be caught by the test. Still, maybe some test is better than none. I will think a bit more to see if we can measure the open connections during the test suite.

if I understand correctly, then every migrator instance now (re)uses one connection. can there be scenarios where this also leads to too many connections being opened?

I don't think so, at least not in "normal" usage. The PsqlDosMigrator is only ever called through the PsqlDosBackend, and that only constructs it when a profile gets configured for the first time, or when _clear is called (which for now is really only happening during tests). So during normal production usage, I don't think all of this is all that impactful. It was mostly causing problems during testing where we are resetting the database a lot and were opening connections all over the place that weren't being cleaned up properly.

@ltalirz ltalirz self-requested a review November 19, 2022 23:00
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the replies to my comments.

From my side, this is good enough then.
I leave it up to you whether you can come up with a test (moving the creation of the engine probably makes sense).

@sphuber sphuber force-pushed the fix/psql-dos-migrator-connections branch from 70a8998 to d956ef8 Compare November 20, 2022 15:39
@sphuber
Copy link
Contributor Author

sphuber commented Nov 20, 2022

I have absolutely no idea why, but as soon as I move the instantiation of the engine to the connection property, the tests fail and pylint triggers the following:

aiida/storage/psql_dos/migrator.py:132:16: E1101: Instance of 'MockConnection' has no 'rollback' member (no-member)
aiida/storage/psql_dos/migrator.py:284:12: E1101: Instance of 'MockConnection' has no 'commit' member (no-member)
aiida/storage/psql_dos/migrator.py:316:12: E1101: Instance of 'MockConnection' has no 'commit' member (no-member)
aiida/storage/psql_dos/migrator.py:367:16: E1101: Instance of 'MockConnection' has no 'commit' member (no-member)

I have no clue whatsoever why this small change causes pylint to think that the Connection returned by engine.connect is of type MockConnection. The origin of the failure of the tests is also not quite clear (https://github.com/aiidateam/aiida-core/actions/runs/3508656756/jobs/5877214874) as it simply fails in the profile setup when it tries to initialise the storage through the migrator. The error message is:

MetaData object is not bound to an Engine or Connection.  Execution can not proceed without a database to execute against.

@sphuber sphuber force-pushed the fix/psql-dos-migrator-connections branch 2 times, most recently from 743846b to ff48704 Compare November 22, 2022 13:14
The `PsqlDosMigrator` requires a connection to the database for various
operations. To this end, it created an `Engine` from the connection
parameters provided by the `Profile`, through which a connection was
opened in the `_connection_context`. However, this connection was never
explicitly closed. Moreover, although the method allowed passing an
optional connection that was already open, this was not consistently
used in the entire class. This could lead to too many connections being
opened to the client resulting in an exception when the limit was
reached.

Here the class is refactored as follows:

 * The engine is created in the constructor and assigned to the private
   attribute `_engine`.
 * A `connection` property is added that will return an open connection.
   If the engine hasn't already been constructed or no connection is
   already open, the engine is constructed and a connection is opened
   which is then assigned to the `_connection` attribute.
 * The `close` method is added which when called will close the
   connection if it was opened and dispose of the engine. This should be
   called if the caller is done using the migrator and its connections
   to the database should be cleaned up.
 * The `_connection_context` context manager is removed and all internal
   methods now simply call the `connection` property to get an open
   connection.

The `PsqlDosBackend` uses the new connection handling interface of the
`PsqlDosMigrator` through the `migrator_context` context manager. Each
method that requires a migrator should request it through this method.
At the end of the context the `close` method is automatically called on
the migrator ensuring its opened connections are properly relinquished.
@sphuber sphuber force-pushed the fix/psql-dos-migrator-connections branch from 2b681de to 6bf08b1 Compare November 22, 2022 13:20
@sphuber sphuber merged commit 0c85786 into aiidateam:main Nov 22, 2022
@sphuber sphuber deleted the fix/psql-dos-migrator-connections branch November 22, 2022 13:43
@sphuber sphuber restored the fix/psql-dos-migrator-connections branch November 22, 2022 13:44
@sphuber sphuber deleted the fix/psql-dos-migrator-connections branch November 23, 2022 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants