Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inherit and Extend apsw.cursor class #381

Closed
kmedcalf opened this issue Nov 28, 2022 · 9 comments
Closed

Inherit and Extend apsw.cursor class #381

kmedcalf opened this issue Nov 28, 2022 · 9 comments

Comments

@kmedcalf
Copy link

kmedcalf commented Nov 28, 2022

I have modifiied my cursor wrapper so that it inherits from apsw.cursor and adds some extension slots (I also do the same thing with the apsw.connection class, as you know). Recently I noticed that when "closing" the connection (which calls super().close) that something was happening that was taking a huge amount of time.

Investigation found that although the cursor object(s) (in this case several hundred thousand or millions of them) all appeared to be closed and released properly, the apsw.Connection.close method was doing some sort of "loop though and close/free dependents".

I added the following to my apsw.Cursor extension which solved the problem (I previously had no close or __del__ methods (the close method is not called in the ordinary course, and having the __del__ call super().close (whether with or without force) seems to fix the issue.

+    def close(self, *args, **kwds):
+        self.__convert = None
+        self.__makerow = None
+        super().close(*args, **kwds)
+
+    # Needed to add this method to force the dependents to clean themselves up
+    def __del__(self):
+        self.__convert = None
+        self.__makerow = None
+        super().close(True)
+

FYI, the millions of cursors were all "INSERT" statements (the same statement with different bindings) that should have run to completion on the first step and were then discarded. Is this because I am extending using __slots__ rather than via a dict?

@rogerbinns
Copy link
Owner

Almost certainly you have circular references which then prevents garbage collection. The connection keeps a list of dependents, but they are all kept as weak references. The dependents (cursors, blobs etc) keep strong references to the connection as expected.

You should be able to use gc.get_referrers passing in one of your cursors before connection.close() is called to see who is keeping a reference alive.

Attached is some test code showing this and being unable to reproduce the issue.

issue381.txt

@kmedcalf
Copy link
Author

kmedcalf commented Nov 29, 2022

Ok. So if I remove my __del__ method and before calling .close() on the connection, gc.referents(connection) returns "dead weak references" and a live reference or two to the last used cursor and row object constructor. Since the references are dead, all I see are the addresses of the dead reference object. The close method is never called on the cursor instances unless I force it to be called from __del__. All that I need to do is have __del__ call super().close() -- it does not appear to need force.

@kmedcalf
Copy link
Author

kmedcalf commented Nov 29, 2022

Ok, I found the problem, but I don't know exactly how to fix it (well, I have one way, but it is messy).

My inherited Cursor mayhaps uses super().setexectrace(self._exectrace_) and super().setrowtrace (self._rowtrace_) to generate row objects and do various other things if necessary. When super().execute runs it does not return any rows nor throw an exception, so in the case where no rows are generated at all, I do not release the self references. This creates a loop and the garbage collector cannot collect.

@kmedcalf
Copy link
Author

How do I know if an apsw.Cursor has run to completion when the none of the iterators are used? If any of the iterator methods are used it is easy -- your get a StopIteration exception in the __next__ method(s).

eg:
cur = super().execute('statement that does not return any rows')

@rogerbinns
Copy link
Owner

rogerbinns commented Nov 29, 2022

If cursor.execute has been called then the cursor will be in one of two states - it will either have run to completion, or will be providing a result row. That state is not exposed directly. You can deduce completion in two awkward ways.

  • next(cursor) raises StopIteration even for queries that didn't return data like only a comment
  • cursor.description raises apsw.ExecutionCompleteError if execution is complete, otherwise returns something

I could add a bool property like cursor.is_execution_complete to provide a clear answer.

@kmedcalf
Copy link
Author

kmedcalf commented Nov 29, 2022 via email

@kmedcalf
Copy link
Author

kmedcalf commented Nov 29, 2022 via email

@rogerbinns
Copy link
Owner

Can this issue be closed?

On review the handling of very large numbers of outstanding cursors isn't ideal (in efficiency terms) but there is nothing that is obviously better.

@kmedcalf
Copy link
Author

Yes, this issue can be closed. Although the handling of the cursors is problematic, avoiding having the cursor instance having a reference to one of its one instance methods was the source of the problem. Moving the specific procedures out of the cursor instance cured the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants