Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disconnect after schema version check. #57

Closed
wants to merge 5 commits into from
Closed

Disconnect after schema version check. #57

wants to merge 5 commits into from

Conversation

jeneric
Copy link

@jeneric jeneric commented Aug 25, 2014

Explicitly disconnect after perfoming the schema version check. Without doing so, Schema::Versioned is leaving an idle connection open to the database in daemon processes. This results in twice the expected number of active connections opened to the database.

Please let me know if you need addtional information or would prefer a different approach in order to accept the pull request.

@ribasushi
Copy link
Collaborator

First of all many thanks for the time looking into this! I would however like for you to look deeper into the issue. The current behavior seems wrong in itself, therefore the result is you adding a bandaid fixing something that should not have been there in the first place.

The problem on the table is: we are connecting twice.
The correct question isn't "why aren't we disconnecting", but rather "why are we connecting twice?!"

The solution would lie in examining the ::Versioned code and determining whether any of the ->connect calls have anything special done to their connect attributes. If this is not the case (i.e. we always connect to the same place anyway) - then clearly you are looking into a shared $dbh as can be seen in the second bullet point here: https://metacpan.org/pod/DBIx::Class::Storage::DBI#connect_info. I will not be available for most of this week, please ask @frioux for help if you get stuck and/or have trouble navigating my explanation.

@dbsrgits-sync
Copy link

Hi Eric,
did you test with the latest dev release of DBIC?

On Tue, Aug 26, 2014 at 6:37 AM, Peter Rabbitson notifications@github.com
wrote:

First of all many thanks for the time looking into this! I would however
like for you to look deeper into the issue. The current behavior seems
wrong in itself, therefore the result is you adding a bandaid fixing
something that should not have been there in the first place.

The problem on the table is: we are connecting twice.
The correct question isn't "why aren't we disconnecting", but rather "why
are we connecting twice?!"

The solution would lie in examining the ::Versioned code and determining
whether any of the ->connect calls have anything special done to their
connect attributes. If this is not the case (i.e. we always connect to
the same place anyway) - then clearly you are looking into a shared
$dbh as can be seen in the second bullet point here:
https://metacpan.org/pod/DBIx::Class::Storage::DBI#connect_info. I will
not be available for most of this week, please ask @frioux
https://github.com/frioux for help if you get stuck and/or have trouble
navigating my explanation.

@peter: this sounds to me like a consequence of the connection race
condition we debugged and fixed some month ago where the wrong connection
got the session parameters set but the other one got used.


Reply to this email directly or view it on GitHub
#57 (comment).


DBIx-Class-Devel mailing list
DBIx-Class-Devel@lists.scsys.co.uk
http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class-devel

@ribasushi
Copy link
Collaborator

On 08/26/2014 02:10 PM, dbsrgits sync wrote:

First of all many thanks for the time looking into this! I would
however
like for you to look deeper into the issue. The current behavior seems
wrong in itself, therefore the result is you adding a bandaid fixing
something that should not have been there in the first place.

The problem on the table is: we are connecting twice.
The correct question isn't "why aren't we disconnecting", but rather
"why
are we connecting twice?!"

The solution would lie in examining the ::Versioned code and
determining
whether any of the ->connect calls have anything special done to their
connect attributes. If this is not the case (i.e. we always
connect to
the same place anyway) - then clearly you are looking into a shared
$dbh as can be seen in the second bullet point here:
https://metacpan.org/pod/DBIx::Class::Storage::DBI#connect_info. I will
not be available for most of this week, please ask @frioux
https://github.com/frioux for help if you get stuck and/or have
trouble
navigating my explanation.

@peter: this sounds to me like a consequence of the connection race
condition we debugged and fixed some month ago where the wrong connection
got the session parameters set but the other one got used.

Could you remind me which issue this was? .../Schema/Versioned.pm has
not been touched since 2011, hence I am confused which issue/fix are you
referring to.

@dbsrgits-sync
Copy link

On Tue, Aug 26, 2014 at 2:45 PM, Peter Rabbitson notifications@github.com
wrote:

On 08/26/2014 02:10 PM, dbsrgits sync wrote:

First of all many thanks for the time looking into this! I would
however
like for you to look deeper into the issue. The current behavior seems
wrong in itself, therefore the result is you adding a bandaid fixing
something that should not have been there in the first place.

The problem on the table is: we are connecting twice.
The correct question isn't "why aren't we disconnecting", but rather
"why
are we connecting twice?!"

The solution would lie in examining the ::Versioned code and
determining
whether any of the ->connect calls have anything special done to their
connect attributes. If this is not the case (i.e. we always
connect to
the same place anyway) - then clearly you are looking into a shared
$dbh as can be seen in the second bullet point here:

https://metacpan.org/pod/DBIx::Class::Storage::DBI#connect_info. I
will
not be available for most of this week, please ask @frioux
https://github.com/frioux for help if you get stuck and/or have
trouble
navigating my explanation.

@peter: this sounds to me like a consequence of the connection race
condition we debugged and fixed some month ago where the wrong connection
got the session parameters set but the other one got used.

Could you remind me which issue this was? .../Schema/Versioned.pm has
not been touched since 2011, hence I am confused which issue/fix are you
referring to.

1eb87dd - Fix pesky on_connect_* race condition


Reply to this email directly or view it on GitHub
#57 (comment).


DBIx-Class-Devel mailing list
DBIx-Class-Devel@lists.scsys.co.uk
http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class-devel

… schema object in _on_connect to prevent doubling database connections on versioned schemas.
@jeneric
Copy link
Author

jeneric commented Aug 30, 2014

Thanks for the feedback. I have been developing and testing against the latest developer release. I agree that the first commit was a band-aid and believe that the new commits now address the underlying issue.

Rather than creating a new schema object with its own connection to the database, register_class is used to add DBIx::Class::Version::Table to the existing schema. A schema object and second connection is only used when initially creating the dbix_class_schema_versions table. All unit tests pass and have validated that a single connection is now used for both the version check and subsequent queries.

Once again, please let me know if you need additional information, would prefer a different approach, or would like other changes before acceptance.

@jeneric
Copy link
Author

jeneric commented Sep 2, 2014

With the original commit I was simply closing a redundant connection that in most cases is only used to check the schema version in the underlying 'dbix_class_schema_versions' table but remains open throughout the life a web server or daemon process when using the Schema::Versioned component.

The redundant connections became apparent troubleshooting bug reports of users exhausting database connections and having to increase the connection limit in our Netdisco application when it should not have been necessary. It was discovered that for every daemon or web server process there was a second idle connection with the last statement issued being 'SELECT me.version FROM dbix_class_schema_versions me ORDER BY installed DESC LIMIT $1'.

Looking into Schema::Versioned a complete schema object to include another connection is created in the primary schema as 'vschema ($self->{vschema})'. This appears to have been in the code base since inception and was why I didn't originally modify it, just issuing a disconnect after the check.

However, as you indicated in the first review, we are ultimately connecting to the same database where the 'dbix_class_schema_versions' table exists. It would be preferable to only connect to the database once, run the version check, and then use the existing connection for subsequent work to conserve system resources.

The second set of commits only uses one connection, except for when the 'dbix_class_schema_versions' table is created. Within the original Schema::Versioned, deploy has been overridden to call an install method after any other deploy methods the inheritance stack. The install method is a documented method which can be called on its own to install the 'dbix_class_schema_versions' table. In the original code this is called via '$self->{vschema}->deploy;' limiting it to only installing the 'dbix_class_schema_versions' table. The sqlt_deploy_hook was added to prevent the 'dbix_class_schema_versions' table from being dropped if { add_drop_table => 1 } is passed in %sqlt_args to an inherited deploy method which is problematic because we would lose the version history and was caught in unit testing. The _deploy_version_table method was added because it is not apparent how to call deploy for a single result class, which does not exist as a ddl file. This should only be called once in the versioned database lifetime and so the redundant connection that is closed on completion should be acceptable.

All documented methods work as they did before these commits and the unit tests in '94versioning.t' and 'admin/02ddl.t' which exercise the api both pass all tests. The only potential issue might be a name collision with 'register_class('Version::Table', 'DBIx::Class::Version::Table');', but this could be changed to a fully qualified source name if deemed necessary.

I'm uncertain why the class wasn't added to the schema in the past, could it have been that supporting methods weren't available to ensure that it wasn't touched during deploy? The table exists in the underlying database.

It seems more clunky to register the class and then unregister source in multiple methods to try and achieve the same objective.

I would prefer a solution that only connects to the database once. If you have thoughts on a better approach that would only connect to the database once I'm happy to try and implement it. With this explanation, let me know which path you would like me to take.

@ribasushi
Copy link
Collaborator

@jeneric Sorry for not replying earlier - just came back home after a long slog around the world https://twitter.com/ribasushi/status/507425713520390144 :)

I will reply with a concrete code proposal in about ~5 hours from now, need to attend some $work-related things first.

@ribasushi
Copy link
Collaborator

@jeneric Right, so... I have been thinking how exactly to answer this. Make no mistake - I am not making this hard on purpose. The easiest thing to do would be to just show you the actual solution which is literally a 2 line diff (except the actual test, which is required to merge the work). However this will represent a lost teaching opportunity. After all I am not just trying to solve your problem, but hope you can gain an extra viewpoint in the process.

Instead I want to understand where your confusion lies, and what docs I could improve to make sure it doesn't happen to others in the future.

The immediate problem with your approach is that you are polluting the existing Schema. You kinda answer the question yourself in: I'm uncertain why the class wasn't added to the schema in the past, could it have been that supporting methods weren't available to ensure that it wasn't touched during deploy?. The reason it was never added is so that nothing needs to sanitize things after the fact. It is always easier to never add something, rather than removing it after the fact.

More fundamentally you seem to be going down the wrong path because you assume that $schema_instance == separate connection. This is completely untrue. Please reread the piece I linked to earlier in #57 (comment): ...clearly you are looking into a shared $dbh as can be seen in the second bullet point here: https://metacpan.org/pod/DBIx::Class::Storage::DBI#connect_info.

If after the above you are still not entirely sure what I keep suggesting and would rather just move on - let me know and I will fix the code in question.

Cheers and thank you for your patience ;)

@ribasushi
Copy link
Collaborator

Hi!

Since I am on a relatively tight schedule with the next release slipping further and further, and since this is a relatively simple and useful fix - I went ahead and implemented it in fef130ca . I hope the code better illustrates my point of "non-invasive changes", and still solves the very problem raised in this ticket.

I am closing the PR for the time being, but I urge you to please write back if something about the current code doesn't seem right to you.

@ribasushi ribasushi closed this Sep 10, 2014
@ribasushi
Copy link
Collaborator

Whoops I pushed the wrong branch by mistake - the fix (unmodified) is actually in 81023d83

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants