New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Logging for Slow Database Sessions #7494
Add Logging for Slow Database Sessions #7494
Conversation
3fb1c10
to
e433aa3
Compare
e433aa3
to
158f3c2
Compare
5d0250c
to
c1cd86a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few minor comments regarding the main part of the PR and some substantial remarks concerning the testing section.
Since we're awaiting this PR for the release of a new RC version, I've given my approval.
f36a99d
to
c323ff5
Compare
c323ff5
to
db7cb00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just left one minor comment. The rest of the PR appears good to me.
|
||
def _format_warning(self, db_session_duration: float, info: DbSessionInfo) -> str: | ||
local_statistics = self._summarize_stat(db.local_stats for db in databases_to_track) | ||
threshold = SLOW_DB_SESSION_DURATION_THRESHOLD if self.duration_threshold is None else self.duration_threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider the following option:
class TriblerDbSession(core.DBSessionContextManager):
def __init__(self, *args, duration_threshold: float = SLOW_DB_SESSION_DURATION_THRESHOLD, **kwargs):
super().__init__(*args, **kwargs)
# `duration_threshold` specifies how long db_session should be to trigger the long db_session warning.
# When `duration_threshold` is None, `SLOW_DB_SESSION_DURATION_THRESHOLD` value is used instead.
self.duration_threshold = duration_threshold
this way it is more clear (and should keep the desired logic).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes it may be useful to change the value of SLOW_DB_SESSION_DURATION_THRESHOLD
dynamically (for example, in the run_tribler.py
script). The current design allows it.
The alternate approach of setting SLOW_DB_SESSION_DURATION_THRESHOLD
as a default parameter in the db_session
constructor has a significant drawback. In Python, default values are evaluated at the point of function definition in the defining scope. Therefore, when @db_session
is used as a decorator, it captures the value of SLOW_DB_SESSION_DURATION_THRESHOLD
at the time of import. This makes it impossible to modify the SLOW_DB_SESSION_DURATION_THRESHOLD
value post-import in a manner that affects the behavior of the @db_session
decorator. Consequently, the sequence in which modules are imported gains significance, which can introduce unexpected behaviors and maintenance challenges.
In contrast, the current design ensures that the db_session
retrieves the value of SLOW_DB_SESSION_DURATION_THRESHOLD
each time it is invoked. This live retrieval accommodates dynamic changes and guarantees that the session consistently uses the most up-to-date threshold, thus circumventing the abovementioned issues.
In conclusion, the present architecture not only affords greater flexibility but also prevents potential complications arising from import order dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed modification enables dynamic alteration of the duration_threshold
value. This is evidenced by the test_patched_db_session
passing after the said changes were applied to the current code:
diff --git a/src/tribler/core/utilities/pony_utils.py b/src/tribler/core/utilities/pony_utils.py
index 70f0859..6142082 100644
--- a/src/tribler/core/utilities/pony_utils.py
+++ b/src/tribler/core/utilities/pony_utils.py
@@ -15,8 +15,8 @@ from weakref import WeakSet
from pony import orm
from pony.orm import core
-from pony.orm.dbproviders import sqlite
from pony.orm.core import Database, select
+from pony.orm.dbproviders import sqlite
from pony.utils import localbase
SLOW_DB_SESSION_DURATION_THRESHOLD = 1.0
@@ -132,7 +132,7 @@ Queries statistics for the entire application:
class TriblerDbSession(core.DBSessionContextManager):
- def __init__(self, *args, duration_threshold: Optional[float] = None, **kwargs):
+ def __init__(self, *args, duration_threshold: float = SLOW_DB_SESSION_DURATION_THRESHOLD, **kwargs):
super().__init__(*args, **kwargs)
# `duration_threshold` specifies how long db_session should be to trigger the long db_session warning.
# When `duration_threshold` is None, `SLOW_DB_SESSION_DURATION_THRESHOLD` value is used instead.
@@ -193,8 +193,7 @@ class TriblerDbSession(core.DBSessionContextManager):
start_time = info.start_time
db_session_duration = time.time() - start_time
- threshold = SLOW_DB_SESSION_DURATION_THRESHOLD if self.duration_threshold is None else self.duration_threshold
- if db_session_duration > threshold:
+ if db_session_duration > self.duration_threshold:
self._log_warning(db_session_duration, info)
def _log_warning(self, db_session_duration: float, info: DbSessionInfo):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed modification enables dynamic alteration of the duration_threshold value. This is evidenced by the test_patched_db_session passing after the said changes were applied to the current code.
No, it doesn't. After the suggested change, any db_session applied as a decorator in the module scope will use the value of SLOW_DB_SESSION_DURATION_THRESHOLD
at the moment of the module import.
There are two reasons why the test is passed successfully after your changes:
- The
db_session
in the test uses an explicitly specified argumentduration_threshold=0.0
, and in the current discussion, we are talking about the case when the duration threshold is not explicitly specified for a db_session (that is, for all db_sessions used in Tribler's actual code) - Even if we remove the explicit
duration_threshold=0.0
specification for thedb_session
in the test, thedb_session
is applied to the_perform_queries
inner function at the test function execution time and not at the module import time, so its successful execution does not mean the same will work for db_sessions applied to module-level functions.
I added a new test_patched_db_session_default_duration_threshold
test that checks the db_session
defined at the module level, and the new test fails with your changes. In the test, I'm again dynamically changing the value SLOW_DB_SESSION_DURATION_THRESHOLD
(as I did originally) because this is the case I want to test here, that the dynamic change of SLOW_DB_SESSION_DURATION_THRESHOLD
is handled correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your points are valid.
However, I don't think it's necessary or beneficial to customize the default value
of the threshold. Could you provide a plausible example where this feature would be used?
The change I proposed, allows for the dynamic alteration of the actual threshold value, which you even confirmed by acknowledging that the tests passed due to the explicitly specified threshold.
In the current discussion, we are discussing the case when the duration threshold is not explicitly specified.
I perceive our current discussion differently.
We both agreed on the point: "Sometimes it may be useful to dynamically change the threshold value". Both our code versions allow this.
I don't see any advantages to allowing the default value
to be changed. However, I do see the benefits in a simplified version of the code without this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To present a plausible use case, imagine running a pytest suite where there's a need to adjust the db_session
duration threshold across all functions under test. This is a legitimate and practical scenario. The implementation in my PR caters to this requirement perfectly, but the modification you propose would make it infeasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is not about functionality. It is about surprising code behavior.
Any code written by someone else can lead to unexpected behavior. This is why it should be as simple as possible and provide the most straightforward API to the user (future developer) as possible.
What I'm advocating for is not an exotic feature but a standard flexibility that could be expected by developers who are used to configuring applications dynamically.
You're developing a feature that is not needed in Tribler unless it becomes a requirement. Less code equates to less maintenance. Reduced complexity results in fewer bugs. Even a small step in this direction is worth taking.
Your analogy with sending emails about test results is not directly comparable. Email notifications are an entirely different feature, whereas the capability I am defending is an extension of configuration flexibility, which is a fundamental aspect of software customization.
It is a bit extreme. I wrote it in the hope you could get the point. So, here is the refined version:
I acknowledge your perspective, but I must emphasize that a future developer could potentially find the option of filtering SQL queries by regex beneficial.
We can imagine so many features that "could potentially be beneficial", but it doesn't mean we should implement them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any code written by someone else can lead to unexpected behavior. This is why it should be as simple as possible and provide the most straightforward API to the user (future developer) as possible.
Absolutely! And one such unexpected behavior in Python relates to default function arguments, a pitfall that's well-documented and being addressed via PEP 671.
My approach aims to use the late binding of SLOW_DB_SESSION_DURATION_THRESHOLD
, keeping the logic straightforward. However, the proposed change uses default arguments, which bind early and can lead to the aforementioned pitfall.
Let me clarify:
- The proposed change blocks an important PR that is essential for the Tribler release.
- It disrupts use cases that I believe are valid:
- Adjusting the configuration when a specific script starts
- Altering the
db_session
duration threshold for testing purposes.
- It introduces a potential complexity and error-prone behavior by relying on Python's default arguments.
- The rationale for the proposed change seems to be primarily centered on simplicity, but I contend that it is at the cost of flexibility and falling into known Python pitfalls.
Reduced complexity results in fewer bugs. Even a small step in this direction is worth taking.
I concur, and evading the default argument issue in Python is a step toward reducing complexity.
We can imagine so many features that "could potentially be beneficial", but it doesn't mean we should implement them.
I agree, but in this instance, the flexibility is a natural outcome of the implementation in the PR, not an additional feature. It is streamlined, and the scenarios are accommodated by a single line of code.
In conclusion, while I appreciate the insights and value our collaborative efforts, I firmly believe that the current implementation is in the best interest of flexibility and reliability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change you're referring to is not as consequential as you suggest:
- As we discussed yesterday, it is not essential for the Tribler release. All the necessary features are already in the release branch.
- Any perceived blockage is only on your side as we already have one approval. This means a majority of developers are in agreement with the change.
Your argument about disrupting valid use cases is speculative. We do not have any real necessity for supporting these scenarios.
Your assertion that the proposed change introduces potential complexity and error-prone behavior contradicts the arguments I wrote above (also, I asked you to prove your earlier claim, but you were not able to prove it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any perceived blockage is only on your side as we already have one approval. This means a majority of developers are in agreement with the change.
I appreciate the clarification regarding the approval process. I was under the impression that an explicit rejection would necessitate addressing it before proceeding. Thank you for clearing that up.
As for the suggested change, I respect the opinions and the discussion we had, but I must stand by my conviction that the proposed modification could introduce design issues. With the understanding that the PR has received majority approval, I would like to proceed without incorporating this specific change.
Thank you for the engagement and valuable feedback throughout this review.
a16e0fa
to
2b59579
Compare
Sometimes we can observe a very long execution of coroutine steps that entirely freeze the asyncio loop for huge periods (like, 87.143 seconds)
The primary cause of these freezes is usually the waiting time to acquire an exclusive lock before a database transaction begins. This occurs when another database session, operating in a different thread, holds the database lock.
This PR introduces logging functionality to monitor slow database sessions across all threads in the Tribler application. Slow database sessions can sometimes cause the asyncio loop to freeze for extended periods, which impacts the performance and responsiveness of the application.
Key Features:
This enhancement will assist in identifying performance bottlenecks and aid in optimizing database interactions within the Tribler application.
Example log entry: