Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance degradation in logging.TimedRotatingFileHandler due to fix to #89564 #96159

Closed
dgrisby opened this issue Aug 21, 2022 · 5 comments
Assignees
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@dgrisby
Copy link
Contributor

dgrisby commented Aug 21, 2022

The fix to #89564 was to add two file checks to the shouldRollover() method of logging.TimedRotatingFileHandler and RotatingFileHandler (see 62a6677 )

This means that every single log call now does two stat() calls to the log file, which is quite expensive. If the log file is stored on NFS, those stat() calls become extremely expensive. In our application it caused a dramatic (and hard to diagnose) performance loss.

It would be better if shouldRollover() was reverted to its previous implementation, and the check for special files was moved into doRollover(). That would mean the cost of the additional checks was only incurred when there is a rollover to do.

@dgrisby dgrisby added the type-bug An unexpected behavior, bug, or error label Aug 21, 2022
@AlexWaygood AlexWaygood added performance Performance or resource usage stdlib Python modules in the Lib dir labels Aug 21, 2022
@vsajip
Copy link
Member

vsajip commented Aug 22, 2022

Well, the logic belongs in shouldRollover() because that's where the decision on whether to rollover (i.e. whether or not to call doRollover()) is made, and that depends on the result of the file-exists-but-is-not-really-a-file check. One could possibly mitigate the issue you've hit in several ways:

  1. Cache the result of that check in the handler.
  2. Determine the result of the check via a single os.stat() call rather than two (I assume that one is due to the os.path.exists() call and the other is due to os.path.isfile() call).
  3. Move the check in shouldRollover() to where the method would otherwise return True, so that rollover would have occurred but for the check.

As you're hitting the performance issue in your environment, I invite you to consider making a patch using one or more of the above suggestions and then testing the performance impact, and if a solution is found which is acceptable, working up a PR with that patch.

@dgrisby
Copy link
Contributor Author

dgrisby commented Aug 22, 2022

In our application we know for a fact that the logfile is not a special file, so the new check is not required for us. We already had our own subclass of TimedRotatingFileHandler, so we have fixed this for ourselves by simply giving our subclass a shouldRollover method containing a copy of the original code.

We no longer suffer from this issue because we are no longer using the problematic code. I raised this defect to try to be helpful for other people who hit the same issue.

@vsajip
Copy link
Member

vsajip commented Aug 22, 2022

OK, thanks - that was thoughtful of you. I'll close this issue then, but aim to remember if a performance issue is logged in this area. Of course, that "problematic code" was added because its absence caused problems in other scenarios 🙂

@vsajip vsajip closed this as completed Aug 22, 2022
@dgrisby
Copy link
Contributor Author

dgrisby commented Aug 22, 2022

I do think this is something that should be fixed in the standard library. I will create a PR for it.

dgrisby pushed a commit to dgrisby/cpython that referenced this issue Aug 22, 2022
dgrisby pushed a commit to dgrisby/cpython that referenced this issue Aug 22, 2022
…otatingFileHandler due to fix to python#89564

Only check for special files when the rollover time has passed, rather
than cheking on every log message.
@vsajip
Copy link
Member

vsajip commented Aug 23, 2022

Thanks, Duncan. As you're best placed to test the result of the fix, I was hoping for this.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 23, 2022
…medRotat… (pythonGH-96182)

(cherry picked from commit 1499d73)

Co-authored-by: Duncan Grisby <duncan-github@grisby.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Aug 23, 2022
…medRotat… (pythonGH-96182)

(cherry picked from commit 1499d73)

Co-authored-by: Duncan Grisby <duncan-github@grisby.org>
vsajip pushed a commit that referenced this issue Aug 23, 2022
…imedRotat… (GH-96182) (GH-96195)

Co-authored-by: Duncan Grisby <duncan-github@grisby.org>
vsajip pushed a commit that referenced this issue Aug 23, 2022
…imedRotat… (GH-96182) (GH-96196)

Co-authored-by: Duncan Grisby <duncan-github@grisby.org>
mdboom pushed a commit to mdboom/cpython that referenced this issue Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

3 participants