New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DocTestFinder.find fails if module contains class objects with special characters #107715
Comments
Did this happen in the real world? Your PR looks good to me, but there may be a better solution of this problem. If the class was dynamically generated, the class defining statement cannot be found in the source code. It is better to not even try to search. BTW, |
I ran into this for our physical units framework, in which dimensions are represented by classes and quantities by instances thereof, and operations on the quantities result in dynamically created classes representing the resulting dimension. The special characters in the automatically generated names represent the operation: e.g. L*M/T2 for force. While we could use any other naming scheme to avoid triggering this issue, we prefer to keep the current one for readability during live inspection. It is true that the line number will not be found. This is fine, we just need to get rid of the exception. Avoiding the search altogether would be equally fine if there is a way to identify dynamically created classes. However, we though that embedding an unescaped string in a regular expression is considered bad practice in general, which would suggest that the proposed fix is an improvement regardless. I see the issue with |
This patch escapes the class name before embedding it in the regular expression for `pat` in `doctest.DocTestFinder._find_lineno`. While class names do not ordinarily contain special characters, it is possible to encounter these when a class is created dynamically. Escaping the name will correctly return `None` in this scenario, rather than potentially matching a different class or raising `re.error` depending on the symbols used.
…7716) This patch escapes the class name before embedding it in the regular expression for `pat` in `doctest.DocTestFinder._find_lineno`. While class names do not ordinarily contain special characters, it is possible to encounter these when a class is created dynamically. Escaping the name will correctly return `None` in this scenario, rather than potentially matching a different class or raising `re.error` depending on the symbols used. (cherry picked from commit 8579327) Co-authored-by: Gertjan van Zwieten <git@gjvz.nl>
…7716) This patch escapes the class name before embedding it in the regular expression for `pat` in `doctest.DocTestFinder._find_lineno`. While class names do not ordinarily contain special characters, it is possible to encounter these when a class is created dynamically. Escaping the name will correctly return `None` in this scenario, rather than potentially matching a different class or raising `re.error` depending on the symbols used. (cherry picked from commit 8579327) Co-authored-by: Gertjan van Zwieten <git@gjvz.nl>
I agree that embedding an unescaped and not validated string in a regular expression is not a good practice. It could be validated with Thank you for your contribution @gertjanvanzwieten. |
Thanks for merging! |
The danger of closing before the backports are merged is that the backports may fail for weird reasons. The CLA bot failed again on both backports. Gertjan, try reclicking as you did before. I added a note about the flakey bot to the coredev discord. |
Done! |
…GH-107727) This patch escapes the class name before embedding it in the regular expression for `pat` in `doctest.DocTestFinder._find_lineno`. While class names do not ordinarily contain special characters, it is possible to encounter these when a class is created dynamically. Escaping the name will correctly return `None` in this scenario, rather than potentially matching a different class or raising `re.error` depending on the symbols used. (cherry picked from commit 8579327) Co-authored-by: Gertjan van Zwieten <git@gjvz.nl>
…#107726) * gh-107715: Escape class name in regular expression (GH-107716) This patch escapes the class name before embedding it in the regular expression for `pat` in `doctest.DocTestFinder._find_lineno`. While class names do not ordinarily contain special characters, it is possible to encounter these when a class is created dynamically. Escaping the name will correctly return `None` in this scenario, rather than potentially matching a different class or raising `re.error` depending on the symbols used. (cherry picked from commit 8579327) Co-authored-by: Gertjan van Zwieten <git@gjvz.nl> * Update 2023-08-07-14-12-07.gh-issue-107715.238r2f.rst --------- Co-authored-by: Gertjan van Zwieten <git@gjvz.nl> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Thanks! |
Bug report
Checklist
A clear and concise description of the bug
Doctests initialization may fail with
re.error
in case a module contains (dynamically generated) class objects with special characters in their name. This is caused by the regular expression compiled indoctest.DocTestFinder._find_lineno
failing to escape the class name before embedding it. If compilation does succeed, then invalid matches may result from interpreting the class name as a search pattern.Your environment
Linked PRs
The text was updated successfully, but these errors were encountered: