-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offer suggestions on AttributeError and NameError #82711
Comments
To improve the debugging experience in both interactive and non-interactive code, I propose to offer suggestions when attribute access fails. For example: >>> class A: foo = None
...
>>> A.fou
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'A' has no attribute 'fou' Did you mean: foo? This also applies to imports from modules and other situations: >>> import collections
>>> collections.NamedTuple
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'collections' has no attribute 'NamedTuple' Did you mean: namedtuple? |
PR 16850 shows an initial prototype for the idea |
It already exists as a 3rd party module and it would be really cool to have this in core level. https://github.com/dutc/didyoumean (by James Powell) |
I am not super convinced that this is a great idea because it has some performance cost (although somewhat controlled) but I want to open a discussion. |
Ruby has it integrated into the core : https://bugs.ruby-lang.org/issues/11252 . It was initially a gem that got merged into core.
|
Idea: we could only do this in interactive mode if we consider that is expensive enough. |
I am running pyperformance to check the performance cost of this. |
Slower (27):
The current approach is too expensive, so I'm closing PR 16850. |
AFAIK there is existing issue for this idea. I have doubts about performance. I added _PyObject_LookupAttr in particularly to avoid an overhead of raising and silencing an AttributeError. I believe most performance sensitive code in the core now uses it and will not be affected, but there is other code which silences it (_PyObject_LookupAttr itself silences an AttributeError raised in called functions), third-party code which uses PyObject_HasAttr or PyObject_GetAttr can be affected. It might be simpler to implement it in sys.excepthook or sys.displayhook, but at that point we do not have attribute name and a reference to the object. There is an issues and a PEP about adding references to AttributeError. It could help to implement this feature. |
I am surprised that it was SO expensive. Pathlib would largely benefit from cached_property if it be compatible with slots. |
Serhiy, do you think we could attach the object and the name to some private fields of the AttributeError and check that in sys.excepthook if they are present? |
I will also repeat the pyperformance results locally just in case something was off on the speed.python.org server. |
Why private? They should be public. But the problem is that making a reference to the object we can prolong its lifetime and even create a reference loop. There was a proposition to create a weak reference, but not all types support weak reference. |
I was suggesting orivate for now until the/a PEP to modify the exception is approved. In this way we could try to implement the feature that way. ------ On the other hand do you see any way to make the current approach not that slow? Maybe activating it only on interactive mode? |
If I'm not mistaken, as long as the traceback is alive, the object is alive beacuse the frames will contain it. The other case is if the exception is not propagated, but in that case it should just die unless explicitly captured. The cycle only happens if the object has a reference to the exception, and that should not happen in the general case. |
I have opened PR 16856 adding fields to the AttributeError and implementing the feature in PyErr_Display. |
With the new approach, there is no measurable different in performance: venv ❯ pyperf compare_to json/2019-10-19_20-01-master-24dc2f8c5669.json.gz json/2019-10-20_01-32-suggestions2-21404456383b.json.gz -G Slower (3):
Faster(4):
Benchmark hidden because not significant (57): chameleon, chaos, crypto_pyaes, deltablue, dulwich_log, fannkuch, float, genshi_text, genshi_xml, json_dumps, json_loads, logging_format, logging_silent, logging_simple, mako, meteor_contest, nbody, pathlib, pickle, pickle_dict, pickle_list, pickle_pure_python, pidigits, python_startup, raytrace, regex_compile, regex_dna, regex_effbot, regex_v8, richards, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sor, scimark_sparse_mat_mult, spectral_norm, sqlalchemy_declarative, sqlalchemy_imperative, sympy_expand, sympy_sum, sympy_str, telco, unpack_sequence, unpickle, unpickle_list, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_process |
I think I am going to proceed to modify PR 16856 by adding the name and the object to the AttributeError exceptions. This should not extend the lifetime of the object more than the current exception is doing as the exception keeps alive the whole frame stack in the __traceback__ attribute. Consider this code for example: class Target:
def __del__(self):
print("The object is dead!")
def f():
g()
def g():
h()
def h():
theobj = Target()
theobj.thevalue try: This code prints: <main.Target object at 0x7f064adbfe10> We can notice two things:
Adding another reference to the target object to the exception won't change the current lifetime, neither will create reference cycles as the target object will not have (unless explicitly created) a reference to the exception. We can conclude that this change should be safe. In the resolution email for PEP-473 the council stated that "Discussions about adding more attributes to built-in exceptions can continue on the issue tracker on a per-exception basis". As I think this will be very beneficial for the feature discussed in the issue, I will proceed as indicated. Notice that if we want to change all internal CPython code that raises AttributeError to use the new fields, this should be done in a different PR to keep this minimal. For this feature, we still need to intercept AttributeError raised by the user without these fields added. |
Interesting, why locals are not cleared when an exception leaves a frame? |
Helping the developer to suggest a fix introduces a minor but non-zero overhead, I would prefer to only enable it as an opt-in option. Maybe enable it using in the development mode (-X dev/PYTHONDEVMODE=1)?
This project hooks into PyObject_GetAttr() by modifying PyObject_GetAttr() machine code, which is definitely not a portable approach. Maybe one approach would be to add a way to install a hook to customize AttributeError exceptions? Can this issue be implemented using sys.excepthook? -- This issue is specific to AttributeError, but I vaguely recall that Yury Selivanov told me that he wanted to something but for any exception. Detect the most common mistakes and propose a solution. I don't think that he ever sent anything in public sadly. I add Yury in the nosy list. -- Here is another project which also catch NameError, AttributeError, ImportError, TypeError, ValueError, SyntaxError, MemoryError, OverflowError, OSError, RuntimeError, etc. : It is implemented with sys.excepthook, but it is also compatible with IPython "custom exception handler" (call get_ipython().set_custom_exc()). By the way, does IPython have a feature like this? In short, https://github.com/SylvainDe/DidYouMean-Python seems to already implement this issue in the proper way, no? -- Similar project for Ruby: |
Not that I know of.
I briefly checked the project. My current approach exposes the object and the name directly on the exception, so the display hook can directly using them instead of fiddling with frames and other things. Additionally, I think it would be very beneficial to have it in the core, as many people learning python for example cannot install packages. |
I think doing that would make it lose almost completely its value. |
GCC also provides more and more hints. Example: int main() { int hello = 1; return helo; } GCC: error: 'helo' undeclared (first use in this function); did you mean 'hello'? |
Related issue: PEP-534 -- Improved Errors for Missing Standard Library Modules |
I like the https://docs.python.org/dev/whatsnew/3.10.html#better-error-messages section: well done, thanks ;-) |
I opened PR 25584 to fix this current behavior: >>> v
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'v' is not defined. Did you mean: 'id'?
>>> vv
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'vv' is not defined. Did you mean: 'id'?
>>> vvv
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'vvv' is not defined. Did you mean: 'abs'? |
Some research of other projects: LLVM [1][2]
GCC [3]
Rust [4]
Ruby [5]
I think there are some good ideas here. [1] https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/include/llvm/ADT/edit_distance.h#L42 |
Hi Dennis, this is a fantastic investigation! I think I really like GCC approach here. We may want to invest into porting some of their ideas into our solution. |
PR 25776 is a work in progress for what it might look like to do a few things:
Re: Damerau-Levenshtein (transpositions as single edits), if that were to get implemented, I don't see a way to do that without using a buffer of at least 3x the size, storing the most recent 3 rows of the matrix. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: