Skip to content

Valid Unicode identifier can be used as field name but not as kwargs #1405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serge-sans-paille opened this issue Feb 5, 2025 · 5 comments
Closed

Comments

@serge-sans-paille
Copy link
Contributor

Simple reproducer:

>>> import attrs
>>> A = attrs.make_class('A', ['Ŀ椁楮潴桶'])
>>> A(Ŀ椁楮潴桶=1)  # this is fine
A(Ŀ椁楮潴桶=1)
>>> A(**{'Ŀ椁楮潴桶':1})  # this is not fine
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: A.__init__() got an unexpected keyword argument 'Ŀ椁楮潴桶'

This was raised by fuzzing when setting up google/oss-fuzz#13009

@hynek
Copy link
Member

hynek commented Feb 6, 2025

Thanks, but this is out of scope. I've added a warning in 5820ce7, though.

@hynek hynek closed this as completed Feb 6, 2025
@serge-sans-paille
Copy link
Contributor Author

Thanks for your reply ! I did check that

>>> 'Ŀ椁楮潴桶'.isidentifier()
True

though, so why would that be out of scope?

@serge-sans-paille
Copy link
Contributor Author

@hynek : I've digged a bit because I found it quite strange that attrs fails with such class name while the name is a valid Python identifier.
It turns out that all Python identifiers are converted to so called "normal form NFKC" while parsing, as stated in https://docs.python.org/3.14/reference/lexical_analysis.html#identifiers.
As a consequence supporting all valid Python identifiers in attrs is just a matter of normalizing the parameters of make_class. The diff is very simple:

diff --git a/src/attr/_make.py b/src/attr/_make.py
index e3e31ff..ef3ced0 100644
--- a/src/attr/_make.py
+++ b/src/attr/_make.py
@@ -13,6 +13,7 @@ import linecache
 import sys
 import types
 import typing
+import unicodedata
 
 from operator import itemgetter
 
@@ -2908,10 +2909,15 @@ def make_class(
     .. versionchanged:: 18.1.0 If *attrs* is ordered, the order is retained.
     .. versionchanged:: 23.2.0 *class_body*
     """
+    # All identifiers are converted into the normal form NFKC while parsing
+    name = unicodedata.normalize('NFKC', name)
+
     if isinstance(attrs, dict):
         cls_dict = attrs
+        attrs = {unicodedata.normalize('NFKC', k): v for k, v in attrs}
     elif isinstance(attrs, (list, tuple)):
         cls_dict = {a: attrib() for a in attrs}
+        attrs = [unicodedata.normalize('NFKC', k) for k in attrs]
     else:
         msg = "attrs argument must be a dict or a list."
         raise TypeError(msg)

@hynek
Copy link
Member

hynek commented Feb 8, 2025

would you mind opening a PR, then?

@serge-sans-paille
Copy link
Contributor Author

Done in #1406

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants