-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronize copy methods between Python and C implementations of xml.etree.ElementTree.Element #76605
Comments
Currently, the Python implementation of the Element class in xml.etree.ElementTree defines a method called copy() which the C implementation does not define, whereas the C implementation defines a __copy__() method (and a __deepcopy__() method) which the Python implementation does not define. Given that the documentation makes no mention of a copy() method and that its definition would be masked by a standard import of xml.etree.ElementTree, I propose that it be renamed to __copy__() so that copy.copy() can make use of it. |
As the C implementation is already shadowing the Python implementation when available (last lines of Lib/xml/etree/ElementTree.py), using So renaming the copy method to __copy__ in the python implementation looks good to me. Ultimately the Python implementation would require a __deepcopy__ too? |
Ultimately, yeah, the Python version should probably define __deepcopy__ as well. But since this is just a rename of an existing method, I figure we can defer that to another time. |
Renaming the method I agree that it is not documented, but some users tend to assume public methods are documented and use them. So i think it is better to not to break their code. |
Agree with Srinivas |
I don't agree with the API breakage, the C implementation of the module does not expose copy(), so code using copy() is already broken (since 2005) in cases the C implementation is used. I also expect the C implementation to be used almost anywhere, but I may be wrong on this point: in which case the Python implementation may still be used? |
pypy for example has Element.copy() |
Two notes:
That said, I'm neutral on the subject. I'm happy to implement whichever option you decide on. |
I mean dropping |
If the main issue is that pypy users have been using this, can't pypy just keep an alias for |
For derivative Python implementation there is a standard strategy: they uses a copy of CPython standard library. Tthat's why any new module should have a pure Python implementation and that's why https://www.python.org/dev/peps/pep-0399/ exists. Jython has copy() method as well. IronPython has it. I doubt if an implementation without xml.etree.ElementTree.Element.copy exists at all. The method removal breaks backward compatibility without a reason, __copy__ alias is valuable and backward compatible change. |
@andrew Are you suggesting that a |
Yes. |
Paul Ganssle: Even if Andrew were not suggesting adding copy to the C implementation (I have no opinion on that currently), it would still be correct to maintain backward compatibility in the python version in the standard library. We do not consider the standard library to be CPython specific, even though parts of it are. There is no "fork" involved in other projects using it, from our point of view. Quite the opposite, where possible. We have given commit rights to developers from other python implementations specifically so they can contribute things that improve compatibility in the standard library and standard library test suite for those other python implementations. |
I did not mean "fork" in some judgmental sense, I'm mostly just familiar with pypy, but I was under the impression that other projects were *literally* forking the standard library as a practical matter. I get the impression that other interpreters maintain a patched interpreter often for the purposes of adding fixes like the one pointed out in msg309131 which bring the behavior of the Python implementation of the standard library into line with the C implementations, since the C version is the de facto standard (since most code is written and tested only for CPython). I'd be fine with adding a C alias for |
Yes, that's why I said "from our point of view" :) I know there is usually a fork in the practical sense, but we want to make it as easy as practical to sync that fork, which includes not breaking things in the python versions without good reason. |
Given the discussion, I've gone ahead and created a new PR that synchronizes the three copy methods between implementations and deprecates Element.copy(). |
There is an existing test for copying. No new tests are needed. Just add a case for the copy() method in test_copy. Special __deepcopy__() method for Python implementation is not needed. copy.deepcopy() already works well. |
As discussed above, starting with msg309074, __deepcopy__() is being added to the Python implementation because it already exists in the C implementation. And additional tests have in fact uncovered further discrepancies between the Python and C implementations with regard to copying behavior. PR 5046 has changes that I believe are ready for review. |
Python implementation shouldn't copy all implementation details of the C implementation. This is cumbersome and often impossible. Both implementation should have the same behavior, and they do have.
What are these discrepancies? In any case it is better to extend existing tests by adding missed checks than duplicate tests. |
As it stands, calling Element.__deepcopy__() will succeed with the C Additionally, if copy.deepcopy() already works well with the Python
As I mentioned in the PR, the C implementation does not make use of I have modified the C implementation to fix this problem by adding a The Python implementation does not have this problem because it
As I mentioned in the PR: I have added the unit tests in the way that I have because the existing |
You should not call it directly. Use copy.deepcopy().
__deepcopy__() in the C implementation is merely optimization. It is less efficient to convert from the efficient internal representation of Element object to tuples and dicts and back.
create_new_element() usually is called with a new copy of the attrib dict. Making yet one copy inside it is just a waste of time. If in some cases it is called with an externally referred dict, fix these cases by making a copy before calling create_new_element(). |
Taking a step back, I want to make sure I understand the action items for resolving this:
Any other issues discussed here are orthogonal to the goal of this ticket. Are we in agreement on this? |
Opened bpo-36685 for discussion of the attrib copy issue. |
We should not add anything to the implementation that we consider legacy elsewhere. Py3 has "always" used the C accelerator module instead of the Python implementation, which suggests that its interface is probably the righter one. So: make sure that the Obvious Ways, copy.copy() and copy.deepcopy(), work nicely for both implementations, and deprecate Element.copy() in Py3.8. |
Concur with Stefan. Adding the method deprecated from the born looks silly. We should either deprecate the copy() method in the Python implementation, or add an undeprecated method in the C implementation. The latter should be considered as a new feature. |
It seems the final open question on this is whether the mismatch between the Python and C implementations is enough to bypass PendingDeprecationWarning for copy() in favor of jumping straight to DeprecationWarning. |
Let's just deprecate it directly (in Py3.9, I guess), given that it was never documented. |
Follow-up: issue #94383: Remove xml.etree.ElementTree.Element.copy() (Python implementation): use copy.copy() instead. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: