-
Notifications
You must be signed in to change notification settings - Fork 1
/
CHANGES
219 lines (149 loc) · 7.07 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
The ElementTree Library
$Id: CHANGES 3224 2007-08-27 21:23:39Z fredrik $
*** Changes from release 1.1 to 1.2 ***
(1.2.7 released)
- Added parser support for IronPython (through the ElementIron
module).
- Changed serialization of attribute values; newlines are now
preserved; apostrophes are no longer escaped. The serializer
was also made a bit faster (~20% on my test cases).
(1.2.6 released; Python 2.5 released)
- Fixed handling of entities defined in internal DTD's (reported
by Greg Wilson).
- Fixed serialization under non-standard default encodings (but
using non-standard default encodings is still a lousy idea ;-)
(1.2.5 released)
- Added 'iterparse' implementation. This is similar to 'parse', but
returns a stream of events while it builds the tree. By default,
the parser only returns "end" events (for completed elements):
for event, elem in iterparse(source):
...
To get other events, use the "events" option to pass in a tuple
containing the events you want:
for event, elem in iterparse(source, events=(...)):
...
The event tuple can contain one or more of:
"start"
generated for start tags, after the element has been created
(but before the current element has been fully populated)
"end"
generated for end tags, after all element children has been
created.
"start-ns"
generated when a new namespace scope is opened. for this event,
the elem value is a (prefix, url) tuple.
"end-ns"
generated when the current namespace scope is closed. elem
is None.
Events arrive asynchronously; the tree is usually more complete
than the events indicate, but this is nothing you can rely on.
The iterable itself contains context information. In the current
release, the only public context attribute is "root", which is set
to the root element when parsing is finished. To access the con-
text, assign the iterable to a variable before looping over it:
context = iterparse(source)
for event, elem in context:
...
root = context.root
(1.2.4 released)
- Fixed another FancyTreeBuilder bug on Python 2.3.
(1.2.3 released)
- Fixed the FancyTreeBuilder class, which was broken in 1.2.1
and 1.2.2 (broken for some Python versions, at least).
(1.2.2 released)
- Fixed some ASCII/Unicode issues in the HTML parser. You can now
use the parser on documents that mixes encoded 8-bit data with
character references outside the ASCII range. (backported from 1.3)
(1.2.1 released)
- Changed XMLTreeBuilder to take advantage of new expat features, if
present. This speeds up parsing quite a bit. (backported from 1.3)
(1.2c1 released; 1.2 final released)
- Added 'docs' directory, with PythonDoc documentation for the
ElementTree library. See docs/index.html for an overview.
(1.2b4 released)
- Fixed encoding of Unicode element names and attribute names
(reported by Ken Rimey).
(1.2b3 released)
- Added default argument to 'findtext'. Note that 'findtext' now
always returns an empty string if a matching element is found, but
has no text content. None is only returned if no element is found,
and no default value is specified.
- Make sure 'dump' adds a trailing linefeed.
(1.2b2 released)
- Added optional tree builder argument to the HTMLTreeBuilder class.
(1.2b1 released)
- Added XMLID() helper. This is similar to XML(), but returns both
the root element and a dictionary mapping ID attributes to elements.
- Added simple SgmlopXMLTreeBuilder module. This is a very fast
parser, but it doesn't yet support namespaces. To use this parser,
you need the sgmlop driver:
http://effbot.org/zone/sgmlop-index.htm
- Fixed exception in test suite; the TidyHTMLTreeBuilder class
now raises a RuntimeError exception if the _elementidy module
is not available.
(1.2a5 released)
- Fixed problem that could result in repeated use of the same
namespace prefix in the same element (!).
- Fixed import error in ElementInclude, when using the default
loader (Gustavo Niemeyer).
(1.2a4 released)
- Fixed exception when .//tag fails to find matching elements
(reported by Mike Kent) (@XMLTOOLKIT28)
- Fall back on pre-1.2 find/findtext/findall behaviour if the
ElementPath module is not installed. If you don't need path
support, you can simply copy the ElementTree module to your
own project.
(1.2a3 released)
- Added experimental support for XInclude-style preprocessing. The
ElementInclude module expands xi:include elements, using a custom
resolver. The current release ignores xi:fallback elements.
- Fixed typo in ElementTree.findtext (reported by Thomas Dartsch)
(@XMLTOOLKIT25)
- Fixed parsing of periods in element names (reported by Brian
Vicente) (@XMLTOOLKIT27)
(1.2a2 released)
- Fixed serialization of elements and attributes in the XML default
namespace (http://www.w3.org/XML/1998/namespace). Added "rdf" to
the set of "well-known" namespace prefixes.
- Added 'makeelement' factory method. Added 'target' argument to
XMLTreeBuilder class.
(1.2a1 released)
- Added support for a very limited subset of the abbreviated XPath
syntax. The following location paths are supported:
tag -- select all subelements with the given tag
. -- select this element
* -- select all subelements
// (empty path) -- select all subelements, on all levels
Examples:
p -- select all p subelements
.//a -- select all a sublements, at all sublevels
*/img -- select all img grandchildren
ul/li -- select all li elements that are children of ul elements
.//ul/li -- same, but select elements anywhere in the subtree
Absolute paths (paths starting with a slash) can only be used on
ElementTree instances. To use // on an Element instance, add a
leading period (.).
*** Changes from release 1.0 to 1.1 ***
(1.1 final released)
- Added 'fromstring' and 'tostring' helpers. The 'XML' function is
an alias for 'fromstring', and provides a convenient way to add XML
literals to source code:
from elementtree.ElementTree import XML
element = XML('<element>content</element>')
- Moved XMLTreeBuilder functionality into the ElementTree module. If
all you need is basic XML support, you can simply copy the ElementTree
module to your own project.
- Added SimpleXMLWriter module.
(1.1b2 released)
- Changed default encoding to US-ASCII. Use tree.write(file, "utf-8")
to get the old behaviour. If the tree contains text that cannot be
encoded using the given encoding, the writer uses numerical entities
for all non-ASCII characters in that text segment.
(1.1b1 released)
- Map tags and attribute names having the same value to the same
object. This saves space when reading large XML trees, and also
gives a small speedup (less than 10%).
- Added benchmark script. This script takes a filename argument, and
loads the given file into memory using the XML and SimpleXML tree
builders. For each parser, it reports the document size and the
time needed to parse the document.