New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xml.parsers.expat make a dictionary which keys are broken if buffer_text is False. #49286
Comments
When I make a dictionary by parsing "legacy-icon-mapping.xml"(which is a ===================== #!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import with_statement
import sys
from xml.parsers.expat import ParserCreate
import codecs
class Database:
"""Make a dictionary which is accessible by Databese.dict"""
def __init__(self, buffer_text):
self.cnt = None
self.name = None
self.data = None
self.dict = {}
p = ParserCreate()
p.buffer_text = buffer_text
p.StartElementHandler = self.start_element
p.EndElementHandler = self.end_element
p.CharacterDataHandler = self.char_data
with open("/usr/share/icon-naming-utils/legacy-icon-mapping.xml",
'r') as f:
p.ParseFile(f)
def start_element(self, name, attrs):
if name == 'context':
self.cnt = attrs["dir"]
if name == 'icon':
self.name = attrs["name"]
def end_element(self, name):
if name == 'link':
self.dict[self.data] = (self.cnt, self.name)
def char_data(self, data):
self.data = data.strip()
def print_set(aset):
for e in aset:
print '\t' + e
if __name__ == '__main__':
sys.stdout = codecs.getwriter('utf_8')(sys.stdout)
map_false_dict = Database(False).dict
map_true_dict = Database(True).dict
print "The keys which exist if buffer_text=False but don't exist if
buffer_text=True are"
print_set(set(map_false_dict.keys()) - set(map_true_dict.keys()))
print "The keys which exist if buffer_text=True but don't exist if
buffer_text=False are"
print_set(set(map_true_dict.keys()) - set(map_false_dict.keys())) ===================== The result of running this script is |
If the xml file is small enough, could you attach it to the issue? Or (Note that Python 2.5 only gets security fixes now, so unless this |
Thanks for reply!
|
The sample code has bug. expat is OK. Method char_data must append the incoming characters because the You should reset it by self.data = '' at end_element(). |
Hi kawai. |
That's the spec of XML SAX interface. |
Please read "The ContentHandler.characters() callback is missing data!" and close this issue :) |
a mistake of my former message, briefly -> in detail
|
From msg80438
It seems that we should reset it at start_element() like this, def start_element(self, name, attrs):
...abbr...
if name == 'link':
self.data = '' ============================= |
Could someone close this? |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: