Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 2.7.15: xml.sax.parse() closes file objects passed to it #77913

Closed
gibfahn mannequin opened this issue Jun 1, 2018 · 3 comments
Closed

Python 2.7.15: xml.sax.parse() closes file objects passed to it #77913

gibfahn mannequin opened this issue Jun 1, 2018 · 3 comments
Labels
stdlib Python modules in the Lib dir topic-XML

Comments

@gibfahn
Copy link
Mannequin

gibfahn mannequin commented Jun 1, 2018

BPO 33732
Nosy @vstinner, @zware, @serhiy-storchaka, @gibfahn

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-04-27.03:43:10.526>
created_at = <Date 2018-06-01.12:59:58.263>
labels = ['expert-XML', 'library']
title = 'Python 2.7.15: xml.sax.parse() closes file objects passed to it'
updated_at = <Date 2020-04-27.03:43:10.525>
user = 'https://github.com/gibfahn'

bugs.python.org fields:

activity = <Date 2020-04-27.03:43:10.525>
actor = 'zach.ware'
assignee = 'none'
closed = True
closed_date = <Date 2020-04-27.03:43:10.526>
closer = 'zach.ware'
components = ['Library (Lib)', 'XML']
creation = <Date 2018-06-01.12:59:58.263>
creator = 'gibfahn'
dependencies = []
files = []
hgrepos = []
issue_num = 33732
keywords = []
message_count = 3.0
messages = ['318408', '318419', '367373']
nosy_count = 4.0
nosy_names = ['vstinner', 'zach.ware', 'serhiy.storchaka', 'gibfahn']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue33732'
versions = ['Python 2.7']

@gibfahn
Copy link
Mannequin Author

gibfahn mannequin commented Jun 1, 2018

Sorry if this is a duplicate, I didn't find anything.

We hit some issues with this change:

It's possible I'm misunderstanding something, let me know if that's the case.

It seems that in Python 2.7.15, xml.sax.parse() closes file descriptors that are passed to it.

  1. Isn't this a breaking change? It certainly breaks code we're using in production.
  2. Why is the sax parser closing file descriptors that it didn't open? I understand if the parser is given a path and opens its own fd it makes sense to close it, but not when the fd is given directly.
  3. What do you do if you need access to the file descriptor after parsing it (because you parse it in place)?

For file descriptors that point to files on disk we can work around it by reopening the file after each parse, but for something like a StringIO buffer (see simplified example below) I'm not aware of any way to get around the problem.

-> StringIO Example:

    import xml.sax
    import StringIO
    # Some StringIO buffer.
    fd = StringIO.StringIO(b'<_/>')
    # Do some parsing.
    xml.sax.parse(fd, xml.sax.handler.ContentHandler())
    # Try to do some other parsing (fails).
    xml.sax.parse(fd, xml.sax.handler.ContentHandler())

-> File Example:

    import xml.sax
    fd = open('/tmp/test-junit1.xml')
    # Do some parsing.
    xml.sax.parse(fd, xml.sax.handler.ContentHandler())
    # Do some other parsing.
    xml.sax.parse(fd, xml.sax.handler.ContentHandler())

Originally posted on #1451 (comment), thanks serhiy.storchaka for redirecting me here.

@gibfahn gibfahn mannequin added stdlib Python modules in the Lib dir topic-XML labels Jun 1, 2018
@gibfahn
Copy link
Mannequin Author

gibfahn mannequin commented Jun 1, 2018

As an addendum, I note that other parsers, like:

    parser = lxml.etree.XMLParser(compact=False)
    etree.parse(some_fd, parser).find('some_text').text

do not close the fd they are given.

@zware
Copy link
Member

zware commented Apr 27, 2020

Hi Gibson,

I'm sorry this issue didn't get any attention before Python 2.7 reached EOL, but as that milestone has now passed I'm closing the issue. Thank you for the report anyway!

@zware zware closed this as completed Apr 27, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-XML
Projects
None yet
Development

No branches or pull requests

1 participant