Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: BCF xml parsing: drop lxml, use xml.etree #1750

Merged
merged 17 commits into from Oct 10, 2017
Merged

ENH: BCF xml parsing: drop lxml, use xml.etree #1750

merged 17 commits into from Oct 10, 2017

Conversation

sem-geologist
Copy link
Contributor

@sem-geologist sem-geologist commented Oct 5, 2017

This PR drops usage of lxml and adopts built-in xml.etree.ElementTree

  • it is not a first time when travis ci and appveyor fail/skips tests due to lxml geting inculded in/out from default anaconda library list.
  • xml.etree is faster at parsing
  • reduces installation complexity (so that bcf would be installed even with minimal installation)
  • reduces dependency (as etree is built-in and distributed with python)

Progress of the PR

  • getting working replacement of bcf.py
    • adopt complicated xpath definitions to significantly narrowed down etree xpath subset (fix overview image reading)
  • remove lxml checks in other files (io/__init__.py),
  • update relevant parts in documentations
  • update installation scripts dropping lxml dependency (setup.py, .travis.yml, apveyor.yml ...)
  • check tests and do some adjustment if needed to pass all of them, (original metadata structure is not changed going to be very minory changed, adding the XML attributes with '@Attribute' notation)
  • find faulty xml example (I believe I encountered one once), and adopt regex code for its fixing prior-parsing, (containing numbers in the tags: e.g. <1>good day</1>) can't find such faulty bcf file. I have only bruker eds *.spx and project files with messy xml tags. Without working bug example this will wait until I or somebody will find such buggy bcf file.
  • ready for review.

@sem-geologist sem-geologist mentioned this pull request Oct 5, 2017
Replace 'XmlClass' into '@' string
which mark atrribute from xml node
into etreeelement.find(str)
fix: saved element checking error:
IndexError -> AttributeError
so that test comparing the python vs cython
parsing would be run as last
add: check if compiled cython bcf code
is present to that testing function
@sem-geologist
Copy link
Contributor Author

CI checks fail not due to bcf re-implementation, but due to plotting test failures.

@ericpre
Copy link
Member

ericpre commented Oct 9, 2017

This look alright. Can you please merge with hyperspy:RELEASE_next_minor, so that we can make sure that the bcf tests are also passing on appveyor? Since matplotlib 2.1 the plotting tests are failing... but this is another story.

@ericpre ericpre merged commit adaf1f3 into hyperspy:RELEASE_next_minor Oct 10, 2017
@ericpre
Copy link
Member

ericpre commented Oct 10, 2017

Thanks @sem-geologist!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants