Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Use ElementTree to parse JUnit XML files because it is much faster than minidom #4693
The new stat collecting code in 1.3+ #4561 uses
Change testrunner_task_mixin.py to use ElementTree for parsing XML instead of pants.util.xml_parser. This is much faster for parsing large xml files.
Without this change we had tests generating xml files of up to 40MB which were taking an additional 3 minutes to run for each test and effectively timing out in the build system. The change to ElementTree got the XML parsing down to sub-second.
This change is the minimal change that will get our tests working again. It would be nice to consolidate the JUnit XML parsing that is done in
It would also be good to have an option to turn off stats gathering in case it affects performance in the future.
This is great! I really like how it improves performance. I definitely agree that we should 1) add an option to turn off different segments of stats reporting (test data, target data, and any other data that we may be reporting in the future); 2) use this approach in for all xml parsing.
left a comment
Filing issues against these 2 ideas would be good.