New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GbXML ReverseTranslator needs validation and/or better error handling #4441
Comments
These two are wrong per 6.01 OpenStudio/src/gbxml/MapSchedules.cpp Line 85 in 05c754a
and OpenStudio/src/gbxml/MapSchedules.cpp Line 114 in 05c754a
should be https://www.gbxml.org/schema/6-01/GreenBuildingXML_Ver6.01.xsd#element_WeekSchedule https://www.gbxml.org/schema/6-01/GreenBuildingXML_Ver6.01.xsd#element_DaySchedule |
Example "manual" error handling in 2dfa66b |
If XSD validation is added to OS, could it be done in such a way that it’s exposed in the SDK? Maybe Schematron validation too? I have wanted this for a long time for our HPXML work. The Oga library we’re using is great, but it doesn’t support XSD or Schemtron validation. And the performance suffers from being in Ruby. |
@jmarrec my understanding is that the momentum is toward bringing in Xerces to support validation. I know that Xerces is popular and capable, and I have confidence that you did a wide search for other library options. But, some folks on the team mentioned libxml2 and have used it through Ruby/Python bindings and really liked it. Can you think of a reason to choose Xerces over libxml2? |
I would love a solution that supports Schematron, which libxml2 appears to support but not xerces. |
For what it's worth, I did find this document, which described why pugixml was chosen in the past (instead of xerces or libxml2). |
I love pugixml. I don't think we will likely remove it as we already went through one conversion from QXML to pugi, and doing it again would be a lot of work and unnecessary. I suggest we continue encouraging pugi for parsing / xml manipulation needs, but we obviously need one of these other libraries for validation. |
libxml2 is C based, but if we keep the application of it focused on validation, we can put a small C++ wrapper around it to hide the pointers. |
Libxml2 is C. Aside from that no. I don't really care which one we choose. Both are available for conan. Xerces does feel like C-ish as well (start / stop methods, takes const char pointers etc) Also I do think like you that we should keep pugixml in place, just do the validation with another one, instead just using one lib to do both. So we'll provide a simple class that does only one thing: validate a path or stream, reports errors and warnings. The implementation details would be hidden and we won't expose the internal library we use. So both would work |
@kbenne There's something that makes me uneasy about libxml2, is that it doesn't seem to be truly thread safe. It's possible to use it with multiple threads though if you're careful. It's also not like we necesarilly need thread safety here, but still. https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-parser.html#xmlCleanupParser Anyways, we need to settle on a lib to use. @joseph-robertson has already started implementation with xerces-C in #4578, so if you insist on a library change it should happen now rather than when it's finished and polished. Did you guys discuss it on Friday maybe? |
Here is a current "main points" list that @jmarrec and I complied:
|
Here is an additional note on using libxmlplusplus c++ wrapper for schematron. |
@jmarrec you are correct we did talk about this at the iteration meeting on Friday. After that conversation and the discussion here, I favor libxml2. People seem to have a high opinion of libxml2 and especially the Schematron support. But I think we should keep libxml2 confined to schema validation, and I feel like we should write our own C++ wrapper rather than adopting an existing wrapper. I think we basically need about one public API |
Unfortunately it seems to be more complicated than I realized. lxml, which is a very popular python library for working with xml, uses libxml2 but not really for schematron support. From here:
So it sounds like libxml2 support for Schematron is pretty outdated and poor. No idea if it will work well for us. |
@shorowit So this may be misleading. In my search of an example of using libxml2 to do schematron validation, the only example I could find was a Cython extension... for I'm trying to make a MCVE in C right now, to see if it would work or not. |
I have a shitty MCVE implementation at https://github.com/jmarrec/TestCpp-xml-schematron libxml2 uses a too old schematron. So I do the same as what Example API usage: https://github.com/jmarrec/TestCpp-xml-schematron/blob/6dfc1f02fbe67f98e3e9279b0ee1228e5a151496/test/XMLValidator_GTest.cpp#L68-L80 |
FYI, https://github.com/vslavik/xmlwrapp does include both libxml and libxslt, so it could be used to do this will less apparent pointers and co. The issue is making a conan recipe for it... |
Enhancement Request
See #4440 (comment)
Really we need error handling in gbxml... There is currently no validation against the gbXML schema happening and there is very little checking that is done manually (like ensuring that required values/attributes are supplied, or checking that a YearSchedule references a weekScheduleIdRef you make sure it can be found).
Detailed Description
I found this because a test I added was failing without any logging when I had a typo in a weekScheduleIdRef
<Schedule type="Fraction" id="Simple_Schedule"> <YearSchedule id="Simple_Schedule_YearSchedule"> <BeginDate>2017-01-01</BeginDate> <EndDate>2017-12-31</EndDate> - <WeekScheduleId weekScheduleIdRef="SOMETHING_THAT_CANT_BE_FOUND" /> </YearSchedule> [ .... ]
Another issue might be when you pass Schedule with zero YearSchedule children. Or if YearSchedule doesn't have a BeginDate children. etc etc.
https://gbxml.org/schema/6-01/GreenBuildingXML_Ver6.01.xsd#element_YearSchedule
Possible Implementation
Schema validation against the gxbml XSD (herewould be an option.
pugixml
doesn't support that, so that would require using a different library for xml such as xeres-c (which is on conan-center-index: https://github.com/conan-io/conan-center-index/tree/master/recipes/xerces-c).Another option is to manually enforce that required values and attributes are there, and handle errors appropriately (at least logging to inform the user).
The text was updated successfully, but these errors were encountered: