Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad detector XML file crashes the IOC #459

Closed
MarkRivers opened this issue Sep 16, 2020 · 13 comments
Closed

Bad detector XML file crashes the IOC #459

MarkRivers opened this issue Sep 16, 2020 · 13 comments

Comments

@MarkRivers
Copy link
Member

A user reported that when their XML had rather than it crashed the IOC. They could not reboot because each time they did it would crash again reading that file.

The IOC needs to be robust against bad XML files.

@aps-7bm
Copy link

aps-7bm commented Sep 16, 2020

Here is the offending XML file (with .txt appended so GitHub would allow me to upload it). I confirmed that this file will cause the IOC to immediately crash. Let me know if you need the attributes file as well.
mct_IOCcrash.xml.txt

@MarkRivers
Copy link
Member Author

I was under the mistaken impression that it was a bad attributes XML file that was crashing the IOC, but I see now that it is HDF5 layout XML file that you were talking about. Thanks for clarifying.

@prjemian
Copy link
Contributor

I do not see any obvious XML syntax errors in file mct_IOCcrash.xml.txt.

@aps-7bm
Copy link

aps-7bm commented Sep 16, 2020

Check line 17. That should be a tag. Instead, it is a tag. You can run it through xmllint --valid on Linux as well.

@aps-7bm
Copy link

aps-7bm commented Sep 16, 2020

OK, let's try that again. It should be a </dataset> tag, but is really a <dataset> tag.

@aps-7bm
Copy link

aps-7bm commented Sep 16, 2020

To clarify, the bad layout XML file for the HDF5 plugin would cause the IOC to crash. I had edited the XML file while the IOC was running, so autosave restored the HDF1:XMLFileName PV to point to the path to the bad XML file. On boot, as soon as the IOC tried to read this XML file, it would crash again. Deleting the autosave file got it so the IOC would restart.

@MarkRivers
Copy link
Member Author

I have reproduced the problem. It is crashing in NDFileHDF5Layout.cpp in Group::name_exist(). I have added printf() debugging to see the calls and values leading to the crash. This is the git diff.


   Dataset* Group::new_dset(const std::string& name)
   {
     Dataset* ds = NULL;

+printf("Group::new_dset: Calling name_exist(%s)\n", name.c_str());
+printf("Group::new_dset: name_exist(%s) = %d\n", name.c_str(), this->name_exist(name));
     // First check that a dataset or a group with this name doesn't already exist
     if ( this->name_exist(name) ) return NULL;

     // Create the object
+printf("Group::new_dset: creating Dataset %s\n", name.c_str());
     ds = new Dataset(name);
     ds->parent = this;

     // Insert the string, Dataset pointer pair in the datasets map.
     std::pair<std::map<std::string, Dataset*>::iterator,bool> ret;
     ret = this->datasets.insert(std::pair<std::string, Dataset*>(name, ds));

     // Check for successful insertion.
     if (ret.second == false){
       delete ds;
@@ -608,26 +612,32 @@ namespace hdf5
   {
     Element::_copy(src);
     this->datasets = src.datasets;
     this->groups = src.groups;
     this->ndattr_default_container = src.ndattr_default_container;
   }

   bool Group::name_exist(const std::string& name)
   {
     // First check that a dataset or a group with this name doesn't already exist
+printf("Group::name_exist: calling this->datasets.count(%s)\n", name.c_str());
     if (this->datasets.count(name) > 0){
       return true;
     }
+printf("Group::name_exist: this=%p\n", this);
+printf("Group::name_exist: this->groups.size=%d\n", (int)this->groups.size());
+printf("Group::name_exist: calling this->groups.count(%s)\n", name.c_str());
     if (this->groups.count(name) > 0){
+printf("Group::name_exist: returning true\n");
       return true;
     }
+printf("Group::name_exist: returning false\n");
     return false;
   }

   Root::Root() : Group()
   {
   }

   Root::Root(const std::string& name) : Group(name)
   {
   }
diff --git a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
index 9bc1558..7c5f018 100644
--- a/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
+++ b/ADApp/pluginSrc/NDFileHDF5LayoutXML.cpp
@@ -496,27 +496,32 @@ namespace hdf5
     if (this->ptr_curr_element == NULL) return -1;

     std::string str_val = "";
     xmlChar *c_val = NULL;
     std::string str_type = "";
     xmlChar *c_type = NULL;

     xmlChar *dset_name = NULL;
     dset_name = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar *)LayoutXML::ATTR_ELEMENT_NAME.c_str());
     //LOG4CXX_DEBUG(log, "  new_dataset: " << dset_name );
+printf("LayoutXML::new_dataset: dset_name=%p\n", dset_name);
     if (dset_name == NULL) return -1;
+printf("LayoutXML::new_dataset: dset_name=%s\n", dset_name);

     std::string str_dset_name((char*)dset_name);
     xmlGetGlobalState()->xmlFree(dset_name);
     Group *parent = (Group *)this->ptr_curr_element;
+printf("LayoutXML::new_dataset: parent=%p\n", parent);
     Dataset *dset = NULL;
+printf("LayoutXML::new_dataset: calling new_dset str_dset_name=%s\n", str_dset_name.c_str());
     dset = parent->new_dset(str_dset_name);
+printf("LayoutXML::new_dataset: new_dset str_dset_name returned OK\n");
     if (dset == NULL) return -1;

     // Check if this dataset has been set as the default
     bool detector_default = false;
     xmlChar *attr_def = NULL;
     std::string str_attr_def;
     attr_def = xmlTextReaderGetAttribute(this->xmlreader, (const xmlChar*)LayoutXML::ATTR_SRC_DET_DEFAULT.c_str());
     if (attr_def != NULL){
       str_attr_def = (char*)attr_def;
       xmlGetGlobalState()->xmlFree(attr_def);

This is the layout XML file:

<hdf5_layout>
  <group name="detector">
    <dataset name="data1" source="detector" det_default="true">
        <attribute name="AcquireTime" source="ndattribute" ndattribute="AcquireTime" when="OnFileClose"></attribute>
        <attribute name="NumCaptured" source="ndattribute" ndattribute="NumCaptured" when="OnFileOpen"/>
   </dataset>
    <dataset name="det_name" source="constant" value="Simulated detector" type="string" when="OnFileOpen">
        <attribute name="NumCaptured" source="ndattribute" ndattribute="NumCaptured" when="OnFileClose"></attribute>
   </dataset>
   <dataset name="NumCaptured" source="ndattribute" ndattribute="NumCaptured"/>
  </group>
</hdf5_layout>

If I change line 6 in the XML file from from </dataset> to <dataset> it crashes when reading the XML file with this output:

Group::name_exist: calling this->datasets.count(detector)
Group::name_exist: this=0x7fb91c010e30
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(detector)
Group::name_exist: returning false
LayoutXML::new_dataset: dset_name=0x7fb91c010de0
LayoutXML::new_dataset: dset_name=data1
LayoutXML::new_dataset: parent=0x7fb91c010f20
LayoutXML::new_dataset: calling new_dset str_dset_name=data1
Group::new_dset: Calling name_exist(data1)
Group::name_exist: calling this->datasets.count(data1)
Group::name_exist: this=0x7fb91c010f20
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(data1)
Group::name_exist: returning false
Group::new_dset: name_exist(data1) = 0
Group::name_exist: calling this->datasets.count(data1)
Group::name_exist: this=0x7fb91c010f20
Group::name_exist: this->groups.size=0
Group::name_exist: calling this->groups.count(data1)
Group::name_exist: returning false
Group::new_dset: creating Dataset data1
LayoutXML::new_dataset: new_dset str_dset_name returned OK
LayoutXML::new_dataset: dset_name=0x7fb91c010de0
LayoutXML::new_dataset: dset_name=det_name
LayoutXML::new_dataset: parent=0x7fb91c011080
LayoutXML::new_dataset: calling new_dset str_dset_name=det_name
Group::new_dset: Calling name_exist(det_name)
Group::name_exist: calling this->datasets.count(det_name)
Group::name_exist: this=0x7fb91c011080
Group::name_exist: this->groups.size=469831544
Group::name_exist: calling this->groups.count(det_name)
Segmentation fault (core dumped)

This appears to be a very hard error to detect and guard against in the C++ code. Note that all of the pointers being returned are non-NULL. In the final call to Group::name_exist this is a non-NULL pointer. However, this->groups appears to be garbage because this->groups.size() returns a nonsensical number. It then crashes when this->groups.count() is called.

My conclusion is that it is very difficult to detect XML errors in the C++ code. The correct solution is to run xml lint on all XML files before using them.

The error introduced in this XML file was correctly detected by xmllint.

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout --schema /home/epics/devel/areaDetector/ADCore/XML_schema/hdf5_xml_layout_schema.xsd hdf5_layout_test.xml
hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1
corvette:simDetectorIOC/iocBoot/iocSimDetector>

In fact the error is detected even without using the schema file:

corvette:simDetectorIOC/iocBoot/iocSimDetector>xmllint --noout hdf5_layout_test.xml
hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1

@prjemian
Copy link
Contributor

prjemian commented Sep 20, 2020 via email

@aps-7bm
Copy link

aps-7bm commented Sep 20, 2020 via email

@MarkRivers
Copy link
Member Author

Can xmllint be called from the c++ code?

Not unless there is an API for it. The HDF5 file writer runs on vxWorks, Windows, etc.

@prjemian
Copy link
Contributor

prjemian commented Sep 20, 2020 via email

@MarkRivers
Copy link
Member Author

In fact it does attempt to validate the XML file, and that is where it is crashing! I added some debugging printf calls to NDFileHDF5LayoutXML.cpp LayoutXML::verify_xml, and that is where it is crashing:

LayoutXML::verify_xml entry
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
LayoutXML::verify_xml calling process_node
Segmentation fault (core dumped)

It appears to me that before it loops over process_node it needs to do a more basic validation to make sure it is valid XML syntax.

The code in asynNDArrayDriver.cpp that reads the detector attributes XML file does do that. If I create an invalid detector attribute XML via the following edit that removes the final \:

-    <Attribute name="AcquireTime"         type="EPICS_PV" source="$(CAMERA)AcquireTime"         dbrtype="DBR_NATIVE"  description="Camera acquire time"/>
+    <Attribute name="AcquireTime"         type="EPICS_PV" source="$(CAMERA)AcquireTime"         dbrtype="DBR_NATIVE"  description="Camera acquire time">

then when that file is read I get the following error messages on the IOC:

reboot_restore: done with file 'auto_settings.sav'

noname.xml:25: parser error : expected '>'
</Attributes>
           ^
noname.xml:26: parser error : Premature end of data in tag Attributes line 3

^
2020/09/20 11:15:42.094 asynNDArrayDriver:readNDAttributesFile: error creating doc

The simDetector OPI screen clearly shows the error to the user in red:
image

I will attempt to do the same thing in the HDF5 XML file reader.

@MarkRivers
Copy link
Member Author

I have fixed the problem. The DLS folks who wrote the HDF5 XML parser were probably misled by the documentation for the XML functions they were using. They are calling xmlReaderForMemory() and xmlReaderForFile(). The documentation for those says:

Create an xmltextReader for an XML in-memory document. The parsing flags @options are a combination of xmlParserOption.
parse an XML file from the filesystem or the network. The parsing flags @options are a combination of xmlParserOption.

They are passing 0 for the options flag. It seems that these functions do not do any basic syntax checking of the XML text, at least not when options=0.

I added code before these calls to call xmlParseMemory() or xmlParseFile(). Those functions do perform basic syntax checking of the XML file, and return a NULL xmlDocPtr if the syntax is invalid.

Now when I attempt to read the invalid XML file I get the following errors at the IOC console:

hdf5_layout_test.xml:11: parser error : Opening and ending tag mismatch: dataset line 6 and group
  </group>
          ^
hdf5_layout_test.xml:12: parser error : Opening and ending tag mismatch: dataset line 3 and hdf5_layout
</hdf5_layout>
              ^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag group line 2

^
hdf5_layout_test.xml:13: parser error : Premature end of data in tag hdf5_layout line 1

^
2020/09/20 11:58:48.527 NDFileHDF5::verifyLayoutXMLFile XML description file parser error.
2020/09/20 11:58:48.527 13SIM1:HDF1:XMLFileName devAsynOctet::writeIt failed NDFileHDF5:writeOctet: status=3, function=145, value=hdf5_layout_test.xml

The OPI screen also shows an error:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants