Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for output_encoding, utf8 #12

Closed
wants to merge 1 commit into from
Closed

Conversation

dLuna
Copy link
Contributor

@dLuna dLuna commented Jul 27, 2012

When using utf8 output_encoding, the old code would crash for a
CDATA which was not flush with its surrounding tags.

There are numerous more ++ in the code of this module and I don't know enough to be able to reliably know whether some of those should also be replaced with a version that works on both binary and lists.

Feedback and comments very welcome.

When using utf8 output_encoding, the old code would crash for a
CDATA which was not flush with its surrounding tags.
@willemdj
Copy link
Owner

Hi,

I don't have much time right now, but I'll look into it.

Can you explain what you mean by "CDATA which was not flush with its surrounding tags"? Or provide an example?

Regards,
Willem

@dLuna
Copy link
Contributor Author

dLuna commented Aug 1, 2012

The following xsd and xml files will work if you put the <![CDATA flush with <whatnot> but not the way it is in the example below.

Save these examples as example.xsd and example.xml and run erlsom:scan(element(2, file:read_file("example2.xml")), element(2, erlsom:compile_xsd_file("example2.xsd")), [{output_encoding, utf8}]). and you will get the following crash. Remove [{output_encoding, utf8}] and it works. It is fully possible that the bug is in erlsom_sax_utf8.erl instead. There is a comment on line 862 that sort of makes me suspect that is the case. I don't understand the code base well enough to solve it that way.

** exception throw: {'EXIT',
                     {error,
                      [{exception,
                        {badarg,
                         [{erlang,'++',[<<"\n">>,<<"Testing">>],[]},
                          {lists,append,2,[{file,"lists.erl"},{line,63}]},
                          {erlsom_parse,stateMachine,2,
                           [{file,"src/erlsom_parse.erl"},{line,652}]},
                          {erlsom_parse,xml2StructCallback,2,
                           [{file,"src/erlsom_parse.erl"},{line,299}]},
                          {erlsom_sax_utf8,wrapCallback,2,
                           [{file,"src/erlsom_sax_utf8.erl"},{line,1364}]},
                          {erlsom_sax_utf8,parseContentLT,2,
                           [{file,"src/erlsom_sax_utf8.erl"},{line,864}]},
                          {erlsom_sax_utf8,parse,2,
                           [{file,"src/erlsom_sax_utf8.erl"},{line,196}]},
                          {erlsom,scan2,3,
                           [{file,"src/erlsom.erl"},{line,211}]}]}},
                       {stack,[{'#PCDATA',char,<<"\n">>},'top-type']},
                       {received,{characters,<<"Testing">>}}]}}
     in function  erlsom:scan2/3 (src/erlsom.erl, line 215)
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:simpleType name="whatnot-type">
    <xs:restriction base="xs:string" />
  </xs:simpleType>
  <xs:complexType name="top-type">
    <xs:all>
      <xs:element name="whatnot" type="whatnot-type"></xs:element>
    </xs:all>
  </xs:complexType>
  <xs:element name="top" type="top-type" />
</xs:schema>
<top><whatnot>
<![CDATA[Testing]]></whatnot></top>

@willemdj
Copy link
Owner

Thanks, I merged it to master.

@willemdj willemdj closed this Aug 12, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants