Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of XML content with input fields > 512kb fails #551

Closed
jodue opened this issue Feb 6, 2017 · 5 comments
Closed

Parsing of XML content with input fields > 512kb fails #551

jodue opened this issue Feb 6, 2017 · 5 comments

Comments

@jodue
Copy link
Contributor

@jodue jodue commented Feb 6, 2017

I have test xml data with a binary resource which is > 512kb. When HAPI-FHIR tries to parse this an exception is thrown because the utilized XML woodstox library has 512kb as a default max size for a single attribute.

This issue has already been fixed some time ago but now this happens again because there is a check if the WstxInputFactory is used that does not work correctly any more:

XMLUtil.java line 1618:
if (inputFactory instanceof com.ctc.wstx.stax.WstxInputFactory)
This does evaluate to false as inputFactory is wrapped "__redirected.__XMLInputFactory@614ab92" but the real factory object is indeed "actual: WstxInputFactory"

jamesagnew added a commit that referenced this issue Feb 6, 2017
@jamesagnew
Copy link
Owner

@jamesagnew jamesagnew commented Feb 6, 2017

Hi @jodue , would you be able to share more information about your setup?

This is working fine on my end (and I've committed a unit test just to be sure). I'm wondering if you have a different version of Woodstox on your classpath or something like that?

@jodue
Copy link
Contributor Author

@jodue jodue commented Feb 7, 2017

Hi James,

thanks for looking into this! Woodstox is added as a maven dependency by HAPI-FHIR, woodstox-core-asl-4.4.1.jar so this should be correct i guess.

We are running HAPI_FHIR in an JBoss EAP 7 which seems to somehow wrap this factory class which is why the call to XMLInputFactory.newInstance(); returns this "__redirected.__XMLInputFactory". As far as i could find out this should affect all JBoss versions and not only EAP 7.

An easy solution to this would be to remove the instanceof check as adding the two properties to other implementations would not really have any downside.

Since the HAPI-FHIR build currently seems to be broken i am unable to fix/test this but if you wish i could look into this when the build is running again!

Thank you!
Johannes

@jamesagnew
Copy link
Owner

@jamesagnew jamesagnew commented Feb 7, 2017

Hi Johannes,

Unfortunately we can't set those properties blindly.. we used to do exactly that but this caused issues on Glassfish, where a non-woodstox StAX provider is used by default (SJSXP).. on that platform having unrecognized properties caused the parser to refuse to work.

I wonder if there is any way other than the instanceof test that could be used to detect that the parser is woodstox. It might be worth asking on one of the JBoss list I suppose, I'm not really sure why they are using a repackaged woodstox in the first place.

One option might be something like:

boolean isWoodstox = inputFactory.getProperty("org.codehaus.stax2.implVersion") != null;

You'd want to wrap that in a try-catch, but that could work.

Ps, build is passing again!

@jodue
Copy link
Contributor Author

@jodue jodue commented Feb 7, 2017

Hi James,

that is unfortunate. One would think that it would just ignore unknown properties ...

I now built HAPI-FHIR locally with the proposed fix (see attached diff) and it works as expected. I can also send a pull request if you like!

Thank you!
woodstox.txt

jamesagnew added a commit that referenced this issue Mar 17, 2017
Pull request for issue #551 to fix parsing of input fields > 512kb with JBoss
@jamesagnew
Copy link
Owner

@jamesagnew jamesagnew commented Mar 17, 2017

PR merged in.

@jamesagnew jamesagnew closed this Mar 17, 2017
jamesagnew added a commit that referenced this issue Mar 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.