-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Allow Most chars for variable Values #27
Fix: Allow Most chars for variable Values #27
Conversation
@@ -77,5 +100,6 @@ public static void assertPropertyIdentifiers( Collection<String> properties) thr | |||
} | |||
|
|||
private static final Pattern identifierRegex_ = Pattern.compile( "[\\w\\-]+"); | |||
private static final Pattern varValueRegex_ = Pattern.compile( "[^\\p{Cntrl}]*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoiding chars like newline, escape, bell. Those could be encoded in XML1.1, but the SAX parser fails to recognize them.
@@ -194,8 +202,7 @@ public String toIdPath( String attributeName, String attributeValue) throws SAXE | |||
*/ | |||
public String getAttribute( Attributes attributes, String attributeName) | |||
{ | |||
String value = attributes.getValue( attributeName); | |||
return StringUtils.isBlank( value)? null : value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a single whitespace seems like an important boundary test case value to me, for testing various functions.
@@ -125,7 +128,7 @@ public void writeAttribute( String name, String value) | |||
print( " "); | |||
print( name); | |||
print( "=\""); | |||
print( value); | |||
print( StringEscapeUtils.escapeXml10(value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about all the differences between XML 1.0 and 1.1. This one suggests to go with 1.0 when in doubt: https://stackoverflow.com/questions/6260975/should-i-learn-xml-1-0-or-xml-1-1
8790e78
to
c8a1cb9
Compare
c8a1cb9
to
296203c
Compare
These changes seem basically OK. But I don't see any new unit tests to verify them. |
Yes, new tests need to be added, my free time is almost over, but I can work on adding tests next week. |
fd3aa83
to
f041e0a
Compare
I have added more tests, and an equivalent change to SystemInputDefReader. Also I allowed whitespace and some control characters additionally, as they can be represented in XML1.1. Some control characters cannot be parsed with the JDK parser however (such as newline I think), so users would need to add a more XML1.1 compliant parser on the classpath, such as in maven adding simply:
This replaces the SAXParser implementation automatically. I manually added to tests and saw that additional control character sequences can be parsed that way. But I did not add tests for that since tcases probably should not depend on xerces. For other identifiers such as System and function name, I now think it would be preferable to allow international character sets, but that's more of a nuisance than a blocker, so up to you. For properties I wish I could use some more characters like |
f041e0a
to
2d685f8
Compare
I agree -- see new issue #30. This has to be done carefully, but I will investigate soon. |
For now, perhaps prefixing with something like |
@@ -125,7 +128,7 @@ public void writeAttribute( String name, String value) | |||
print( " "); | |||
print( name); | |||
print( "=\""); | |||
print( value); | |||
print( NumericEntityEscaper.below(0x20).translate(StringEscapeUtils.escapeXml11(value))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a comment to explain this translation. For example, why are both NumericEntityEscaper and StringEscapeUtils required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment
033e8bc
to
4aee214
Compare
Fix #25: allow any string except null as var value. XML-Unescaping is already done by the SAXParser, no code change needed. But allowing blanks for values required relaxing SystemTestDocReader. Identifiers could also be relaxed in the future that way (I see no reason beyond western-thinking esthetics to restrict to ascii Java word chars), but that is not so urgent for me, programmers should use ASCII for actual identifiers.