Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed quote conversion to entity inside content #25

Merged
merged 7 commits into from
Feb 12, 2014
Merged

Removed quote conversion to entity inside content #25

merged 7 commits into from
Feb 12, 2014

Conversation

goetas
Copy link
Member

@goetas goetas commented Feb 3, 2014

Encoding quotes inside element contents may be not necessary

Encoding quotes inside element contents may be not necessary
@technosophos
Copy link
Member

This commit broke the unit tests, so I can't merge.

I'm also deferring to @mattfarina to decide on this one.

@mattfarina
Copy link
Member

First, this looks like it would be a good change. ENT_QUOTES is for a single quote '. Escaping HTML in the html5 spec doesn't do this.

To be merged the tests need to be updated in https://github.com/Masterminds/html5-php/blob/master/test/HTML5/Serializer/OutputRulesTest.php#L191. The actual test failures can be seen at https://travis-ci.org/Masterminds/html5-php/jobs/18130032.

@goetas can you clean up the tests as well?

@goetas
Copy link
Member Author

goetas commented Feb 4, 2014

I have added some tests.
I have added also some tests for my previous PR. Can it be merged anyway?

@@ -245,7 +245,7 @@ protected function nl() {
* The encoded text.
*/
protected function enc($text) {
$flags = ENT_QUOTES;
$flags = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to make an API addition here. In attribute mode a " needs to be turned into &quot;. When not in attribute mode the < and > need to be converted. The details are at http://www.w3.org/TR/2013/CR-html5-20130806/syntax.html#escapingString.

This would mean when in attribute mode we need to use ENT_COMPAT. When and avoid encoding < and >.

If we going to fix this let's get it right. I'm the one the broke it so I'd like to see it right on the second go. We could either use a method for each mode or an argument. Because of escaping issues around < and > I'm inclined to add a new method.

What do you think? I'm happy to let you write it or I can tackle this problem if you like.

Sorry it took a second go of looking at this to realize this change. It was reading your tests that caused me to go back and re-read the spec. Testing FTW!

@mattfarina
Copy link
Member

Also, I'm OK merging in your new tests. I'd like @technosophos to take a quick read of those tests first.

@goetas
Copy link
Member Author

goetas commented Feb 4, 2014

I have added an optional parameter to enc method

@goetas
Copy link
Member Author

goetas commented Feb 4, 2014

(The indentation with two white-spaces is so strange)

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 7713a8b on goetas:patch-2 into * on Masterminds:master*.

@technosophos
Copy link
Member

I'm okay with the changes here. This brings us into closer conformance to the spec.

The history of the two-space indent can be traced, basically, back to the Drupal coding guidelines, which Masterminds adopted before PSR-1/2 came out.

@mattfarina
Copy link
Member

@goetas thanks. I see it's failing in php 5.3. can you take a look at it... https://travis-ci.org/Masterminds/html5-php/jobs/18208034

@goetas
Copy link
Member Author

goetas commented Feb 5, 2014

@mattfarina
This is caused by$ret = strtr($text, \HTML5\Serializer\HTML5Entities::$map); substitution on php5.3 environment.

A solution can be somethig like this?

$cusotmMap = \HTML5\Serializer\HTML5Entities::$map;
if (!($cusotmMap & ENT_QUOTES)){
     unset($cusotmMap["'"]);
}
$ret = strtr($text, \HTML5\Serializer\HTML5Entities::$map);

@goetas
Copy link
Member Author

goetas commented Feb 9, 2014

News? what can i do?

@@ -163,7 +163,7 @@ protected function attrs($ele) {
$len = $map->length;
for ($i = 0; $i < $len; ++$i) {
$node = $map->item($i);
$val = $this->enc($node->value);
$val = $this->enc($node->value, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update true to be TRUE? For booleans values we use a coding style where these re uppercase.

@mattfarina
Copy link
Member

@goetas thanks for working on this. I was mostly busy for a couple days.

There are a couple other changes I'd like to see.

  1. The header commend block to the enc() needs a @param description in phpdoc format for the new argument. A reference to the the encoding portion of the spec would be useful.
  2. When not in attribute mode this should not encode < and >. This is per the spec on encoding. I'm not sure of the best way to accomplish this at the moment.

@goetas
Copy link
Member Author

goetas commented Feb 10, 2014

@mattfarina \HTML5\Serializer\HTML5Entities::$map contains ' as entity, but htmlentities with ENT_HTML5 do not converts it... so tests on php 5.3 will fail.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 1a89c6b on goetas:patch-2 into * on Masterminds:master*.

if (!$this->encode) {
return htmlspecialchars($text, $flags, 'UTF-8');
if (!$this->encode && $attribute) {
return strtr($text, array('"'=>'&quot;','&'=>'&amp;'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the array can you add some space. '"' => '&quot;', '&' => '&amp;'
This will make it more readable.

$tests = array(
'&\'<>"' => '&amp;&#039;&lt;&gt;&quot;',
'This + is. a < test' => 'This + is. a &lt; test',
function getEncDataAttssribute(){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the name getEncDataAttssribute a misspelling for the method name? Is this method needed? I didn't see where it's referenced.

@mattfarina
Copy link
Member

@goetas Thanks for keeping this moving along. I "think" this last batch of comments cover everything and will lead to passing tests too.

The only thing we have not done (and doesn't work right now anyway) is the encoding of &nbsp. I think that's OK for now but reserve the right to change my mind when someone files a bug later.

@goetas
Copy link
Member Author

goetas commented Feb 11, 2014

@mattfarina sorry for codestyle mistakes and thank you for your patience!
I'm trying to handle nbsp with \xc2\xa0 charcode, as html_entity_decode("&nbsp;", ENT_QUOTES, 'UTF-8') == '\xa0'.

@technosophos
Copy link
Member

Sorry... I did NOT fix this issue. I just made a mistake in a commit message.

@goetas Feel free to also add your info to the CREDITS file as a contributor.

mattfarina added a commit that referenced this pull request Feb 12, 2014
Removed quote conversion to entity inside content
@mattfarina mattfarina merged commit e21281f into Masterminds:master Feb 12, 2014
@mattfarina
Copy link
Member

@goetas I already added you to the CREDITS file. Let me know if that info should be changed.

@goetas
Copy link
Member Author

goetas commented Feb 12, 2014

@mattfarina thank you for adding me in credits file

@goetas goetas deleted the patch-2 branch February 12, 2014 06:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants