Skip to content
This repository has been archived by the owner on Jan 29, 2020. It is now read-only.

Commit

Permalink
Merge d1fbe03 into c7d68ca
Browse files Browse the repository at this point in the history
  • Loading branch information
GeeH committed Nov 5, 2015
2 parents c7d68ca + d1fbe03 commit 3ff5a72
Show file tree
Hide file tree
Showing 9 changed files with 609 additions and 0 deletions.
21 changes: 21 additions & 0 deletions doc/book/zend.escaper.configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Configuring Zend\\Escaper

`Zend\Escaper\Escaper` has only one configuration option available, and that is the encoding to be
used by the Escaper object.

The default encoding is **utf-8**. Other supported encodings are:

> - iso-8859-1
- iso-8859-5
- iso-8859-15
- cp866, ibm866, 866
- cp1251, windows-1251
- cp1252, windows-1252
- koi8-r, koi8-ru
- big5, big5-hkscs, 950, gb2312, 936
- shift\_jis, sjis, sjis-win, cp932
- eucjp, eucjp-win
- macroman

If an unsupported encoding is passed to `Zend\Escaper\Escaper`, a
`Zend\Escaper\Exception\InvalidArgumentException` will be thrown.
72 changes: 72 additions & 0 deletions doc/book/zend.escaper.escaping-css.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Escaping Cascading Style Sheets

CSS is similar to \[Javascript\](zend.escaper.escaping-javascript) for the same reasons. CSS
escaping excludes only basic alphanumeric characters and escapes all other characters into valid CSS
hexadecimal escapes.

## Examples of Bad CSS Escaping

In most cases developers forget to escape CSS completely:

```php
<?php header('Content-Type: application/xhtml+xml; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
body {
background-image: url('http://example.com/foo.jpg?</style><script>alert(1)</script>');
}
INPUT;
?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Unescaped CSS</title>
<meta charset="UTF-8"/>
<style>
<?php echo $input; ?>
</style>
</head>
<body>
<p>User controlled CSS needs to be properly escaped!</p>
</body>
</html>
```
In the above example, by failing to escape the user provided CSS, an attacker can execute an XSS
attack fairly easily.
## Examples of Good CSS Escaping
By using `escapeCss` method in the CSS context, such attacks can be prevented:
```php
<?php header('Content-Type: application/xhtml+xml; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
body {
background-image: url('http://example.com/foo.jpg?</style><script>alert(1)</script>');
}
INPUT;
$escaper = new Zend\Escaper\Escaper('utf-8');
$output = $escaper->escapeCss($input);
?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Escaped CSS</title>
<meta charset="UTF-8"/>
<style>
<?php
// output will look something like
// body\20 \7B \A \20 \20 \20 \20 background\2D image\3A \20 url\28 ...
echo $output;
?>
</style>
</head>
<body>
<p>User controlled CSS needs to be properly escaped!</p>
</body>
</html>
```
By properly escaping user controlled CSS, we can prevent XSS attacks in our web applications.
121 changes: 121 additions & 0 deletions doc/book/zend.escaper.escaping-html-attributes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Escaping HTML Attributes

Escaping data in the **HTML Attribute context** is most often done incorrectly, if not overlooked
completely by developers. Regular \[HTML escaping\](zend.escaper.escaping-html) can be used for
escaping HTML attributes, *but* only if the attribute value can be **guaranteed as being properly
quoted**! To avoid confusion, we recommend always using the HTML Attribute escaper method in the
HTML Attribute context.

To escape data in the HTML Attribute, use `Zend\Escaper\Escaper`'s `escapeHtmlAttr` method.
Internally it will convert the data to UTF-8, check for it's validity, and use an extended set of
characters to escape that are not covered by `htmlspecialchars` to cover the cases where an
attribute might be unquoted or quoted illegally.

## Examples of Bad HTML Attribute Escaping

An example of incorrect HTML attribute escaping:

```php
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
' onmouseover='alert(/ZF2!/);
INPUT;
/**
* NOTE: This is equivalent to using htmlspecialchars($input, ENT_COMPAT)
*/
$output = htmlspecialchars($input);
?>
<html>
<head>
<title>Single Quoted Attribute</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div>
<?php
// the span tag will look like:
// <span title='' onmouseover='alert(/ZF2!/);'>
?>
<span title='<?php echo $output ?>'>
What framework are you using?
</span>
</div>
</body>
</html>
```

In the above example, the default `ENT_COMPAT` flag is being used, which does not escape single
quotes, thus resulting in an alert box popping up when the `onmouseover` event happens on the `span`
element.

Another example of incorrect HTML attribute escaping can happen when unquoted attributes are used,
which is, by the way, perfectly valid HTML5:

```php
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
faketitle onmouseover=alert(/ZF2!/);
INPUT;
// Tough luck using proper flags when the title attribute is unquoted!
$output = htmlspecialchars($input,ENT_QUOTES);
?>
<html>
<head>
<title>Quoteless Attribute</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div>
<?php
// the span tag will look like:
// <span title=faketitle onmouseover=alert(/ZF2!/);>
?>
<span title=<?php echo $output ?>>
What framework are you using?
</span>
</div>
</body>
</html>
```

The above example shows how it is easy to break out from unquoted attributes in HTML5.

## Examples of Good HTML Attribute Escaping

Both of the previous examples can be avoided by simply using the `escapeHtmlAttr` method:

```php
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
faketitle onmouseover=alert(/ZF2!/);
INPUT;
$escaper = new Zend\Escaper\Escaper('utf-8');
$output = $escaper->escapeHtmlAttr($input);
?>
<html>
<head>
<title>Quoteless Attribute</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div>
<?php
// the span tag will look like:
// <span title=faketitle&#x20;onmouseover&#x3D;alert&#x28;&#x2F;ZF2&#x21;&#x2F;&#x29;&#x3B;>
?>
<span title=<?php echo $output ?>>
What framework are you using?
</span>
</div>
</body>
</html>
```

In the above example, the malicious input from the attacker becomes completely harmless as we used
proper HTML attribute escaping!
73 changes: 73 additions & 0 deletions doc/book/zend.escaper.escaping-html.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Escaping HTML

Probably the most common escaping happens in the **HTML Body context**. There are very few
characters with special meaning in this context, yet it is quite common to escape data incorrectly,
namely by setting the wrong flags and character encoding.

For escaping data in the HTML Body context, use `Zend\Escaper\Escaper`'s `escapeHtml` method.
Internally it uses PHP's `htmlspecialchars`, and additionally correctly sets the flags and encoding.

```php
// outputting this without escaping would be a bad idea!
$input = '<script>alert("zf2")</script>';

$escaper = new Zend\Escaper\Escaper('utf-8');

// somewhere in an HTML template
<div class="user-provided-input">
<?php
echo $escaper->escapeHtml($input); // all safe!
?>
</div>
```

One thing a developer needs to pay special attention too, is that the encoding in which the document
is served to the client, as it **must be the same** as the encoding used for escaping!

## Examples of Bad HTML Escaping

An example of incorrect usage:

```php
<?php
$input = '<script>alert("zf2")</script>';
$escaper = new Zend\Escaper\Escaper('utf-8');
?>
<?php header('Content-Type: text/html; charset=ISO-8859-1'); ?>
<!DOCTYPE html>
<html>
<head>
<title>Encodings set incorrectly!</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<?php
// Bad! The escaper's and the document's encodings are different!
echo $escaper->escapeHtml($input);
?>
</body>
```

## Examples of Good HTML Escaping

An example of correct usage:

```php
<?php
$input = '<script>alert("zf2")</script>';
$escaper = new Zend\Escaper\Escaper('utf-8');
?>
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
<!DOCTYPE html>
<html>
<head>
<title>Encodings set correctly!</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<?php
// Good! The escaper's and the document's encodings are same!
echo $escaper->escapeHtml($input);
?>
</body>
```
87 changes: 87 additions & 0 deletions doc/book/zend.escaper.escaping-javascript.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Escaping Javascript

Javascript string literals in HTML are subject to significant restrictions particularly due to the
potential for unquoted attributes and any uncertainty as to whether Javascript will be viewed as
being CDATA or PCDATA by the browser. To eliminate any possible XSS vulnerabilities, Javascript
escaping for HTML extends the escaping rules of both ECMAScript and JSON to include any potentially
dangerous character. Very similar to HTML attribute value escaping, this means escaping everything
except basic alphanumeric characters and the comma, period and underscore characters as hexadecimal
or unicode escapes.

Javascript escaping applies to all literal strings and digits. It is not possible to safely escape
other Javascript markup.

To escape data in the **Javascript context**, use `Zend\Escaper\Escaper`'s `escapeJs` method. An
extended set of characters are escaped beyond ECMAScript's rules for Javascript literal string
escaping in order to prevent misinterpretation of Javascript as HTML leading to the injection of
special characters and entities.

## Examples of Bad Javascript Escaping

An example of incorrect Javascript escaping:

```php
<?php header('Content-Type: application/xhtml+xml; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
bar&quot;; alert(&quot;Meow!&quot;); var xss=&quot;true
INPUT;
$output = json_encode($input);
?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Unescaped Entities</title>
<meta charset="UTF-8"/>
<script type="text/javascript">
<?php
// this will result in
// var foo = "bar&quot;; alert(&quot;Meow!&quot;); var xss=&quot;true";
?>
var foo = <?php echo $output ?>;
</script>
</head>
<body>
<p>json_encode() is not good for escaping javascript!</p>
</body>
</html>
```
The above example will show an alert popup box as soon as the page is loaded, because the data is
not properly escaped for the Javascript context.
## Examples of Good Javascript Escaping
By using the `escapeJs` method in the Javascript context, such attacks can be prevented:
```php
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
<!DOCTYPE html>
<?php
$input = <<<INPUT
bar&quot;; alert(&quot;Meow!&quot;); var xss=&quot;true
INPUT;
$escaper = new Zend\Escaper\Escaper('utf-8');
$output = $escaper->escapeJs($input);
?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Escaped Entities</title>
<meta charset="UTF-8"/>
<script type="text/javascript">
<?php
// this will look like
// var foo =
bar\x26quot\x3B\x3B\x20alert\x28\x26quot\x3BMeow\x21\x26quot\x3B\x29\x3B\x20var\x20xss\x3D\x26quot\x3Btrue;
?>
var foo = <?php echo $output ?>;
</script>
</head>
<body>
<p>Zend\Escaper\Escaper::escapeJs() is good for escaping javascript!</p>
</body>
</html>
```
In the above example, the Javascript parser will most likely report a `SyntaxError`, but at least
the targeted application remains safe from such attacks.
Loading

0 comments on commit 3ff5a72

Please sign in to comment.