Skip to content

Commit

Permalink
New: Add rule to check markup validity
Browse files Browse the repository at this point in the history
Fix #28
Close #333
  • Loading branch information
qzhou1607-zz authored and alrra committed Jul 13, 2017
1 parent b554bba commit 08f36db
Show file tree
Hide file tree
Showing 7 changed files with 417 additions and 2 deletions.
1 change: 1 addition & 0 deletions .sonarrc
Expand Up @@ -15,6 +15,7 @@
"disallowed-headers": "warning",
"disown-opener": "warning",
"highest-available-document-mode": "warning",
"html-checker": "warning",
"manifest-exists": "warning",
"manifest-file-extension": "warning",
"manifest-is-valid": "warning",
Expand Down
68 changes: 68 additions & 0 deletions docs/user-guide/rules/html-checker.md
@@ -0,0 +1,68 @@
# The Nu HTML Test (`html-checker`)

`html-checker` validates the markup of a website against the [Nu HTML checker](https://validator.github.io/validator/).

## Why is this important?

> Serving valid HTML nowadays have been commonly overlooked these days.
By running the HTML documents through a checker, it's easier to catch
unintended mistakes which might have otherwise been missed.
Adhering to the W3C' standards has a lot to offer to both the
developers and the web users: It provides better browser compatibility,
helps to avoid potential problems with accessibility/usability, and makes it easier for future maintainance.
>
> The Nu Html Checker(v.Nu) serves as the backend of [checker.html5.org](https://checker.html5.org/),
[html5.validator.nu](https://html5.validator.nu), and [validator.w3.org/nu](https://validator.w3.org/nu/).
It also provides a [web service interface](https://github.com/validator/validator/wiki/Service-%C2%BB-HTTP-interface).
This rule interacts with this service via [html-validator](https://www.npmjs.com/package/html-validator),
and is able to test both remote websites and local server instances.

## What does the rule check?

According to the Nu Html checker [documentation](https://validator.w3.org/nu/about.html), the positive cases contain two sections:

* Markup cases that are potential problems for accessibility, usability,
interoperability, security, or maintainability—or because they can result in poor performance,
or that might cause your scripts to fail in ways that are hard to troubleshoot.

* Markup cases that are defined as errors because they can cause you to run into potential
problems in HTML parsing and error-handling behavior—so that, say, you’d end up with some unintuitive, unexpected result in the DOM.

For explanation behind those requirements, please checkout:

* [rationale for syntax-level errors](https://www.w3.org/TR/html/introduction.html#syntax-errors)
* [rationale for restrictions on content models and on attribute values](https://www.w3.org/TR/html/introduction.html#restrictions-on-content-models-and-on-attribute-values)

## Can the rule be configured?

You can ignore certain error/warning by setting the `ignore` option for the `html-checker` rule.
You can either pass in a string or an array that contains all the messages to be ignored.

E.g. The following configuration will ignore the errors/warnings with the message of `Invalid attribute`:

```json
"html-checker": ["error", {
"ignore": "Invalid attribute"
}]
```

Alternative, you can pass in an array if you have more than one type of messages to ignore:

```json
"html-checker": ["error", {
"ignore": ["Invalid attribute", "Invalid tag"]
}]
```

You can also override the default validator by passing in the endpoint of an alternative validator. However, you need to make sure that this alternative validator exposes the same REST interface as the default one.

```json
"html-checker": ["error", {
"validator": "https://html5.validator.nu"
}]
```

## Further Reading

* [Why Validate Using the Nu Html Checker?](https://validator.w3.org/nu/about.html)
* [The Nu Html Checker Wiki](https://github.com/validator/validator/wiki)
1 change: 1 addition & 0 deletions docs/user-guide/rules/index.md
Expand Up @@ -15,6 +15,7 @@

* [`content-type`](content-type.md)
* [`highest-available-document-mode`](highest-available-document-mode.md)
* [`html-checker`](html-checker.md)
* [`no-friendly-error-pages`](no-friendly-error-pages.md)

## Performance
Expand Down
1 change: 1 addition & 0 deletions package.json
Expand Up @@ -27,6 +27,7 @@
"file-url": "^2.0.2",
"globby": "^6.1.0",
"handlebars": "^4.0.10",
"html-validator": "^2.2.1",
"iconv-lite": "^0.4.17",
"inquirer": "^3.0.6",
"is-ci": "^1.0.10",
Expand Down
4 changes: 2 additions & 2 deletions src/lib/rule-context.ts
Expand Up @@ -88,7 +88,7 @@ export class RuleContext {
}

/** Reports a problem with the resource. */
public async report(resource: string, element: IAsyncHTMLElement, message: string, content?: string, location?: IProblemLocation, severity?: Severity): Promise<void> { //eslint-disable-line require-await
public async report(resource: string, element: IAsyncHTMLElement, message: string, content?: string, location?: IProblemLocation, severity?: Severity, codeSnippet?: string): Promise<void> { //eslint-disable-line require-await
let position: IProblemLocation = location;
let sourceCode: string = null;

Expand All @@ -107,7 +107,7 @@ export class RuleContext {
this.sonar.report(
this.id,
severity || this.severity,
sourceCode,
codeSnippet || sourceCode,
position,
message,
resource
Expand Down
155 changes: 155 additions & 0 deletions src/lib/rules/html-checker/html-checker.ts
@@ -0,0 +1,155 @@
/**
* @fileoverview Validating html using `the Nu HTML checker`;
* https://validator.w3.org/nu/
*/

// ------------------------------------------------------------------------------
// Requirements
// ------------------------------------------------------------------------------

import { debug as d } from '../../utils/debug';
import { RuleContext } from '../../rule-context'; // eslint-disable-line no-unused-vars
import { IRule, IRuleBuilder, ITargetFetchEnd, IScanEnd, IProblemLocation, Severity } from '../../types'; // eslint-disable-line no-unused-vars

const debug: debug.IDebugger = d(__filename);

// ------------------------------------------------------------------------------
// Public
// ------------------------------------------------------------------------------

const rule: IRuleBuilder = {
create(context: RuleContext): IRule {
/** The promise that represents the scan by HTML checker. */
let htmlCheckerPromise: Promise<any>;
/** Array of strings that needes to be ignored from the checker result. */
let ignoredMessages;
/** The options to pass to the HTML checker. */
const scanOptions = {
data: '',
format: 'json',
validator: ''
};

type HtmlError = { // eslint-disable-line no-unused-vars
extract: string, // code snippet
firstColumn: number,
lastLine: number,
hiliteStart: number,
message: string,
subType: string
};

const loadRuleConfig = () => {
const ignore = (context.ruleOptions && context.ruleOptions.ignore) || [];
const validator = (context.ruleOptions && context.ruleOptions.validator) || 'https://validator.w3.org/nu/';

scanOptions.validator = validator;

// Up to now, the `ignore` setting in `html-validator` only works if `format` is set to `text`
// So we implement `ignore` in our code rather than pass it to `scanOptions`
// TODO: Pass `ignore` once this issue (https://github.com/zrrrzzt/html-validator/issues/58) is solved.
ignoredMessages = Array.isArray(ignore) ? ignore : [ignore];
};

// Filter out ignored messages
const filter = (messages) => {
return messages.filter((message) => {
return !ignoredMessages.includes(message.message);
});
};

const locateAndReport = (resource: string) => {
return (messageItem: HtmlError): Promise<void> => {
const position: IProblemLocation = {
column: messageItem.firstColumn,
elementColumn: messageItem.hiliteStart + 1,
elementLine: 1, // We will pass in the single-line code snippet generated from the HTML checker, so the elementLine is always 1
line: messageItem.lastLine
};

return context.report(resource, null, messageItem.message, null, position, Severity[messageItem.subType], messageItem.extract);
};
};

const start = (data: ITargetFetchEnd) => {
const { response } = data;

/* HACK: Need to do a require here in order to be capable of mocking
when testing the rule and `import` doesn't work here. */
const htmlChecker = require('html-validator');

scanOptions.data = response.body.content;
htmlCheckerPromise = htmlChecker(scanOptions);
};

const end = async (data: IScanEnd) => {
const { resource } = data;
const locateAndReportByResource = locateAndReport(resource);
let result;

if (!htmlCheckerPromise) {
return;
}

debug(`Waiting for HTML checker results for ${resource}`);

try {
result = await htmlCheckerPromise;
} catch (e) {
debug(`Error getting HTML checker result for ${resource}.`, e);
await context.report(resource, null, `Couldn't get results from HTML checker for ${resource}. Error: ${e}`);

return;
}

debug(`Received HTML checker results for ${resource}`);

const filteredMessages: Array<HtmlError> = filter(result.messages);
const reportPromises: Array<Promise<void>> = filteredMessages.map((messageItem: HtmlError): Promise<void> => {
return locateAndReportByResource(messageItem);
});

try {
await Promise.all(reportPromises);
} catch (e) {
debug(`Error reporting the HTML checker results.`, e);

return;
}
};

loadRuleConfig();

return {
'scan::end': end,
'targetfetch::end': start
};
},
meta: {
docs: {
category: 'Interoperability',
description: `Validate HTML using 'the Nu HTML checker'`
},
fixable: 'code',
recommended: true,
schema: [{
properties: {
ignore: {
anyOf: [
{
items: { type: 'string' },
type: 'array'
}, { type: 'string' }
]
},
validator: {
pattern: '^(http|https)://',
type: 'string'
}
}
}],
worksWithLocalFiles: true
}
};

module.exports = rule;

0 comments on commit 08f36db

Please sign in to comment.