Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved hostname validation. #1143

Merged
merged 7 commits into from Feb 17, 2020
Merged

Improved hostname validation. #1143

merged 7 commits into from Feb 17, 2020

Conversation

sambauers
Copy link
Contributor

@sambauers sambauers commented Jan 8, 2020

Allow trailing dot. Check for octet count rather than string length.

Trailing dots in hostnames are valid, and also highly desirable in some contexts (e.g. DNS zone files).

Also, currently the validator is checking for string length of hostname, rather than the octet count. This is an edge case as few hostnames reach the limit, but technically incorrect. The fix is fairly straightforward and included here.

I have included tests for the length check, but the check for a domain that is too long is failing, as it appears that the validate function that is constructed for the tests bypasses the defined function hostname() in lib/compile/formats.js and instead simply uses native regex test() function. I couldn't troubleshoot this as it is all obfuscated in the generator in lib/dotjs/validate.js. I have left the failing test in here as it appears to be an issue with AJV testing itself, rather than the implementation.

What issue does this pull request resolve?
None, sorry. I didn't realise this was a requirement of submitting a pull request.

What changes did you make?
I updated the format: hostname regex to allow for trailing dots in hostnames. I also updated the hostname() function. Both in lib/compile/formats.js.

Is there anything that requires more attention while reviewing?
The failing test that I mentioned. The validation test it generated for me was:

function(data, dataPath, parentData, parentDataProperty, rootData) {
  'use strict';
  var vErrors = null;
  var errors = 0;

  if (typeof data === "number") { }
  if (errors === 0) {  
    if (typeof data === "string") {
      if (!formats.hostname.test(data) ) {  
        validate.errors = [ { keyword: 'format' ,
                              dataPath: (dataPath || '') + "" ,
                              schemaPath: '#/format' ,
                              params: { format:  'hostname'  }  ,
                              message: 'should match format "hostname"'  } ];
        return false;  
      }    
    }  
  }  
  validate.errors = vErrors;
  return errors === 0;
}

…which as you can see only validates the regex using: if (!formats.hostname.test(data) ) { - maybe I just put my test cases in the wrong place, or maybe this is something more fundamental. I would have thought it should somehow call the hostname() function in lib/compile/formats.js instead.

spec/tests/rules/format.json Outdated Show resolved Hide resolved
@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 9, 2020

For anyone wanting to implement this now, it can be achieved with something like the following in your own implementation:

const AJV = require('ajv');
const ajv = new AJV()

ajv.addFormat('hostname', (str) => {
  const HOSTNAME = /^(?=.{1,253}\.?$)[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?(?:\.[a-z0-9](?:[-0-9a-z]{0,61}[0-9a-z])?)*\.?$/i;
  return HOSTNAME.test(str);
});

@epoberezkin
Copy link
Member

@epoberezkin epoberezkin commented Jan 18, 2020

@sambauers Sam - thanks for the PR. Having issue is not a requirement, just a general recommendation in most projects to avoid spending time implementing changes that may be rejected (for whatever reason :). So no problem.

I trust you that trailing dots in hostnames are valid, but it would help accepting if you maybe could point to a specific section of the relevant RFC to confirm it?

Re the implementation itself. Ajv has two modes of format validation - "fast" (default) and "full" (more comprehensive and slower) - it's managed by the option format. hostname function is only used in "full" mode, HOSTNAME RegExp is used in "fast" mode - see the lines 32 and 64 in formats.js. The tests in .json format test Ajv in both modes, so if the test created to pass only in "full" mode, will fail in "fast" mode.

Is it possible to modify HOSTNAME to pass this test as well, without complicating it too much? If not, then this particular test can be moved into code - please create a new file formats.spec.js in https://github.com/epoberezkin/ajv/tree/master/spec folder and follow the conventions in other test files.

lib/compile/formats.js Outdated Show resolved Hide resolved
@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 22, 2020

I'll see what I can do to get this up to scratch.

@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 22, 2020

This is the best explanation of the octet versus length issue I have found:
https://devblogs.microsoft.com/oldnewthing/?p=7873

@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 22, 2020

The relevant RFC is 1034 (section 3.1) here:
https://www.ietf.org/rfc/rfc1034.txt

When a user needs to type a domain name, the length of each label is
omitted and the labels are separated by dots ("."). Since a complete
domain name ends with the root label, this leads to a printed form which
ends in a dot. We use this property to distinguish between:

  • a character string which represents a complete domain name
    (often called "absolute"). For example, "poneria.ISI.EDU."

  • a character string that represents the starting labels of a
    domain name which is incomplete, and should be completed by
    local software using knowledge of the local domain (often
    called "relative"). For example, "poneria" used in the
    ISI.EDU domain.

@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 22, 2020

You can see many trailing dots on domain names in the zone file example here:
https://en.wikipedia.org/wiki/Zone_file

@sambauers
Copy link
Contributor Author

@sambauers sambauers commented Jan 22, 2020

I have used a positive lookahead in the regex pattern for hostname to do the octet (character) count. The tests I originally wrote and added to the other hostname tests now pass. They are triggered by https://github.com/epoberezkin/ajv/blob/master/spec/schema-tests.spec.js#L33 I think, so no need for a new formats.spec.js file I think?

@epoberezkin
Copy link
Member

@epoberezkin epoberezkin commented Feb 16, 2020

@sambauers If all that hostname now does is testing against the format, then the function can be just removed and replaced with the same regular expression in line 73.

@epoberezkin epoberezkin merged commit 75aa5fd into ajv-validator:master Feb 17, 2020
2 checks passed
@epoberezkin
Copy link
Member

@epoberezkin epoberezkin commented Feb 17, 2020

Thank you @sambauers

@epoberezkin
Copy link
Member

@epoberezkin epoberezkin commented Feb 17, 2020

I’ll release all pending changes next weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

2 participants