Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'example.de' (and many others hostnames) not valid with hostname format #26

Closed
FrederikP opened this issue Sep 12, 2018 · 6 comments
Closed

Comments

@FrederikP
Copy link
Contributor

Aaand another one. Hope you aren't getting annoyed. Running this example throws a JsonSchemaException:

import fastjsonschema

schema = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "https://example.com/example.schema.json",
    "title": "Example",
    "description": "An example schema",
    "type": "object",
    "properties": {
        "host": {
            "type": "string",
            "description": "Some hostname",
            "format": "hostname"
        }   
    }
}

validate = fastjsonschema.compile(schema)

validate({"host": "example.de"})

For 'google.com' it works fine. I'll create a PR soon, probably.

@FrederikP
Copy link
Contributor Author

FrederikP commented Sep 12, 2018

This is the current regex:

^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]{1,62}[A-Za-z0-9])$

As far as I see it, this means, the last part can never be exactly 2 chars long. This is a huge problem, as there are many tlds with 2 chars. (.de, .fr, etc.)

In the PR I'll probably use a different regex. But: As I said in the other issue today: I don't think regexes are a good solution for formats that can easily be validated using python standard library means (other than re).

@horejsek
Copy link
Owner

I agree, some formats would be better to validate by some extra library than by regexps. Also, some regexps are are not super correct for some cases. But it brings dependencies which I want to avoid until it's not really necessary. :-)

But maybe, if license allows it, I could copy-paste some code into this library to keep an eye on performance and avoid problems with changing dependencies. Will put on my todo list to check possibilities.

@FrederikP
Copy link
Contributor Author

I agree, some formats would be better to validate by some extra library than by regexps. Also, some regexps are are not super correct for some cases. But it brings dependencies which I want to avoid until it's not really necessary. :-)

In some cases, like ipv4 and ipv6, you could use functionality like the ipaddress module that doesn't require additional dependencies.
But for some other formats you might need other dependencies and I understand that you'd like to avoid them. You could start with the formats that are covered by the standard python library.

horejsek added a commit that referenced this issue Sep 14, 2018
Switched to different hostname regex. Fixes #26
@FrederikP
Copy link
Contributor Author

@horejsek Thanks for merging! I would be very happy if you can do a bugfix release with this in the near future, because this issue keeps us from rolling out a component that we switched from jsonschema to your library (because of draft7 support, performance and I like it better :) )

@horejsek
Copy link
Owner

Done, v2.3 :-)

@horejsek
Copy link
Owner

Glad it helped your component! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants