Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex format validation fails on non-python regexes #20

Closed
sirosen opened this issue Oct 31, 2021 · 0 comments
Closed

regex format validation fails on non-python regexes #20

sirosen opened this issue Oct 31, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@sirosen
Copy link
Member

sirosen commented Oct 31, 2021

Format validation will fail on "regex" fields which use syntaxes which are not valid in python, but are valid in other regex engines.

In the originating use-case for this issue, (?<foo>) was used as a name-capturing group. The trouble here is that (?< has a special meaning in python regular expressions which is not common to all regex engines. In general, (? ... ) expressions are a space of great variation between engines.

I'm not sure how best to handle this case. regex validation in jsonschema is done with python's regex engine with re.compile, but that is not strictly compatible with ECMA 262 regex syntax (which is the point of reference for JSON Schema).

JSON Schema itself notes that strict adherence to ECMA 262 is not necessarily feasible for all implementations, and recommends that schema authors use a safe subset of regex syntaxes.

A couple of years ago, jsonschema worked on supporting JS syntax, but it had to be backed out due to issues. The resulting js-regex package appears to be abandoned.

Roughly, I see a few options:

  • try to support this syntax with customized regex validation (HARD)
  • wrap the jsonschema regex format validator to look for (? followed by any character other than ! or =, and disable the check in those cases only
  • remove "regex" from the values of format which are supported/checked by check-jsonschema
  • add a flag to disable certain formats, e.g. --disable-formats "regex,date"
  • expect users encountering this issue to disable format checks altogether
Originating Comment

I've tested the new version and it broke our pre-commit schema validation:

- repo: https://github.com/sirosen/check-jsonschema
  rev: 0.6.0
  hooks:
  - id: check-jsonschema
    name: Validate Renovate
    files: ^\.github/renovate\.json
    types:
    - json
    args:
    - --schemafile
    - https://docs.renovatebot.com/renovate-schema.json

renovate.json:

{
  "extends": [
    "config:base"
  ],
  "regexManagers": [
    {
      "fileMatch": [
        "(^|/)\\.pre-commit-config\\.yaml$"
      ],
      "matchStrings": [
        "\\nminimum_pre_commit_version: (?<currentValue>.*?)\\n"
      ],
      "depNameTemplate": "pre-commit",
      "datasourceTemplate": "pypi"
    },
    {
      "fileMatch": [
        "(^|/)\\.pre-commit-config\\.yaml$"
      ],
      "matchStrings": [
        "\\n\\s*entry: (?<depName>[^:]+):(?<currentValue>\\S+)"
      ],
      "datasourceTemplate": "docker"
    }
  ]
}

Error:

Schema validation errors were encountered.
  .github/renovate.json::regexManagers.0.matchStrings.0: '\\nminimum_pre_commit_version: (?<currentValue>.*?)\\n' is not a 'regex'

Originally posted by @dudicoco in #19 (comment)

@sirosen sirosen added the enhancement New feature or request label Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant