Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define the validation configuration options #224

Open
javagl opened this issue Oct 4, 2022 · 5 comments
Open

Define the validation configuration options #224

javagl opened this issue Oct 4, 2022 · 5 comments

Comments

@javagl
Copy link
Contributor

javagl commented Oct 4, 2022

There may be different options or settings that affect the validation process in various ways. These options are currently drafted in the ValidationOptions.ts class. For now, this only contains a flag validateContentData, which determines whether tile content data is resolved and validated. But this is really just a "placeholder", to show the intended usage. There is not even a way to programmatically set this flag...


The comment in the ValidationOptions.ts already mentions some other possible configuration options:

  • validateContentData - Whether tile content data should be validated (as mentioned above)
  • validateContentTypes - To explicitly list the content types, like ["B3DM", "GLB"]
  • validateExternalTilesets - Whether external tilesets should be validated
  • validateExternalSchema - Whether metadata schemas from a schemaUri should be validated
  • validateExternalResources - To enable validation of anything external in general
  • validateAbsoluteUris - Whether the validator should try to resolve https://example.com/data.glb

The desired granularity, structure and representation of these options is certainly not obvious. For example, validateContentTypes would be a more fine-grained version of validateContentData, as in

  • validateContentData=true corresponding to validateContentTypes=[B3DM, I3DM, CMPT, PNTS, GLB, GLTF]
  • validateContentData=false corresponding to validateContentTypes=[]

And one could generalize that arbitrarily - up to validateStuffWithThisGlobPattern="*.{glb,b3dm}" ...

From the "consuming" API side, most of this could probably boil down to a function like shouldValidate(uri) : boolean. This function could internally check a flag, or check the file extension to be contained in the list, or check a glob pattern. But the question about how these options will be configured, programmatically or via the command line, still has to be answered.


Another dimension of possible configuration options refers to the questions of

  • Which issues should be ignored?
  • Should it be possible to define certain issues to either be a WARNING or an ERROR?
  • ...

In the worst case, this might affect the whole validation process itself.

The current approach or idea is to always perform all validations and report all issues. If somebody wants to "ignore" certain issues, then this can be achieved by filtering the ValidationResult. For example, as in result.removeAll(severity=WARNING).

But the question about the degrees of freedom for the configuration is still non-trivial. For eample, the approach of filtering the ValidationResult does not allow the user to ignore issues depending on the context. For example, someone might want to ignore BOUNDING_VOLUME_INCONSISTENT errors when they refer to bounding spheres, but not when they refer to bounding regions. This fine-grained configurability would not be possible now, and not trivial to implement generically.


A "combination" of the questions raised above is: How should the validator handle resources that can not be resolved? When performing the validation on the local file system, with a (supposedly!) complete tileset, then it should probably be possible to resolve all resources, and when one of them can not be resolved, it should be an ERROR. In other cases, a non-resolvable resource might only be considered to be a WARNING, and not necessarily cause the validation to fail.

@javagl
Copy link
Contributor Author

javagl commented Dec 5, 2022

There currently is a draft for configuration options support at #247 (a very early state for now).

I think that it could make sense to split the "configuration" into two levels:

  • The configuration for the validator run
  • The options for one validation process

The idea of the configuration for the validator run is to have config.json file that may look like this:

{
  "tilesetsDirectory": "specs/data/tilesets",
  "tilesetGlobPattern": "**/*.json",
  "writeReports": false,
  "options": {
    "validateContentData": false
  }
}

Its top-level elements correspond (exactly) to the official command line arguments. This would allow to define different "standard runs" for the validator solely within the config file, which then completely replaces the (other) command line arguments, and could just be passed in as
npx ts-node src/main.ts --configFile validationConfig.json
without having to repeatedly type (or copy and paste) the arguments at the command line. Each of these config files would represent a ("reproducible") run of the validator against a certain set of input files.


The options for the validation process are then stored in the options object, which just contains everything that corresponds to the ValidationOptions.ts class. Right now, this still contains the validateContentData : boolean as its only option.

Which of the options that are mentioned in the first post are supposed to be supported at which stage is not yet clear. But a rough idea was to extend these options to selectively configure the handling of certain content types. The question about how to encode this in JSON is not entirely obvious. A rough idea would be to have a structure like this:

{
  "tilesetsDirectory": "specs/data/tilesets",
  "tilesetGlobPattern": "**/*.json",
  "writeReports": false,
  "options": {
    "validateContentData": true,
    "validateContentTypes": [
      {
        "magic": "b3dm",
        "validation": "PERFORM"
      },
      {
        "extension": ".geojson",
        "validation": "SKIP"
      },
      {
        "magic": "GEOM",
        "validation": "WARN"
      }
    ]
  }
}

which would be translated as

  • Do a standard validation when the magic header of the content is B3DM
  • Skip the validation when the extension of a content is .geojson
  • Issue a warning when the magic header of the content is GEOM

(The intention is to eventually address this comment in the ContentDataValidators setup, with some implementation details still having to be sorted out)

(An aside: It would probably make sense to then also add an --options options.json command line argument that only contains the options, and not the whole configuration)


Unrelated/TODO: The build is currently failing at the linting step, which is checking eslint "./**/*.*s" - this greatly fails due to a pntS file that I just added for the tests. (Note to myself: Never try to make your life easy...)

@javagl
Copy link
Contributor Author

javagl commented Dec 8, 2022

The latest state extends the validation options with more fine-grained control about the validated content types. This is not yet fully and deeply configurable as drafted above, because the question of how exactly to configure the validation process for each type has to be sorted out. But a reasonable first step is that it is possible to define which content types should be validated at all.

The ValidationOptions have been exposed in the API, and now contain an array of strings, called validatedContentTypes. These strings do not appear publicly in the API (so they are no public enum constants or so). The "known" types might change over time. For now, the valid content types are

  • CONTENT_TYPE_TILESET (for external tilesets)
  • CONTENT_TYPE_GLB (for binary glTF only)
  • CONTENT_TYPE_GLTF (for JSON-based glTF)
  • CONTENT_TYPE_B3DM
  • CONTENT_TYPE_I3DM
  • CONTENT_TYPE_CMPT
  • CONTENT_TYPE_PNTS
  • CONTENT_TYPE_3TZ (Experimental support)
  • CONTENT_TYPE_GEOM (Not validated yet)
  • CONTENT_TYPE_VCTR (Not validated yet)
  • CONTENT_TYPE_GEOJSON (Not validated yet)

(When validatedContentTypes is undefined, then ALL known content types will be considered. This may mean that encountering certain content types will cause a validation warning when the content type validation is not implemented)


Running the validator with

npx ts-node src/main.ts --configFile .\config.json

using the following config.json

{
  "tilesetsDirectory": "specs/data/tilesets",
  "tilesetGlobPattern": "**/validTilesetWithInvalidB3dm.json",
  "writeReports": false,
  "options": {
    "validateContentData": true,
    "validatedContentTypes": ["CONTENT_TYPE_B3DM", "CONTENT_TYPE_GLTF", "CONTENT_TYPE_GLB"]
  }
}

will report the invalid B3DM. Removing the CONTENT_TYPE_B3DM will cause it to report no issues.

The state has to be further tested and reviewed. But general feedback about the approach (config file format, content type definitions, etc) can be added here.


Note: The latest change also fixes a critical bug that severly limits the usability of the validator as a library: The registration of the known content type (and extension) validators had only be done in main.ts. So when using it as a library, the content data is not validated at all (ouch). This was changed now, to ensure that the registration functions are called (once) when one of the functions that uses them is called.

@javagl
Copy link
Contributor Author

javagl commented Dec 11, 2022

Just to keep track of why certain things are done in a way that appears to be ... awkward at the first glance:

As mentioned int he previous comment, there are things that have to be done once - like globally registering validators for the different content types. Theoretically, this should be done with static initializer blocks, like so...

export class ContentDataValidators {
  static {
    ContentDataValidators.registerDefaults();
  } 
  private static registerDefaults() { ... }
}

One has to be very careful about where this block is inserted, because the blocks and fields are processed in textual order within one class (see https://v8.dev/features/class-static-initializer-blocks ). But even then, things fall apart pretty quickly: When there are two different classes that contain such blocks, it's everything but clear which part of which class is initialized at which point in time. It might theoretically be possible to derive that from ... some specification. But the knowledge that has to go into that and the brittleness of the result (which might break "unexplicably" for someone who doesn't know the details of the initialization order) made me think that the somewhat un-elegant pragmatic approach of having a flag that keeps track of the initialization state might be the better solution here. I'm open for "better" suggestions.

@javagl
Copy link
Contributor Author

javagl commented Dec 11, 2022

One question about how the validated content types should be configured appears to be a very high-level one. Namely, whether the content types should be included or excluded explicitly via the configuration settings. Both cases can have different effects when the validator is updated:

  • When the content types are included explicitly in the config file, and a new version of the validator supports a new content type, then this content type will by default not be validated
  • When the content types are excluded explicitly in the config file, and a new version of the validator supports a new content type, then this content type may cause new validation issues

It's hard to predict which behavior is the more desirable one. Therefore, the ValidationOptions offer both options right now, via includeContentTypes and excludeContentTypes properties. I have added a validationOptionsDemo.ts showing the usage (from a "library usage" point of view). The output of this demo (with validation issue details omitted) is

Result of validating a tileset with an invalid B3DM, 
with the B3MD content type being INcluded:
{
  "date": "2022-12-11T15:22:03.312Z",
  "numErrors": 1,
  "numWarnings": 0,
  "numInfos": 0,
  "issues": [
    {
      "type": "CONTENT_VALIDATION_ERROR",
       ...
    }
  ]
}
--------------------------------------------------------------------------------
Result of validating a tileset with an invalid B3DM,
with the B3MD content type NOT being INcluded:
{
  "date": "2022-12-11T15:22:03.322Z",
  "numErrors": 0,
  "numWarnings": 0,
  "numInfos": 0
}
--------------------------------------------------------------------------------
Result of validating a tileset with an invalid B3DM,
with the B3MD content type being EXcluded:
{
  "date": "2022-12-11T15:22:03.325Z",
  "numErrors": 0,
  "numWarnings": 0,
  "numInfos": 0
}
--------------------------------------------------------------------------------
Result of validating a tileset with an invalid B3DM,
the B3MD content type NOT being EXcluded:
{
  "date": "2022-12-11T15:22:03.327Z",
  "numErrors": 1,
  "numWarnings": 0,
  "numInfos": 0,
  "issues": [
    {
      "type": "CONTENT_VALIDATION_ERROR",
       ...
    }
  ]
}
--------------------------------------------------------------------------------

@javagl
Copy link
Contributor Author

javagl commented Dec 13, 2022

The current state is part of the 0.2.0 release. This does not involve stability guarantees, and is offered as a "preview" and to gather feedback. Therefore, I'll leave this issue open for now, but any comments about the current approach should still be posted here.

@javagl javagl mentioned this issue Dec 15, 2022
53 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant