Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out a way of supporting tests in Iglu #1

Open
alexanderdean opened this issue Jun 18, 2014 · 8 comments
Open

Figure out a way of supporting tests in Iglu #1

alexanderdean opened this issue Jun 18, 2014 · 8 comments
Assignees
Milestone

Comments

@alexanderdean
Copy link
Member

Fred wrote some nice tests for the core Snowplow schemas when these were a part of snowplow/snowplow. These can be seen here:

https://github.com/snowplow/snowplow/tree/40a5037563e729c67a922a3e2e67c4e5bb917809/0-common/schemas/jsonschema/tests

Fundamentally, tests divide into:

  • Good tests - pass validation
  • Bad tests - fail validation

So we need to think about how to store tests inside of Iglu. Starter for 10 - what about:

com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/good
com.snowplowanalytics.self-desc/instance/jsonschema/1-0-0/tests/bad

The idea is that tests that are good for 1-0-0 must also (by definition) be good for 1-0-1, 1-0-2 etc. Tests which are bad for 1-0-0 could be good for 1-0-1 so there's nothing we can reason about there.

@fblundun thoughts?

@alexanderdean
Copy link
Member Author

I guess it would be like this:

tests/good/1
tests/good/2
tests/bad/1
tests/bad/2
etc

and tests/good would return all the good tests.

ALSO: once we have tests/good/1-0, we can validate a new schema upload to jsonschema/1-0-1 that all good/1-0 tests pass.

@alexanderdean
Copy link
Member Author

I think we should use valid and invalid rather than good and bad, as per Fred's existing tests

@alexanderdean
Copy link
Member Author

Probably use UUIDs instead of /1, /2 etc. Also allows to remove existing tests if we want to.

http://stackoverflow.com/questions/7114694/should-i-use-uuids-for-resources-in-my-public-api

@fblundun
Copy link
Contributor

This all sounds good.

I wonder if it would be worth having a system where whenever a schema's version is bumped (e.g. from 1-0-0 to 1-0-1) we have a place for tests designed specifically to be validated by the new version but not by the old, to highlight the difference between the two.

The structure could be something like this:

com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/good/3 would contain JSONs which should be validated by schema 1-0-x where x >= 3

com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/bad/3 would contain JSONs which should be rejected by schema 1-0-x where x <= 3

Then if we want to test schema 1-0-3, we check that it:

  • validates all JSONs in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/good/y for all y <= 3
  • rejects all JSONs in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/bad/y for all y >= 3

In fact we could alternatively do away with the "good" and "bad" distinction and just have com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/z containing all examples which should be validated by 1-0-z but not by 1-0-(z-1).

Then to test schema version 1-0-3, we would check that it:

  • validates all JSONs in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/w for all w <= 3
  • rejects all JSONs in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/w for all w > 3

The disadvantage of this is that it might involve moving some test JSONs to a new directory when a new version is published, and that it's pretty complicated...

@alexanderdean
Copy link
Member Author

Hey Fred, lots of great thoughts there. I think what we're saying is that fundamentally, for a given MODEL-REVISION, tests are either:

  1. valid-from-ADDITION
  2. invalid-until-ADDITION
  3. invalid-forever

Is that a helpful taxonomy?

@fblundun
Copy link
Contributor

I think that's a helpful way to think about it. In terms of file structure, we could group test JSONs into 2 categories:

  1. invalid-forever, located in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/bad/
  2. valid-from-ADDITION, located in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/ADDITION

Then to test schema 1-0-x we make sure it validates everything in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/y for y >=x, and that it invalidates every other test JSON in com.snowplowanalytics.self-desc/instance/jsonschema/1-0/tests/.

@alexanderdean
Copy link
Member Author

Interesting! Possible simplification: we add all tests simply as:

com.snowplowanalytics.self-desc/instance/jsonschema/tests/f47ac10b-58cc-4372-a567-0e02b2c3d479

etc.

Then when you submit a new JSON Schema, all existing tests are run against it, and the response from the new JSON Schema registration contains a listing of all test stati.

Going further, when you request an individual test, it contains in its metadata which tests it succeeds against.

Going even further, there should be the opportunity to run a new potential schema against all tests without actually committing it.

@alexanderdean alexanderdean added this to the Version 0.3.0 milestone Jul 10, 2014
@alexanderdean
Copy link
Member Author

Assigning to @BenFradet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants