split test cases into specific levels of functionality? #497

ianstormtaylor · 2020-12-12T20:42:08Z

Hey, thanks for this benchmark! It's a great idea.

I think right now the existing test cases aren't testing for the same behavior in each library though. Depending on the library, it might support "guarding", "validating", "parsing", "transforming", etc. But if you compare guarding of one to parsing of another it's not a direct comparison.

For example, as far as I can tell right now toi is simply guarding a value and returning a boolean in its test case, whereas superstruct is parsing an input, cloning it, and returning the parsed value in its test case (instead of using the simpler is export).

It might make sense to add more specific levels of functionality to each test case. And some libraries simply won't allow certain functionality. For example (using superstruct because I'm the author so I know it best):

import { is, assert, coerce, object, ... } from 'superstruct'

const type = object({ ... })

class SuperstructCase extends Case {
  name = 'superstruct'

  // The `is` test is for simply returning a boolean indicating whether 
  // a value is valid. It doesn't need to create errors, parse values, etc.
  // This is the most basic and the minimum to be included.
  is(value) {
    return is(value, type)
  }

  // The `assert` test is for throwing a proper `Error` object, with a 
  // reason for why validation failed. It shouldn't be re-implemented
  // for libraries that only support `is` type guards because the errors
  // won't actually contain any information.
  assert(value) {
    assert(value, type)
  }

  // The `parse` test is for libraries that allow some parsing step on
  // the value to ensure that it is valid—for things like default values,
  // coercing inputs, etc. It should return the parsed value, and the
  // value *should not* equal the input (ie. cloned).
  parse(value) {
    return coerce(value, type)
  }
}

You'd be able to show a section of graphs for each level of functionality. Then people can make more informed decisions—eg. a library might be much faster in is, but not allow for parse.

I think this would make the comparisons much more accurate.

The text was updated successfully, but these errors were encountered:

moltar · 2020-12-18T11:41:41Z

@ianstormtaylor thank you for your input. I completely agree. This started small by comparing just a few libs and then it grew to this monster and now we are comparing apples to oranges. I like your idea about having extra test cases.

ianstormtaylor · 2020-12-18T18:35:26Z

@moltar 😄 glad to hear it! I think the graphs and stuff that you'll be able to show once it's split will be really cool. And I'm looking forward to digging into Superstruct's performance when it's apples-to-apples.

moltar · 2021-03-03T01:18:19Z

Let's think about what kind of test buckets we want then? Any ideas?

ianstormtaylor · 2021-03-03T01:50:03Z

I think is, assert and parse are good to start.

hoeck · 2021-04-26T11:29:40Z

In addition to the three modes above I'd like to see a second dimension: strict vs non-strict validation of objects.

Strict:
only allowing known attributes in an object, reporting an error when unknown keys are present.

Non-strict:
similar to how Typescript checks interface type compatibility: if all known attributes are present, everything is good.

Strict vs non-strict checking has performance implications that I'd like to see how other libraries tackle it.

Unfortunately there is a third mode in this dimension, that is allowing non-strict objects as input but returning strict object that has all unknown keys stripped.

As a side note, I think that non-strict runtype checking is actually leading to a false sense of security as you essentially allow arbitrary attributes on objects that are passed into you application and maybe even deeper into the database or other services. So, maybe only benchmark for strict + non-strict-input-strict-output cases?

I'd love to help once it is clear what we need and how to report it.

hoeck · 2021-06-28T11:30:42Z

Found some time over the weekend trying to solidify my ideas hoeck@eef6d89

It basically introduces:

Support for more than one benchmark case per library: https://github.com/hoeck/typescript-runtime-type-benchmarks/blob/eef6d89fc5d7d8ef249b47bd8a8f3474c66d5a55/cases/valita.ts
Interactive visualization using gh-pages: https://hoeck.github.io/typescript-runtime-type-benchmarks/

Next steps I can think of are:

add a parsing benchmark case
update the tests to work with the multiple benchmark cases
add node-version and library filter to the visualization
decide what to do with the existing graphs in the readme: keep as is / use the new group-graph / keep only a single graph and refer to the gh-pages ui for detailed info about node-versions and outlier-removal?

Any ideas, suggestions, feedback, code-complaints?

Should I move forward with this @moltar ?

moltar · 2021-06-29T03:52:58Z

@hoeck love the progress, especially the interactive visualization using gh-pages! I was dreaming of this, but didn't think I'll ever have time to get to that!

What I might have done differently on the additional benchmark cases is to have a single interface, which every benchmark implements. Not to have two tests per class.

We can then group and organize different classes of test cases into folders and perhaps just run the benchmarking tool with options to point to different folders, or just loop thru folders programmatically and run all benchmarks.

Another option is to add a concept of tags or features to the interface, which can be just an array of strings, and then use that as part of the UI to select/unselect different options. Then we can organize the test cases however we want, e.g. group by package in a dir, or put them all in one dir, with just different filenames, and use tags to organize.

hoeck · 2021-06-29T05:33:43Z

Thanks for the Feedback 😁!

I agree that grouping the cases can be improved. Was just not sure about how much of the existing structure you'd like to keep or why everything was written like it is in the first place (e.g. using an abstract class over an interface).

Having a single interface and multiple files per package sounds like the good & common way to do it. I would like to keep the setup simple though, so no automatic reading and importing of directories. In my experience this leads to more complex builds than just having a few explicit import statements. Automatic importing could probably be added later once the folder structure is in place anyway.

Tags sound like a nice add-on layer over choosing packages and benchmarks individually and I thought of this too but would first like to try to get the basics off the ground.

I'll update my fork with said way of grouping benchmarks when I have some spare time (probably in about a week) and keep you updated.

hoeck · 2021-08-31T14:14:02Z

Hey there,

found some time to work on this further. @moltar could you please have a look at the lastest commit at https://github.com/hoeck/typescript-runtime-type-benchmarks/tree/refactor-benchmarks ?

I've simplified adding new benchmarks and libraries. Its no done via a typed register function. Benchmark cases now look like this:

import { register } from '../benchmarks';
import * as v from '@badrap/valita';

const dataType = v.object({
  number: v.number(),
  negNumber: v.number(),
  maxNumber: v.number(),
  string: v.string(),
  longString: v.string(),
  boolean: v.boolean(),
  deeplyNested: v.object({
    foo: v.string(),
    num: v.number(),
    bool: v.boolean(),
  }),
});

register('valita', 'validate', data => {
  return dataType.parse(data);
});

register('valita', 'validateStrict', data => {
  return dataType.parse(data, { mode: 'strict' });
});

The benchmark case implementation is just the single function passed to register. Grouping benchmarks in different files is trivial. You could even add them dynamically but for now I am just using simple import calls.

For the strict & non-strict benchmarks its easier to keep them in 1 file as the runtype definition is the same only the check arguments differ for most libraries.

Benchmarks also have their own tests which are executed for each registered case.

If you give me a thumbs up I would continue to clean this up, migrate all existing cases and create a mergable PR.

moltar · 2021-11-01T05:54:16Z

@hoeck Can you please make a PR and I'll review it! Thanks! 😄

hoeck · 2021-12-30T10:17:39Z

Just FYI I finally found some time and I am working on this. right now. I am also really curious on the results 😁.

ianstormtaylor mentioned this issue Jan 12, 2021

add note about incorrect comparisons to readme #518

Merged

moltar mentioned this issue Jul 7, 2021

AJV Test Case #683

Open

moltar mentioned this issue Dec 31, 2021

docs: add missing outlier #816

Merged

rehanvdm mentioned this issue Jan 1, 2022

feat: add ajv #817

Merged

hoeck mentioned this issue Jan 10, 2022

refactor: benchmarking process #819

Merged

moltar closed this as completed in #819 Jan 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split test cases into specific levels of functionality? #497

split test cases into specific levels of functionality? #497

ianstormtaylor commented Dec 12, 2020 •

edited

Loading

moltar commented Dec 18, 2020

ianstormtaylor commented Dec 18, 2020

moltar commented Mar 3, 2021

ianstormtaylor commented Mar 3, 2021

hoeck commented Apr 26, 2021

hoeck commented Jun 28, 2021 •

edited

Loading

moltar commented Jun 29, 2021

hoeck commented Jun 29, 2021

hoeck commented Aug 31, 2021

moltar commented Nov 1, 2021

hoeck commented Dec 30, 2021

split test cases into specific levels of functionality? #497

split test cases into specific levels of functionality? #497

Comments

ianstormtaylor commented Dec 12, 2020 • edited Loading

moltar commented Dec 18, 2020

ianstormtaylor commented Dec 18, 2020

moltar commented Mar 3, 2021

ianstormtaylor commented Mar 3, 2021

hoeck commented Apr 26, 2021

hoeck commented Jun 28, 2021 • edited Loading

moltar commented Jun 29, 2021

hoeck commented Jun 29, 2021

hoeck commented Aug 31, 2021

moltar commented Nov 1, 2021

hoeck commented Dec 30, 2021

ianstormtaylor commented Dec 12, 2020 •

edited

Loading

hoeck commented Jun 28, 2021 •

edited

Loading