Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1480 Output key validation #1628

Merged
merged 12 commits into from Oct 12, 2021
Merged

#1480 Output key validation #1628

merged 12 commits into from Oct 12, 2021

Conversation

BALEHOK
Copy link
Contributor

@BALEHOK BALEHOK commented Oct 11, 2021

Related to #1480

@BALEHOK BALEHOK requested a review from BLoe October 11, 2021 17:07

const REQUIRED_FIELD_REGEX = /^Instance does not have required property "(?<property>.+)"\.$/;
export type PipelineErrors = string | Record<string, unknown>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a documentation comment of what PipelineErrors is supposed to hold?

  • What are the keys in the record? Are they Formik field paths?
  • When would it be a string vs. an object?
  • How to represent if there are no errors? Is it an empty object? Using null?

NIT: We have an UnknownObject alias defined you can use for Record<string, unknown>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors are confusing, I agree. Here PipelineErrors is Formik error... thing. It can be a string, a record of strings, or a record of records... I.e. it is dynamic and depends on the level of the state tree where the error happens. Addressing a comment regarding pipelineErrors[0], the error state normally is not an array but since the pipeline is an array we use numbers (index) to get the error related to a block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite it looks like an array (the top-level may look like an array - have numbers for property names), it is an object. I believe we should not type it as an array, that would be even more confusing, for instance, it doesn't have a `length property.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open for discussion and suggestions here.
Basically we need a typing for the Formik errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @BLoe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a reasonable explanation - but let's put the explanation in the code instead of the PR so that people who are using the type understand how it's used and why it is the way that it. @BLoe thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree we could add a code comment explaining how this works. It took me a while to understand at first for sure. I think formik is just designed to be super dynamic in the structure here and it doesn't map very well to typescript, or they haven't put a ton of effort yet into getting the typing right for the error object structure.

Btw, the way this works is what makes it possible to use our joinName() logic to construct the field paths to the formik errors, otherwise blockPipeline.2.config.<whatever> wouldn't work properly when we run this:

propertyNameInPipeline = joinName(
  String(blockIndex),
  "config",
  property
);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it appears that the lodash set() function works fine with regular arrays and field paths that include the .<index>.<field> syntax, so that's just lodash magic and doesn't have to do with the error object structure 🤷

if (!traceError) {
return;
}
const errorTraceEntry = useSelector(selectTraceError);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarifying my performance idea from the outdated pull request:

You can add a useMemo here to convert allBlocks into a Map for faster lookups in the validation. The blockMap variable can then be referenced in the useCallback

const blockMap = useMemo(() => new Map(allBlocks.map(x => [x.id, x]), [allBlocks])

allBlocks: IBlock[]
) {
return Promise.all(
pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • See comment about passing the allBlocks around as a Map instead of as an array
  • You could also just pre-compute the types of all the blocks as well and pass the types around with the block definitions. I think that would eliminate all the memoizeGetPipelineBlockTypes logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memoizeGetPipelineBlockTypes did look ugly to me too.


for (let blockIndex = 0; blockIndex !== pipeline.length; ++blockIndex) {
let errorMessage: string;
const pipelineBlock = pipeline[blockIndex];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to you could use lodashs's zip here instead of manually maintaining the index logic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could. But I need to know the index, hence for-of doesn't nicely work here. What's left is the regular for. Then it doesn't make much sense to me to zip and run the for loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is that you could do zip(pipeline, blockTypes).forEach((block, type, index) => ...

Definitely mostly a style thing though, so I don't have strong feelings either way.

Copy link
Contributor Author

@BALEHOK BALEHOK Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love forEach, not allowed on the project though, so only 2 options: for-of or for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, that gets a log uglier when you have to use for (const <something> of zip()) { ... }`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and remember about the index ;)

Copy link
Collaborator

@BLoe BLoe Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I mean you can probably do:

for (const [index, value] of zip(...)) {
  const { block, type } = value;
  ...
}

But, you aren't allowed to move the destructure inside the for loop definition, so at this point it's harder to read than a basic for loop.

@twschiller
Copy link
Contributor

twschiller commented Oct 12, 2021

I'm OK to get this merged in, but I have a pretty strong preference of avoiding cache logic by just doing the work for the block definitions once:

  • Put them in a Map from id -> block
  • Enrich them with the block type information

That would allow constant time (actually linear in the length of the pipeline) validation without the complexity of managing cache invalidation

@BALEHOK BALEHOK force-pushed the feature/1480-output-key-validation branch from 3e7085d to 3379f3b Compare October 12, 2021 12:51
Copy link
Collaborator

@BLoe BLoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally in favor of the approach here. I think the logic around allBlocks can definitely be simplified a bit (see my comment about possibly using resolvedBlocks instead). I'm a little concerned with naming conventions with some of this stuff. I don't like the fact that something called a validator is not actually performing any validation, but instead translating/mapping errors from one system to another (trace --> formik).

Otherwise, I think the organization of stuff here makes a lot of sense 👍

const formikField = useField<BlockPipeline>({
name: "extension.blockPipeline",
const [{ value: blockPipeline }, { error: blockPipelineErrors }] = useField<
BlockConfig[]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the BlockPipeline type alias for BlockConfig[] if you want to

* Gets Input validation error from the Trace
* @param pipelineErrors Pipeline validation errors for the Formik context.
* @param traceError Serialized error from running the block ({@link TraceError.error}).
* @param blockInstanceId Index of block that generated the Trace Error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo blockInstanceId --> blockIndex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed it after merge)


import { TraceError } from "@/telemetry/trace";

function traceErrorGeneralValidator(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that big of an issue, but why is this called "General Validator" when it's only mapping trace errors to pipeline error indexes? It doesn't seem like it does any validating itself. To me, a "validator" is something that is running the validation logic, directly. Should we call this and the input one applyTraceInputErrors() and applyTraceGeneralErrors() or mapTraceGeneralErrors() or something like that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Comment on lines 44 to 45
.map(({ id }) => allBlocks.find((block) => block.id === id))
.map(async (block) => (block ? getType(block) : null))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same lookup/map logic used to define resolvedBlocks in EditTab. Should we pass around resolvedBlocks into the validator logic instead of allBlocks? resolvedBlocks is also an array with matching indexes to blockPipeline so it might be easier to use rather than allBlocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. resolvedBlocks depends on pipeline, pipeline depends on validation, validation depends on resolvedBlocks. Didn't look nice.

Copy link
Collaborator

@BLoe BLoe Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah hmm okay, I see what you're saying.

Here's an idea: BlockRegistry is already basically a cache system for block lookups. What if you just searched the registry directly with each block id using the BlockRegistry.lookup() function? Shouldn't this work just as well as a local cache somewhere in the UI component tree?

The only other usage of allBlocks in EditTab is for the add-brick modal. I think that code could be cleaned up in the future too. It could probably be moved down into a child component at some point later.


for (let blockIndex = 0; blockIndex !== pipeline.length; ++blockIndex) {
let errorMessage: string;
const pipelineBlock = pipeline[blockIndex];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is that you could do zip(pipeline, blockTypes).forEach((block, type, index) => ...

Definitely mostly a style thing though, so I don't have strong feelings either way.

@BALEHOK BALEHOK force-pushed the feature/1480-output-key-validation branch from 8984380 to a0e32bd Compare October 12, 2021 17:36
Comment on lines +54 to +55
validateOutputKey(formikErrors, pipeline, allBlocks);
applyTraceError(formikErrors, errorTraceEntry, pipeline);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like these names a lot more 👍

Copy link
Collaborator

@BLoe BLoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still more improvements to the code to make in the future here, but I'm cool with getting this merged in for now as it is 👍

@BLoe BLoe merged commit 5f0c4f9 into main Oct 12, 2021
@BLoe BLoe deleted the feature/1480-output-key-validation branch October 12, 2021 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants