Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cascading validation #995

Merged
merged 11 commits into from
Apr 12, 2021
Merged

Conversation

LWprogramming
Copy link
Collaborator

@LWprogramming LWprogramming commented Feb 3, 2021

As described in PR #994:

Schema renaming currently has some technical debt where there is a separate validation step before actually attempting to use the renamings to modify the schema. For this to work, it requires us to keep the validation code and the processing code in sync, which we'd like to avoid in favor of attempting to modify the schema directly and gathering errors if something goes wrong.

This PR continues cleaning up this technical debt. It takes cascading suppression errors and moves the checks into the renaming code rather than having a separate validation step, simplifying the renaming process.

This PR lets us remove the entire CascadingSuppressionCheckVisitor class. If it'd be significantly easier to understand the changes here if I tried moving the functionality out of the CascadingSuppressionCheckVisitor piecemeal, just let me know and I can do that-- here I decided not to do that initially since all the code in the visitor follows the same pattern of "foo in the schema depends on bar where bar is to be suppressed".

@@ -235,7 +235,7 @@ def rename_schema(
schema = build_ast_schema(schema_ast)
query_type = get_query_type_name(schema)

_validate_renamings(schema_ast, type_renamings, field_renamings, query_type)
_ensure_no_unsupported_suppressions(schema_ast, type_renamings)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this PR, the only separate validation step left would be this one, so there's no need for a special _validate_renamings function.

@codecov
Copy link

codecov bot commented Feb 9, 2021

Codecov Report

Merging #995 (08100d0) into main (c645238) will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #995      +/-   ##
==========================================
+ Coverage   94.33%   94.38%   +0.05%     
==========================================
  Files         113      113              
  Lines        9070     9080      +10     
==========================================
+ Hits         8556     8570      +14     
+ Misses        514      510       -4     
Impacted Files Coverage Δ
...ql_compiler/schema_transformation/rename_schema.py 100.00% <100.00%> (+0.35%) ⬆️
...phql_compiler/compiler/ir_lowering_sql/__init__.py 100.00% <0.00%> (ø)
graphql_compiler/compiler/emit_sql.py 98.57% <0.00%> (+0.05%) ⬆️
...r/schema_generation/sqlalchemy/edge_descriptors.py 98.66% <0.00%> (+0.42%) ⬆️
graphql_compiler/schema/schema_info.py 90.47% <0.00%> (+0.89%) ⬆️
..._compiler/schema_generation/sqlalchemy/__init__.py 100.00% <0.00%> (+33.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c645238...08100d0. Read the comment docs.

type_name = node.name.value
current_type_fields_to_suppress = {}
for field_node in node.fields:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop strongly resembles the one below on line 806, but I'm not sure of how to organize it best. The problem here is with the function structure. Let's say we're at a type T with a field foo, and foo is of type Bar`. Then this function is essentially doing the following:

  1. For each field foo of type Bar: Is the type Bar going to be suppressed? If so, and if foo isn't going to be suppressed, the renamings are invalid and we should do the bare minimum necessary to continue the schema renaming and collect any other independent errors while recording the problem. Return from this node-- no more to be done here.
  2. Do we expect to renaming T's fields at all? If not, we're done and can return.
  3. For each field foo: rename foo if necessary.

I considered trying something along the lines of implementing step 2 via "if T's fields are not to be renamed, this is equivalent to using the identity mapping for all of T's fields with the reasoning that this means the loops in steps 1 and 3 can probably combined into one single loop, but it's not clear to me that this is necessarily easier to understand. Would definitely invite any feedback!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR comment would be useful to integrate into the code itself. The logic around line 780 is quite complex and it's not really obvious from the source code itself what is being done and why.

In particular, I'd love to see complex expressions like self.type_renamings.get(field_type_name, field_type_name) is None get assigned to well-named local variables, because they are complex and surprising — in this case, that's because it's unusual to see a .get() check with an explicit non-None default value be checked against a None result.

What I'd recommend is writing something similar to your PR comment as a code comment in a continuous block just above this for loop, and then reference the individual parts of the algorithm the comment describes from the appropriate place inside the for loop. Together with good naming of intermediate expression values, I think this code will become much clearer.

@LWprogramming LWprogramming marked this pull request as ready for review February 9, 2021 22:20
type_name = node.name.value
current_type_fields_to_suppress = {}
for field_node in node.fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR comment would be useful to integrate into the code itself. The logic around line 780 is quite complex and it's not really obvious from the source code itself what is being done and why.

In particular, I'd love to see complex expressions like self.type_renamings.get(field_type_name, field_type_name) is None get assigned to well-named local variables, because they are complex and surprising — in this case, that's because it's unusual to see a .get() check with an explicit non-None default value be checked against a None result.

What I'd recommend is writing something similar to your PR comment as a code comment in a continuous block just above this for loop, and then reference the individual parts of the algorithm the comment describes from the appropriate place inside the for loop. Together with good naming of intermediate expression values, I think this code will become much clearer.

path: List[Any],
ancestors: List[Any],
) -> None:
"""Check that each union still has at least one non-suppressed member."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This docstring doesn't really capture what's going on. The implementation does an action if that check fails, but the docstring doesn't make any mention of that.

Consider instead:

Suggested change
"""Check that each union still has at least one non-suppressed member."""
"""Suppress unions that do not have any remaining non-suppressed members."""

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action that happens if the check fails is actually raising an error instead of automatically suppressing the union (since renaming is defined so that unions in such situations have to get explicitly suppressed) and the error handling code is outside of the visitor itself, so I'm inclined to keep the docstring as-is. Worth adding a bit to mention what the union_types_to_suppress field is going to be used for, even if it doesn't happen in this function itself?

Copy link
Contributor

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me whether you were done tweaking this PR — when you're done, hit that "re-request review" button next to my name in the reviewers list to put this PR back on my review queue.

I have two minor suggestions, and otherwise this seems fine to merge.

graphql_compiler/schema_transformation/rename_schema.py Outdated Show resolved Hide resolved
graphql_compiler/schema_transformation/rename_schema.py Outdated Show resolved Hide resolved
LWprogramming and others added 2 commits April 9, 2021 11:37
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
Copy link
Contributor

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@LWprogramming LWprogramming merged commit 44c63aa into main Apr 12, 2021
@LWprogramming LWprogramming deleted the refactor_cascading_validation branch April 12, 2021 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants