Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(GraphService): Thorough graph service tests #3011

Merged

Conversation

EnricoMi
Copy link
Contributor

@EnricoMi EnricoMi commented Aug 3, 2021

Adds a more complex test graph and tests all GraphService methods thoroughly. In the current implementation, tests depend on the order of test execution, as they share the state of the tested service.

As the interface is not documented, a couple of questions on which results to expect from the given graph arose:

  • Is it expected that findRelatedUrns returns duplicates, i.e. Urns may occur multiple times?
    • Answer: findRelatedUrns should not return duplicates.
  • Should findRelatedUrns with RelationshipDirection.UNDIRECTED return the union of RelationshipDirection.OUTGOING and RelationshipDirection.INCOMING?
    • Answer: It should return the union of outgoing & incoming.
  • The sourceType and destinationType of findRelatedUrns are explicitly @Nullable. Does null represent any type?
    • Answer: Yes, null sourceType or destinationType represents any type.
  • From current tests it looks like "" is also interpreted as any type. Should this be deprecated in favour of null and disallow empty types? When a type is given (non-null), it should be a useful type string.
    • Answer: Yes, we should use null to reference any type.
  • Should removeEdgesFromNode with RelationshipDirection.UNDIRECTED be equivalent to calling it with RelationshipDirection.OUTGOING and RelationshipDirection.INCOMING?
    • Answer: Yes, removeEdgesFromNode with RelationshipDirection.UNDIRECTED should be equivalent to calling it with RelationshipDirection.OUTGOING and RelationshipDirection.INCOMING.
  • Is it expected to always have at least one relationship type given when calling findRelatedUrns and removeEdgesFromNode? Is an implementation allowed to enforce this limitation?
    • Answer: Yes, it is expected to have at least one relationship type for a query.

ElasticSearchGraphServiceTest fails these tests while above questions are open.

It further seems like no Neo4j tests are running. This includes a fix for #2678.

@gabe-lyons
Copy link
Contributor

@EnricoMi-

  1. findRelatedUrns should not return duplicates- if we add a findEdges method, that would be the place to return duplicates. The elasticsearch service may not be abiding by this- we should update it to dedupe.

  2. RelationshipDirection.UNDIRECTED is not supported in Elasticsearch Graph Service at the moment- Datahub does not use it currently. However, to implement it, it should return the union of outgoing & incoming.

  3. Yes, null sourceType or destinationType represents any type.

  4. Yes, we should use null to reference any type. I agree with your proposal here.

  5. Yes- removeEdgesFromNode with RelationshipDirection.UNDIRECTED should be equivalent to calling it with RelationshipDirection.OUTGOING and RelationshipDirection.INCOMING. UNDIRECTED is not currently used by DataHub and Elasticsearch Graph Service does not support Undirected at the moment.

  6. Yes, it is expected to have at least one relationship type for a query. I would advise simply returning an empty list (or no-oping) to handle this case. I would not advise throwing an exception- this should be a valid call.

@gabe-lyons
Copy link
Contributor

I'll also look into the graph service test issue- those should be running. We recently refactored the logic to run graph service tests, that may have done it.

@EnricoMi
Copy link
Contributor Author

Thanks for answering those questions. I will adjust the tests to reflect that behaviour.

  • Yes, we should use null to reference any type. I agree with your proposal here.

Would you say an empty type string "" should throw an IllegalArgumentException or simply return results for the empty string type? So if you were to add nodes with an empty string as their type, you would retrieve those Urns. Or does Urn disallow empty type strings?

  • Yes, it is expected to have at least one relationship type for a query. I would advise simply returning an empty list (or no-oping) to handle this case. I would not advise throwing an exception- this should be a valid call.

You are right. The relationship types argument is a non-nullable list, so the semantics of providing an empty list is to ask for no types. If any relationship type should be supported in the future, than this could be done through a null for a nullable-list.

@EnricoMi
Copy link
Contributor Author

EnricoMi commented Aug 17, 2021

@gabe-lyons Given existing implementations currently fail these tests I would like to suggest the following approach:

  1. I will rebase the PR to incorporate latest changes done to the API in f40bf1c. ✔️
  2. I will document the API to reflect the behaviour as described in this PR's description: 018f6de. ✔️
  3. We should disable all Elasticsearch and Neo4J tests that currently fail. ✔️
  4. Create tickets to fix those issues separately to make implementations comply with the API. The disabled tests serve as a test-driven-development approach of improving the implementations: ✔️

This way, this PR is not blocked by fixing ES and Neo4J implementations, test from this PR reproduce above issues, and upcoming changes to the GraphService API do not conflict with this PR in the meanwhile.

@EnricoMi
Copy link
Contributor Author

CC @jjoyce0510

@EnricoMi EnricoMi force-pushed the branch-thorough-graph-service-tests branch 2 times, most recently from 3939d47 to b25f443 Compare August 18, 2021 13:17
@EnricoMi EnricoMi force-pushed the branch-thorough-graph-service-tests branch 2 times, most recently from 5e5a3d9 to a08f80d Compare August 18, 2021 21:35
@EnricoMi EnricoMi changed the title test(GraphService): thorough graph service tests test(GraphService): Thorough graph service tests Aug 19, 2021
@EnricoMi EnricoMi force-pushed the branch-thorough-graph-service-tests branch from a08f80d to 58f46de Compare August 19, 2021 10:32
Copy link
Contributor

@gabe-lyons gabe-lyons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding all tests- I will get to reviewing ASAP!

Copy link
Contributor

@gabe-lyons gabe-lyons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just read through all these changes- thanks for all the additional tests & documentation. This looks great @EnricoMi 👍

If you have a chance, would you mind rebasing this? After that I would be good to merge

@EnricoMi EnricoMi force-pushed the branch-thorough-graph-service-tests branch 6 times, most recently from b47be69 to 26fb848 Compare September 13, 2021 21:52
@EnricoMi EnricoMi force-pushed the branch-thorough-graph-service-tests branch from 26fb848 to 4bac19a Compare September 17, 2021 20:10
Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit c0aa3ec into datahub-project:master Sep 17, 2021
@EnricoMi EnricoMi deleted the branch-thorough-graph-service-tests branch September 18, 2021 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants