Add support for multiple batch keys #261

gyDBD · 2021-05-11T17:33:04Z

issue: #75

The endpoint that would benefit from this would be an endpoint that takes multiple batch ids and returns a nested object. See more explanation in the issue linked above.

Solution:

I add two new configurable key: propertyBatchKey and propertyNewKey to config the property key we want to batch up.

Here are the 3 cases that we support for now:
They all expect to have same input parameters format:

ids: str[] (should be the batchKey)
properties: str[] (should be the propertyBatchKey)

But they could have different response contract:

case 1: properties are returned in a nested object, id is at the top level.
e.g.: { id: 'id_1', properties: { property_1: 'foo', property_2: 'boo' } }

case 2: properties is not returned in a nested object, but spread at top level as well.
e.g.: { id: 'id_1', property_1: 'foo', property_2: 'boo' }

case 3: id and its properties are returned as a key-value pair at the top level.
e.g.: { 'id_1': { property_1: 'foo', property_2: 'boo' } }

Testing

update: I added more unit tests so that we know the new propertyBatchKey works fine with existing config.
e2e test with star war example: (data masking is correct, and when we query three fields, they only call the function once)
output: https://fluffy.yelpcorp.com/i/MpJQFFdv18k4jnc6tHCRV2Br5qVpVmjd.html

magicmark · 2021-05-11T18:13:01Z

Looking good!

Couple questions:

1. Merging Strategies

Will this support use cases as described in the original issue? #75 ?

Looks like there's different shapes for how secondary stuff (arguments/response) could be exposed. It could be an array that needs merging (your use case), or object properties (the use case described in the original issue description). Maybe even another strategy like boolean merging (e.g. if include_extra_info was your secondary batch key, you'd use "true" as the superset value to merge)

even if we don't support the original use case for now, given we've seen different "behaviours", I wonder if we want an option to control that (even if this this is the only behaviour we support right now) - something like secondaryMergeStrategy, or maybe it's just something we can infer by the types, and we don't need this key 🤔

2. Data Masking

Assuming we make the following set of network requests:

getBusiness({ foo_ids: [1,2], properties: ['rating', 'name'], include_extra_info: false});
getBusiness({ foo_ids: [3], properties: ['rating'], include_extra_info: true});

What is the response to the callsite that made this initial loader request?

{ foo_id: 2, property: 'name', include_extra_info: false },

They only requested the name property - but would they get back 'rating' too?

(can't tell from the tests yet)

I think we'd want to make sure we don't leak extra info that wasn't requested - to avoid screwing up the types and introducing buggy responses that people rely on that we may want to fix one day.

this is awesome stuff!!!

src/implementation.ts

src/runtimeHelpers.ts

magicmark · 2021-05-11T18:29:25Z

src/runtimeHelpers.ts

+export function getBatchKeyForPartitionItems(
+    batchKey: string,
+    ignoreKeys: Array<string>,
+    items: ReadonlyArray<object>,
+): ReadonlyArray<ReadonlyArray<number>> {


Could partitionItems be re-used? What is this doing differently?

instead of returning the order, here we return the batchKey value of that item, the main difference is line 135. https://github.com/Yelp/dataloader-codegen/pull/261/files#diff-92b6f52e2c1b302cd9d15133831c595f151215aa1dfda274ff29fbd33d89a9b2R135

For example, if we have (the batch key here is bar_id and secondary batch key is property) :

{ bar_id: 100, property: 'name', include_extra_info: true }, { bar_id: 101, property: 'rating', include_extra_info: false }, { bar_id: 100, property: 'rating', include_extra_info: true },

function partitionItems will return [ [ 0, 2 ], [ 1 ] ]
while, function getBatchKeyForPartitionItems with return [ [ 100, 100 ], [ 101 ] ]
We use this information to help us match results object to requests. see https://github.com/Yelp/dataloader-codegen/pull/261/files#diff-92b6f52e2c1b302cd9d15133831c595f151215aa1dfda274ff29fbd33d89a9b2R349-R350

gyDBD · 2021-05-11T21:44:09Z

1. Merging Strategies

Will this support use cases as described in the original issue? #75 ?

Looks like there's different shapes for how secondary stuff (arguments/response) could be exposed. It could be an array that needs merging (your use case), or object properties (the use case described in the original issue description). Maybe even another strategy like boolean merging (e.g. if include_extra_info was your secondary batch key, you'd use "true" as the superset value to merge)

even if we don't support the original use case for now, given we've seen different "behaviours", I wonder if we want an option to control that (even if this this is the only behaviour we support right now) - something like secondaryMergeStrategy, or maybe it's just something we can infer by the types, and we don't need this key 🤔

@magicmark I believe I did cover the use cases as described in the original issue #75. See this unit test the response is like { id1: {rating: 3, name: 'Burger King'}} vs my use case {foo_id: id1, rating: 3, name: 'Burger King'}. So either the batchKey value is as a key or as a value in the response, it's covered. However, if the response doesn't contain batchKey in it, e.g. {rating: 3, name: 'Burger King'} that cannot be covered.

__tests__/implementation.test.js

src/config.ts

src/runtimeHelpers.ts

__tests__/implementation.test.js

…nto yueguo_testing

schema.json

src/implementation.ts

src/runtimeHelpers.ts

magicmark · 2021-05-21T19:31:53Z

src/runtimeHelpers.ts

+                            result = Object.assign(
+                                {},
+                                ...[newKey, propertyBatchKeyPartion[i][j]].map((key) => ({ [key]: element[key] })),
+                            );


this syntax is really confusing to me

we usually don't need to explicitly call Object.assign for merging - we can use normal spread syntax:

const foo = { ...myObj1, ...myObj2, };

I'm a bit more confused about what's going on here though.

> Object.assign({}, ...['hello', 'world'] ) { '0': 'w', '1': 'o', '2': 'r', '3': 'l', '4': 'd' }

:thonk:

Could we separate out these lines or something? this is making my brain hurt a bit

I searched how to only keep selected key-value pair in an object, https://stackoverflow.com/questions/54907549/keep-only-selected-keys-in-every-object-from-array and stackoverflow gave me this.

I'm still not following :/

Could you give a sample demo input/output of what's happening here?

To just keep newKey ('id') and the property ('name') from the response. So that we only return what we asked for.

For example:

var data = {id: '1', name: 'John', age: 42}; var keys_to_keep = ['id', 'name'] var data_new=Object.assign({}, ...keys_to_keep.map(key => ({[key]: data[key]}))); console.log(data_new)

output:

{ id: '1', name: 'John' }

Gotcha - thanks. (The simple example helped a lot!)

does https://lodash.com/docs/4.17.15#pick work?

the Object.assign({}, ... logic feels sufficiently tricky to warrant using the lodash method instead (I think we're already using lodash in here anyway)

src/runtimeHelpers.ts

examples/swapi/swapi-server.js

magicmark · 2021-06-02T23:10:38Z

examples/swapi/swapi.js

+        getFilmsV2: ({ film_ids, properties }) => {
+            return [


might be worth a comment to explain what's going on here / why v2 exists

magicmark · 2021-06-02T23:18:38Z

examples/swapi/swapi-server.js

+            const response = await swapiLoaders.getFilmsV2.load({ film_id: this.id, property: 'title' });
+


can you show the output of $ flow check when you try to read response.director?

hopefully it errors, which would verify the data masking is correct

I added console.log(response) there, and it only gives me film_id and title. I could also do the flow check.

I added console.log(response) and console.log(response.director).
got this

{ film_id: 4, title: 'The Phantom Menace' } undefined

data masking is correct, but flow doesn't complain, which is something I should fix.

I checked the response type just matches the SWAPI_Film_v2 type I defined here
which is

export type SWAPI_Film_v2 = $ReadOnly<{| film_id: number, title: string, episode_id: number, director: string, producer: string, |}>;

so it won't blow up when I do response.director.

Thanks!

I think we want flow to blow up here though don’t we, otherwise there’s a mismatch between what actually exists vs what flow thinks exists?

@magicmark I have trouble with generate correct flow type. Here's a try flow reproduce
Could you help me with that? I don't know how to make id not nullable while change other fields nullable.

On the other hand,

Hmm, here, we only get nullable because the swagger endpoint defines each as property so. That's just what the example endpoint does - there's no guarantee other endpoints will do that.

I think we can document supported resource must return nullable properties? If some endpoint require all properties be non-nullable, and we remove lots of properties in the dataloader-codegen, that seems wrong to me and we shouldn't support that case. So we require properties being nullable make sense to me.

there's a (subtle) difference between getting the superset of all possible null properties and the subset of known guaranteed non-null properties

So if I did:

const properties1 = loaders.fetchStuff({ properties: ['foo', 'bar'] }); const properties2 = loaders.fetchStuff({ properties: ['baz'] });

If I understand what you're saying correctly, you'd expect both properties1 and properties2 to have the following identical type:

{ foo?: string, bar?: string, baz?: string }

If we're being super nitpicky, this is wrong:

there's no way that properties1 should ever have baz present since we didn't request it. Flow is kinda lying to us.

Inproperties2, baz should always be present. Again, flow is kinda lying to us.

These aren't catastrophic problems. The stock price of the company won't go up or down as a result either way here :)

But again, as this is foundational infrastructure that hundreds of our resolvers use, I think we probably want to get the flow typings as "correct" as possible.

You commented this before

At a very minumum, we should mark all properties as nullable - the more correct thing to do would be to omit all non-requested properties from the response entirely.

So I thought just having all properties nullable is fine.
I have trouble with getting it as "correct" as possible.

we should mark all properties as nullable

Is that what the current PR does? Or does it rely on the endpoint already doing that for us? That's a key difference.

I think we can document supported resource must return nullable properties?

Is a reasonable alternative.

I'm happy to set up some time to pair on this to see if we can get flow to do the right thing? We can timebox it, and if not, fall back to your suggestion :)

Is that what the current PR does? Or does it rely on the endpoint already doing that for us? That's a key difference.

We rely on the endpoint returns nullable properties. And that's the reasonable alternative thing you commented on.

I'm happy to set up some time to pair on this to see if we can get flow to do the right thing? We can timebox it, and if not, fall back to your suggestion :)

Thanks!

magicmark

looking good! couple thoughts:

the examples only show one style of the property merging interface - given the complexity here, might be worth e2e testing all styles
can we verify the generated flow types are correct? (by showing it blowing up when you try to read a value we didn't request - a screenshot in the PR description would be good)
can we verify the runtime is correct, by doing the above and console.log() the bad value, and verify it's empty
do we need an extra config param to control what style of endpoint it is?

magicmark · 2021-06-03T00:35:47Z

Thanks! I think we want flow to blow up here though don’t we, otherwise there’s a mismatch between what actually exists vs what flow thinks exists

…

On Wed, Jun 2, 2021 at 4:48 PM Yue Guo ***@***.***> wrote: ***@***.***DBD* commented on this pull request. ------------------------------ In examples/swapi/swapi-server.js <#261 (comment)> : > + const response = await swapiLoaders.getFilmsV2.load({ film_id: this.id, property: 'title' }); + I checked the response type just matches the SWAPI_Film_v2 type I defined here <https://github.com/Yelp/dataloader-codegen/pull/261/files#diff-e461ae547272d09d3b85a44a7ab8de37f10cc4a08ae23a9ffb78b247b4425b7eR66-R73> which is export type SWAPI_Film_v2 = $ReadOnly<{| film_id: number, title: string, episode_id: number, director: string, producer: string, |}>; so it won't blow up when I do response.director. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#261 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAMELZHHPDHBWCJ4CG5A67DTQ27NRANCNFSM44WLEJMA> .

gyDBD added 7 commits May 7, 2021 14:02

testing

fd4a729

testing

553dd81

second -> secondar, and remove console.log

3b85fe3

add secondaryBatchKey

c283cca

add back all tests

f967006

rename some functions and add comments for new functions

23a427a

edit documentation for new functions

b286a39

gyDBD requested review from magicmark and kkellyy May 11, 2021 17:47

magicmark reviewed May 11, 2021

View reviewed changes

src/implementation.ts Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

src/implementation.ts Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

src/runtimeHelpers.ts Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

__tests__/implementation.test.js Outdated Show resolved Hide resolved

magicmark reviewed May 11, 2021

View reviewed changes

__tests__/implementation.test.js Outdated Show resolved Hide resolved

gyDBD added 2 commits May 12, 2021 14:14

data masking

2d4a6bd

remove console.log in /examples/swapi

6116e05

kkellyy reviewed May 13, 2021

View reviewed changes

src/config.ts Outdated Show resolved Hide resolved

magicmark reviewed May 14, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 14, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 14, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

kkellyy reviewed May 14, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

gyDBD added 2 commits May 19, 2021 11:56

add error handling

88f8b6d

add cases that secondaryBatchKey is returned in a nested object

0a393f7

magicmark reviewed May 20, 2021

View reviewed changes

__tests__/implementation.test.js Outdated Show resolved Hide resolved

magicmark reviewed May 20, 2021

View reviewed changes

__tests__/implementation.test.js Outdated Show resolved Hide resolved

gyDBD added 4 commits May 20, 2021 10:37

change to propertyBatchKey and propertyNewKey

3fb6e34

Merge branch 'master' of https://github.com/Yelp/dataloader-codegen i…

f75e68f

…nto yueguo_testing

add propertyBatchKey and propertyNewKey to schema.json

a5e440f

add more tests to prove this work with other existing config

5d2b19d

magicmark reviewed May 21, 2021

View reviewed changes

schema.json Outdated Show resolved Hide resolved

magicmark reviewed May 21, 2021

View reviewed changes

src/implementation.ts Outdated Show resolved Hide resolved

magicmark reviewed May 21, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 21, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

magicmark reviewed May 21, 2021

View reviewed changes

src/runtimeHelpers.ts Outdated Show resolved Hide resolved

gyDBD added 2 commits May 21, 2021 15:56

fix some review feedback and add more tests

8366711

add more comments

15f5670

gyDBD marked this pull request as ready for review June 1, 2021 20:03

gyDBD added 2 commits June 2, 2021 11:39

preserve typescript typing.

a1a70dd

add star war end-to-end test

acf6fff

magicmark reviewed Jun 2, 2021

View reviewed changes

examples/swapi/swapi-server.js Outdated Show resolved Hide resolved

magicmark reviewed Jun 2, 2021

View reviewed changes

gyDBD added 7 commits June 4, 2021 13:20

add e2e tests for all three cases

bf485de

add comments and change return value to Promise

16d3180

update API_DOCS.md

812c846

add enum mergePropertyOptions to control what style of endpoint it is

92765fb

update swapi examples

7b08aa2

merge with master

1aae929

merge with master

c25816b

gyDBD mentioned this pull request Jul 1, 2021

add new config to batch properties calls #299

Merged

gyDBD closed this Jul 22, 2021

		const response = await swapiLoaders.getFilmsV2.load({ film_id: this.id, property: 'title' });

Add support for multiple batch keys #261

Add support for multiple batch keys #261

Uh oh!

Conversation

gyDBD commented May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

issue: #75

Solution:

Testing

Uh oh!

magicmark commented May 11, 2021

1. Merging Strategies

2. Data Masking

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gyDBD May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gyDBD commented May 11, 2021

1. Merging Strategies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

magicmark Jun 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

magicmark Jun 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gyDBD Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gyDBD Jun 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gyDBD commented May 11, 2021 •

edited

Loading

gyDBD May 11, 2021 •

edited

Loading

magicmark Jun 1, 2021 •

edited

Loading

magicmark Jun 1, 2021 •

edited

Loading

gyDBD Jun 2, 2021 •

edited

Loading

gyDBD Jun 2, 2021 •

edited

Loading

gyDBD Jun 16, 2021 •

edited

Loading

magicmark left a comment •

edited

Loading