query.map() and query.map_async() are no longer implemented #210

chmoder · 2019-09-24T18:45:57Z

Are the map functions something we want to add in the future? I have a use case I can refactor
or I can also implement query.map*. I am guessing there is some low level reason that the choice to not implement was made; a comment about why would help prevent others from trying if so.

The text was updated successfully, but these errors were encountered:

cguardia · 2019-09-26T16:55:29Z

Hi, sorry we took some time to answer. There's really no technical reason for not implementing this. We looked around and it seemed that it was not in use. We are open to considering re-adding this at some point in the beta process, or if you would like to try to implement it, we would be happy to review a PR.

chmoder · 2019-09-27T16:24:59Z

We will think about it, for now we went with this concept.

pool = Pool(50)
it = query.iter(keys_only=True)
items = pool.map(SomeClass.apply_some_values, it)
pool.close()

chrisrossi · 2019-09-27T18:53:49Z

@chmoder What is the Pool class from?

Are/were you using any of the arguments to Query.map besides the callback?

chmoder · 2019-09-27T20:42:38Z

from multiprocessing.pool import Pool
No, we were not using any other arguments in Query.map and Query.map_async.

I would be happy to hear any other suggestions you have if there is a better choice than pool. The reason we used map here is to get the entities of keys in the current entity. (relationship type of thing)

For example:
putting the Car entity in the Person entity but only on query. This is because the Car entities change over time.

Person()
  car = ndb.KeyProperty()

Car()
    model = StringProperty()

chrisrossi · 2019-10-01T20:31:09Z

Can I ask what you're doing in apply_some_values?

chmoder · 2019-10-01T20:50:28Z

The idea of apply_some_values is basically a way to set other entities on this one like a foreign key relationship.

It's a contrived example above, but when getting a Person also get that persons Car and set it on the Person instance.

chrisrossi · 2019-10-04T14:12:30Z

Hi @chmoder ,

Like Carlos says above, the main reason we didn't implement this is we didn't think anyone was using it. (EDIT: And also because there was a large amount of supporting infrastructure just for this one feature.) Having learned otherwise, I don't see any reason why we shouldn't just go ahead and implement.

In the meantime, I think your work-around is fine, but it achieves parallelism in a fundamentally different way than NDB and is relatively expensive in terms of computing resources. Since these operations are going to be I/O bound rather than CPU bound, it may be that the use of multiprocessing isn't as good as just an old fashioned thread pool. In either case, I think NDB's coroutine based single threaded parallelism is still going to give the best performance.

I will, provisionally, add this to our queue.

chmoder · 2019-10-04T16:24:03Z

We will certainly switch to python-ndb native methods and test them when the PR comes through for this issue. I understand and appreciate your explanation however I am not sure I am familiar enough with NDB's "single threaded parallelism" to write the implementation myself. (EDIT: please feel free to ask if there is something we can do to help.)

In the mean time the pool keeps backward functionality while we refactor our projects. Thank you for the excellent work!

chrisrossi · 2019-10-04T18:48:46Z

I'm already on the case. Thanks!

yoshi-automation added the triage me I really want to be triaged. label Sep 25, 2019

cguardia added the type: question Request for information or clarification. Not an issue. label Sep 26, 2019

yoshi-automation removed the triage me I really want to be triaged. label Sep 26, 2019

chrisrossi self-assigned this Oct 4, 2019

chrisrossi mentioned this issue Oct 8, 2019

Implement Query.map and Query.map_async. #218

Merged

chrisrossi closed this as completed in #218 Oct 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

query.map() and query.map_async() are no longer implemented #210

query.map() and query.map_async() are no longer implemented #210

chmoder commented Sep 24, 2019

cguardia commented Sep 26, 2019

chmoder commented Sep 27, 2019

chrisrossi commented Sep 27, 2019

chmoder commented Sep 27, 2019 •

edited

Loading

chrisrossi commented Oct 1, 2019

chmoder commented Oct 1, 2019 •

edited

Loading

chrisrossi commented Oct 4, 2019 •

edited

Loading

chmoder commented Oct 4, 2019 •

edited

Loading

chrisrossi commented Oct 4, 2019

query.map() and query.map_async() are no longer implemented #210

query.map() and query.map_async() are no longer implemented #210

Comments

chmoder commented Sep 24, 2019

cguardia commented Sep 26, 2019

chmoder commented Sep 27, 2019

chrisrossi commented Sep 27, 2019

chmoder commented Sep 27, 2019 • edited Loading

chrisrossi commented Oct 1, 2019

chmoder commented Oct 1, 2019 • edited Loading

chrisrossi commented Oct 4, 2019 • edited Loading

chmoder commented Oct 4, 2019 • edited Loading

chrisrossi commented Oct 4, 2019

chmoder commented Sep 27, 2019 •

edited

Loading

chmoder commented Oct 1, 2019 •

edited

Loading

chrisrossi commented Oct 4, 2019 •

edited

Loading

chmoder commented Oct 4, 2019 •

edited

Loading