[master < ] Add python & cpp batching #964

antoniofilipovic · 2023-06-06T10:19:27Z

[master < Task] PR

Check, and update documentation if necessary
Provide the full content or a guide for the final git message

To keep docs changelog up to date, one more thing to do:

Write a release note here
Tag someone from docs team in the comments

… and write, add test not to store vertices

antoniofilipovic · 2023-06-19T14:16:27Z

Some decisions made along the way:

initializer will receive the same parameters as the main func in batched procedure; thus, we do type checks in mgp.py. The same parameters are sometimes required to do proper initialization, i.e. reading from a file, etc...
in py_module.cpp there is an option not to invalidate the PyGraph object. This option would give users the option to cache vertices, edges, and other objects from storage and batch return them from the procedure. There are two perspectives in this case:
1. Tech perspective: we would need two different memory allocators since objects which plan to be cached need to be held in memory until the batched function is called again. Currently, everything is allocated with the same memory resource which after one batch is executed is restarted to a clean state (memory deallocated) to reduce memory usage. In code there are lot of checks if same memory resource is used for PyGraph and PyVertex and these would need decoupling.
2. User story: Batching works best when it is used for streams and returning of large objects to reduce memory usage. Vertices and edges are not large objects, thus seems unnecessary to batch return of vertices. Other point is that vertices in order to be cached, user would need get all vertices from storage, which would cause upsurge of memory which seems like not necessary.
[DISCUSS] during the execution of batching procedure there is an option to copy the shared pointer to read lock on all modules if we want for batched procedure not to be restarted (restart requires global lock). It seems like this is also more of a bug than a feature in regular procedures, as the user could write MATCH (n) CALL proc.do_smth() and during execution from another session called mg.load(proc) which would result in issue as only what is checked in code is size of results. Return types are checked so that they suit what was registered.

Josipmrden

Some comments but mostly good work

include/mgp.hpp

include/mgp.py

tests/e2e/batched_procedures/simple_read.py

src/query/plan/operator.cpp

tests/e2e/batched_procedures/procedures/batch_c_read.cpp

…graph/memgraph into MG-add-python-cpp-batching-procedure

Josipmrden

Changes fixed. Minor comment, don't need to re-ask for review again

include/mgp.hpp

andrejtonev

Just a few comments.

src/query/plan/operator.cpp

src/query/plan/operator.hpp

src/query/procedure/mg_procedure_impl.cpp

src/query/procedure/py_module.cpp

antoniofilipovic · 2023-06-26T13:52:01Z

@vpavicic for release note:
Add an option to batch results from procedures from both Python and CPP procedures

I will open PR shortly for procedures on docs

antoniofilipovic added 3 commits June 5, 2023 18:29

add buildable batching procedure option

afed75d

add initializer on procedure with same arguments as proc

6b1a7e4

add mgp py with initializer

43c3e39

antoniofilipovic self-assigned this Jun 6, 2023

antoniofilipovic added 10 commits June 6, 2023 12:23

add _mgp.hpp

542dd38

fix bug on procedure.h

e24cf91

fix bugs on num of arguments and mgp.py wrong wrapper

243eb51

add working version

5d72e81

add working version of cleaning resurces

830653c

fix mgp add write for read

23f224a

add code reuse

7ec6142

add use pool resource for batched proc

7f7a75d

add e2e test for python batched procedure

232230d

fix operator to call cleanup after 1 batch, add python tests for read…

3508f48

… and write, add test not to store vertices

antoniofilipovic and others added 3 commits June 20, 2023 17:02

add cpp api, add support in mgp.hpp and add tests in cpp

0df533a

add final touches on tests

5c2102f

Merge branch 'master' into MG-add-python-cpp-batching-procedure

4d43ae2

antoniofilipovic marked this pull request as ready for review June 21, 2023 09:02

antoniofilipovic requested review from Josipmrden and andrejtonev June 21, 2023 09:18

gitbuda added this to the mg-v2.9.0 milestone Jun 22, 2023

Josipmrden reviewed Jun 23, 2023

View reviewed changes

antoniofilipovic and others added 5 commits June 26, 2023 09:40

Merge branch 'master' into MG-add-python-cpp-batching-procedure

c274291

fix pr comments and failing test on replicas

c76550c

Merge branch 'MG-add-python-cpp-batching-procedure' of github.com:mem…

f5e6a70

…graph/memgraph into MG-add-python-cpp-batching-procedure

add empty lines

d9a7d94

add empty lines

d7a17b4

antoniofilipovic requested a review from Josipmrden June 26, 2023 09:42

antoniofilipovic added the Ready for review PR is ready for review label Jun 26, 2023

Merge branch 'master' into MG-add-python-cpp-batching-procedure

de32b0e

Josipmrden reviewed Jun 26, 2023

View reviewed changes

include/mgp.hpp Show resolved Hide resolved

andrejtonev reviewed Jun 26, 2023

View reviewed changes

include/mgp.hpp Outdated Show resolved Hide resolved

include/mgp.hpp Show resolved Hide resolved

andrejtonev reviewed Jun 26, 2023

View reviewed changes

antoniofilipovic and others added 2 commits June 26, 2023 13:06

resolve pr comments and clang tidy fix

d1479b7

Merge branch 'master' into MG-add-python-cpp-batching-procedure

43e01a9

Josipmrden approved these changes Jun 26, 2023

View reviewed changes

fix clang tidy error

3242fc0

antoniofilipovic merged commit d573eda into master Jun 26, 2023
6 checks passed

antoniofilipovic deleted the MG-add-python-cpp-batching-procedure branch June 26, 2023 13:46

antoniofilipovic mentioned this pull request Jun 26, 2023

[master < T] Fix multiple include error #1043

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[master < ] Add python & cpp batching #964

[master < ] Add python & cpp batching #964

antoniofilipovic commented Jun 6, 2023 •

edited

Loading

antoniofilipovic commented Jun 19, 2023 •

edited

Loading

Josipmrden left a comment

Josipmrden left a comment

andrejtonev left a comment

antoniofilipovic commented Jun 26, 2023

[master < ] Add python & cpp batching #964

[master < ] Add python & cpp batching #964

Conversation

antoniofilipovic commented Jun 6, 2023 • edited Loading

antoniofilipovic commented Jun 19, 2023 • edited Loading

Josipmrden left a comment

Choose a reason for hiding this comment

Josipmrden left a comment

Choose a reason for hiding this comment

andrejtonev left a comment

Choose a reason for hiding this comment

antoniofilipovic commented Jun 26, 2023

antoniofilipovic commented Jun 6, 2023 •

edited

Loading

antoniofilipovic commented Jun 19, 2023 •

edited

Loading