Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel object serialization #1714

Closed
larshp opened this issue Jul 29, 2018 · 11 comments
Closed

Parallel object serialization #1714

larshp opened this issue Jul 29, 2018 · 11 comments
Assignees
Labels
new feature New feature or request

Comments

@larshp
Copy link
Member

larshp commented Jul 29, 2018

Improve performance by serializing objects in parallel. This will require a function group, suggest not including this feature in the compiled report, but only in the full development installation.

Prerequsite: larshp/abapmerge#56

Performance can be measured using https://github.com/abapGit/api_examples/blob/master/src/zabapgit_api_local_files.prog.abap

@larshp larshp added the new feature New feature or request label Jul 29, 2018
@larshp larshp self-assigned this Jul 29, 2018
@larshp
Copy link
Member Author

larshp commented Jul 29, 2018

Notes:

test code in https://github.com/larshp/parallel_test

  • refactor logic in get_files to new class?
  • refactor cache handling into new method(s)
  • delete HAS_CHANGED_SINCE ? Serialization Cache for Classes does not work #1021
  • activate parallel when number of objecs larger than x, activate via global setting?
  • general framework for parallel processing?
  • move responsibility for setting path and SHA1 into SERIALIZE,
      lt_files = zcl_abapgit_objects=>serialize(
        is_item     = ls_item
        iv_language = get_dot_abapgit( )->get_master_language( )
        io_log      = io_log ).
      LOOP AT lt_files ASSIGNING <ls_file>.
        <ls_file>-path = <ls_tadir>-path.
        <ls_file>-sha1 = zcl_abapgit_hash=>sha1(
          iv_type = zif_abapgit_definitions=>gc_type-blob
          iv_data = <ls_file>-data ).

@larshp
Copy link
Member Author

larshp commented Jul 29, 2018

@mkaesemann / @christianguenter2 try running the code in https://github.com/larshp/parallel_test, i have only ~1100 objects across my installed repos

Speedup is around 2x with 4 CPUs,
image

3x with 8 CPUs,
image

@christianguenter2
Copy link
Member

7.51 SP02 dev edition inside VM and docker container with one CPU

image

image

@christianguenter2
Copy link
Member

Two CPUs
image

@mkaesemann
Copy link
Contributor

I will try to do so as soon as possible.
The parallelization code I am currently using brings our repositrory down from 10 minutes to 1 minute, but that is in conjunction with other optimizations (many already submitted via PR) and with 27 usable work processes in the system.

@larshp
Copy link
Member Author

larshp commented Jul 30, 2018

there is a few things that needs to be reorganized in order to implement parallel serialization properly, the example is in order to determine if its worth the effort

@larshp
Copy link
Member Author

larshp commented Aug 5, 2018

Suggest deleting the time based cache after/if parallel serialization is implemented

mv_last_serialization

@larshp
Copy link
Member Author

larshp commented Nov 20, 2018

Some numbers, reproduce with https://github.com/larshp/parallel_test, runnin on a big box

@joymike

4 threads:
image

10 threads:
image

@larshp
Copy link
Member Author

larshp commented Nov 20, 2018

plus follow progress here: https://github.com/larshp/parallel_test

plan is to add new class ZCL_ABAPGIT_SERIALIZE which takes care of the parallel

@larshp
Copy link
Member Author

larshp commented Nov 21, 2018

okay, more or less done

#2122 to be merged

then retrofit the code from https://github.com/larshp/parallel_test into abapGit core, and we'll have parallel serialization

larshp added a commit that referenced this issue Nov 22, 2018
larshp added a commit that referenced this issue Nov 23, 2018
* parallel serialization #1714

* use latest abaplint

* Update abaplint.json

* add function group

* fallback to sequential

* fix error when running in background mode
@larshp larshp closed this as completed Nov 23, 2018
@joymike
Copy link

joymike commented Nov 23, 2018

Awesome. Will try it out next week. Great stuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Development

No branches or pull requests

4 participants