Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory used by the publish calls #289

Merged
merged 9 commits into from Apr 5, 2023
Merged

Conversation

jreidinger
Copy link
Member

@jreidinger jreidinger commented Mar 30, 2023

Problem

Agame and also yast consume quite bit of memory, so lets try to optimize it.

Solution

Add some profiling code and based on it, apply some optimization.

Optimization results ( testing on Leap15.4 so with ruby 2.5 ) just totals:

Original result:

Total allocated: 33846260 bytes (365250 objects)
Total retained:  5387014 bytes (47980 objects)

after first optimization:

Total allocated: 32843452 bytes (364422 objects)
Total retained:  5383110 bytes (47979 objects)

after second optimization:

Total allocated: 31132062 bytes (351286 objects)
Total retained:  4337427 bytes (41452 objects)

So result is roughly saved 2650 KiB of allocated and in persistent it is 1021 KiB on testing sample that just require yast and imports some yast modules.

@mvidner
Copy link
Member

mvidner commented Apr 3, 2023

Reproduced the failure with:

rm -rf build
mkdir build
cd build
cmake ..
make -j`nproc`
../tests/yast_spec.rb

@jreidinger jreidinger marked this pull request as ready for review April 3, 2023 10:47
@mvidner
Copy link
Member

mvidner commented Apr 3, 2023

BTW I've tried enabling frozen string literals to see what memory effect it has, but fast_gettext does not like it, starting here

$ ruby --enable=frozen-string-literal -r ../tests/test_helper.rb ../profiling/yast_import.rb  |& o
/usr/lib64/ruby/gems/2.5.0/gems/fast_gettext-1.6.0/lib/fast_gettext/vendor/mofile.rb:52:in `force_encoding': can't modify frozen String (FrozenError)
        from /usr/lib64/ruby/gems/2.5.0/gems/fast_gettext-1.6.0/lib/fast_gettext/vendor/mofile.rb:52:in `<class:MOFile>'
...

https://github.com/grosser/fast_gettext/blob/a4628d516cfd511bef3acc0762d453934f5a9acd/lib/fast_gettext/vendor/mofile.rb#L52

@mvidner
Copy link
Member

mvidner commented Apr 3, 2023

First, frozen string literals are possible if I apply this diff
--- /usr/lib64/ruby/gems/2.5.0/gems/fast_gettext-1.6.0/lib/fast_gettext/vendor/mofile.rb.orig   2023-04-03 14:03:16.711532621 +0200
+++ /usr/lib64/ruby/gems/2.5.0/gems/fast_gettext-1.6.0/lib/fast_gettext/vendor/mofile.rb        2023-04-03 14:06:38.496909306 +0200
@@ -46,11 +46,14 @@
         :trans_sysdep_tab_offset
       end
 
-      MAGIC_BIG_ENDIAN    = "\x95\x04\x12\xde"
-      MAGIC_LITTLE_ENDIAN = "\xde\x12\x04\x95"
+      mbe = "\x95\x04\x12\xde"
+      mle = "\xde\x12\x04\x95"
       if "".respond_to?(:force_encoding)
-        MAGIC_BIG_ENDIAN.force_encoding("ASCII-8BIT")
-        MAGIC_LITTLE_ENDIAN.force_encoding("ASCII-8BIT")
+        MAGIC_BIG_ENDIAN = mbe.dup.force_encoding("ASCII-8BIT")
+       MAGIC_LITTLE_ENDIAN = mle.dup.force_encoding("ASCII-8BIT")
+      else
+        MAGIC_BIG_ENDIAN = mbe
+       MAGIC_LITTLE_ENDIAN = mle
       end
 
       def self.open(arg = nil, output_charset = nil)

Then, as I was documenting it, I looked at your Exportable optimization and realized that it can be improved 😄

(I've added the under_scores manually)

before frozen path
Total allocated: 31_308_339 bytes (353285 objects)
Total retained: 4_718_083 bytes (46366 objects)

with frozen path: note that it does not affect the retained values
Total allocated: 31_289_098 bytes (352803 objects)
Total retained: 4_718_083 bytes (46366 objects)

with global string freeze (would need the above fix in fast_gettext)
Total allocated: 30_608_658 bytes (335792 objects)
Total retained: 4_685_883 bytes (45561 objects)

pairs instead of name+type hashes, without global string freeze
Total allocated: 31_116_547 bytes (352806 objects)
Total retained: 4_545_259 bytes (46366 objects)

pairs instead of name+type hashes, with global string freeze
Total allocated: 30_436_107 bytes (335795 objects)
Total retained: 4_513_059 bytes (45561 objects)

I really wonder how this performs on a real workload

instead of two-keyed hashes,
to further reduce memory consumption for this overhead
Copy link
Member

@mvidner mvidner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more improvements

profiling/yast_import.rb Show resolved Hide resolved
src/binary/YRubyNamespace.cc Show resolved Hide resolved
src/binary/YRubyNamespace.cc Show resolved Hide resolved
@mvidner
Copy link
Member

mvidner commented Apr 3, 2023

@jreidinger later, I think it is worth looking at the retained foo part of the memory_profiler report, but for a real workload, not just for an initial import like your test

@mvidner mvidner changed the title Optimize memory Optimize memory used by the publish calls Apr 3, 2023
Co-authored-by: Martin Vidner <mvidner@suse.cz>
@jreidinger jreidinger merged commit 9a50d95 into master Apr 5, 2023
2 checks passed
@jreidinger jreidinger deleted the optimize_memory branch April 5, 2023 10:40
@yast-bot
Copy link
Contributor

yast-bot commented Apr 5, 2023

✔️ Public Jenkins job #70 successfully finished
✔️ Created OBS submit request #1077500

@yast-bot
Copy link
Contributor

yast-bot commented Aug 7, 2023

✔️ Internal Jenkins job #58 successfully finished
✔️ Created IBS submit request #304735

@mvidner mvidner mentioned this pull request Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants