Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in update_all_products.pl #2053

Closed
stephanegigandet opened this issue Jul 4, 2019 · 9 comments
Closed

Memory leak in update_all_products.pl #2053

stephanegigandet opened this issue Jul 4, 2019 · 9 comments
Labels
🐛 bug This is a bug, not a feature request.

Comments

@stephanegigandet
Copy link
Contributor

stephanegigandet commented Jul 4, 2019

There are memory leaks when updating products with update_all_products.pl (not through Apache and mod_perl).

When the process starts, after the taxonomies are loaded:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6835 off 30 10 855956 665756 11352 R 100.0 2.0 0:21.84 update_all+

after 5 minutes:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6835 off 30 10 1177888 987596 11352 R 100.0 3.0 2:11.81 update_all+

another 5 minutes:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6835 off 30 10 1531100 1.279g 11352 R 99.0 4.1 4:14.73 update_all+

The process eventually gets killed, or we run out of memory and swap.

@stephanegigandet stephanegigandet added the 🐛 bug This is a bug, not a feature request. label Jul 4, 2019
@stephanegigandet
Copy link
Contributor Author

Trying Devel::MAT

https://tech.binary.com/tracing-perl-memory-leaks-with-devel-mat/

I'm testing update_all_products.pl, with 1 product updated, 10 products, and 100 products. The memory does grow:

-rw-r--r-- 1 off off 477508290 Jul 4 17:47 file.pmat
-rw-r--r-- 1 off off 485072433 Jul 4 18:00 file50.pmat
-rw-r--r-- 1 off off 521910527 Jul 4 18:01 file100.pmat

with file50.pmat:

pmat [more]> count
Kind Count (blessed) Bytes (blessed)
ARRAY 215105 26 21.3 MiB 2.3 KiB
CODE 16455 10 2.0 MiB 1.2 KiB
GLOB 29074 2 4.2 MiB 304 bytes
HASH 304464 4423 177.3 MiB 1.2 MiB
INVLIST 435 1.1 MiB
IO 33 33 5.2 KiB 5.2 KiB
PAD 13950 4.5 MiB
REF 519553 11.9 MiB
REGEXP 3305 197 723.0 KiB 43.1 KiB
SCALAR 3678287 17 200.9 MiB 1.2 KiB
STASH 1919 1.3 MiB


(total) 4782580 4708 425.3 MiB 1.3 MiB

pmat [more]> largest
HASH(552404) at 0x556666d6db70=strtab: 37.7 MiB
SCALAR(PV) at 0x5566680b32a8: 4.1 MiB
HASH(89343) at 0x5566674ed120: 3.0 MiB
HASH(53763) at 0x556686a328f8: 1.7 MiB
HASH(45280) at 0x5566674f69d8: 1.5 MiB
others: 377.2 MiB

with file100.pmat:

pmat> count
Kind Count (blessed) Bytes (blessed)
ARRAY 215131 26 21.4 MiB 2.3 KiB
CODE 16455 10 2.0 MiB 1.2 KiB
GLOB 29075 2 4.2 MiB 304 bytes
HASH 304475 4424 177.3 MiB 1.2 MiB
INVLIST 436 1.1 MiB
IO 33 33 5.2 KiB 5.2 KiB
PAD 13950 4.5 MiB
REF 519572 11.9 MiB
REGEXP 3311 201 724.3 KiB 44.0 KiB
SCALAR 4286054 17 249.0 MiB 1.2 KiB
STASH 1919 1.3 MiB


(total) 5390411 4713 473.5 MiB 1.3 MiB

pmat> largest
HASH(552425) at 0x55bcd4fb3b70=strtab: 37.7 MiB
SCALAR(PV) at 0x55bcd62f5ca8: 4.1 MiB
HASH(89343) at 0x55bcd5734790: 3.0 MiB
HASH(53763) at 0x55bcf4c78410: 1.7 MiB
HASH(45280) at 0x55bcd573e048: 1.5 MiB
others: 425.4 MiB

@stephanegigandet
Copy link
Contributor Author

Seems to be the spellcheck of unknown ingredients to check if they are additives:

-rw-r--r-- 1 off off 521900462 Jul 4 19:36 file100.pmat.process_ingredients
-rw-r--r-- 1 off off 470279244 Jul 4 21:35 file100.pmat.extract_ingredients
-rw-r--r-- 1 off off 521891103 Jul 4 21:36 file100.pmat.extract_ingredients_plus_classes
-rw-r--r-- 1 off off 521867238 Jul 4 21:37 file100.pmat.extract_ingredients_classes
-rw-r--r-- 1 off off 470267487 Jul 4 21:50 file100.pmat.no_spellcheck

@stephanegigandet
Copy link
Contributor Author

More precisely, this call:

my $tagid = get_fileid($candidate);

(on many many candidates)

@stephanegigandet
Copy link
Contributor Author

Maybe a Perl 5.24 issue: https://rt.perl.org/Public/Bug/Display.html?id=130254

5.24.1 is used on the dev server and production.

@stephanegigandet
Copy link
Contributor Author

stephanegigandet commented Jul 4, 2019

# cat /etc/debian_version

9.8
# apt-get install perl

perl is already the newest version (5.24.1-3+deb9u5).

@stephanegigandet
Copy link
Contributor Author

We can replace some of the regular expressions in get_fileid by tr(), which should be faster.

It makes the memory usage go down:

-rw-r--r-- 1 off off 469421679 Jul 4 23:32 file100.pmat.new_get_fileid_with_

stephanegigandet added a commit that referenced this issue Jul 5, 2019
…ory-leak

replace regexps by tr, alleviate regexp memory leak, bug #2053
@VaiTon
Copy link
Member

VaiTon commented Jul 5, 2019

[...] but 'no warnings' pragma eliminates memory leak.
https://rt.perl.org/Public/Bug/Display.html?id=130254

@VaiTon
Copy link
Member

VaiTon commented Jul 5, 2019

Same discussion:
[..] but I note that when I uncommented 'no warnings' in the block, I did not eliminate *all* memory leaks.

@stephanegigandet
Copy link
Contributor Author

@VaiTon : thanks! I'm not sure we want to disable the warnings though.
The fix implemented seem to have removed the bulk of the memory leaks, so I'll close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug This is a bug, not a feature request.
Projects
None yet
Development

No branches or pull requests

2 participants