Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

too-large rpm hangs when processed #52

Open
trel opened this issue Sep 24, 2015 · 12 comments
Open

too-large rpm hangs when processed #52

trel opened this issue Sep 24, 2015 · 12 comments

Comments

@trel
Copy link
Contributor

trel commented Sep 24, 2015

I have been working through automating our build process, and our existing development package is the only one that does not succeed when processed by prm. Investigating further, and generating test rpms to exclude something specific to our package, I have found where some size threshold is being surpassed.

I am using EPM to generate RPM files.

Works Hangs
EPM's test.list filesize 4,742,089 4,744,637
RPM's test.spec filesize 5,130,897 5,133,653
RPM filesize 13,062,868 13,070,664

beginning and end of test.list

%product EPM testing
%copyright BSD
%vendor Testing <testing@example.org>
%license foo
%readme foo
%description Example description
%version 1
%format all

d 755 root root /usr/lib/epmtesting -
f 644 root root /usr/lib/epmtesting/foo1 foo
f 644 root root /usr/lib/epmtesting/foo2 foo
f 644 root root /usr/lib/epmtesting/foo3 foo
...
f 644 root root /usr/lib/epmtesting/foo96998 foo
f 644 root root /usr/lib/epmtesting/foo96999 foo
f 644 root root /usr/lib/epmtesting/foo97000 foo

instrumented prm and arr-pm output for hanging case

Built Yum repository for centos6
x86_64
1 write begin 16384
  write complete
  read begin
  reading...
  ---- SKIPPING
  read complete
2 write begin 16384
  write complete
  read begin
  reading...
  ---- SKIPPING
  read complete
3 write begin 16384
  write complete
  read begin
  reading...
  ---- SKIPPING
  read complete
4 write begin 16384
  write complete
  read begin
  reading...
  ---- SKIPPING
  read complete
5 write begin 9412

htop output for hanging case

11902 trel        20   0 1304M 1203M  3732 S  0.0  0.5  0:00.00 │  │  │  │  │     ├─ /usr/bin/ruby1.9.1 /usr/local/bin/prm -t rpm -p pool -r centos6 -a x86_64
11901 trel        20   0  4440   640   540 S  0.0  0.0  0:00.02 │  │  │  │  │     └─ sh -c xz -d | cpio -it --quiet
11904 trel        20   0  7304   624   524 S  0.0  0.0  0:00.00 │  │  │  │  │        ├─ cpio -it --quiet
11903 trel        20   0 12012  1084   752 S  0.0  0.0  0:00.00 │  │  │  │  │        └─ xz -d

strace output for hanging case

$ sudo strace -s 300 -p 11904
Process 11904 attached
write(1, "sr/lib/epmtesting/foo70470\n./usr/lib/epmtesting/foo70471\n./usr/lib/epmtesting/foo70472\n./usr/lib/epmtesting/foo70473\n./usr/lib/epmtesting/foo70474\n./usr/lib/epmtesting/foo70475\n./usr/lib/epmtesting/foo70476\n./usr/lib/epmtesting/foo70477\n./usr/lib/epmtesting/foo70478\n./usr/lib/epmtesting/foo70479\n./u"..., 4096

prm hangs here:
https://github.com/dnbert/prm/blob/master/lib/prm/rpm.rb#L242

The strace write() call is from the ruby-arr-pm library here:
https://github.com/jordansissel/ruby-arr-pm/blob/master/lib/arr-pm/file.rb#L210

It feels like a logic error once the filelist gets too long - but I've hit the edge of what I can diagnose by inspection. And of course, if this is a bug in arr-pm proper, apologies here.

@trel
Copy link
Contributor Author

trel commented Sep 28, 2015

@jordansissel Do you have any insight into whether prm is calling arr-pm incorrectly, or this is an internal arr-pm bug?

@dnbert
Copy link
Owner

dnbert commented Sep 29, 2015

Hi @trel sorry for my reply being so late!

I think this is a bug in arr-pm, but it might be due to this payload assignment here: https://github.com/jordansissel/ruby-arr-pm/blob/master/lib/arr-pm/file.rb#L208

How large is the list in the rpm? Any chance you could provide me with a sample rpm (something at the least similar to your RPM)?

@trel
Copy link
Contributor Author

trel commented Sep 29, 2015

The list is 97k lines long:

$ rpm -qpl linux-2.6-x86_64/epmtest-1-linux-2.6-x86_64.rpm | wc -l
97001

first of the file list

$ rpm -qpl linux-2.6-x86_64/epmtest-1-linux-2.6-x86_64.rpm | head
/usr/lib/epmtesting
/usr/lib/epmtesting/foo1
/usr/lib/epmtesting/foo10
/usr/lib/epmtesting/foo100
/usr/lib/epmtesting/foo1000
/usr/lib/epmtesting/foo10000
/usr/lib/epmtesting/foo10001
/usr/lib/epmtesting/foo10002
/usr/lib/epmtesting/foo10003
/usr/lib/epmtesting/foo10004

last of the file list

$ rpm -qpl linux-2.6-x86_64/epmtest-1-linux-2.6-x86_64.rpm | tail
/usr/lib/epmtesting/foo9990
/usr/lib/epmtesting/foo9991
/usr/lib/epmtesting/foo9992
/usr/lib/epmtesting/foo9993
/usr/lib/epmtesting/foo9994
/usr/lib/epmtesting/foo9995
/usr/lib/epmtesting/foo9996
/usr/lib/epmtesting/foo9997
/usr/lib/epmtesting/foo9998
/usr/lib/epmtesting/foo9999

I've been generating an EPM list file with this script...

$ cat generate_test_epm.sh 
#!/bin/bash -e

NUMBEROFLINES=97000
# increase this until prm hangs
# NUMBEROFLINES=98000

# local file
FILENAME=foo
dd if=/dev/zero of=$FILENAME bs=1k count=10
echo "foo" > $FILENAME

# prepare preamble
echo -e "
%product EPM testing
%copyright BSD
%vendor Testing <testing@example.org>
%license foo
%readme foo
%description Example description
%version 1
%format all
"

# list of files
echo "d 755 root root /usr/lib/epmtesting -"
for (( i=1; i<=$NUMBEROFLINES; i++ )); do
    echo "f 644 root root /usr/lib/epmtesting/$FILENAME$i $FILENAME"
done

Then generating an RPM via EPM

$ ./generate_test_epm.sh > test.list
$ epm -k -f rpm epmtest test.list

The RPM will then be local

$ ls -l linux*/epmtest*
-rw-r--r-- 1 trel trel 13062880 Sep 29 14:33 epmtest-1-linux-2.6-x86_64.rpm
-rw-r--r-- 1 trel trel  5130897 Sep 29 14:32 epmtest.spec

A sample file can be found here for 30 days...

This will hang when processed by prm (with the new -d option, but should be irrelevant here)

$ prm -t rpm -p pool -r centos6 -a x86_64 -d .

I've instrumented arr-pm with the following

$ git diff -w
diff --git a/arr-pm.gemspec b/arr-pm.gemspec
index 4645b97..d515237 100644
--- a/lib/arr-pm/file.rb
+++ b/lib/arr-pm/file.rb
@@ -204,17 +204,26 @@ class RPM::File
     end
     payload_fd = payload.clone
     output = ""
+    count = 0
     loop do
+      count += 1
       data = payload_fd.read(16384, buffer)
       break if data.nil? # listerextractor.write(data)
+      puts "#{count} write begin #{data.length}"
       lister.write(data)
+      puts "  write complete"

       # Read output from the pipe.
+      puts "  read begin"
       begin
+        puts "  reading..."
         output << lister.read_nonblock(16384)
       rescue Errno::EAGAIN
         # Nothing to read, move on!
+        puts "  ---- SKIPPING"
       end
+      puts "  read complete"
     end
     lister.close_write

@dnbert
Copy link
Owner

dnbert commented Oct 1, 2015

Hey @trel I couldn't get an EPM build to work. I'll have an strace of the EPM error, but it essentially said it couldn't build the package. I'll try with an FPM build later!

@trel
Copy link
Contributor Author

trel commented Oct 1, 2015

thanks, a large enough fpm-produced package should hit this as well.

in the meantime, we've got a local temporary workaround with rpm2cpio.

@dnbert
Copy link
Owner

dnbert commented Oct 2, 2015

@trel definitely replicated with an fpm produced package. Hoping to dig in more tonight

@trel
Copy link
Contributor Author

trel commented Oct 2, 2015

excellent news.

@dnbert
Copy link
Owner

dnbert commented Oct 6, 2015

I've looked a bit more and it is definitely an issue on that arr-pm side, or at least that's where it's hitting. I tested a few different things, but it looks like lowering the string length size, I was able to build out a repo with a RPM that had 95000+ files in it. My guess is that the pipe for the IO object is not being flushed, or is too small, for the type of content we're throwing at it - but again that's a guess.

I tried various sizes for the string lengths, 8000, 1000, 500, 100, and 10

I can't imagine @jordansissel will want to reduce the read string length from 16k to 1, it takes quite a bit of time to generate the repository for a single package due to that. I was able to build the repository after changing the string length to 1 and adding in some stdout content (puts "test") into my arr-pm library.

@trel
Copy link
Contributor Author

trel commented Oct 6, 2015

I was also able to get the original code to work with extra puts when trying to work out what was happening. Comment out the debugging, and it would hang again. It does feel like a flush-related thing.

Lowering the buffer size did not seem to help for me, even when set to 1.

@dnbert
Copy link
Owner

dnbert commented Oct 13, 2015

@trel so while it does look like the pipe size is the issue on the read, there's not much more for me to go on unfortunately. I've kinda tapped my expertise, but yea it's definitely an arr-pm issue

@trel
Copy link
Contributor Author

trel commented Oct 13, 2015

I'll file an issue and point back here. Thanks @dnbert

@quanah
Copy link

quanah commented Nov 18, 2015

Hm, I was interested in possibly using this for Zimbra, but this would be a total blocker. :(

davydotcom added a commit to davydotcom/ruby-arr-pm that referenced this issue Sep 7, 2016
…read buffer being too full to allow writes therefore blocking write
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants