Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkgin segfaults on upgrade #1

Closed
ghost opened this issue May 29, 2013 · 28 comments
Closed

pkgin segfaults on upgrade #1

ghost opened this issue May 29, 2013 · 28 comments

Comments

@ghost
Copy link

ghost commented May 29, 2013

I created a binary package using pkg_create and added to the repo.

pkg_info -X >> pkg_summary was used to update the summary file.

pkgin install was used to install the packge.

All worked fine.

pkgin upgrade caused a segfault:

[root@2f52fae8-983d-4425-a8bd-29f929cc995e /var/db/pkgin]# mdb /opt/local/bin/pkgin core
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]

$C
fffffd7fffdffa20 libc.so.1`strcmp+0x16()
fffffd7fffdffa40 pkgin_upgrade+0x36()
fffffd7fffdffb20 main+0x55f()
fffffd7fffdffb30 _start+0x6c()

truss:

67017: read(4, 0x006AE018, 1024) = 1024
67017: \r\0\0\005\0 !\003 302 m01B0\0DA\0 !\0\0\0\0\0\0\0\0\0\0\0\0\0\0
67017: \081 61E12\0 7 '1B E1D1D [15\0 /\01F17\017\0 p 5 - N e t - L i b
67017: I D N - 0 . 1 2 n b 4 p 5 - N e t - L i b I D N 0 . 1 2 n b 4 P
67017: e r l b i n d i n g s f o r G N U L i b i d n a r t i s
67017: t i c 2 0 0 9 1 1 1 5 h t t p : / / s e a r c h . c p a n . o r
67017: g / d i s t / N e t - L i b I D N / 5 . 1 1 n e t / p 5 - N e t
67017: - L i b I D N n e t p e r l 5 5 4 5 5 4 S u n O S81 S1D12\0 ;
67017: -19 g\01D m15\0 ?\01D17\017\0 p k g _ i n s t a l l - i n f o -
67017: 4 . 5 n b 3 p k g _ i n s t a l l - i n f o 4 . 5 n b 3 S t a n
67017: d a l o n e G N U i n f o f i l e i n s t a l l a t i o
67017: n u t i l i t y 2 0 0 9 1 1 1 5 h t t p : / / w w w . g n u .
67017: o r g / s o f t w a r e / t e x i n f o / t e x i n f o . h t m
67017: l 5 . 1 1 p k g t o o l s / p k g _ i n s t a l l - i n f o p k
67017: g t o o l s 3 2 4 4 8 S u n O S81 :1C12\0 9 # ! e !1D 315\0 /\0
67017: 1F1F\017\0 s c m g i t - b a s e - 1 . 8 . 0 . 1 n b 1 s c m g i
67017: t - b a s e 1 . 8 . 0 . 1 n b 1 G I T T r e e H i s t o r y
67017: S t o r a g e T o o l ( b a s e p a c k a g e ) g n u -
67017: g p l - v 2 2 0 0 9 1 1 1 5 h t t p : / / g i t - s c m . c o m
67017: / 5 . 1 1 d e v e l / s c m g i t - b a s e d e v e l s c m 2
67017: 0 7 2 7 8 0 3 8 S u n O S81 C1B12\0 )151F ] 91D 515\0 ! / -1D\0
67017: 17\0 p e r l - 5 . 1 6 . 2 n b 2 p e r l 5 . 1 6 . 2 n b 2 P r a
67017: c t i c a l E x t r a c t i o n a n d R e p o r t L a n
67017: g u a g e g n u - g p l - v 2 O R a r t i s t i c 2 0 0 9 1
67017: 1 1 5 h t t p : / / w w w . p e r l . o r g / 5 . 1 1 l a n g /
67017: p e r l 5 6 4 b i t a u t o t h r e a d s l a n g d e v e l
67017: p e r l 5 6 1 5 8 3 5 3 5 S u n O S81 J1A12\0 5 #1D y %1D 915
67017: \0 5171D1B\017\t p k g _ i n s t a l l - 2 0 1 2 0 2 2 1 p k g _
67017: i n s t a l l 2 0 1 2 0 2 2 1 P a c k a g e m a n a g e m e n
67017: t a n d a d m i n i s t r a t i o n t o o l s f o r p
67017: k g s r c m o d i f i e d - b s d 2 0 0 9 1 1 1 5 h t t p : / /
67017: w w w . p k g s r c . o r g / 5 . 1 1 p k g t o o l s / p k g _
67017: i n s t a l l i n e t 6 p k g t o o l s 2 2 5 7 5 1 2 S u n O S
67017: brk(0x006C8000) = 0
67017: Incurred fault #6, FLTBOUNDS %pc = 0xFFFFFD7FFF1782D6
67017: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000
67017: Received signal #11, SIGSEGV [default]
67017: siginfo: SIGSEGV SEGV_MAPERR addr=0x00000000

FIX:

pkgin unkeep <package.

pkgin upgrade

--> works?!?!

@iMilnb
Copy link
Contributor

iMilnb commented Jun 16, 2013

What version of pkgin are you running? On which platform?

@Licenser
Copy link

I'm having the same issue with 0.6.0nb1 on SmartOS for me it seems to be related to having multiple repositories the issue also documented here TritonDataCenter/pkgsrc#32 with a crashdump attatched.

Cheers,
Heinz

@ghost
Copy link
Author

ghost commented Jul 19, 2013

Yum - multiple repos would be nice but I rsync to my own location.


W. A. Khushil Dep - khushil.dep@gmail.com - 07905 374 843
High Performance Web Platforms Architect & Engineer
@ http://www.facebook.com/GlobalOverlordkhushil

On 19 July 2013 15:55, Heinz N. Gies notifications@github.com wrote:

I'm having the same issue with 0.6.0nb1 on SmartOS for me it seems to be
related to having multiple repositories the issue also documented here
TritonDataCenter/pkgsrc#32 TritonDataCenter/pkgsrc#32 with a
crashdump attatched.

Cheers,
Heinz


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-21254486
.

@iMilnb
Copy link
Contributor

iMilnb commented Jul 21, 2013

@Licenser @khushil well yes, multiple repos is a feature that's not actually implemented, I plan to work on it at some point as it is #1 requested feature as of now :)
Thanks for the feedback, I'll update this thread whenever the work really starts on this subject :)

@jacques
Copy link

jacques commented Sep 14, 2013

Multiple repositories works fine until pkgin update no longer works. Delete the pkgin.db file and pkgin up and no joy. Imil any idea as to when mutliple repository support will work correctly out the box?

@ghost
Copy link
Author

ghost commented Sep 14, 2013

This is actually a big bug bear. Also, random core dumps when upgrading
packages created by using pkg_create, though this maybe because there is
scant information on what must be included in build-info...

On Saturday, 14 September 2013, Jacques Marneweck wrote:

Multiple repositories works fine until pkgin update no longer works.
Delete the pkgin.db file and pkgin up and no joy. Imil any idea as to when
mutliple repository support will work correctly out the box?


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-24452399
.

Sent from Gmail Mobile on my iPad2

@ghost
Copy link
Author

ghost commented Sep 15, 2013

iMil also, the problem seems to be 'pkgin upgrade' when the packages I build are marked as keepable. If I mark them non keepable the upgrades work fine.

@Licenser
Copy link

Given this is a bigger problem for me (or Project FiFo) I've opened a bounty source bounty for this issue

https://www.bountysource.com/issues/914106-pkgin-segfaults-on-upgrade

Cheers,
Heinz

@bahamat
Copy link

bahamat commented Dec 14, 2013

@khushil I'm getting bit by this too, but marking the package unkeep works.

@iMilnb
Copy link
Contributor

iMilnb commented Apr 9, 2014

All, I've changed the way remote packages are looked on when multiple repositories are declared, that may have a positive effect on the bug you're witnessing, could anyone give a try to master/github version of pkgin? Thanks.

@bahamat
Copy link

bahamat commented Apr 20, 2014

Is there an easy way to pick up these changes for testing on SmartOS?

@iMilnb
Copy link
Contributor

iMilnb commented Dec 27, 2014

No news from reporters for 6 moths+, closing this issue.

@iMilnb iMilnb closed this as completed Dec 27, 2014
@bahamat
Copy link

bahamat commented Dec 27, 2014

Sorry, I thought I had commented.

I did manage to get this compiled to test on SmartOS. As of the last time I tested I still got segfaults even with these changes.

@bahamat
Copy link

bahamat commented Dec 27, 2014

I just tried again with the most recent code. Still segfaults.

@iMilnb iMilnb reopened this Dec 27, 2014
@iMilnb
Copy link
Contributor

iMilnb commented Dec 27, 2014

Ok, could you paste here a scenario that fails so I can reproduce it? I've seen that report TritonDataCenter/pkgsrc#32 but I can't reproduce it as I don't have an x86_64 zone.

@bahamat
Copy link

bahamat commented Dec 27, 2014

After a bit of experimentation I discovered that the segfault only occurs if a installed package has the same name as a package available from the pkgsrc repository. This happens whether the package is installed with pkg_add or if additional repositories are added to etc/pkgin/repositories.conf

To demonstrate I created this Makefile to build a fake package named AnonymousPro. The fake package merely contains a single test file. AnonymousPro from the pkgsrc repository contains a truetype font. The package name isn't important so long as it overlaps with an existing package. I chose AnonymousPro because it won't really break anything if the package contents are replaced.

AnonymousPro-2.001fake1.tgz: build-info pkls comment desc r/tmp/testfile
        pkg_create -B build-info -c comment -d desc -I / -f pkls -U $@

build-info:
        pkg_info -X pkg_install | egrep '^(MACHINE_ARCH|OPSYS|OS_VERSION|PKGTOOLS_VERSION)' > $@

pkls: r/tmp/testfile
        echo '@cwd r' > $@
        ( cd r ; find . -type f ) >> $@

r/tmp/testfile:
        mkdir -p r/tmp
        echo "Test" > $@

comment:
        echo "Test Package" > $@

desc:
        echo "Test Package" > $@

.PHONY: test
test:
        pkg_add ./AnonymousPro-2.001fake1.tgz
        pkg_info | grep AnonymousPro-2.001fake1
        pkgin up -f
        pkgin ls | grep AnonymousPro-2.001fake1
        pkgin fug

.PHONY: clean
clean:
        rm -r r build-info pkls comment desc AnonymousPro-2.001fake1.tgz

This Makefile will reliably reproduce the crash on a clean SmartOS 14.3.0 image without installing any additional software. A free dev tier instance in the Joyent cloud can be provisioned for testing. If you've already used up your dev-tier let me know and I'll see if I can get you some billing credits (since I'm a Joyent employee).

To crash, run:

bmake
bmake test

And after a bit of digging through the source, I think the crash happens in actions.c in narrow_match() at line 723, 724 or 730.

Here's a stack trace of the core:

[root@wasp (barovia) /zones/592c6444-67ac-ec04-db16-d4f85fe76e35/cores]# pstack core.pkgin.87167
core 'core.pkgin.87167' of 87167:   ./pkgin fug
 fffffd7fff189ed6 strcmp () + 16
 0000000000410494 pkgin_upgrade () + a4
 000000000041f39e main () + 5be
 000000000040be3c _start () + 6c

@iMilnb
Copy link
Contributor

iMilnb commented Dec 27, 2014

Wow now this a perfect report! I'll definitely be able to work with this :)
Thanks a lot!

@iMilnb
Copy link
Contributor

iMilnb commented Dec 28, 2014

So I figured out that this particular segfault is caused by PKGPATH being NULL for this fake package. I secured strcmp calls, but please note that wrong packages like this one can lead to more inconsistencies.
Checkout latest commit for the fix.

@mamash
Copy link

mamash commented Dec 28, 2014

Emile, this more or less confirms by suspicions. Segfaults were often reported when packages were involved that were manually constructed using pkg_create. Most people would have likely used jperkin's guide here:

http://www.perkin.org.uk/posts/creating-local-smartos-packages.html

Creating packages this way is most attractive to 3rd party vendors (possibly even those who used to provide SVR4 packages for closed source software), as they do not want to carry the burden of a full pkgsrc building tree and actually build static binaries with no non-platform dependencies.

@iMilnb
Copy link
Contributor

iMilnb commented Dec 28, 2014

Hmm, makes sense then, but I fear this could lead to a couple of upgrade issues, PKG_PATH is used to check there's no abusive automatic upgrade, i.e. mysql 5.1 -> 5.5.
Is this packaging method widely used at Joyent? If yes I'd rather add a specific !NULL check.

Update

I've added a specific check for those packages that will not prevent their upgade, not sure if pkgpath being NULL will not bit somewhere else though...

@bahamat
Copy link

bahamat commented Dec 28, 2014

I use it for building newer binaries for existing packages. E.g., Cfengine in pkgsrc is 3.4.2, but the latest is 3.6.3 and as of 3.6 depends on lmdb which isn't in pkgsrc.

I tried creating an lmdb package and updating Cfengine in the pkgsrc way but I wasn't able to get it working properly. But using jperkins' guide provides usable packages. Jperkins also provides a tutorial for creating a repository which I also do. I used to add my private repo to /opt/local/etc/pkgin/repositories.confuntil I realized that I was getting segfaults. I switched to using pkg_add http://repo/package instead, but since my package names still overlap I'm still getting segfaults.

So, is there something about jperkins' guide that can be updated so that packages will get a proper PKG_PATH? Or, if there were better tutorials (or perhaps more easy to fine tutorials?) about creating/updating pkgsrc packages then we wouldn't need to update them outside of pkgsrc? Can you recommend a good tutorial on pkgsrc?

I'll also try out the new pkgin as soon as I can get to it.

@bahamat
Copy link

bahamat commented Dec 28, 2014

I've confirmed that the new version doesn't segfault on full-upgrade anymore. But I am still interested in seeing if there's something else or better we should be doing.

@mamash
Copy link

mamash commented Dec 28, 2014

I'd say that bahamat's usage above is not exactly legitimate, but understandable given the pkgsrc steep learning curve. The proper approach there would be to import lmdb into pkgsrc and update cfengine of course.

However, it makes sense for closed source software vendors if they feel the need to provide SmartOS binaries. I know Basho does release Riak this way, although they mess things up in other ways too, and we prefer to build Riak ourselves (based on wip/riak).

We can certainly suggest that people put some foo/bar nonsense as PKGPATH, but maybe it's just as easy to have pkgin assume a non-match if PKGPATH empty?

@iMilnb
Copy link
Contributor

iMilnb commented Dec 28, 2014

@bahamat latest commit should not crash anymore when pkgin fug'ing with NULL PKG_PATH packages, let me know how it goes!

@bahamat
Copy link

bahamat commented Dec 28, 2014

@iMilnb Confirmed, does not crash anymore on fug.

@iMilnb
Copy link
Contributor

iMilnb commented Dec 28, 2014

@mamash indeed, I've added what's needed today in order to ignore those packages. It is also already available in wip/pkgin.

@iMilnb
Copy link
Contributor

iMilnb commented Dec 28, 2014

@bahamat yay! thanks for the report. I'll close that issue unless anything shows up.

@iMilnb iMilnb closed this as completed Dec 28, 2014
@Licenser
Copy link

Licenser commented Apr 1, 2015

bahamat added a commit to bahamat/basic-pkgsrc-repo that referenced this issue Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants