Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to just get the 'medicinal' effects of strace with no overhead #14

Closed
jidanni opened this issue Sep 27, 2017 · 34 comments
Closed

Comments

@jidanni
Copy link

jidanni commented Sep 27, 2017

It is often the case that users when attempting to find out why a
program often segfaults etc., but sometimes works, turn to strace.

But when testing under strace they find the program always works.

Therefore they write a wrapper script that just does

strace program 2>/dev/null

enabling their program to always work. Original bug not solved, but at
least they can get on with their jobs.

However the above wrapper is not optimized.

I wish the strace man page would say what to do in such cases: what
combination of options is best when one doesn't even care about what
strace is doing. All one wants is the 'medicinal' effects of strace's
signal blocking or thread untangling or race condition preventing or
whatever it is apparently doing right, without needing even anything
printed to /dev/null. Or maybe there should be an extra --option for
such cases. Or maybe simply an example of what trap(1)s one can use to
emulate what strace it doing will suffice.

What are some examples of programs that always run under strace, but
often fail otherwise?:
https://serverfault.com/questions/594080/process-works-under-strace-but-not-normally
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=869746

@esyr
Copy link
Member

esyr commented Nov 3, 2017

Well, strace tries to stay as transparent as possible in terms of signal delivery. Usually the absence of segfaults under strace is the result of frequent ptrace-syscall-stops that significantly lower the probability of races in some cases, as strace has to inspect each syscall in order to figure out whether it is of interest, at least. You can try to use -e trace=none -e signal=none -qq, for example, to minimise amount of output (and code executed) by strace. I see no reason for documenting it, as this is definitely a very bad and unreliable practice.

@jidanni
Copy link
Author

jidanni commented Nov 3, 2017

@esyr hmmm, maybe there would be a "market" for an "app" that is specially designed just as a stopgap app to stop/reduce races while users are waiting for new versions of offending apps.
(app=program, market=usage case).

@Kogl1n
Copy link

Kogl1n commented Jan 23, 2018

There is already a programmer's job market.

@1358
Copy link

1358 commented Jan 23, 2018

m(
Heisenbugs appear. Nothing special. Fix your code. Abusing strace is not a solution for your problem.

@zackw
Copy link

zackw commented Jan 23, 2018

The thing you should understand is that the "medicinal effects" are caused by the overhead. Running the program under strace makes it go slower and more serially, and that makes it harder to hit timing-sensitive bugs. Reducing the amount of overhead might very well make the problem come back.

The thing to try is running the program under valgrind instead, in both the default and "helgrind" modes; that will be even slower but has a fighting chance of finding the actual bug so it can be fixed properly.

@phillipp
Copy link

image

@AnsisMalins
Copy link

AnsisMalins commented Jan 23, 2018

What if @jidanni does not have the source code and can't fix his application, but must keep it running by any means? On that point: I would try running it in a single core VM.

@smammy
Copy link

smammy commented Jan 23, 2018

Heisenbugs appear. Nothing special. Fix your code. Abusing strace is not a solution for your problem.

This attitude ignores the completely valid use case where it's not your code, and maybe you don't even have the source, and some critical service is falling over, and your manager is breathing down your neck, and you just need to make it stop.

You can say "oh that shouldn't've made it to production, your processes are b0rked, don't run closed code" all you like, but it doesn't change the fact that the poor sap who has to deal with it is going to have a much better day with an strace kludge.

@pRiVi
Copy link

pRiVi commented Jan 23, 2018

Fun

@Kogl1n
Copy link

Kogl1n commented Jan 23, 2018

@smammy
If there are no consequences of bad practice it will become standard.
A critical service w/o source on Linux? You can just hope it is compiled with debug symbols.

@CatCookie
Copy link

If it works, it's not stupid.
Sometimes it is necessary to be pragmatic.

@andreacampi
Copy link

andreacampi commented Jan 23, 2018

I think you can have your cake and eat it too, just:

alias strace="bash -c" 

@cashell
Copy link

cashell commented Jan 23, 2018

Yes, the right fix is to fix the code, but as someone who's spent a lot of years dealing with other people's software (including third-party software), that you are responsible to keep running, I also understand the desire to find alternate fixes when you can't fix the code.

However, there are multiple better solutions than abusing strace for this. Numerous tools have been written to start, monitor, and restart processes. If you have something crashing, you're almost always better off using something like that, while also working to find a replacement or permanent fix, as opposed to an ugly hack solution like trying to run it under strace.

Tools like these also make it easier to log, track, and report on the crashes, which may help in getting better solutions in place.

Among the ones I've seen and used:

  • daemontools
  • supervisord
  • restartd
  • launchtool
  • systemd
  • runit

I'm sure there are others, too. I'd use any of these, and probably a shell script wrapper that watches and restarts a process, before I'd rely on trying to run it under strace to "prevent" crashes.

@jidanni
Copy link
Author

jidanni commented Jan 23, 2018

@cashell are there any wrappers to prevent crashes, not just start again?
No I don't want to restart an e.g., browser when it crashes (losing all kinds of half entered data, etc.)
With the trick mentioned I was successfully able to make it not crash.

@sroas
Copy link

sroas commented Jan 23, 2018

Did you actually file a bug against the software causing the problem in the first place. I don't see the point of abusing strace to cover up a bug in another software. Documenting how strace could most efficiently be used to do that seems like the worst idea you can come up with. Force upstream to fix their software or replace the software if they refuse to do so. Covering up shitty software will not make the situation any better.

@smammy
Copy link

smammy commented Jan 24, 2018

I don't see the point of abusing strace to cover up a bug in another software.

The point is: the software that was broken works now.

@jidanni
Copy link
Author

jidanni commented Jan 24, 2018 via email

@jidanni
Copy link
Author

jidanni commented Jan 24, 2018 via email

@toolforger
Copy link

You may think that strace fixed your problem, but your program might still have silent data corruption (i.e. using a random value from somewhere else in memory rather than what should be used).

Running the program on a single CPU core is likely to fix those, too.
Single-core VMs would fix these, too.
Another approach would be using "processor affinity" to force the program to one CPU core. I never used that because I didn't have your problem, but that would be the first thing to try if I had.

@sroas
Copy link

sroas commented Jan 24, 2018

The software is still broken. Running it under strace is a kludge to cover the broken state. And while I can imagine situations where you need to deploy a hack like that to keep the system running until a real fix is available or some replacement can be found, I don't think it is a good idea that strace documents how it can efficiently used to cover up for broken software.

@jidanni
Copy link
Author

jidanni commented Jan 24, 2018 via email

@maunzCache
Copy link

I don't understand why this is an issue filed for this project. There is no bug or feature request in this thread. The requested documentation is not required to use this software. If you want to know how it works look into the code and/or ask someone who can explain the code to you.

The discussion above is very much off-topic or mocking the author which is pretty useless or could be done at one of the many software forums we have today.

@jidanni jidanni closed this as completed Jan 24, 2018
@Kogl1n
Copy link

Kogl1n commented Jan 24, 2018

@jidanni
So you filed a bug in July last year: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=869746
FYI: A Geforce 1030 with passive cooling is 50 bucks.

@jidanni
Copy link
Author

jidanni commented Jan 24, 2018 via email

@cashell
Copy link

cashell commented Jan 24, 2018

@jidanni: "are there any wrappers to prevent crashes, not just start again?"

No, because there are no magic crash prevention tools. strace doesn't prevent crashes, it just potentially reduces them in very specific and incredibly rare cases. strace isn't magic. It doesn't stop an application from crashing. In the best case, for very, very rare circumstances, it might reduce the chance of hitting specific bugs, but you're still playing Russian roulette. You're also introducing a significant performance penalty, too.

If this weird and ugly hack works for you for something, that's good, I suppose. I've never seen a case where this kind of behavior was consistent or reliable, however, and it's not something I would ever rely on or put into regular use.

@jidanni
Copy link
Author

jidanni commented Jan 24, 2018 via email

@RouL
Copy link

RouL commented Jan 25, 2018

Since some people seem to just don’t get it:
„Fixing“ a Problem like that is like „curing“ cancer with pain meds (since you used the term medicinal effects) or „securing“ a vulnerable software by changing the port number of an open port.

Not trying to mock, but simply stating the facts.

@eirikbakke
Copy link

Someone did a study of these kinds of techniques--see [Enhancing Server Availability and Security Through Failure-Oblivious Computing] by Rinard et al.(http://www.usenix.net/legacy/events/osdi04/tech/full_papers/rinard/rinard.pdf)

@repomaa
Copy link

repomaa commented Mar 6, 2018

I wrote this little script which will fix your whole system! Simply execute and then add this to your .bashrc: source ~/fixes.sh

#!/bin/bash

# This
for dir in $(echo "$PATH" | tr : '\n'); do
    while read -r exe; do
        # is
        basename=$(basename "$exe")
        printf '%s () {\n  strace -o /dev/null "%s"\n}\n' "$basename" "$exe" >> fixes.sh
    done < <(find "$dir" -maxdepth 1 -executable)
    # sarcasm (also tiny bit shocked that these comments were necessary)
done

@jidanni
Copy link
Author

jidanni commented Mar 7, 2018

Maybe you can document your script.

@jidanni
Copy link
Author

jidanni commented Mar 7, 2018

Seems overkill if there is only one trouble program out of 1500 on the system.

@toolforger
Copy link

Applying strace moves race conditions around. It may solve more problems than it creates, and it can be a stopgap solution.
I think the script defines an strace wrapper shell function for every executable in $PATH. Which means applying a stopgap to everything. This can be useful for testing system stability (watch what breaks and what heals), but I wouldn't put that in .profile or anything.

@repomaa
Copy link

repomaa commented Mar 7, 2018

@jidanni well, if only one program is causing trouble of 1500 it's 1499 programs just waiting for causing random bugs in the feature. Better to be safe and just wrap everything with strace.

@repomaa
Copy link

repomaa commented Mar 7, 2018

@jidanni also, yes, of course I will document my script. Good documentation is almost as important as writing programs which don't randomly segfault!

EDIT: there! all done!

ldv-alt pushed a commit that referenced this issue Apr 2, 2022
After des Strausses awareness has been raised sufficiently,
it is time for den Strauss to raise the awareness about strace,
and to do so, the most modern and contemporary method has been elected:
displaying tips, tricks and tweaks on each run.

* src/strace.c (parse_tips_args): New function.
(init) <enum>: Add GETOPT_TIPS.
<longopts>: Add  "tips" option.
(init) <case GETOPT_TIPS>: Call parse_tips_args.
(terminate): Call print_totd before exit.
(usage): Document --tips.
* doc/strace.1.in (.SS Miscellaneous): Ditto.
* strauss.c (MAX_TIP_LINES): New enum.
(tips_tricks_tweaks, tip_top, tip_bottom, tip_left, tip_right): New
static constants.
(show_tips, tip_id): New variable.
(print_totd): New function.
* strauss.h (tips_fmt, tip_ids): New enumerations.
(show_tips, tip_id, print_totd): New declarations.
* tests/Makefile.am (MISC_TESTS): Add strace--tips.test,
strace--tips-full.test.
(EXTRA_DIST): Add strace--tips.exp.
* tests/strace--tips-full.test: New test.
* tests/strace--tips.test: Ditto.
* tests/strace--tips.exp: New file.
* tests/options-syntax.test: Add --tips syntax checks.
* NEWS: Mention it.

Suggested-by: Elvira Khabirova <lineprinter@altlinux.org>
References: #14
esyr added a commit that referenced this issue Apr 2, 2022
After des Strausses awareness has been raised sufficiently,
it is time for den Strauss to raise the awareness about strace,
and to do so, the most modern and contemporary method has been elected:
displaying tips, tricks and tweaks on each run.

* src/strace.c (parse_tips_args): New function.
(init) <enum>: Add GETOPT_TIPS.
<longopts>: Add  "tips" option.
(init) <case GETOPT_TIPS>: Call parse_tips_args.
(terminate): Call print_totd before exit.
(usage): Document --tips.
* doc/strace.1.in (.SS Miscellaneous): Ditto.
* strauss.c (MAX_TIP_LINES): New enum.
(tips_tricks_tweaks, tip_top, tip_bottom, tip_left, tip_right): New
static constants.
(show_tips, tip_id): New variable.
(print_totd): New function.
* strauss.h (tips_fmt, tip_ids): New enumerations.
(show_tips, tip_id, print_totd): New declarations.
* tests/Makefile.am (MISC_TESTS): Add strace--tips.test,
strace--tips-full.test.
(EXTRA_DIST): Add strace--tips.exp.
* tests/strace--tips-full.test: New test.
* tests/strace--tips.test: Ditto.
* tests/strace--tips.exp: New file.
* tests/options-syntax.test: Add --tips syntax checks.
* NEWS: Mention it.

Suggested-by: Elvira Khabirova <lineprinter@altlinux.org>
References: #14
esyr added a commit that referenced this issue Apr 2, 2022
After des Strausses awareness has been raised sufficiently,
it is time for den Strauss to raise the awareness about strace,
and to do so, the most modern and contemporary method has been elected:
displaying tips, tricks and tweaks on each run.

* src/strace.c (parse_tips_args): New function.
(init) <enum>: Add GETOPT_TIPS.
<longopts>: Add  "tips" option.
(init) <case GETOPT_TIPS>: Call parse_tips_args.
(terminate): Call print_totd before exit.
(usage): Document --tips.
* doc/strace.1.in (.SS Miscellaneous): Ditto.
* src/strauss.c (MAX_TIP_LINES): New enum.
(tips_tricks_tweaks, tip_top, tip_bottom, tip_left, tip_right): New
static constants.
(show_tips, tip_id): New variable.
(print_totd): New function.
* src/strauss.h (tips_fmt, tip_ids): New enumerations.
(show_tips, tip_id, print_totd): New declarations.
* tests/Makefile.am (MISC_TESTS): Add strace--tips.test,
strace--tips-full.test.
(EXTRA_DIST): Add strace--tips.exp.
* tests/strace--tips-full.test: New test.
* tests/strace--tips.test: Ditto.
* tests/strace--tips.exp: New file.
* tests/options-syntax.test: Add --tips syntax checks.
* NEWS: Mention it.

Suggested-by: Elvira Khabirova <lineprinter@altlinux.org>
References: #14
esyr added a commit that referenced this issue Apr 2, 2022
After des Strausses awareness has been raised sufficiently,
it is time for den Strauss to raise the awareness about strace,
and to do so, the most modern and contemporary method has been elected:
displaying tips, tricks and tweaks on each run.

* src/strace.c (parse_tips_args): New function.
(init) <enum>: Add GETOPT_TIPS.
<longopts>: Add  "tips" option.
(init) <case GETOPT_TIPS>: Call parse_tips_args.
(terminate): Call print_totd before exit.
(usage): Document --tips.
* doc/strace.1.in (.SS Miscellaneous): Ditto.
* src/strauss.c (MAX_TIP_LINES): New enum.
(tips_tricks_tweaks, tip_top, tip_bottom, tip_left, tip_right): New
static constants.
(show_tips, tip_id): New variable.
(print_totd): New function.
* src/strauss.h (tips_fmt, tip_ids): New enumerations.
(show_tips, tip_id, print_totd): New declarations.
* tests/Makefile.am (MISC_TESTS): Add strace--tips.test,
strace--tips-full.test.
(EXTRA_DIST): Add strace--tips.exp.
* tests/strace--tips-full.test: New test.
* tests/strace--tips.test: Ditto.
* tests/strace--tips.exp: New file.
* tests/options-syntax.test: Add --tips syntax checks.
* NEWS: Mention it.

Suggested-by: Elvira Khabirova <lineprinter@altlinux.org>
References: #14
ANOLASC pushed a commit to ANOLASC/strace that referenced this issue Sep 30, 2022
After des Strausses awareness has been raised sufficiently,
it is time for den Strauss to raise the awareness about strace,
and to do so, the most modern and contemporary method has been elected:
displaying tips, tricks and tweaks on each run.

* src/strace.c (parse_tips_args): New function.
(init) <enum>: Add GETOPT_TIPS.
<longopts>: Add  "tips" option.
(init) <case GETOPT_TIPS>: Call parse_tips_args.
(terminate): Call print_totd before exit.
(usage): Document --tips.
* doc/strace.1.in (.SS Miscellaneous): Ditto.
* src/strauss.c (MAX_TIP_LINES): New enum.
(tips_tricks_tweaks, tip_top, tip_bottom, tip_left, tip_right): New
static constants.
(show_tips, tip_id): New variable.
(print_totd): New function.
* src/strauss.h (tips_fmt, tip_ids): New enumerations.
(show_tips, tip_id, print_totd): New declarations.
* tests/Makefile.am (MISC_TESTS): Add strace--tips.test,
strace--tips-full.test.
(EXTRA_DIST): Add strace--tips.exp.
* tests/strace--tips-full.test: New test.
* tests/strace--tips.test: Ditto.
* tests/strace--tips.exp: New file.
* tests/options-syntax.test: Add --tips syntax checks.
* NEWS: Mention it.

Suggested-by: Elvira Khabirova <lineprinter@altlinux.org>
References: strace#14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests