Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bogus backtraces due to fake GL calls #191

Closed
urkle opened this issue Nov 21, 2013 · 10 comments
Closed

Bogus backtraces due to fake GL calls #191

urkle opened this issue Nov 21, 2013 · 10 comments
Labels

Comments

@urkle
Copy link
Contributor

urkle commented Nov 21, 2013

I (finally) got the backtrace dumping code working on linux.. I'll be sending a separate pull request with some documentation updates and CMake tweaks.

However the issue I'm now having is the "skip frame" detection code doesn't quite work correctly. The application I am testing uses SDL2 and is written in C++ for everything else. I had it trace calls to glViewport which worked mostly. the "C" calls (e.g. from SDL2 initialization) were "trimmed" correctly. However the calls from C++ code skipped 2 extra frames that shouldn't have been trimmed. I'm not sure why the code-path varies yet.. I've currently modified the libbacktraceProvider to simply back down the skipFrames var by 2. which fixes the C++ code, and simply adds 2 extra calls in the C code..

dumps of "info1" in the bt_countskip method (order is fname, sname, fbase, saddr)

Skip: /home/urkle/bin/wrappers/glxtrace32.so :: (null) :: 0xf7468000, (nil)
Skip: /home/urkle/bin/wrappers/glxtrace32.so :: (null) :: 0xf7468000, (nil)
Skip: /home/urkle/bin/wrappers/glxtrace32.so :: (null) :: 0xf7468000, (nil)
Skip: /home/urkle/bin/wrappers/glxtrace32.so :: glViewport :: 0xf7468000, 0xf752bf10
Skip: /home/urkle/bin/wrappers/glxtrace32.so :: (null) :: 0xf7468000, (nil)
Skip: /home/urkle/bin/wrappers/glxtrace32.so :: glXMakeCurrent :: 0xf7468000, 0xf75172dc
@amonakov
Copy link
Member

Huh. I wonder if a glXMakeCurrent-glViewport sibcall in your code could cause that.

@urkle
Copy link
Contributor Author

urkle commented Nov 21, 2013

that seemed odd that those were even in there when it was "detecting".. But the real oddity is why there are 2 extra calls when a GL method was called form SDL2 vs being called from the game code.

@amonakov
Copy link
Member

Detection is done when the first backtrace is requested.

@amonakov
Copy link
Member

This is due to apitrace calling glViewport from inside glXMakeCurrent to record initial drawable size. See this FIXME.

It's possible to fix the problem by passing the function to be backtraced to get_backtrace and use that in bt_countskip.

@amonakov
Copy link
Member

Alternatively, the problem can be fixed by addressing the FIXME and emitting a fake call rather than recursing to glViewport from glXMakeCurrent. José, please let me know what you prefer. Apropos, why is glScissor call there? glViewport looks enough on its own.

@urkle
Copy link
Contributor Author

urkle commented Nov 22, 2013

Sweet.. that makes sense as to why it's messed up:) So.. essentially if I had NOT backtraced the glViewport call I would not have encountered this issue:) HA!

@amonakov
Copy link
Member

glViewport is not the only one that can be a "fake function"; there's also glScissor, glGet, and probably something else.

@jrfonseca
Copy link
Member

Alternatively, the problem can be fixed by addressing the FIXME and emitting a fake call rather than recursing to glViewport from glXMakeCurrent. José, please let me know what you prefer.

This route would be my preference. Even ignoring backtracing issues, we really shouldn't be calling into OpenGL implementation while tracing. It is dangerous.

But because this is in a separate source file as the generated wrappers, it's not straightforward. We'll need to declare the glViewport/glScissor signatures somehow...

Apropos, why is glScissor call there? glViewport looks enough on its own.

The scissor state, just like viewport, is defined to match the size of the first bound drawable, as explained https://github.com/apitrace/apitrace/blob/master/wrappers/gltrace_state.cpp#L156 . So if we don't record the glScissor call here, then when replaying the sicssor will have a bogus value (typical 16x16, as that is the initial window size used in replays), even if the window is later resized to be the right value.

@amonakov
Copy link
Member

amonakov commented Dec 1, 2013

Indeed, obtaining access to _glViewport_sig is not that easy. Here's the patch that implements the other approach:

diff --git a/common/os_backtrace.cpp b/common/os_backtrace.cpp
index c95692b..03e0f33 100644
--- a/common/os_backtrace.cpp
+++ b/common/os_backtrace.cpp
@@ -268,7 +268,7 @@ public:
     }
 };

-std::vector<RawStackFrame> get_backtrace() {
+std::vector<RawStackFrame> get_backtrace(const char* fname) {
     static DalvikBacktraceProvider backtraceProvider;
     return backtraceProvider.parseBacktrace(backtraceProvider.getBacktrace());
 }
@@ -332,6 +332,7 @@ static void dumpFrame(const RawStackFrame &frame)
 #define BT_DEPTH 10

 class libbacktraceProvider {
+    const char *fname;
     struct backtrace_state *state;
     int skipFrames;
     Id nextFrameId;
@@ -364,6 +365,9 @@ class libbacktraceProvider {
         if (info1.dli_fbase != info2.dli_fbase)
             return 1;
         this_->skipFrames++;
+        if (info1.dli_sname && this_->fname
+            && !strcmp(info1.dli_sname, this_->fname))
+          return 1;
         return 0;
     }

@@ -440,7 +444,8 @@ class libbacktraceProvider {
     }

 public:
-    libbacktraceProvider():
+    libbacktraceProvider(const char *fname = NULL):
+        fname(fname),
         state(backtrace_create_state(NULL, 0, bt_err_callback, NULL))
     {
         backtrace_simple(state, 0, bt_countskip, bt_err_callback, this);
@@ -460,8 +465,8 @@ public:
     }
 };

-std::vector<RawStackFrame> get_backtrace() {
-    static libbacktraceProvider backtraceProvider;
+std::vector<RawStackFrame> get_backtrace(const char* fname) {
+    static libbacktraceProvider backtraceProvider(fname);
     return backtraceProvider.getParsedBacktrace();
 }

@@ -473,7 +478,7 @@ void dump_backtrace() {

 #else /* !HAVE_BACKTRACE */

-std::vector<RawStackFrame> get_backtrace() {
+std::vector<RawStackFrame> get_backtrace(const char *fname) {
     return std::vector<RawStackFrame>();
 }

diff --git a/common/os_backtrace.hpp b/common/os_backtrace.hpp
index 27dcc90..59d9298 100644
--- a/common/os_backtrace.hpp
+++ b/common/os_backtrace.hpp
@@ -36,7 +36,7 @@ namespace os {
 using trace::RawStackFrame;


-std::vector<RawStackFrame> get_backtrace();
+std::vector<RawStackFrame> get_backtrace(const char* fname);
 bool backtrace_is_needed(const char* fname);

 void dump_backtrace();
diff --git a/common/trace_writer_local.cpp b/common/trace_writer_local.cpp
index 529f83b..3a6081e 100644
--- a/common/trace_writer_local.cpp
+++ b/common/trace_writer_local.cpp
@@ -174,7 +174,7 @@ unsigned LocalWriter::beginEnter(const FunctionSig *sig, bool fake) {
     unsigned thread_id = this_thread_num - 1;
     unsigned call_no = Writer::beginEnter(sig, thread_id);
     if (!fake && os::backtrace_is_needed(sig->name)) {
-        std::vector<RawStackFrame> backtrace = os::get_backtrace();
+        std::vector<RawStackFrame> backtrace = os::get_backtrace(sig->name);
         beginBacktrace(backtrace.size());
         for (unsigned i = 0; i < backtrace.size(); ++i) {
             writeStackFrame(&backtrace[i]);

@jrfonseca
Copy link
Member

@amonakov Thanks for the patch.

I plan to start making the *_sig non-static. Not just for the sake of this bug, but so I can better organize the code, as currently the abuse of static variables ends up forcing a lot of hand-written code to be defined in the python generating scripts, where it could be perfectly happy in a ordinary .cpp file.

I'm not sure if it's worth adding the kludge to os::get_backtrace until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants