Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing a function in musl's PointsTo #15

Closed
anh-q opened this issue Apr 6, 2017 · 8 comments
Closed

Missing a function in musl's PointsTo #15

anh-q opened this issue Apr 6, 2017 · 8 comments

Comments

@anh-q
Copy link

anh-q commented Apr 6, 2017

Hi,

I've been using WPA in SVF to analyze a library using Andersen's algorithm. The library I'm using is musl-libc version 1.1.15 since it can be compiled using LLVM.

I notice that, in musl libc, there is an indirect call from function vfprintf to sn_write which is not captured in WPA's output. Particularly, when a program invokes vsnprintf, it prepares a FILE struct with a pointer to sn_write function. "vsnprintf" then issues a direct call to vfprintf with a pointer to this struct as an argument. Finally, vfprintf invokes sn_write at an indirect callsite.

Source codes for vsnprintf and vfprintf.

Attached musl.tar.gz contains bitcode file and LLVM assembly file of musl generated by LLVM gold plugin.

Thank you for your help.

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 6, 2017

Hello Anh,

A very good question!

When analyzing a C/C++ application, the function bodies of standard C library functions are not included in the LLVM bit code files. SVF summarizes the side-effect of many standard library APIs. Please see "MemoryModel/PAGBuilder.cpp" (lines 716-988) and "Util/ExtAPI.cpp"

If you are trying to analyze a C library such as "libc", you may wish to let SVF understand this. Some of the library summarizations (e.g., vsnprintf and vfprintf) should be disabled to achieve your goal (lines 412-413 in "Util/ExtAPI.cpp").

Good luck!

@anh-q
Copy link
Author

anh-q commented Apr 7, 2017

Thank you for the reply.

I tried following your instruction and commented out the 2 lines in "Util/ExtAPI.cpp" file, recompiled WPA, then analyzed the bitcode file I attached in the first message. The result from Andersen's analysis is the same, meaning sn_write does not appear in vfprintf's points-to.

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 8, 2017

Hi Anh,

Could you please make a small test case regarding the issue you have found?

It is impossible to include only bc code (without source) for me to debug such a large program.

It would be good to extract a small code example (few hundreds lines) from your case using the same function names you have mentioned.

My guessing is that you may wish to disable some more ext APIs when analyzing musl in order to discover the call relations in the library.

Thanks

@anh-q
Copy link
Author

anh-q commented Apr 10, 2017

Here is a simple test case that produces the same issue I've been having:

#include <stdio.h>

static int my_sn_write() {
  printf("Executing my_sn_write\n");
  return 0;
}

struct MYFILE {
  int (*pt) (void);
};

void my_vfprintf(struct MYFILE *pts) {
  printf("Executing bar\n");
  pts->pt();
}

int my_vsnprintf() {
  struct MYFILE pts = { .pt = my_sn_write };
  my_vfprintf(&pts);
  return 0;
}

int main() {
  my_vsnprintf();
  return 0;
}

Please compile with -O0 since that is required for the project I'm working on.

Below is the final callgraph after applying Andersen analysis:

callgraph_final

As you can see, we should expect an edge from my_vfprintf to my_sn_write.

Thank you for your help!

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 11, 2017

Hi Anh,

A very good test case. We have found the problem. This is because LLVM translates your local structure initialization to be a global constant initialization.

If you modify your "my_vsnprintf" function to be the following, then the indirect call edge will be connected.

int my_vsnprintf() {
  struct MYFILE pts;
  pts.pt = my_sn_write;
  my_vfprintf(&pts);
  return 0;
}

When you take a closer look into the two bc files.

  struct MYFILE pts = { .pt = my_sn_write };

is strangely translated into a global constant variable by LLVM

@my_vsnprintf.pts = private unnamed_addr constant %struct.MYFILE { i32 (i32*)* @my_sn_write }, align 8

The following code (the common initialization pattern) is translated as a local function pointer assignment as shown below.

  struct MYFILE pts;
  pts.pt = my_sn_write;
  %pt = getelementptr inbounds %struct.MYFILE, %struct.MYFILE* %pts, i32 0, i32 0, !dbg !184
  store i32 (i32*)* @my_sn_write, i32 (i32*)** %pt, align 8, !dbg !185

For your former case, we will fix this global constant issue and submit a patch later.

Thanks for reporting it!

@anh-q
Copy link
Author

anh-q commented Apr 11, 2017

Unfortunately modifying the source code is not allowed so I'm looking forward to your patch.
Thank you.

@yuleisui
Copy link
Collaborator

yuleisui commented Apr 12, 2017

Hi Anh,

Fixed (14d9b9f). Please pull the new update and re-analyze your test case.

Thanks

@anh-q
Copy link
Author

anh-q commented Apr 12, 2017

As far as I can tell, the patch fixes all issues I have with musl libc.
Thank you.

@yuleisui yuleisui closed this as completed May 7, 2017
yuleisui added a commit that referenced this issue Aug 30, 2019
fix crash of AndersenHCD and some method of OCG
yuleisui added a commit that referenced this issue Jun 12, 2020
add libsvf_xxx.a for mac or ubuntu fixed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants