Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Character encoding issue ? #2

Open
mathaefele opened this issue Nov 13, 2019 · 8 comments
Open

Character encoding issue ? #2

mathaefele opened this issue Nov 13, 2019 · 8 comments
Labels
bug Something isn't working

Comments

@mathaefele
Copy link

mathaefele commented Nov 13, 2019

Describe the bug

I am new to pdwfs. It looks really nice but the result of the simple example I tried to build is different with and without pdwfs.

How to reproduce

redis-server --daemonize yes
echo "########### Launching simu ##############"
pdwfs -p $PWD/staged -- ./simu
redis-cli dump "/local/home/mhaefele/ownCloud/work/dev/hello_worlds/pdwfs/C/staged/Cpok:0"
echo "########### Launching post-process ##############"
pdwfs -p $PWD/staged -- ./post-process
echo "########### Done ##############"
redis-cli shutdown
  • ./simu is a C program writing 10 times Hello444 in staged/Cpok
  • ./post-process reads staged/Cpok and writes only its first line in ./resC

Expected behaviour

So I expect to have a single "Hello444" in resC which is the case without using pdwfs. When using it I get the following:

mhaefele@mdlspc113:C $ ./launch.sh 
15287:C 13 Nov 14:08:15.990 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
15287:C 13 Nov 14:08:15.990 # Redis version=4.0.9, bits=64, commit=00000000, modified=0, pid=15287, just started
15287:C 13 Nov 14:08:15.990 # Configuration loaded
########### Launching simu ##############
"\x00\xc3\x11@Z\tHello444\nH\xe0E\b\x014\n\b\x00\xc4\xa2\r\xfe\x05f\xedy"
########### Launching post-process ##############
post-process: �
########### Done ##############
mhaefele@mdlspc113:C $ cat resC 
�

Am I doing something wrong ? Anything related with character encoding ?

Thanks for your help
Mat

JCapul added a commit that referenced this issue Nov 19, 2019
when getline() is inlined by the compiler, it resorts to using and linking against
the __getdelim function from the standard library.
However only the getdelim function was covered by pdwfs, not the __getdelim

Besides no tests were actually covering getline and getdelim...bad, very bad.
@JCapul
Copy link
Collaborator

JCapul commented Nov 19, 2019

Hi Mat,
Thanks for looking into pdwfs!

I think I managed to reproduce your issue. Actually this is not an encoding issue, but it's definitely a bug.
I assume that you have used the standard C function getline in your post-process program to read lines from staged/Cpok ?
If this is the case, then the issue came from the fact that this function was actually not intercepted by pdwfs, instead the "real" getline from the libc was called on a file stream descriptor not managed by the libc resulting in a completely undefined behaviour.
What your post-process was reading was the resulting undefined content of the memory buffer passed to getline.

The reason why getline was not intercepted was because it is inlined by the compiler and replaced by a call to __getdelim C function from the libc which was not intercepted by pdwfs.

I have pushed the branch fix-github-issue-2 that is fixing the problem.
Could you try this branch?

I hope that you actually used getline ! If yes, then I think I solved your issue. If not, well, at least I fixed one issue...

@JCapul
Copy link
Collaborator

JCapul commented Nov 19, 2019

btw, you can easily check what calls are intercepted by pdwfs by running it with the -t (trace) option:
pdwfs -t -p $PWD/staged -- ./post-process

@JCapul JCapul added the bug Something isn't working label Nov 19, 2019
@mathaefele
Copy link
Author

I am using

fscanf(f, "%s", buffer);

and it looks like it is not intercepted neither according to the run with the trace activated:

mhaefele@mdlspc113:C $ ./launch.sh 
30425:C 19 Nov 14:12:30.600 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
30425:C 19 Nov 14:12:30.601 # Redis version=4.0.9, bits=64, commit=00000000, modified=0, pid=30425, just started
30425:C 19 Nov 14:12:30.601 # Configuration loaded
########### Launching simu ##############
[PDWFS][30436][TRACE][C] intercepting fopen(path=staged/Cpok, mode=w)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fwrite(ptr=0x5566d0d337d2, size=1, nmemb=9, stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting fclose(stream=0x5566d168b040)
[PDWFS][30436][TRACE][C] intercepting close(fd=5)
[PDWFS][30436][TRACE][C] intercepting close(fd=5)
[PDWFS][30436][TRACE][C] calling libc close
"\x00\xc3\x11@Z\tHello444\nH\xe0E\b\x014\n\b\x00\xc4\xa2\r\xfe\x05f\xedy"
########### Launching post-process ##############
[PDWFS][30453][TRACE][C] intercepting fopen(path=staged/Cpok, mode=r)
[PDWFS][30453][TRACE][C] intercepting fclose(stream=0x564475d11040)
[PDWFS][30453][TRACE][C] intercepting close(fd=5)
[PDWFS][30453][TRACE][C] intercepting close(fd=5)
[PDWFS][30453][TRACE][C] calling libc close
post-process: �
[PDWFS][30453][TRACE][C] intercepting fopen(path=resC, mode=w)
[PDWFS][30453][TRACE][C] calling libc fopen
[PDWFS][30453][TRACE][C] intercepting fprintf(stream=0x564475d126c0, ...)
[PDWFS][30453][TRACE][C] intercepting fputs(s=�
, stream=0x564475d126c0)
[PDWFS][30453][TRACE][C] calling libc fputs
[PDWFS][30453][TRACE][C] intercepting fclose(stream=0x564475d126c0)
[PDWFS][30453][TRACE][C] calling libc fclose
########### Done ##############

Anyway, I was just trying out how this works with hello world examples. But I agree, in a real code you might want to read the full buffer with a fread and parse the ACSII content in memory...

I'll have try with your branch, but first I need to compile pdwfs. Until now I made the minimum effort using the available binaries 😊

@mathaefele
Copy link
Author

I've just tested with fread and parsing it in memory, it works !

@JCapul
Copy link
Collaborator

JCapul commented Nov 19, 2019

yay!
I'll check what's going on with fscanf, probably a similar story as getline.

@JCapul
Copy link
Collaborator

JCapul commented Nov 19, 2019

I'll check what's going on with fscanf, probably a similar story as getline.

ok my bad, interception of fscanf was never implemented actually. I guess we never had to in applications so far.

but failing an hello world example looks pretty bad...so we'll make it work!

@mathaefele
Copy link
Author

ok, I managed to compile and run my hello world with the fix-github-issue-2 version of pdwfs.

  • fscanf still broken (not surprising as only getline has been fixed)
  • fread still works

Tell me when I can I have a new try.

@mathaefele
Copy link
Author

The small hello world example in C I am using
hello_pdwfs_C.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants