Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error compiling module for program "cachestat" #3

Closed
SamuelGong opened this issue Feb 21, 2021 · 8 comments
Closed

error compiling module for program "cachestat" #3

SamuelGong opened this issue Feb 21, 2021 · 8 comments

Comments

@SamuelGong
Copy link

Hi TEEMon Team,

When I followed the instructions in README.md, I finally made to start the six containers (i.e., prometheus, grafana, sgx-exporter, node-exporter, ebpf-exporter, cadvisor) after fixing Issue #1 with #2. However, the next challenge I faced was that the ebpf_exporter kept restarting, as shown in the sample result of executing docker ps:

6b93ec2bcc7f   sgx_exporter:latest      "/bin/sh -c 'python …"   17 minutes ago   Up 17 minutes                   0.0.0.0:9441->9441/tcp                           teemon_sgx-exporter_1
a6179b8563d6   prom/node-exporter       "/bin/node_exporter …"   17 minutes ago   Up 17 minutes                   9100/tcp, 0.0.0.0:9442->9442/tcp                 teemon_node-exporter_1
f613eb7234eb   google/cadvisor          "/usr/bin/cadvisor -…"   17 minutes ago   Up 17 minutes                   0.0.0.0:8080->8080/tcp, 0.0.0.0:9443->9443/tcp   teemon_cadvisor_1
df05fce653a6   ebpf_exporter:latest     "ebpf_exporter --con…"   17 minutes ago   Restarting (1) 24 seconds ago                                                    teemon_ebpf-exporter_1
a3bb09949093   prom/prometheus:v2.8.1   "/bin/prometheus --c…"   17 minutes ago   Up 17 minutes                   0.0.0.0:9090->9090/tcp                           teemon_prometheus_1
4772022ecc6e   grafana/grafana:6.5.0    "/run.sh"                17 minutes ago   Up 17 minutes                   3000/tcp, 0.0.0.0:9091->9091/tcp                 teemon_grafana_1
5e8b190b75fb   test                     "/bin/sh -c sh"          7 hours ago      Up 7 hours                                                                       quizzical_napier

I then tried to take a look at ebpf_exporter's log. It contained something like

2021-02-21T10:02:51.168811897Z In file included from /virtual/main.c:1:
2021-02-21T10:02:51.168827597Z In file included from include/uapi/linux/ptrace.h:142:
2021-02-21T10:02:51.168830297Z In file included from ./arch/x86/include/asm/ptrace.h:5:
2021-02-21T10:02:51.168832397Z ./arch/x86/include/asm/segment.h:266:2: error: expected '(' after 'asm'
2021-02-21T10:02:51.168834997Z         alternative_io ("lsl %[seg],%[p]",
2021-02-21T10:02:51.168837197Z         ^
2021-02-21T10:02:51.168839197Z ./arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
2021-02-21T10:02:51.168841197Z         asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
2021-02-21T10:02:51.168843297Z         ^
2021-02-21T10:02:51.168845197Z include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
2021-02-21T10:02:51.168847297Z #define asm_inline asm __inline
2021-02-21T10:02:51.168849197Z                        ^
2021-02-21T10:02:51.174461433Z 1 error generated.
2021-02-21T10:02:51.175202938Z 2021/02/21 10:02:51 Error attaching exporter: error compiling module for program "cachestat"

It seems to be an ``inherenet'' problem, which has nothing to do with my operations.

I am wondering if your team has met this issue before? If yes, could you possibly tell me what I should do to mitigate the issue? Thanks a lot!

@rcrane
Copy link
Owner

rcrane commented Feb 21, 2021

Thanks for pointing this out!

ebpf_exporter needs libbcc/bcc.
The Dockerfile for the ebpf_exporter uses a version of bcc that is probably outdated wrt compatibilty with the latest (host)kernel.

Here is a similar issue with more information: Sysinternals/ProcMon-for-Linux#13

To make this work again, we need to update the Dockerfile for the ebpf_exporter and most likely have to compile libbcc/bcc ourselves as the debian package has not been updated.

The bug is in https://github.com/rcrane/TEEMon/blob/master/exporters/ebpf_exporter/Dockerfile at lines 10-12.

For clarification, could you state the kernel version of your host machine?

@SamuelGong
Copy link
Author

Thanks for your prompt following-up. I am currently using 5.4.0-1039-azure.

@rcrane
Copy link
Owner

rcrane commented Feb 21, 2021

Integrating https://github.com/iovisor/bcc/releases/tag/v0.18.0 into the Dockerfile (compilation part) of the ebpf_exporter might solve the issue.

@SamuelGong
Copy link
Author

SamuelGong commented Feb 21, 2021

Integrating https://github.com/iovisor/bcc/releases/tag/v0.18.0 into the Dockerfile (compilation part) of the ebpf_exporter might solve the issue.

According to our discussion, does it suffice to modify both Line 12 and Line 31 in the Dockerfile of ebpf_exporter by replacing

apt-get install -y libbcc linux-headers-amd64

with

apt-get install -y libbcc=0.18.0 linux-headers-amd64

?

rcrane added a commit that referenced this issue Feb 21, 2021
@rcrane
Copy link
Owner

rcrane commented Feb 21, 2021

@SamuelGong
Copy link
Author

Thanks for your comprehensive help! The previous error seems to be mitigated, however, the container ebpf_exporter still keeps restarting with only two lines of error messages in log:

2021-02-22T06:30:51.277185249Z perf_event_open failed: No such file or directory
2021-02-22T06:30:51.277229149Z 2021/02/22 06:30:51 Error attaching exporter: failed to attach perf event 0:3 to "on_cache_miss" in program "llcstat": failed to attach BPF perf event: invalid argument

I just ran with the latest update-ebpf_exporter branch, and additionally avoid the issue of #1 by solution mentioned in #2. Apart from these, I have made no other modification during the experiment. Do I still miss anything?

@rcrane
Copy link
Owner

rcrane commented Feb 22, 2021

Your latest issue sounds like it is something different.
Could you put the 'perf_event_open' problem into a new issue and close this one?
For the new issue, please provide detailed information about your host system that allows us to reproduce it (the issue).
The new issue didn't occur on my machine and I suspect that your host kernel is somehow not compatible.
Maybe there are some restrictions at Azure that we didn't have on our machines. If possible could you try to run TEEMon on your local machine?

@SamuelGong
Copy link
Author

Alright. I will open another issue and try to give a base for you to reproduce the issue. Thanks again for your support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants