Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perfetto crash #213

Closed
miklelappo opened this issue Dec 16, 2021 · 5 comments
Closed

Perfetto crash #213

miklelappo opened this issue Dec 16, 2021 · 5 comments

Comments

@miklelappo
Copy link
Contributor

miklelappo commented Dec 16, 2021

Ubuntu 20.04, x86_64, master

-----BEGIN PERFETTO PRE-CRASH LOG-----
proc_count: 1575
[225.324] trace_controller.cc:306 Reposting immediate ReadTick as there's more work.
[230.033] ory_arbiter_impl.cc:192 Shared memory buffer overrun! Stalling
[230.076] ory_arbiter_impl.cc:167 Recovered from stall after 11 iterations
[230.157] trace_controller.cc:306 Reposting immediate ReadTick as there's more work.
[230.267] tats_data_source.cc:676 ProcessStatsDataSource clearing incremental state.
[230.710] trace_controller.cc:306 Reposting immediate ReadTick as there's more work.
[230.866] trace_controller.cc:306 Reposting immediate ReadTick as there's more work.
[230.876] trace_controller.cc:297 PERFETTO_CHECK(pages_read <= max_pages) (errno: 0, Success)

-----END PERFETTO PRE-CRASH LOG-----

------------------ BEGINNING OF CRASH ------------------
Signal: Illegal instruction (possibly unaligned access)
Fault addr: 000055C7363DE5D5

Backtrace:

#00  0x00007EFC7F7C93C0

#01  perfetto::FtraceController::ReadTick(int)
     ../../src/traced/probes/ftrace/ftrace_controller.cc:297

#02  operator()
     ../../src/traced/probes/ftrace/ftrace_controller.cc:310

#03  decltype(std::__1::forward<perfetto::FtraceController::ReadTick(int)::$_1&>(fp)()) std::__1::__invoke<perfetto::FtraceController::ReadTick(int)::$_1&>(perfetto::FtraceController::ReadTick(int)::$_1&)
     ../../buildtools/libcxx/include/type_traits:3529

#04  void std::__1::__invoke_void_return_wrapper<void>::__call<perfetto::FtraceController::ReadTick(int)::$_1&>(perfetto::FtraceController::ReadTick(int)::$_1&)
     ../../buildtools/libcxx/include/__functional_base:349

#05  std::__1::__function::__alloc_func<perfetto::FtraceController::ReadTick(int)::$_1, std::__1::allocator<perfetto::FtraceController::ReadTick(int)::$_1>, void ()>::operator()()
     ../../buildtools/libcxx/include/functional:1540

#06  std::__1::__function::__func<perfetto::FtraceController::ReadTick(int)::$_1, std::__1::allocator<perfetto::FtraceController::ReadTick(int)::$_1>, void ()>::operator()()
     ../../buildtools/libcxx/include/functional:1714

#07  std::__1::__function::__value_func<void ()>::operator()() const
     ../../buildtools/libcxx/include/functional:1867

#08  std::__1::function<void ()>::operator()() const
     ../../buildtools/libcxx/include/functional:2473

#09  perfetto::base::RunTaskWithWatchdogGuard(std::__1::function<void ()> const&)
     ../../include/perfetto/ext/base/watchdog.h:67

#0A  perfetto::base::UnixTaskRunner::RunImmediateAndDelayedTask()
     ../../src/base/unix_task_runner.cc:152

#0B  perfetto::base::UnixTaskRunner::Run()
     ../../src/base/unix_task_runner.cc:56

#0C  perfetto::ProbesMain(int, char**)
     ../../src/traced/probes/probes.cc:140

#0D  perfetto::(anonymous namespace)::TraceboxMain(int, char**)
     ../../src/tracebox/tracebox.cc:86

#0E  main
     ../../src/tracebox/tracebox.cc:196

#0F  0x00007EFC7F46E0B3

#10  0x000055C73606915E
------------------ END OF CRASH ------------------
@chromy
Copy link
Contributor

chromy commented Dec 16, 2021

Hi Mikhail, thanks for the bug report!

You're hitting this D(ebug)CHECK. Cursory inspection of the code suggests that the DCHECK might be incorrect (it's not clear to me that ReadCycle is guaranteed to return <= max_pages) - if so we'll update the DCHECK and/or code as necessary.

In the mean time you can build in release mode (is_debug = false) which ignores DCHECKS (and also will improve performance of everything quite a bit).

@miklelappo
Copy link
Contributor Author

@chromy , thanks! The problem is also reproducible with 22.1

@miklelappo
Copy link
Contributor Author

miklelappo commented Dec 16, 2021

@chromy in release it's even worse... tracebox exits few seconds after client connects

[236.276]          service.cc:237 Started traced, listening on /tmp/perfetto-producer /tmp/perfetto-consumer
[236.340]           probes.cc:104 Starting traced_probes service
[236.341]  probes_producer.cc:114 Connected to the service
[236.341]     perfetto_cmd.cc:914 Connected to the Perfetto traced service, TTL: 1800s
[236.342] ing_service_impl.cc:956 Configured tracing session 1, #sources:4, duration:1800000 ms, #buffers:1, total buffer size:409600 KB, total sessions:1, uid:12031245 session name: ""
[236.796]  probes_producer.cc:251 Ftrace setup (target_buf=1)
[236.822]    ftrace_procfs.cc:217 enabled ftrace in /sys/kernel/tracing/
[250.003] ory_arbiter_impl.cc:192 Shared memory buffer overrun! Stalling
[250.046] ory_arbiter_impl.cc:167 Recovered from stall after 11 iterations
[252.504] ory_arbiter_impl.cc:192 Shared memory buffer overrun! Stalling
[252.847] ory_arbiter_impl.cc:167 Recovered from stall after 17 iterations
[252.881]  probes_producer.cc:360 Producer stop (id=1)
[252.883]    ftrace_procfs.cc:224 disabled ftrace in /sys/kernel/tracing/
[253.488]  probes_producer.cc:360 Producer stop (id=2)
[253.488]  probes_producer.cc:360 Producer stop (id=3)
[253.488]    perfetto_cmd.cc:1035 Trace written into the output file

Do you have an idea?

@miklelappo
Copy link
Contributor Author

miklelappo commented Dec 16, 2021

Config file:

buffers: {
    size_kb: 409600
    fill_policy: RING_BUFFER
}

duration_ms: 1800000
write_into_file: true
file_write_period_ms: 2500
max_file_size_bytes: 100000000
flush_period_ms: 30000

incremental_state_config {
    clear_period_ms: 5000
}

# Ftrace data from the kernel, mainly the process scheduling events.
data_sources {
  config {
    name: "linux.ftrace"
    target_buffer: 0
    ftrace_config {
      ftrace_events: "sched/sched_switch"
      ftrace_events: "power/suspend_resume"
      ftrace_events: "sched/sched_wakeup"
      ftrace_events: "sched/sched_wakeup_new"
      ftrace_events: "sched/sched_waking"
      ftrace_events: "raw_syscalls/sys_enter"
      ftrace_events: "raw_syscalls/sys_exit"
      ftrace_events: "sched/sched_process_exit"
      ftrace_events: "sched/sched_process_free"
      ftrace_events: "task/task_newtask"
      ftrace_events: "task/task_rename"
      ftrace_events: "sched/sched_blocked_reason"
      buffer_size_kb: 2048
      drain_period_ms: 250
    }
  }
}

data_sources: {
    config {
        name: "linux.process_stats"
        target_buffer: 0
        process_stats_config {
            scan_all_processes_on_start: true
        }
    }
}
data_sources: {
    config {
        name: "linux.sys_stats"
        sys_stats_config {
            stat_period_ms: 1000
            stat_counters: STAT_CPU_TIMES
            stat_counters: STAT_FORK_COUNT
        }
    }
}

data_sources: {
    config {
        name: "track_event"
    }
}

@chromy
Copy link
Contributor

chromy commented Jul 11, 2022

Sorry I can't reproduce this or see any hints in that log. I appreciate it's been a while. If you are still running into this issue please reopen.

@chromy chromy closed this as completed Jul 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants