Skip to content

KarthikeyaAnna/SamplingProfiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Linux Sampling Profiler

A lightweight, low-overhead sampling profiler for Linux applications. This tool leverages the perf_event_open system call to sample the instruction pointer (IP) of a child process at high frequency, aggregating those samples into a human-readable report showing the percentage of time spent in each function.

🚀 Features

  • Non-Intrusive Profiling: Uses hardware performance counters (via perf_event_open) to sample the CPU without modifying the target binary.
  • Symbol Resolution: Automatically maps instruction addresses to function names using dladdr.
  • Aggregation by Function: Groups multiple instruction pointers belonging to the same function to show total function-level impact.
  • Percentage-Based Reporting: Clear output showing exactly which functions are the hottest bottlenecks.
  • Minimal Overhead: Uses a ring-buffer shared memory (mmap) approach to minimize context switches between the kernel and the profiler.

🛠 Architecture Overview

The profiler operates in two main phases:

  1. Sampling Phase:

    • The profiler forks a child process and executes the target binary.
    • It configures a PERF_COUNT_HW_INSTRUCTIONS event with a specific sample_period.
    • The Linux kernel periodically writes the current Instruction Pointer (IP) into a ring-buffer shared via mmap.
    • The profiler reads these samples asynchronously while the child runs.
  2. Reporting Phase:

    • Samples are stored in a custom Linear Probing Hash Map for efficient counting.
    • The profiler uses dladdr (Dynamic Linker API) to translate raw memory addresses into function symbols.
    • It aggregates all instructions belonging to the same symbol and calculates the percentage of total execution time.

📋 Prerequisites

  • Linux Kernel: Requires perf_events support (standard on most modern distros).
  • Permissions: Linux restricts access to performance counters by default. You may need to run:
    sudo sysctl -w kernel.perf_event_paranoid=1
    (Set to -1 for most permissive, 1 for user-level profiling only.)

📥 Installation

  1. Clone the repository:

    git clone https://github.com/KarthikeyaAnna/SamplingProfiler.git
    cd SamplingProfiler
  2. Build the project: The provided Makefile handles the specific flags required for symbol resolution.

    make

🖥 Usage

To profile a program, simply pass its path and arguments to the profiler:

./profiler ./test_target

Compilation Flags for Targets

For the profiler to resolve symbols correctly, your target programs should be compiled with:

  • -g: Include debug information.
  • -rdynamic: Export symbols to the dynamic symbol table (critical for dladdr).
  • -no-pie: Disable Position Independent Execution to ensure address consistency.

📊 Example Output

Test program started (pid=452607)
Done. sink=130049999695000000

Function Symbol / Address                Samples    Percentage
--------------------------------------------------------------------------
my_expensive_loop                        29602      78.54%
compute_hash                             6124       16.24%
main                                     1955       5.19%
_start                                   10         0.03%

🧩 File Structure

File Description
sampling_profiler.c Core logic: process forking, perf_event_open setup, and ring-buffer processing.
ip_hashmap.c Implementation of the fixed-size hash map for storing IP counts and symbol aggregation.
ip_hashmap.h Header definitions for the hash map and reporting functions.
test_code.c A sample target program used to verify profiler accuracy.
Makefile Build automation with correct linker flags.

⚠️ Limitations

  • User-Space Only: This implementation currently excludes kernel and hypervisor samples (exclude_kernel = 1).
  • Static Buffer: The sample storage is currently limited by TABLE_SIZE. For extremely long runs, a dynamic resizing hash map or periodic flushing would be required.
  • PIE Support: Profiling Position Independent Executables (PIE) may require additional offset calculation via /proc/pid/maps.

📜 License

This project is open-source and available under the MIT License.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors