-
Notifications
You must be signed in to change notification settings - Fork 470
fix(iast): wrong memory address in subprocess in mcp servers [backport 4.0] #15566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 4.0
Are you sure you want to change the base?
Conversation
IAST-enabled applications using Gunicorn/Uvicorn workers were experiencing segmentation faults (~33% crash rate on MCP streaming requests) due to memory corruption when processes fork. - C++ global singletons (`taint_engine_context`, `initializer`) initialized at module load - Taint maps storing PyObject pointers by memory address - Child processes after fork inherited stale pointers from parent process memory - Accessing these stale pointers → use-after-free → SIGSEGV crash - Implemented `pthread_atfork` handler that automatically resets C++ global state in child processes after every fork: - Added comprehensive null-check wrappers around all native functions to prevent crashes when native state is - Fixed test regression issues where context slots weren't being freed: **[AddressSanitizer (ASAN)](https://github.com/google/sanitizers/wiki/AddressSanitizer)** is a fast memory error detector that catches use-after-free, buffer overflows, and other memory corruption bugs at runtime. **1. Runtime Environment (No Recompilation Required)** The simplest way to test is using LD_PRELOAD with the system's libasan: ```bash ASAN_LIB=$(gcc -print-file-name=libasan.so) LD_PRELOAD=$ASAN_LIB \ ASAN_OPTIONS="detect_leaks=0:symbolize=1:abort_on_error=0" \ python3 -m pytest tests/appsec/iast/test_fork_handler_regression.py -v ``` **ASAN_OPTIONS explained:** - `detect_leaks=0` - Disable leak detection (Python has many false positives) - `symbolize=1` - Show human-readable stack traces - `abort_on_error=0` - Continue after first error (collect all errors) **2. Build with ASAN (Optional, for deeper analysis)** For more thorough testing, compile the native extension with ASAN: ```bash export CFLAGS="-fsanitize=address -fno-omit-frame-pointer -g" export CXXFLAGS="-fsanitize=address -fno-omit-frame-pointer -g" export LDFLAGS="-fsanitize=address" pip install --no-build-isolation --force-reinstall -e . ASAN_OPTIONS="detect_leaks=0:symbolize=1:abort_on_error=0" \ python3 -m pytest tests/appsec/iast/test_fork_handler_regression.py -v ``` This script demonstrates the fork safety fix and can be used to verify ASAN finds no errors: ```python """Minimal fork safety reproduction test for ASAN verification.""" import os from ddtrace.appsec._iast._taint_tracking import OriginType from ddtrace.appsec._iast._taint_tracking._native import initialize_native_state from ddtrace.appsec._iast._taint_tracking._taint_objects import taint_pyobject from ddtrace.appsec._iast._taint_tracking._context import ( start_request_context, debug_context_array_free_slots_number, debug_num_tainted_objects ) def main(): print(f"[Parent PID {os.getpid()}] Initializing IAST...") # Initialize native state initialize_native_state() # Create context and tainted objects in parent ctx_id = start_request_context() print(f"[Parent] Context created: {ctx_id}") # Create some tainted objects (populates native maps) for i in range(10): taint_pyobject(f"data_{i}", "source", f"value_{i}", OriginType.PARAMETER) tainted_count = debug_num_tainted_objects(ctx_id) print(f"[Parent] Tainted objects: {tainted_count}") # Fork pid = os.fork() if pid == 0: # Child process print(f"[Child PID {os.getpid()}] Started after fork") # Verify pthread_atfork reset worked free_slots = debug_context_array_free_slots_number() print(f"[Child] Free slots: {free_slots}") if free_slots > 0: print("[Child] ✅ Context slots were reset (pthread_atfork worked!)") os._exit(0) # Success else: print("[Child] ❌ Context slots NOT reset") os._exit(1) # Failure else: # Parent waits for child _, status = os.waitpid(pid, 0) exit_code = os.WEXITSTATUS(status) if exit_code == 0: print(f"[Parent] ✅ Child exited cleanly - Fork safety verified!") return 0 else: print(f"[Parent] ❌ Child failed with exit code {exit_code}") return 1 if __name__ == "__main__": exit(main()) ``` **Run with ASAN:** ```bash LD_PRELOAD=$(gcc -print-file-name=libasan.so) \ ASAN_OPTIONS="detect_leaks=0:symbolize=1:abort_on_error=0" \ python3 test_fork_asan.py ``` **Expected output (success):** ``` [Parent PID 12345] Initializing IAST... [Parent] Context created: 0 [Parent] Tainted objects: 10 [Child PID 12346] Started after fork [Child] Free slots: 2 [Child] ✅ Context slots were reset (pthread_atfork worked!) [Parent] ✅ Child exited cleanly - Fork safety verified! ``` **What ASAN would report WITHOUT this fix:** ``` ==12346==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 ==12346==The signal is caused by a READ memory access. #0 0x7f... in get_tainted_object_map_by_ctx_id #1 0x7f... in debug_context_array_free_slots_number ``` (cherry picked from commit d1b4fd8)
|
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 248 ± 4 ms. The average import time from base is: 251 ± 4 ms. The import time difference between this PR and base is: -2.6 ± 0.2 ms. Import time breakdownThe following import paths have shrunk:
|
Performance SLOsComparing candidate backport-15514-to-4.0 (7f93a47) with baseline 4.0 (c9959f6) 🟡 Near SLO Breach (2 suites)🟡 flasksimple - 18/18✅ appsec-getTime: ✅ 4.594ms (SLO: <4.750ms -3.3%) vs baseline: +0.3% Memory: ✅ 63.856MB (SLO: <66.500MB -4.0%) vs baseline: +4.7% ✅ appsec-postTime: ✅ 6.620ms (SLO: <6.750ms 🟡 -1.9%) vs baseline: ~same Memory: ✅ 63.991MB (SLO: <66.500MB -3.8%) vs baseline: +4.7% ✅ appsec-telemetryTime: ✅ 4.588ms (SLO: <4.750ms -3.4%) vs baseline: -0.1% Memory: ✅ 63.837MB (SLO: <66.500MB -4.0%) vs baseline: +4.6% ✅ debuggerTime: ✅ 1.854ms (SLO: <2.000ms -7.3%) vs baseline: -0.4% Memory: ✅ 47.860MB (SLO: <49.500MB -3.3%) vs baseline: +5.3% ✅ iast-getTime: ✅ 1.859ms (SLO: <2.000ms -7.1%) vs baseline: +0.2% Memory: ✅ 44.650MB (SLO: <49.000MB -8.9%) vs baseline: +4.9% ✅ profilerTime: ✅ 1.917ms (SLO: <2.100ms -8.7%) vs baseline: +0.3% Memory: ✅ 48.629MB (SLO: <50.000MB -2.7%) vs baseline: +5.0% ✅ resource-renamingTime: ✅ 3.370ms (SLO: <3.650ms -7.7%) vs baseline: ~same Memory: ✅ 54.228MB (SLO: <56.000MB -3.2%) vs baseline: +4.9% ✅ tracerTime: ✅ 3.361ms (SLO: <3.650ms -7.9%) vs baseline: +0.1% Memory: ✅ 54.248MB (SLO: <56.500MB -4.0%) vs baseline: +4.9% ✅ tracer-nativeTime: ✅ 3.358ms (SLO: <3.650ms -8.0%) vs baseline: +0.1% Memory: ✅ 54.207MB (SLO: <60.000MB -9.7%) vs baseline: +4.7% 🟡 telemetryaddmetric - 30/30✅ 1-count-metric-1-timesTime: ✅ 2.977µs (SLO: <20.000µs 📉 -85.1%) vs baseline: +1.6% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.7% ✅ 1-count-metrics-100-timesTime: ✅ 200.724µs (SLO: <220.000µs -8.8%) vs baseline: +0.3% Memory: ✅ 34.465MB (SLO: <35.500MB -2.9%) vs baseline: +4.8% ✅ 1-distribution-metric-1-timesTime: ✅ 3.589µs (SLO: <20.000µs 📉 -82.1%) vs baseline: +9.7% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.9% ✅ 1-distribution-metrics-100-timesTime: ✅ 215.319µs (SLO: <230.000µs -6.4%) vs baseline: +0.6% Memory: ✅ 34.465MB (SLO: <35.500MB -2.9%) vs baseline: +4.9% ✅ 1-gauge-metric-1-timesTime: ✅ 2.190µs (SLO: <20.000µs 📉 -89.1%) vs baseline: -1.1% Memory: ✅ 34.485MB (SLO: <35.500MB -2.9%) vs baseline: +4.9% ✅ 1-gauge-metrics-100-timesTime: ✅ 138.303µs (SLO: <150.000µs -7.8%) vs baseline: +1.1% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +5.0% ✅ 1-rate-metric-1-timesTime: ✅ 3.120µs (SLO: <20.000µs 📉 -84.4%) vs baseline: +2.1% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +4.8% ✅ 1-rate-metrics-100-timesTime: ✅ 215.045µs (SLO: <250.000µs 📉 -14.0%) vs baseline: +0.1% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.7% ✅ 100-count-metrics-100-timesTime: ✅ 20.246ms (SLO: <22.000ms -8.0%) vs baseline: -0.6% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.8% ✅ 100-distribution-metrics-100-timesTime: ✅ 2.255ms (SLO: <2.300ms 🟡 -2.0%) vs baseline: +0.9% Memory: ✅ 34.544MB (SLO: <35.500MB -2.7%) vs baseline: +4.9% ✅ 100-gauge-metrics-100-timesTime: ✅ 1.408ms (SLO: <1.550ms -9.1%) vs baseline: +0.3% Memory: ✅ 34.564MB (SLO: <35.500MB -2.6%) vs baseline: +5.0% ✅ 100-rate-metrics-100-timesTime: ✅ 2.204ms (SLO: <2.550ms 📉 -13.6%) vs baseline: -0.5% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +5.0% ✅ flush-1-metricTime: ✅ 4.498µs (SLO: <20.000µs 📉 -77.5%) vs baseline: +1.3% Memory: ✅ 34.524MB (SLO: <35.500MB -2.7%) vs baseline: +4.9% ✅ flush-100-metricsTime: ✅ 175.996µs (SLO: <250.000µs 📉 -29.6%) vs baseline: +0.2% Memory: ✅ 34.603MB (SLO: <35.500MB -2.5%) vs baseline: +5.1% ✅ flush-1000-metricsTime: ✅ 2.140ms (SLO: <2.500ms 📉 -14.4%) vs baseline: +0.3% Memory: ✅ 35.291MB (SLO: <36.500MB -3.3%) vs baseline: +4.7% 📉 Performance Improvements (1 suite)📉 iastaspects - 118/118✅ add_aspectTime: ✅ 0.385µs (SLO: <10.000µs 📉 -96.2%) vs baseline: 📉 -14.0% Memory: ✅ 37.809MB (SLO: <41.500MB -8.9%) vs baseline: +0.8% ✅ add_inplace_aspectTime: ✅ 0.383µs (SLO: <10.000µs 📉 -96.2%) vs baseline: 📉 -14.2% Memory: ✅ 37.825MB (SLO: <41.500MB -8.9%) vs baseline: +0.8% ✅ add_inplace_noaspectTime: ✅ 0.291µs (SLO: <10.000µs 📉 -97.1%) vs baseline: +0.9% Memory: ✅ 37.789MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ add_noaspectTime: ✅ 0.358µs (SLO: <10.000µs 📉 -96.4%) vs baseline: +0.8% Memory: ✅ 37.967MB (SLO: <41.500MB -8.5%) vs baseline: +0.9% ✅ bytearray_aspectTime: ✅ 1.317µs (SLO: <10.000µs 📉 -86.8%) vs baseline: -1.4% Memory: ✅ 37.717MB (SLO: <41.500MB -9.1%) vs baseline: +0.3% ✅ bytearray_extend_aspectTime: ✅ 1.562µs (SLO: <10.000µs 📉 -84.4%) vs baseline: -1.5% Memory: ✅ 37.837MB (SLO: <41.500MB -8.8%) vs baseline: +0.5% ✅ bytearray_extend_noaspectTime: ✅ 0.613µs (SLO: <10.000µs 📉 -93.9%) vs baseline: -0.7% Memory: ✅ 37.739MB (SLO: <41.500MB -9.1%) vs baseline: +0.3% ✅ bytearray_noaspectTime: ✅ 0.484µs (SLO: <10.000µs 📉 -95.2%) vs baseline: -0.2% Memory: ✅ 37.768MB (SLO: <41.500MB -9.0%) vs baseline: +0.4% ✅ bytes_aspectTime: ✅ 1.354µs (SLO: <10.000µs 📉 -86.5%) vs baseline: +3.9% Memory: ✅ 37.627MB (SLO: <41.500MB -9.3%) vs baseline: ~same ✅ bytes_noaspectTime: ✅ 0.494µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -0.2% Memory: ✅ 37.773MB (SLO: <41.500MB -9.0%) vs baseline: +0.6% ✅ bytesio_aspectTime: ✅ 1.349µs (SLO: <10.000µs 📉 -86.5%) vs baseline: +2.4% Memory: ✅ 37.734MB (SLO: <41.500MB -9.1%) vs baseline: +0.2% ✅ bytesio_noaspectTime: ✅ 0.497µs (SLO: <10.000µs 📉 -95.0%) vs baseline: -1.0% Memory: ✅ 37.816MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ capitalize_aspectTime: ✅ 0.741µs (SLO: <10.000µs 📉 -92.6%) vs baseline: -0.1% Memory: ✅ 37.834MB (SLO: <41.500MB -8.8%) vs baseline: +0.7% ✅ capitalize_noaspectTime: ✅ 0.438µs (SLO: <10.000µs 📉 -95.6%) vs baseline: +1.0% Memory: ✅ 37.798MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ casefold_aspectTime: ✅ 0.740µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +0.3% Memory: ✅ 37.768MB (SLO: <41.500MB -9.0%) vs baseline: +0.4% ✅ casefold_noaspectTime: ✅ 0.368µs (SLO: <10.000µs 📉 -96.3%) vs baseline: -0.2% Memory: ✅ 37.823MB (SLO: <41.500MB -8.9%) vs baseline: +0.8% ✅ decode_aspectTime: ✅ 0.738µs (SLO: <10.000µs 📉 -92.6%) vs baseline: +1.2% Memory: ✅ 37.812MB (SLO: <41.500MB -8.9%) vs baseline: +0.4% ✅ decode_noaspectTime: ✅ 0.422µs (SLO: <10.000µs 📉 -95.8%) vs baseline: +0.6% Memory: ✅ 37.849MB (SLO: <41.500MB -8.8%) vs baseline: +0.8% ✅ encode_aspectTime: ✅ 0.712µs (SLO: <10.000µs 📉 -92.9%) vs baseline: -0.6% Memory: ✅ 37.792MB (SLO: <41.500MB -8.9%) vs baseline: +0.7% ✅ encode_noaspectTime: ✅ 0.410µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +1.7% Memory: ✅ 37.786MB (SLO: <41.500MB -9.0%) vs baseline: +0.4% ✅ format_aspectTime: ✅ 3.470µs (SLO: <10.000µs 📉 -65.3%) vs baseline: +2.1% Memory: ✅ 37.830MB (SLO: <41.500MB -8.8%) vs baseline: +0.5% ✅ format_map_aspectTime: ✅ 3.697µs (SLO: <10.000µs 📉 -63.0%) vs baseline: +3.6% Memory: ✅ 37.704MB (SLO: <41.500MB -9.1%) vs baseline: ~same ✅ format_map_noaspectTime: ✅ 0.828µs (SLO: <10.000µs 📉 -91.7%) vs baseline: +1.8% Memory: ✅ 37.792MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ format_noaspectTime: ✅ 0.598µs (SLO: <10.000µs 📉 -94.0%) vs baseline: -3.1% Memory: ✅ 37.799MB (SLO: <41.500MB -8.9%) vs baseline: +0.7% ✅ index_aspectTime: ✅ 0.343µs (SLO: <10.000µs 📉 -96.6%) vs baseline: -4.7% Memory: ✅ 37.649MB (SLO: <41.500MB -9.3%) vs baseline: ~same ✅ index_noaspectTime: ✅ 0.313µs (SLO: <10.000µs 📉 -96.9%) vs baseline: -0.4% Memory: ✅ 37.825MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ join_aspectTime: ✅ 1.271µs (SLO: <10.000µs 📉 -87.3%) vs baseline: 📉 -12.5% Memory: ✅ 37.751MB (SLO: <41.500MB -9.0%) vs baseline: +0.3% ✅ join_noaspectTime: ✅ 0.532µs (SLO: <10.000µs 📉 -94.7%) vs baseline: -0.6% Memory: ✅ 37.841MB (SLO: <41.500MB -8.8%) vs baseline: +0.7% ✅ ljust_aspectTime: ✅ 2.580µs (SLO: <20.000µs 📉 -87.1%) vs baseline: +1.6% Memory: ✅ 37.791MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ ljust_noaspectTime: ✅ 0.410µs (SLO: <10.000µs 📉 -95.9%) vs baseline: +0.9% Memory: ✅ 37.959MB (SLO: <41.500MB -8.5%) vs baseline: +1.1% ✅ lower_aspectTime: ✅ 2.271µs (SLO: <10.000µs 📉 -77.3%) vs baseline: +2.6% Memory: ✅ 37.752MB (SLO: <41.500MB -9.0%) vs baseline: +0.5% ✅ lower_noaspectTime: ✅ 0.372µs (SLO: <10.000µs 📉 -96.3%) vs baseline: +0.5% Memory: ✅ 37.688MB (SLO: <41.500MB -9.2%) vs baseline: +0.3% ✅ lstrip_aspectTime: ✅ 2.214µs (SLO: <20.000µs 📉 -88.9%) vs baseline: +1.4% Memory: ✅ 37.714MB (SLO: <41.500MB -9.1%) vs baseline: +0.2% ✅ lstrip_noaspectTime: ✅ 0.385µs (SLO: <10.000µs 📉 -96.2%) vs baseline: ~same Memory: ✅ 37.826MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ modulo_aspectTime: ✅ 0.969µs (SLO: <10.000µs 📉 -90.3%) vs baseline: 📉 -10.9% Memory: ✅ 37.810MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 1.487µs (SLO: <10.000µs 📉 -85.1%) vs baseline: -7.8% Memory: ✅ 37.743MB (SLO: <41.500MB -9.1%) vs baseline: +0.6% ✅ modulo_aspect_for_bytesTime: ✅ 0.955µs (SLO: <10.000µs 📉 -90.5%) vs baseline: -6.9% Memory: ✅ 37.858MB (SLO: <41.500MB -8.8%) vs baseline: +0.8% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 1.160µs (SLO: <10.000µs 📉 -88.4%) vs baseline: 📉 -11.9% Memory: ✅ 37.757MB (SLO: <41.500MB -9.0%) vs baseline: +0.2% ✅ modulo_noaspectTime: ✅ 0.674µs (SLO: <10.000µs 📉 -93.3%) vs baseline: -1.2% Memory: ✅ 37.820MB (SLO: <41.500MB -8.9%) vs baseline: +0.9% ✅ replace_aspectTime: ✅ 5.388µs (SLO: <10.000µs 📉 -46.1%) vs baseline: +9.5% Memory: ✅ 37.799MB (SLO: <41.500MB -8.9%) vs baseline: +0.4% ✅ replace_noaspectTime: ✅ 0.463µs (SLO: <10.000µs 📉 -95.4%) vs baseline: -1.0% Memory: ✅ 37.844MB (SLO: <41.500MB -8.8%) vs baseline: +0.9% ✅ repr_aspectTime: ✅ 0.947µs (SLO: <10.000µs 📉 -90.5%) vs baseline: ~same Memory: ✅ 37.801MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ repr_noaspectTime: ✅ 0.452µs (SLO: <10.000µs 📉 -95.5%) vs baseline: ~same Memory: ✅ 37.802MB (SLO: <41.500MB -8.9%) vs baseline: +0.2% ✅ rstrip_aspectTime: ✅ 1.868µs (SLO: <20.000µs 📉 -90.7%) vs baseline: +0.5% Memory: ✅ 37.793MB (SLO: <41.500MB -8.9%) vs baseline: +0.7% ✅ rstrip_noaspectTime: ✅ 0.385µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.6% Memory: ✅ 37.772MB (SLO: <41.500MB -9.0%) vs baseline: +0.6% ✅ slice_aspectTime: ✅ 0.490µs (SLO: <10.000µs 📉 -95.1%) vs baseline: -1.5% Memory: ✅ 37.784MB (SLO: <41.500MB -9.0%) vs baseline: +0.5% ✅ slice_noaspectTime: ✅ 0.449µs (SLO: <10.000µs 📉 -95.5%) vs baseline: -0.4% Memory: ✅ 37.856MB (SLO: <41.500MB -8.8%) vs baseline: +0.7% ✅ stringio_aspectTime: ✅ 1.686µs (SLO: <10.000µs 📉 -83.1%) vs baseline: -1.7% Memory: ✅ 37.780MB (SLO: <41.500MB -9.0%) vs baseline: +0.5% ✅ stringio_noaspectTime: ✅ 0.918µs (SLO: <10.000µs 📉 -90.8%) vs baseline: -0.4% Memory: ✅ 37.797MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ strip_aspectTime: ✅ 2.162µs (SLO: <20.000µs 📉 -89.2%) vs baseline: -1.3% Memory: ✅ 37.825MB (SLO: <41.500MB -8.9%) vs baseline: +0.8% ✅ strip_noaspectTime: ✅ 0.387µs (SLO: <10.000µs 📉 -96.1%) vs baseline: +0.1% Memory: ✅ 37.838MB (SLO: <41.500MB -8.8%) vs baseline: +0.7% ✅ swapcase_aspectTime: ✅ 2.575µs (SLO: <10.000µs 📉 -74.2%) vs baseline: +7.4% Memory: ✅ 37.694MB (SLO: <41.500MB -9.2%) vs baseline: +0.1% ✅ swapcase_noaspectTime: ✅ 0.535µs (SLO: <10.000µs 📉 -94.7%) vs baseline: -0.5% Memory: ✅ 37.780MB (SLO: <41.500MB -9.0%) vs baseline: +0.5% ✅ title_aspectTime: ✅ 2.395µs (SLO: <10.000µs 📉 -76.0%) vs baseline: +3.2% Memory: ✅ 37.777MB (SLO: <41.500MB -9.0%) vs baseline: +0.4% ✅ title_noaspectTime: ✅ 0.501µs (SLO: <10.000µs 📉 -95.0%) vs baseline: +0.3% Memory: ✅ 37.812MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ translate_aspectTime: ✅ 3.256µs (SLO: <10.000µs 📉 -67.4%) vs baseline: +1.6% Memory: ✅ 37.800MB (SLO: <41.500MB -8.9%) vs baseline: +0.5% ✅ translate_noaspectTime: ✅ 1.046µs (SLO: <10.000µs 📉 -89.5%) vs baseline: +0.4% Memory: ✅ 37.804MB (SLO: <41.500MB -8.9%) vs baseline: +0.6% ✅ upper_aspectTime: ✅ 2.281µs (SLO: <10.000µs 📉 -77.2%) vs baseline: +1.9% Memory: ✅ 37.810MB (SLO: <41.500MB -8.9%) vs baseline: +0.7% ✅ upper_noaspectTime: ✅ 0.375µs (SLO: <10.000µs 📉 -96.2%) vs baseline: +2.6% Memory: ✅ 37.830MB (SLO: <41.500MB -8.8%) vs baseline: +0.4%
|
Backport d1b4fd8 from #15514 to 4.0.
IAST-enabled applications using Gunicorn/Uvicorn workers were experiencing segmentation faults (~33% crash rate on MCP streaming requests) due to memory corruption when processes fork.
C++ global singletons (
taint_engine_context,initializer) initialized at module loadTaint maps storing PyObject pointers by memory address
Child processes after fork inherited stale pointers from parent process memory
Accessing these stale pointers → use-after-free → SIGSEGV crash
Implemented
pthread_atforkhandler that automatically resets C++ global state in child processes after every fork:Added comprehensive null-check wrappers around all native functions to prevent crashes when native state is
Fixed test regression issues where context slots weren't being freed:
AddressSanitizer
(ASAN) is a fast memory error detector that catches use-after-free, buffer overflows, and other memory corruption bugs at runtime.
1. Runtime Environment (No Recompilation Required)
The simplest way to test is using LD_PRELOAD with the system's libasan:
ASAN_OPTIONS explained:
detect_leaks=0- Disable leak detection (Python has many false positives)symbolize=1- Show human-readable stack tracesabort_on_error=0- Continue after first error (collect all errors)2. Build with ASAN (Optional, for deeper analysis)
For more thorough testing, compile the native extension with ASAN:
This script demonstrates the fork safety fix and can be used to verify ASAN finds no errors:
Run with ASAN:
Expected output (success):
What ASAN would report WITHOUT this fix:
(cherry picked from commit d1b4fd8)
Description
Testing
Risks
Additional Notes