From db188f77bde4a5b718c5434482e1c62bb172b3a1 Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Wed, 15 Apr 2026 07:16:46 -0400 Subject: [PATCH 1/3] [GH 49760]: Document the difference between bug and security vulnerability --- docs/source/cpp/security.rst | 4 ++++ docs/source/format/Security.rst | 38 +++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/docs/source/cpp/security.rst b/docs/source/cpp/security.rst index ee35f7b296f..727d9feeead 100644 --- a/docs/source/cpp/security.rst +++ b/docs/source/cpp/security.rst @@ -63,6 +63,10 @@ is always assumed to be :ref:`valid `. If your program may encounter invalid data, it must explicitly check its validity by calling one of the following validation APIs. +Note that even if a library crashes or hangs when encountering invalid data, it is +generally considered a bug rather than a security vulnerability, unless the behavior +is exploitable (see :ref:`Bugs vs. Security Vulnerabilities `). + Structural validity ''''''''''''''''''' diff --git a/docs/source/format/Security.rst b/docs/source/format/Security.rst index 8e630ea9a55..94a992c002b 100644 --- a/docs/source/format/Security.rst +++ b/docs/source/format/Security.rst @@ -51,6 +51,44 @@ You should read this document if you belong to either of these two categories: documented on https://arrow.apache.org. +Bugs vs. Security Vulnerabilities +================================= + +The Arrow project aims for robustness when processing data from untrusted +sources. However, it is important to distinguish between functional bugs +and security vulnerabilities. + +Invalid input files (such as malformed IPC streams or Parquet files) that +cause an Arrow implementation to misbehave, (for example, by triggering +a segmentation fault or an infinite loop) are generally considered **bugs**, +not security vulnerabilities, unless the behavior is **exploitable**. + +Such uncontrolled behavior is considered **exploitable** if it +can be leveraged by an attacker to: + +* Perform arbitrary code execution (e.g. Remote Code Execution); +* Access or exfiltrate sensitive information from the process memory + (Information Disclosure); +* Cause a sustained Denial of Service (DoS) affecting the broader system + (beyond the individual process processing the data). + +Examples of behaviors that are bugs but generally **not** security vulnerabilities: + +* A segmentation fault (SIGSEGV) or null pointer dereference occurring within + the process parsing an invalid file, provided it cannot be leveraged for + code execution or information disclosure; +* An assertion failure or abortion (`std::abort`) triggered by an internal + sanity check when encountering malformed data; +* An infinite loop or excessive CPU/memory usage that only affects the local + process and does not impact the availability of the overall system. + +We encourage users to report such uncontrolled behavior on invalid data as +regular bugs in our `public issue tracker `_ so they can be fixed. If you suspect +an issue is exploitable, please follow the +`ASF security reporting process `_ +and report it privately. + + Columnar Format =============== From e36cd16f09606c436428770905d27a32f8ece5d6 Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Wed, 15 Apr 2026 07:20:14 -0400 Subject: [PATCH 2/3] tweaks --- docs/source/cpp/security.rst | 2 +- docs/source/format/Security.rst | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/source/cpp/security.rst b/docs/source/cpp/security.rst index 727d9feeead..0208ca4a5e1 100644 --- a/docs/source/cpp/security.rst +++ b/docs/source/cpp/security.rst @@ -65,7 +65,7 @@ the following validation APIs. Note that even if a library crashes or hangs when encountering invalid data, it is generally considered a bug rather than a security vulnerability, unless the behavior -is exploitable (see :ref:`Bugs vs. Security Vulnerabilities `). +is exploitable (see :ref:`Bugs vs. Security Vulnerabilities `). Structural validity ''''''''''''''''''' diff --git a/docs/source/format/Security.rst b/docs/source/format/Security.rst index 94a992c002b..0a65b8bf8a1 100644 --- a/docs/source/format/Security.rst +++ b/docs/source/format/Security.rst @@ -51,6 +51,8 @@ You should read this document if you belong to either of these two categories: documented on https://arrow.apache.org. +.. _bugs_vs_security: + Bugs vs. Security Vulnerabilities ================================= @@ -59,7 +61,7 @@ sources. However, it is important to distinguish between functional bugs and security vulnerabilities. Invalid input files (such as malformed IPC streams or Parquet files) that -cause an Arrow implementation to misbehave, (for example, by triggering +cause an Arrow implementation to misbehave (for example, by triggering a segmentation fault or an infinite loop) are generally considered **bugs**, not security vulnerabilities, unless the behavior is **exploitable**. From 705ce5c97a31e340d665373f8a500e65989ef627 Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Wed, 15 Apr 2026 07:22:42 -0400 Subject: [PATCH 3/3] make concise --- docs/source/cpp/security.rst | 4 +-- docs/source/format/Security.rst | 54 +++++++++++++-------------------- 2 files changed, 23 insertions(+), 35 deletions(-) diff --git a/docs/source/cpp/security.rst b/docs/source/cpp/security.rst index 0208ca4a5e1..4a754cc36c0 100644 --- a/docs/source/cpp/security.rst +++ b/docs/source/cpp/security.rst @@ -63,8 +63,8 @@ is always assumed to be :ref:`valid `. If your program may encounter invalid data, it must explicitly check its validity by calling one of the following validation APIs. -Note that even if a library crashes or hangs when encountering invalid data, it is -generally considered a bug rather than a security vulnerability, unless the behavior +Note that library crashes or hangs triggered by invalid data are generally +considered bugs rather than security vulnerabilities, unless the behavior is exploitable (see :ref:`Bugs vs. Security Vulnerabilities `). Structural validity diff --git a/docs/source/format/Security.rst b/docs/source/format/Security.rst index 0a65b8bf8a1..253a9f4c8ca 100644 --- a/docs/source/format/Security.rst +++ b/docs/source/format/Security.rst @@ -56,39 +56,27 @@ You should read this document if you belong to either of these two categories: Bugs vs. Security Vulnerabilities ================================= -The Arrow project aims for robustness when processing data from untrusted -sources. However, it is important to distinguish between functional bugs -and security vulnerabilities. - -Invalid input files (such as malformed IPC streams or Parquet files) that -cause an Arrow implementation to misbehave (for example, by triggering -a segmentation fault or an infinite loop) are generally considered **bugs**, -not security vulnerabilities, unless the behavior is **exploitable**. - -Such uncontrolled behavior is considered **exploitable** if it -can be leveraged by an attacker to: - -* Perform arbitrary code execution (e.g. Remote Code Execution); -* Access or exfiltrate sensitive information from the process memory - (Information Disclosure); -* Cause a sustained Denial of Service (DoS) affecting the broader system - (beyond the individual process processing the data). - -Examples of behaviors that are bugs but generally **not** security vulnerabilities: - -* A segmentation fault (SIGSEGV) or null pointer dereference occurring within - the process parsing an invalid file, provided it cannot be leveraged for - code execution or information disclosure; -* An assertion failure or abortion (`std::abort`) triggered by an internal - sanity check when encountering malformed data; -* An infinite loop or excessive CPU/memory usage that only affects the local - process and does not impact the availability of the overall system. - -We encourage users to report such uncontrolled behavior on invalid data as -regular bugs in our `public issue tracker `_ so they can be fixed. If you suspect -an issue is exploitable, please follow the -`ASF security reporting process `_ -and report it privately. +Arrow aims for robustness when processing untrusted data, but it is important to +distinguish functional bugs from security vulnerabilities. + +Unexpected behavior (e.g., crashes or infinite loops) triggered by malformed +input is generally considered a **bug**, not a security vulnerability, unless it +is **exploitable**. An issue is exploitable if an attacker can: + +* Execute arbitrary code (RCE); +* Exfiltrate sensitive information from process memory (Information Disclosure); +* Cause a sustained Denial of Service (DoS) affecting the broader system. + +Examples of bugs that are typically **not** security vulnerabilities: + +* Process-local crashes (SIGSEGV, null pointer dereference, or `std::abort`) + that cannot be leveraged for code execution or information disclosure; +* Resource exhaustion (infinite loops, high CPU/memory usage) that only + affects the local process. + +Report such issues on our `public issue tracker `_. +If you suspect an issue is exploitable, report it privately via the +`ASF security process `_. Columnar Format