Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/source/cpp/security.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ is always assumed to be :ref:`valid <format-invalid-data>`. If your program may
encounter invalid data, it must explicitly check its validity by calling one of
the following validation APIs.

Note that library crashes or hangs triggered by invalid data are generally
considered bugs rather than security vulnerabilities, unless the behavior
is exploitable (see :ref:`Bugs vs. Security Vulnerabilities <bugs_vs_security>`).

Structural validity
'''''''''''''''''''

Expand Down
28 changes: 28 additions & 0 deletions docs/source/format/Security.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,34 @@ You should read this document if you belong to either of these two categories:
documented on https://arrow.apache.org.


.. _bugs_vs_security:

Bugs vs. Security Vulnerabilities
=================================

Arrow aims for robustness when processing untrusted data, but it is important to
distinguish functional bugs from security vulnerabilities.

Unexpected behavior (e.g., crashes or infinite loops) triggered by malformed
input is generally considered a **bug**, not a security vulnerability, unless it
is **exploitable**. An issue is exploitable if an attacker can:

* Execute arbitrary code (RCE);
* Exfiltrate sensitive information from process memory (Information Disclosure);
* Cause a sustained Denial of Service (DoS) affecting the broader system.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we define DoS as an exploitable issue, but then say that process crashes are not exploitable?

I think what gets tricky is technically any arrow API could be exposed by a client application, and therefore in theory could be exploitable in that applications context.

I wonder if we need to distinguish between network APIs, e.g. arrow flight, and internal APIs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was hedging with "sustained" and "affecting the broader system" . I guess in my mind I don't think we should treat panics or OOMs as security issues (they are bugs certainly)

I want it to be clear to downstream users that they need to take other precautions (like process sandboxing, and cgroups for example) to make their systems resilent rather than assume we will treat every bug as a security issue

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just remove "denial of service" from the list 🤔


Examples of bugs that are typically **not** security vulnerabilities:

* Process-local crashes (SIGSEGV, null pointer dereference, or `std::abort`)
that cannot be leveraged for code execution or information disclosure;
* Resource exhaustion (infinite loops, high CPU/memory usage) that only
affects the local process.

Report such issues on our `public issue tracker <https://github.com/apache/arrow/issues>`_.
If you suspect an issue is exploitable, report it privately via the
`ASF security process <https://apache.org/security/#reporting-a-vulnerability>`_.


Columnar Format
===============

Expand Down
Loading