Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for extracting attachments from OneNote section files #1048

Merged
merged 8 commits into from Dec 11, 2023

Conversation

micahsnyder
Copy link
Contributor

@micahsnyder micahsnyder commented Oct 8, 2023

Resolves: #819

Also includes rudimentary support in libclamav Rust code for getting slices from FMap's and for interacting with libclamav's context structure.

@micahsnyder micahsnyder changed the title WIP: Support for extracting attachments from OneNote section files Support for extracting attachments from OneNote section files Oct 10, 2023
@micahsnyder
Copy link
Contributor Author

I added 3 test tiles to demonstrate extraction of the clam.exe program from OneNote section files exported by OneNote 2007, OneNote 2010, and a recent Office364 OneNote export. If extraction were to fail, it would fail the libclamav unit tests as well as some clamd and clamscan tests.

I have temporarily set the onenote_parser module to pull from https://github.com/micahsnyder/onenote.rs/tree/CLAM-2329-new-from-slice pending this PR: msiemens/onenote.rs#12

I think it's ready for review.

libclamav/scanners.c Outdated Show resolved Hide resolved
libclamav_rust/build.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@ragusaa ragusaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to address the hardcoded path before merging?

@micahsnyder
Copy link
Contributor Author

Ok fixed to remove hardcoded path and dead code.

Copy link
Contributor

@shutton shutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't flag all the spots where you could probably pull out some const's in onenote.rs for all those hardcoded hex strings.

libclamav_rust/Cargo.toml Outdated Show resolved Hide resolved
libclamav_rust/src/onenote.rs Outdated Show resolved Hide resolved
libclamav_rust/src/onenote.rs Show resolved Hide resolved
libclamav_rust/src/onenote.rs Outdated Show resolved Hide resolved
libclamav_rust/src/onenote.rs Show resolved Hide resolved
libclamav_rust/src/scanners.rs Outdated Show resolved Hide resolved
libclamav_rust/src/scanners.rs Outdated Show resolved Hide resolved
libclamav_rust/src/scanners.rs Outdated Show resolved Hide resolved
libclamav_rust/src/scanners.rs Outdated Show resolved Hide resolved
libclamav_rust/src/util.rs Outdated Show resolved Hide resolved
libclamav_rust/src/ctx.rs Show resolved Hide resolved
libclamav_rust/src/ctx.rs Show resolved Hide resolved
libclamav_rust/Cargo.toml Outdated Show resolved Hide resolved
libclamav_rust/src/util.rs Show resolved Hide resolved
libclamav_rust/src/ctx.rs Outdated Show resolved Hide resolved
libclamav_rust/build.rs Show resolved Hide resolved
libclamav_rust/src/fmap.rs Outdated Show resolved Hide resolved
libclamav_rust/src/onenote.rs Outdated Show resolved Hide resolved
libclamav_rust/src/fmap.rs Outdated Show resolved Hide resolved
libclamav_rust/src/fmap.rs Show resolved Hide resolved
@Cisco-Talos Cisco-Talos deleted a comment from micahsnyder Nov 6, 2023
libclamav_rust/src/util.rs Show resolved Hide resolved
libclamav_rust/src/onenote.rs Outdated Show resolved Hide resolved
libclamav_rust/src/fuzzy_hash.rs Outdated Show resolved Hide resolved
libclamav_rust/src/fuzzy_hash.rs Outdated Show resolved Hide resolved
libclamav_rust/src/fuzzy_hash.rs Show resolved Hide resolved
libclamav_rust/src/fmap.rs Outdated Show resolved Hide resolved
libclamav_rust/src/ctx.rs Outdated Show resolved Hide resolved
libclamav/scanners.c Outdated Show resolved Hide resolved
@micahsnyder micahsnyder force-pushed the CLAM-2330-onenote-c-api branch 2 times, most recently from ca55c59 to 2633cbc Compare November 28, 2023 03:37
Includes rudimentary support for getting slices from FMap's and for
interacting with libclamav's context structure.

For now will use a Cisco-Talos org fork of the onenote_parser
until the feature to read open a onenote section from a slice (instead
of from a filepath) is added to the upstream.
In order to generate Rust bindings for C code, Rust's bindgen module
needs to know where to find all headers included by the API.
If they're all inside the project or inside the standard include path
(e.g. /usr/include and /usr/local/include) that's fine. But for third-
party C library headers from outside the standard include path, that's
a problem.
We didn't really notice this problem when generating on Unix systems
until we switched to use OpenSSL 3.1 and tested on systems that have
the OpenSSL 1.1.1 dev package installed.

The ability to find headers outside the project path is also needed to
generate bindings on Windows, if desired.

This commit solves the problem by passing include directories for the
ClamAV::libclamav CMake build target to the Rust build via the
CARGO_INCLUDE_DIRECTORIES environment variable.
Then, in the `libclamav_rust/build.rs` script, where we run bindgen,
we split that `;` separated string into invididual paths and add each
to the bindgen builder.
The fmap structure has some stuff that differs in size in memory between
Linux and Windows, and between 32bit and 64bit architectures.

Notably, `time_t` appears to be defined by the Rust bindgen module as
`ulong` which may be either 8 bytes or 4 bytes, depending architecture
(thanks, C). To resolve this, we'll store time as a uint64_t instead.

The other problem in the fmap structure is the windows file and map
handles should always be exist, and may only be used in Windows, for
consistency in sizing of the structure.
Fix Coverity issues 192935, 192932, 192928, and 192917.

None of these are particularly serious. I thought I'd clean them up
while trying to track down a strange crash that occurs in Windows debug
builds with my specific setup when freeing the metadata filename pointer
malloced by the UnRAR iface "peek" function.

I wasn't able to figure out why freeing that causes a crash, so instead
I converted it to an array that need not be freed, and my troubles
melted away.
@micahsnyder micahsnyder merged commit cb45aeb into Cisco-Talos:main Dec 11, 2023
23 of 24 checks passed
@micahsnyder micahsnyder deleted the CLAM-2330-onenote-c-api branch December 11, 2023 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Extraction of attachments inside OneNote documents
4 participants