Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Feb 10, 2026

ZIP files can contain entries with valid local file headers that are deliberately omitted from the central directory index. Standard extraction only reads the central directory, so these steganographic/tampered entries are invisible.

Approach

SharpCompress ZipArchive.Open reads the central directory; ReaderFactory.Open walks local file headers sequentially. By comparing entries between the two, we surface anything hidden. The scan is opt-in to avoid the cost of a second stream pass.

Changes

  • FileEntryStatus — new NonIndexedEntry enum value for flagging hidden entries
  • ExtractorOptions — new ExtractNonIndexedZipEntries property (default false)
  • ZipExtractor — collects central directory keys during normal extraction; when opted in, does a second pass via ReaderFactory and yields any entries absent from the central directory set, tagged as NonIndexedEntry
  • Tests — 4 xUnit tests using a programmatically crafted tampered ZIP (central directory references only one of two local entries)

Usage

var extractor = new Extractor();
var results = extractor.Extract("suspect.zip", new ExtractorOptions
{
    ExtractNonIndexedZipEntries = true
});

foreach (var entry in results)
{
    if (entry.EntryStatus == FileEntryStatus.NonIndexedEntry)
        Console.WriteLine($"Hidden entry found: {entry.FullPath}");
}
Original prompt

This section details on the original issue you should resolve

<issue_title>Find Non-Indexed files in zips</issue_title>
<issue_description>It is possible to hide extra files in a zip file that are not referenced in the directory. These may be missed by standard extraction, but it should be possible to find them (and perhaps flag them as suspicious).

https://www.codeproject.com/Articles/13808/Steganography-16-Hiding-additional-files-in-a-ZIP</issue_description>

<agent_instructions>Please investigate how we could accomplish - or at least detect this kind of situation, ideally we'd want to avoid having to have a complete second read of the stream - for performance reasons - and we also want to continue using external libraries for zip parsing - but we do reference a number of dependencies between sharpcompress, the standard lib and discutils, so we can consider switching between libraries if needed.</agent_instructions>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits February 10, 2026 23:17
Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
… directory

Add opt-in ExtractNonIndexedZipEntries option that uses a forward-only
SharpCompress reader to walk local file headers and discover entries absent
from the central directory. Such entries are flagged with the new
NonIndexedEntry status on FileEntryStatus.

Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
…onsistent with async

Co-authored-by: gfs <98900+gfs@users.noreply.github.com>
Copilot AI changed the title [WIP] Add detection for non-indexed files in zip archives Detect and extract non-indexed ZIP entries hidden outside the central directory Feb 10, 2026
Copilot AI requested a review from gfs February 10, 2026 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Find Non-Indexed files in zips

2 participants