Skip to content

fixes #20#47

Merged
rlerdorf merged 5 commits intomasterfrom
fix/gh20-content-id-parentheses
Apr 5, 2026
Merged

fixes #20#47
rlerdorf merged 5 commits intomasterfrom
fix/gh20-content-id-parentheses

Conversation

@rlerdorf
Copy link
Copy Markdown
Member

@rlerdorf rlerdorf commented Apr 5, 2026

Fix Content-ID parsing stripping parenthesized text (GH-20)

Fixes #20

Problem

mailparse_msg_get_part_data() returns the wrong content-id value when the Content-ID header contains parentheses. For example:

Content-ID: <Facebook_32x32(1)_aa284ba9-f148-4698-9c1f-c8e92bdb842e.png>

Returns Facebook_32x32 _aa284ba9-f148-4698-9c1f-c8e92bdb842e.png instead of the correct Facebook_32x32(1)_aa284ba9-f148-4698-9c1f-c8e92bdb842e.png.

The code was passing the Content-ID value through the RFC 822 address tokenizer (php_mailparse_rfc822_tokenize + php_rfc822_parse_address_tokens), which interprets parenthesized text as RFC 822 comments and strips them. Content-ID is a message identifier, not an email address, so full address parsing is inappropriate here.

Fix

Replace the RFC 822 address parser with simple whitespace trimming and angle bracket stripping. This preserves the existing behavior for standard RFC-compliant Content-IDs like <part1@example.com> (angle brackets removed, inner value returned) while fixing the bug for values containing parentheses.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes incorrect Content-ID extraction in mailparse_msg_get_part_data() when the header value contains parentheses by avoiding RFC 822 address parsing (which treats parenthesized text as comments).

Changes:

  • Replace RFC 822 address-token parsing for Content-ID with direct trimming + bracket stripping in mailparse.c.
  • Add a new PHPT regression test covering parenthesized Content-ID values and bracketed/unbracketed forms.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mailparse.c Updates Content-ID extraction logic to preserve parenthesized text by no longer using the RFC 822 address parser.
tests/gh20.phpt Adds regression coverage for GH-20 to ensure parentheses in Content-ID are preserved.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mailparse.c Outdated
Comment thread mailparse.c Outdated
Comment thread tests/gh20.phpt
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mailparse.c Outdated
Comment thread mailparse.c
Comment thread tests/gh20.phpt
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mailparse.c
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mailparse.c Outdated
@rlerdorf rlerdorf merged commit 8788b36 into master Apr 5, 2026
12 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unexpected parsed value of content-id

2 participants