Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ability to recover partial content from broken encoding #262

Merged

Conversation

fcisnerosrojo
Copy link
Contributor

@fcisnerosrojo fcisnerosrojo commented Sep 5, 2022

--> Problem

What I did:
I wrote a simple email which has two parts (text/plain and text/html) with base64 encoded data, but with one characteristic: both part contents have one extra base64 character (so they are slightly broken). Instead of having 4*n characters, the contents are (4*n)+1 characters long.

What I expected:
ReadParts() returns the root Part, with a FirstChild, and this one with a NextSibling. FirstChild and NextSibling have the partially decoded base64 data (Content != nil, parser recovers partial data from the parts with broken encoding).

What I got:
ReadParts() returns the root Part, with a FirstChild, and this one with a NextSibling. FirstChild and NextSibling have Content == nil (Parser does not recover partial data)
The reason is that for each part a base64.CorruptInputError is thrown, which then is catched by base64CorruptInputCheck(), leaving the part content empty and returning nil error (skipMalformedParts flag on true/false would not change the output).

--> Solution

With this particular scenario, in which there are text/plain and text/html contents, I would like to ignore that specific error (base64.CorruptInputError) and to keep the buffer read at that moment.
But the thing is that there might be different preferences (for now on, "policies"). For example:

  • keep the buffer read only when base64.CorruptInputError is raised
  • keep the buffer read only when another error is raised
  • keep the buffer read only on a base64.CorruptInputError error and when ContentType is text/plain
  • keep the buffer read only on a base64.CorruptInputError error and when ContentType is text/plain or text/html
  • always keep the buffer read, no matter the error
  • and so on ...

To solve this, I added a callback function readPartErrorPolicy to the Parser type, from which it will be possible to decide when to keep the buffer or not when an error is thrown while reading a part content.

When no policy is sent to the parser, default behavior is kept.

Release or branch I am using:
master

@jhillyerd jhillyerd self-requested a review September 6, 2022 17:52
Copy link
Owner

@jhillyerd jhillyerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, code looks good overall, but I have some suggestions.

part.go Outdated Show resolved Hide resolved
part_test.go Outdated Show resolved Hide resolved
part.go Outdated Show resolved Hide resolved
part_test.go Outdated Show resolved Hide resolved
parser.go Show resolved Hide resolved
@fcisnerosrojo
Copy link
Contributor Author

Thanks, code looks good overall, but I have some suggestions.

Thanks for your feedback @jhillyerd ! All suggestions were included in the last commit .

@jhillyerd jhillyerd merged commit 5bf7c5f into jhillyerd:master Sep 13, 2022
@jhillyerd
Copy link
Owner

Thanks!

@fcisnerosrojo
Copy link
Contributor Author

@jhillyerd , can we get a release that includes this feature?

@jhillyerd
Copy link
Owner

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants