Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add option skip malformed parts #248

Conversation

dmytrokasianenko-outreach
Copy link
Contributor

This PR is proposal for the feature which allows to skip parts that can't be parsed. It won't return error, just add them to "problems"

This can be useful if you need to parse everything what is possible to parse and turned off by default.

For example, we have emails which have body of the email starting at the beginning of the part (without headers). Currently parser returns error, but it can be beneficial to skip this part.

...
Content-Type: multipart/report; report-type=delivery-status;
	boundary="12345/example.com"
Auto-Submitted: auto-generated (failure)
MIME-Version: 1.0

--12345/example.com
[EXTERNAL EMAIL]

The original message was received
...

@jhillyerd
Copy link
Owner

jhillyerd commented Apr 29, 2022

I'm in favor of this, will take a bit of time to review, and may hold off merging until after the next release.

Edit: or if you think you will have follow-ups, we can merge into a dedicated branch for now

@iredmail
Copy link
Contributor

iredmail commented Apr 29, 2022

How about a new ReadEnvelopeWithOptions() function instead of modifying existing ReadEnvelope()?

I'd like to add options like:

  • skip reading / parsing mail body (just parse headers in some cases)
  • Skip (base64) decoding attachments (no need to decode it in some cases, so why not save some system resources)

@jhillyerd
Copy link
Owner

My initial plan for options was like @iredmail describes above, however I think if I designed it today, I'd have a options builder struct with ReadEnvelope/etc as methods on it. This would make it easier for people to mock in tests. The original package funcs could just forward to those methods on a default options object.

That being said, I haven't thought about the strategy @dmytrokasianenko-outreach used before, so I will spend some time on it. Probably not this weekend though. Most code would continue to work with that change, but I think I've seen breakage from that type of change in the past. I expect we'd want to go from 0.9 series to 0.10 or 1.0 in that case.

@dmytrokasianenko-outreach
Copy link
Contributor Author

Probably better solutions could be to use option interfaces, see Uber Go Style Guide for functional options. Mainly:

Note that there's a method of implementing this pattern with closures but we believe that the pattern above provides more flexibility for authors and is easier to debug and test for users. In particular, it allows options to be compared against each other in tests and mocks, versus closures where this is impossible. Further, it lets options implement other interfaces, including fmt.Stringer which allows for user-readable string representations of the options.

WDYT?

@dmytrokasianenko-outreach dmytrokasianenko-outreach force-pushed the dmytrokasianenko-outreach/fix/skip-malformed-parts branch from 90ae0da to 671b2cb Compare May 2, 2022 10:57
@kadukf-outreach
Copy link

Q1: @jhillyerd so that it would be possible to replace some part of the message parsing flow? So that we will have open/close principle in place - open for extension but closed for modifications - closed part would be message parsing orchestration (method ReadParts) which drives boundary reader but the particular steps would be replaceable (use defaults it nothing provided), right? IMO the question is how much of the openings is needed/desired.

Q2: would it be possible to introduce custom error for malformed parts so that when processing "problems" it would be possible to get a pointer to the malformed part, type of issue (header, content, etc).

Q3: Shouldn't explicitly "skip malformed parts" be part of the default behavior so that after parsing we get some results with references to the failed parts so that we can handle it accordingly, wdyt?

@jhillyerd
Copy link
Owner

jhillyerd commented May 10, 2022

Probably better solutions could be to use option interfaces, see Uber Go Style Guide for functional options. Mainly:

Note that there's a method of implementing this pattern with closures but we believe that the pattern above provides more flexibility for authors and is easier to debug and test for users. In particular, it allows options to be compared against each other in tests and mocks, versus closures where this is impossible. Further, it lets options implement other interfaces, including fmt.Stringer which allows for user-readable string representations of the options.

WDYT?

After giving it some thought, I'm OK with the functional options pattern you are using, and it seems more acceptable in Go than a builder pattern. What I'm not OK with is forcing our users to build their options from scratch for every call to enmime, most users will be using identical options for every parse. Thoughts on making config public (while keeping the fields private)?

Edit: I didn't quite grok the interfaces part on the first read, you're right that would be better than public config struct.

@jhillyerd
Copy link
Owner

Q1: @jhillyerd so that it would be possible to replace some part of the message parsing flow? So that we will have open/close principle in place - open for extension but closed for modifications - closed part would be message parsing orchestration (method ReadParts) which drives boundary reader but the particular steps would be replaceable (use defaults it nothing provided), right? IMO the question is how much of the openings is needed/desired.

This seems way out of scope for this PR, or even the 1.0 release of enmime.

Q2: would it be possible to introduce custom error for malformed parts so that when processing "problems" it would be possible to get a pointer to the malformed part, type of issue (header, content, etc).

Probably, but that can come later.

Q3: Shouldn't explicitly "skip malformed parts" be part of the default behavior so that after parsing we get some results with references to the failed parts so that we can handle it accordingly, wdyt?

No, this may have security implications for existing users. I expect very few users ever inspect the soft errors enmime returns.

@dmytrokasianenko-outreach
Copy link
Contributor Author

What I'm not OK with is forcing our users to build their options from scratch for every call to enmime, most users will be using identical options for every parse. Thoughts on making config public (while keeping the fields private)?

I have a idea, but it would require API changes:

// build parser once, with all configs
p := enmime.NewParser(ops...)
// use multiple times
p.Parse(...)
p.Parse(...)

@iredmail
Copy link
Contributor

What I'm not OK with is forcing our users to build their options from scratch for every call to enmime, most users will be using identical options for every parse. Thoughts on making config public (while keeping the fields private)?

I have a idea, but it would require API changes:

// build parser once, with all configs
p := enmime.NewParser(ops...)
// use multiple times
p.Parse(...)
p.Parse(...)

A new ReadEnvelopeWithOptions() is better IMO, no backward-comptability concern.

@jhillyerd
Copy link
Owner

jhillyerd commented May 11, 2022

I have a idea, but it would require API changes

I like that. We could modify the existing ReadEnvelope + ReadParts to use a default set of options (private), so the old API continues to work.

@dmytrokasianenko-outreach
Copy link
Contributor Author

A new ReadEnvelopeWithOptions() is better IMO, no backward-comptability concern.

function will still require to pass options and build config every time you want to parse an email.

This commit introduces Parser, which allows to configure how MIME will
be parsed. Default parser behavior isn't changed - ReadEvelope and
ReadParts will work as before.

Currently only adding option to skip malformed parts.
@dmytrokasianenko-outreach dmytrokasianenko-outreach force-pushed the dmytrokasianenko-outreach/fix/skip-malformed-parts branch from 671b2cb to 93f5dfc Compare May 12, 2022 10:35
@dmytrokasianenko-outreach
Copy link
Contributor Author

dmytrokasianenko-outreach commented May 12, 2022

@jhillyerd updated code to use functional options as interfaces and added parser constructor which consumes options.
How does it look to you now?

Copy link
Owner

@jhillyerd jhillyerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great, I added one question and a couple changes.

My plan is to merge this into a separate branch, release the final 0.9.x, and then merge this to master in the next week or so. This will kick off 0.10, which I hope will be a rather short series of releases before going 1.0. 🚀

options.go Outdated Show resolved Hide resolved
part_test.go Show resolved Hide resolved
part_test.go Outdated Show resolved Hide resolved
@jhillyerd jhillyerd changed the base branch from master to option-support May 13, 2022 15:42
@jhillyerd jhillyerd merged commit d050c0f into jhillyerd:option-support May 16, 2022
@jhillyerd
Copy link
Owner

Thanks! This is probably one of the most exciting changes for enmime in years. I'll try to release 0.9.x & merge it to master this week.

jhillyerd pushed a commit that referenced this pull request May 17, 2022
* feat: add option skip malformed parts

This commit introduces Parser, which allows to configure how MIME will
be parsed. Default parser behavior isn't changed - ReadEnvelope and
ReadParts will work as before.

Currently only adding option to skip malformed parts.

Co-authored-by: Dmytro Kasianenko <dmytro.kasianeneko@outreach.io>
@jhillyerd
Copy link
Owner

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants