Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify Mail to allow folding of long header fields #123

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

sbeitzel
Copy link

@sbeitzel sbeitzel commented Dec 22, 2021

Description

Add a folding function to the Mail struct and apply it to headers which might generate lines longer than the RFC-suggested 78 characters (or mandated 998 characters).

Motivation and Context

This change fixes issue #121.

How Has This Been Tested?

I have added unit tests for the wrap/unwrap functions.

Checklist:

  • I have submitted a CLA form
  • If applicable, I have updated the documentation accordingly.
  • If applicable, I have added tests to cover my changes.

@CLAassistant
Copy link

CLAassistant commented Dec 22, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@dannys42 dannys42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this PR and making a unit test! Just a few minor comments below.

Sources/SwiftSMTP/String-Wrap.swift Outdated Show resolved Hide resolved
Sources/SwiftSMTP/Mail.swift Outdated Show resolved Hide resolved
Sources/SwiftSMTP/String-Wrap.swift Outdated Show resolved Hide resolved
Tests/SwiftSMTPTests/TestWrapUnwrap.swift Outdated Show resolved Hide resolved
@sbeitzel sbeitzel changed the title Extend String to support wrap/unwrap of header fields Modify Mail to allow folding of long header fields Jan 3, 2022
Copy link
Contributor

@dannys42 dannys42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating and putting thought into this.

Sources/SwiftSMTP/Mail.swift Show resolved Hide resolved
Comment on lines 147 to 197
private func foldHeaderValue(key: String, value: String) -> String {
let initialHeader = "\(key): \(value)"
if initialHeader.count < 79 {
return value
}
// if we're here, it means that RFC 5322 SUGGESTS that we fold this header
var foldedHeader = ""
var register = "\(key): "
var linePosition = 0
let foldableCharacters = CharacterSet(charactersIn: " ,")
for char in value {
// append the character to the register
register.append(char)
// this test is to detect the end of a token, mid-stream
if let _ = String(char).rangeOfCharacter(from: foldableCharacters) {
if linePosition > 1 && (register.count + linePosition > 78) {
// we already have stuff on the line and the register is too long
// to continue on the current line. So we fold and start a new line.
foldedHeader.append("\r\n ")
linePosition = 1
}
// now, register contains a complete token, so we put it up to the line
linePosition += register.count
if linePosition > 998 {
Log.error("Header line length exceeds the specified maximum (998 chars) - see RFC 5322 section 2.2.3")
}
foldedHeader.append(register)
register = ""
}
}
// we have the last of the value characters in register, so we put them up
// to the line. We still want to apply the same logic as that inside the loop
// though, and apply folding if it's appropriate.
if linePosition > 1 && (register.count + linePosition > 78) {
foldedHeader.append("\r\n ")
}
if register.count > 997 {
Log.error("Header line length exceeds the specified maximum (998 chars) - see RFC 5322 section 2.2.3")
}
foldedHeader.append(register)

let valueIndex = foldedHeader.index(foldedHeader.startIndex, offsetBy: key.count + 2)
return String(foldedHeader.suffix(from: valueIndex))
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's a lot of complexity here that's a little hard to follow. While your approach is certainly memory efficient, I think it would be good to at separates the two logical concepts: splitting the input value into elements and joining them into proper length lines.

Perhaps it might make things easier to work with arrays? So, split the input value into elements, trim leading and training whitespaces in each element, then use that to construct a line of text until it reaches the maximum length, and appending that to an array of output lines. On return, simply join the array with commas. I see two additional points of complexity remaining:

  • An element itself may exceeds the 997 limit.
  • Separating the text into elements. Section 3.2.2 talks about "folding white space and comments" that I think make this a bit more complicated and probably deserves it's own function.

The first one seems easier to solve as the note in RFC 2.2.3 indicates, it sounds like we can just force a fold anywhere we need, and that should be better than allowing it to go above the maximum.

The second one seems like you need a simple lexer... which I always find annoying to write, but can probably be done with a simple state machine.

Also, I suggest having some let constants somewhere representing recommendedLineLength=78 and maximumLineLength=997 rather than leaving them as magic numbers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't entirely disagree with you, and that's really the first approach I thought of. However as I went down that path, it turned out to generate more complexity. This way, at least, there's minimal backtracking and removing and then restoring characters. If you think a state machine and a tokenizer are more readable than a stream processor, I invite you to write them. :)

As for the magic numbers, you're quite right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not 100% it'd be more readable either... Since you tried that first, I think I'll just trust that you picked the better approach. :-) I doubt we'll have a "clean" solution to the problem until someone makes a good parser generator for Swift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more... I believe the RFC talks about trying not to fold during quotes. So for example, if we have a "To" line consisting of something like:

To: "Doe, Jane <jane.doe@nowhere.com>", "Doe, John <john.doe@nowhere.com>"  

We'd ideally want the break outside of the quotes. Do you see an easy way of taking this into account?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For starters, I agree with your understanding of the RFC -- but then as I thought about it, it struck me that we're not just folding the email address lines -- we should be folding any header line that gets too long. Now, in an email address, we know that the quote character should always be matched by another quote character, and these are delimiters for a string that we should treat like an atom. But in some other header it might not be special in that way. Suddenly, it's not just a simple arithmetic problem, it's bigger: the folding code needs to know about the semantics of each header field. Okay, we could do that -- at least we're passing in the header name, so the folding code could look up what the delimiter characters are -- but that gets us back to really wanting a custom state machine for parsing the field.

One approach that occurred to me was that we could say, "If you want folding to work on your field the way it works for a list of email addresses, then let us do it; if you want some other rules to apply, then do it yourself." That seems kind of reasonable, and would probably be acceptable for most cases. However, again, we'd probably want to change how we store headers -- as some smarter object rather than just a String. Also, this starts to feel like a behavior/API change that would go into a major release, rather than a bug fix point release.

And at about this point, I reminded myself that the point of this change is to allow for a reasonably long list of recipients. Is there a place where people are talking about the roadmap? If so, I think this is an excellent candidate for discussion there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and something I discovered while debugging the code and writing the unit tests: because we MIME encode the name part of the email address, and the way email addresses are written, this will only wind up folding between the name and the address, never inside the name. So, even if we do fold inside a single address, we won't be introducing any significant whitespace.

"Some Fellow" <fellow@place.local> - the folding code winds up seeing Some__Fellow and would not add extra whitespace into the name, which an email client might pick up when creating a contact card.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's the slack group... You can try there and it might be good for us to see others thoughts on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular I think it would be nice to get this reviewed by other users of Swift-SMTP.

Sources/SwiftSMTP/Mail.swift Show resolved Hide resolved
Tests/SwiftSMTPTests/TestHeaderFolding.swift Outdated Show resolved Hide resolved
@dannys42 dannys42 requested a review from mbarnach January 8, 2022 06:16
@sonarcloud
Copy link

sonarcloud bot commented Jan 10, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@dannys42
Copy link
Contributor

dannys42 commented Feb 7, 2022

@sbeitzel I had to make some changes to CI, would you mind rebasing?

@sbeitzel
Copy link
Author

sbeitzel commented Feb 9, 2022

@dannys42 When you say, "rebasing," um, what is it that you would like me to do? I have never performed that git operation before. I've just read some documentation that sort of explains what rebasing does to the history, but I'm not entirely sure what that means in the context of a pull request. Does it mean, create a new branch and merge this PR branch into it via rebase, then close this PR and open a new PR with the new branch? Or something else? I mean, I'm happy to cooperate and make your life easier; I just need to understand what I'm supposed to do.

@dannys42
Copy link
Contributor

@sbeitzel Ah sorry... Yeah there's different approaches to this. My preference is rebasing. So my strategy here would be to:

git pull --rebase origin master

This will pull in any changes that have occurred on master. Then resolve any conflicts that occur (there shouldn't be any in your case).

Then when you're done, you'll have to do a git push --force or git push -f. This is because the history of your branch no longer matches your original commit.

Note you normally wouldn't have to do this if there are no merge conflicts (as is the case right now)... but in this case we have to bring in the changes to the CI configuration for the build to run correctly.

@sonarcloud
Copy link

sonarcloud bot commented Feb 12, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants