Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"End of stream" problem while trying to parse eml file #348

Closed
humbledeveloper43 opened this issue Oct 23, 2017 · 29 comments
Closed

"End of stream" problem while trying to parse eml file #348

humbledeveloper43 opened this issue Oct 23, 2017 · 29 comments
Labels
bug Something isn't working

Comments

@humbledeveloper43
Copy link

Hello,

I'm having issues with some mails. Exception is similar one with issue #261 which is "Failed to Parse Message Headers".

I've tried your suggestion:

using (FileStream stream = new FileStream (@"mbox.txt", FileMode.Open)) {
    MimeParser parser = new MimeParser (stream, MimeFormat.Mbox);
    MimeMessage current = null;

    while (!parser.IsEndOfStream) {
        try {
            current = parser.ParseMessage ();
        } catch (FormatException e) {
            // Reset parser state and continue parsing from the current parser position
            stream.Position = parser.Position;
            parser.SetStream (stream, MimeFormat.Entity); // Also tried other MimeFormats too.
            continue;
        }
    }
}

But no success. I've checked mail content and it doesnt has any mail body. Only Headers and from to cc etc. headers. But I can open it with Outlook.

How can I solve the parsing problem?

Thanks

@humbledeveloper43
Copy link
Author

An update: I've printed stream position to understand which part of the email is causing problem. It prints 3538 character as stream position on exception. But I've checked the email and it has total 3476 character. How can it be?

@humbledeveloper43
Copy link
Author

Hello again,
Sublime Text showed wrong information. I've checked with NotePad++ and now it has 3538 character.

I've checked last two characters which are 0x0a 0x0d (3537. and 3538. characters)
Can these characters cause problem?

@humbledeveloper43
Copy link
Author

I've tried to delete those characters to see if they are causing trouble, but now this time it gave same exception with position 3536. So I think they dont have problem with parsing.

Still researching a solution..

@jstedfast
Copy link
Owner

The stream.Position property is not likely to be useful since the parser buffers blocks of 4K at a time, so it might be more helpful to check the parser.Position instead

@jstedfast
Copy link
Owner

Ok, so the problem is that if the last message in an mbox does not contain a message body (like yours), it will throw "End of stream".

I've just modified the code such that it doesn't throw anymore.

Could you grab the latest source code and test that it solves your problem and let me know how it works? I'm planning to make a release this week(end), so try to get back to me as soon as possible.

@jstedfast jstedfast added the bug Something isn't working label Oct 23, 2017
@humbledeveloper43
Copy link
Author

Can you give me compiled nuget package for testing?

@jstedfast
Copy link
Owner

I'm not on a Windows machine right now, so no :(

@humbledeveloper43
Copy link
Author

I'm working with dotnet core. So which class library project should I compile?

@jstedfast
Copy link
Owner

You'll want the MimeKit.NetStandard.csproj

@humbledeveloper43
Copy link
Author

I've successfully compiled MimeKit.NetStandard.csproj project and added as reference to my project.

I'm trying to parse same message but this time I get "Failed to parse message headers."

(MimeParser.cs: line 1535)

@jstedfast
Copy link
Owner

Do you have an mbox or just a message file?

An mbox starts with "From ..." (i.e. "From" + SPACE + more text)

@jstedfast
Copy link
Owner

Is there any way you could email me this message so I can test it?

@jstedfast
Copy link
Owner

Looking at the parser logic, the only way this can happen is if you have invalid headers and the parser is bailing because it thinks you gave it something that isn't a MIME message.

@humbledeveloper43
Copy link
Author

I have only eml files no mbox. So I'm trying to find if any header causes trouble. If I find the header that causes problem I'll send you the sample to investigate soon

@jstedfast
Copy link
Owner

BTW, you want to use MimeFormat.Entity for .eml files.

@jstedfast
Copy link
Owner

If it helps, the part of the header that will be invalid is the field name (not the value).

@humbledeveloper43
Copy link
Author

humbledeveloper43 commented Oct 23, 2017

Well I'm already using MimeFormat.Entity to parse .eml files.

I've found that the last character must be 0x0A

For my example, the file was ending with 0x0A 0x0D

When I add 0x0A as last character using a Hex editor to the file it parsed without any problem. (So it becomes 0x0A 0x0D 0x0A

Its an interesting issue.

@jstedfast
Copy link
Owner

Not really, that just terminates the header block.

@jstedfast
Copy link
Owner

Just send me the message and I can probably have it fixed in a matter of minutes :)

@humbledeveloper43
Copy link
Author

I sent a mail to jestedfa@microsoft.com which contains sample mail message that you want to see.

@jstedfast
Copy link
Owner

It works fine for me. Are you sure that it's just:

Content-Type: text/plain
MIME-Version: 1.0

I've tried with and without the blank line after the Mime-Version header and still no exception being thrown.

@humbledeveloper43
Copy link
Author

e.g. can you get the subject of mail? In my code it throws FormatException, and then I'm catching this error like this:

catch (FormatException e) {
            // Reset parser state and continue parsing from the current parser position
            stream.Position = parser.Position;
            parser.SetStream (stream, MimeFormat.Entity); // Also tried other MimeFormats too.
            continue;
        }

And then it leaves the while and message object seems empty. From, To, Cc, Subject and the other properties are empty or has default values. But in the eml file that fields are not empty.

@jstedfast
Copy link
Owner

The attachment that you sent to me was only the 2 headers I pasted above. Maybe you sent me the wrong file?

@humbledeveloper43
Copy link
Author

I resent the mail.

@jstedfast
Copy link
Owner

Ok, just got it. Thanks. I can reproduce what you are seeing.

I've already traced the problem to https://github.com/jstedfast/MimeKit/blob/master/MimeKit/MimeParser.cs#L896

As you can tell from the comment in the code, the problem is that the headers aren't properly terminated (which is why when you add another newline character, it works).

@humbledeveloper43
Copy link
Author

Great. So how can we solve it :) Are you going to update the library to have a maybe optional paramater on parsing for fixing header termination problem? Or something else?

jstedfast added a commit that referenced this issue Oct 24, 2017
@jstedfast
Copy link
Owner

parser.ParseHeaders() essentially already worked with that input so I just took out the if (!headersOnly) logic and just made it always fail gracefully rather than just when ParseHeaders() was called.

@jstedfast
Copy link
Owner

@bogdansantaGam that typically means you forgot to rewind the stream before trying to parse it.

In other words, you're doing something like this:

using (var memory = new MemoryStream ()) {
    memory.Write (buffer, 0, buffer.Length);

    var message = MimeMessage.Load (memory);
}

What you need to do is rewind the stream (aka memory.Position = 0;) like this:

using (var memory = new MemoryStream ()) {
    memory.Write (buffer, 0, buffer.Length);
    memory.Position = 0;

    var message = MimeMessage.Load (memory);
}

@jstedfast
Copy link
Owner

I would have to see the content of the stream to provide any further help...

Are you sure there's a MIME message in that stream?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants