Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling msg within an msg #14

Open
tballison opened this issue May 9, 2024 · 3 comments
Open

Handling msg within an msg #14

tballison opened this issue May 9, 2024 · 3 comments

Comments

@tballison
Copy link

Thank you so much for an awesome library. While writing a wrapper for readpst for Apache Tika, we noticed a small number of cases where there were fewer attachments when selecting the .msg output option. Tika's jira issue: https://issues.apache.org/jira/browse/TIKA-4250

We were able to reproduce this with a test file we have in our unit tests: https://github.com/apache/tika/blob/main/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testPST.pst

The last email "8" is an email with an embedded email, and inside that embedded email is a docx file.

This is processed correctly with rfc822 and mbox output. However, there is no msg attachment within the 8.msg file.

@tballison
Copy link
Author

test-pst.zip

I'm including the original pst, the mbox, the msg, the .eml and the debug file

@tballison
Copy link
Author

Separately, we noticed that we're getting non-deterministic output when we select the .msg option. Sometimes we get 7 files and sometimes we get 8.

@pabs3
Copy link
Member

pabs3 commented Jul 1, 2024

To be clear; the libpst library has a long history with many contributors, the current maintainers didn't create the library but try to merge patches promptly and work on it when they are able to.

Thanks for the report and the test files, we'll take a look when we can.

The issue with non-deterministic output is known and has a workaround in git master, please comment on the issue if you still see it with the latest commit:

#7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants