Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue #36

Closed
wants to merge 5 commits into from
Closed

Encoding issue #36

wants to merge 5 commits into from

Conversation

kdm9
Copy link
Contributor

@kdm9 kdm9 commented Aug 8, 2016

Hi @kdeldycke

Since the changes we made yesterday I have been able to analyse a bigger set of my maildir. It appears that we missed a few encoding issues. The attached seems to fix them, but checking this on another test-set (specifically a non-English maildir) would be wise.

Cheers,
K

@codecov-io
Copy link

codecov-io commented Aug 8, 2016

Current coverage is 61.63%

Merging #36 into develop will not change coverage

@@            develop        #36   diff @@
==========================================
  Files             3          3          
  Lines           318        318          
  Methods           0          0          
  Messages          0          0          
  Branches         71         71          
==========================================
  Hits            196        196          
  Misses           91         91          
  Partials         31         31          

Powered by Codecov. Last update c96c027...c96c027

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 8, 2016

Looks like the approach I take is python-2 incompatible. Any ideas on how to port this @kdeldycke?

@kdeldycke
Copy link
Owner

Isn't it because message_from_binary_file was introduced in Python 3.2?

In which case you should try to hard. Let's revert to the old behaviour for Python 2. What do you think?

@kdeldycke
Copy link
Owner

FYI, I've reverted to your previous behaviour in c96c027 for tests to pass in Python 2.

Can you try to rebase this PR on top of the develop branch?

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 10, 2016

@kdeldycke Thanks heaps. I've rebased and pushed.

@kdeldycke
Copy link
Owner

Thanks. Tests still not passing though... :(

Is there a reason to not keep using the get_payload() method in 2e43a69 ?

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 11, 2016

Ah, not really, other than that it kept raising UnicodeError and going the route I did always worked (on python 3).

@kdeldycke
Copy link
Owner

kdeldycke commented Aug 12, 2016

Sorry to be skeptical, it's just that it feels like the issue is a .decode('utf-8') away of being solved. That's why I'm not certain we need heavy-duty encoding guessing via chardet.

So here is what I propose: if you can isolate one message triggering a UnicodeDecodeError, I'll try to solve it through a much more restrained decoding path. In the mean time, I'll keep your current PR open and I'll merge it as a last resort. Unless you find a way to make the unittests pass in Python 2. What do you think?

@kdeldycke
Copy link
Owner

Handling of unparseable email has been added in #47, which supersedes and invalidates half of this PR. As a result, and because it's currently un mergeable as-is, I'll close this PR for now. Sorry @kdmurray91. :(

That being said, your effort was not vain as I pointed to your code in #46 as a possible way to enhance parsing of badly encoded mails.

@kdeldycke kdeldycke closed this Jan 13, 2017
@github-actions
Copy link

github-actions bot commented Oct 5, 2020

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2020
@kdeldycke kdeldycke added ✨ enhancement Improvement or change to an existing feature and removed enhancement labels Nov 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
✨ enhancement Improvement or change to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants