Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP, HELP NEEDED] Fix many issues #35

Merged
merged 13 commits into from
Aug 7, 2016
Merged

Conversation

kdm9
Copy link
Contributor

@kdm9 kdm9 commented Aug 7, 2016

Hi,

This started as a fix for #19 , but has grown somewhat.

So far I have:

Cheers,
Kevin

This patch makes the dict of mail hash: messages store only the mail file paths.
Then, for duplicate mail clusters, we load the messages into RAM before
investigating and possibly removing duplicates.
Also fix diff function argument types
@codecov-io
Copy link

codecov-io commented Aug 7, 2016

Current coverage is 61.02% (diff: 66.66%)

Merging #35 into develop will increase coverage by 28.78%

@@            develop        #35   diff @@
==========================================
  Files             3          3          
  Lines           304        313     +9   
  Methods           0          0          
  Messages          0          0          
  Branches         67         70     +3   
==========================================
+ Hits             98        191    +93   
+ Misses          194         91   -103   
- Partials         12         31    +19   

Powered by Codecov. Last update 6773cad...3915d00

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

For the encoding challenge, I suggest we try decoding with ascii, then utf-8 then run chardet. Failing all that I'm not sure what the best method would be.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Also, it would appear that unlike the rest of python3, the iteritems() method has not been removed from Maildir objects, and the items() method returns a list. This will cause additional excess memory usage. See the source for more detail.

@kdeldycke
Copy link
Owner

Thanks for the PR! I'm going to review it soon. Do you plan to commit some more stuff or is it ready to merge?

Do you have the courage of adding a couple of unit-tests? Even one or two would be great.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Ah, it appears to be working for me, but some review and testing especially of the encoding handling would be good. I'll take a look at tests.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

There is another issue. I have vim automatically running pylint, and it detects an undefined variable. Clearly I haven't reached this code path, and I can't tell what the variable should be. Could you take a look at this line please?

@kdeldycke
Copy link
Owner

Tracked the date_only variable back to c87818c , but it looks like I introduced the issue right after in e847797 .

This really makes me sad. There nothing to prevent anybody to push mistakes. Maildir-deduplicate is really helpful for a lot of people, but the code quality is not very good. Essentially because of its patchy history. If you plan to rely on it for important data, I'll be really happy to merge some unit-tests. Even a 10% in increased coverage would be a great leap forward.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Working on a super basic test. Don't have heaps of time, but should at least test maildir handling.

kdeldycke added a commit that referenced this pull request Aug 7, 2016
date_only was added in commit c87818c but was made obsolete in commit e847797 .

Thanks to @kdmurray91 for pointing this out: #35 (comment)
@kdeldycke
Copy link
Owner

@kdmurray91 That's the spirit! If you can only bootstrap the unit-tests of the project, it would be easier, little by little, to improve the whole thing.

As for the date_only return, I cleaned it up in 7c191f5 .

@kdeldycke
Copy link
Owner

This PR is good to merged IMHO. Just waiting for the last unittest commit.

I'll probably do a proper release on PyPi after this one is merged.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

The test I have written is really basic, and could do with extension if you have a little time. (it's bedtime in Australia).

Things to do:

  • test other strategies
  • test use of message ids
  • test on maildir without any dups
  • make the testing function check the presence of a list of files, and assert that no extras are present.
  • make the test class copy maildirs to a temp directory, then test without dry-runs

I'm happy for you to merge & release this as you see fit. There can always be more PRs.
Cheers,
K

@kdeldycke
Copy link
Owner

Thanks @kdmurray91 ! Just waiting for the tests to pass. Then I'll merge it right away.

@kdeldycke
Copy link
Owner

Looks like maildirs are expected to have their new subfolder.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Ah, of course. git won't keep them as they are empty. Doh.

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Fix incoming

@kdm9
Copy link
Contributor Author

kdm9 commented Aug 7, 2016

Looks like I've screwed up the silencing of the logs. If you can easily work this out, go ahead and merge and fix this then merge (I've added you as a collaborator on my repo). Otherwise I'll do it tomorow.

@kdeldycke
Copy link
Owner

I'll merge the PR and try to fix the issue.

@kdeldycke kdeldycke merged commit 088e8b6 into kdeldycke:develop Aug 7, 2016
@kdm9
Copy link
Contributor Author

kdm9 commented Aug 8, 2016

Thanks mate!

@github-actions
Copy link

github-actions bot commented Oct 5, 2020

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2020
@kdeldycke kdeldycke added 🐛 bug Something isn't working, or a fix is proposed ✨ enhancement Improvement or change to an existing feature and removed bug labels Nov 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🐛 bug Something isn't working, or a fix is proposed ✨ enhancement Improvement or change to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants