Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object has no attribute '_subdir' error #191

Closed
dschrempf opened this issue Feb 9, 2021 · 8 comments
Closed

Object has no attribute '_subdir' error #191

dschrempf opened this issue Feb 9, 2021 · 8 comments
Assignees
Labels
🐛 bug Something isn't working, or a fix is proposed

Comments

@dschrempf
Copy link

After execution of the following command, I get the mentioned error message (see logs):

 + ~/Downloads/Temp/mail-deduplicate/bin/mdedup -s select-oldest -a move-selected -t ctime -E Linux-Dedup/ -e maildir Linux/

● Phase #0 - Load mails

Opening /home/dominik/Maildir/gmail/Linux ...
maildir detected.
1063 mails found.

● Phase #1 - Compute hashes and group duplicates
Use [date, from, to, subject, mime-version, content-type, content-disposition, user-agent, x-priority, message-id] headers to compute hashes.
Hashed mails  [####################################]  1063/1063

● Phase #2 - Select mails in each group
select-oldest strategy will be applied on each duplicate set to select candidates.

● Phase #3 - Perform action on selected mails
Perform move-selected action...
1063 mails selected for action.
Creating new maildir box at /home/dominik/Maildir/gmail/Linux-Dedup ...
Traceback (most recent call last):
  File "/home/dominik/Downloads/Temp/mail-deduplicate/bin/mdedup", line 8, in <module>
    sys.exit(mdedup())
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/mail_deduplicate/cli.py", line 388, in mdedup
    perform_action(dedup)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/mail_deduplicate/action.py", line 114, in perform_action
    method(dedup)
  File "/home/dominik/Downloads/Temp/mail-deduplicate/lib/python3.8/site-packages/mail_deduplicate/action.py", line 62, in move_selected
    box.add(mail)
  File "/nix/store/wkw6fsjasr7jbbrlakxxpbiapa8hws42-python3-3.8.7/lib/python3.8/mailbox.py", line 300, in add
    subdir = message.get_subdir()
  File "/nix/store/wkw6fsjasr7jbbrlakxxpbiapa8hws42-python3-3.8.7/lib/python3.8/mailbox.py", line 1537, in get_subdir
    return self._subdir
AttributeError: 'MaildirDedupMail' object has no attribute '_subdir'
@dschrempf
Copy link
Author

So it turns out this was caused by me executing mdep from outside the virtual environment. Pretty stupid that this can actually be done :).

@dschrempf
Copy link
Author

Sorry I have to reopen. This was not my fault. The error does not happen when using -n.

@dschrempf dschrempf reopened this Feb 9, 2021
@kaz-yos
Copy link

kaz-yos commented Feb 10, 2021

Same as #135?

I got the same error with version 6.1.2.

mdedup 6.1.2
{'username': '-', 'guid': '7d002aa8ff457a7721f6a7ad164505f', 'hostname': '-', 'hostfqdn': '-', 'uname': {'system': 'Darwin', 'node': '-', 'release': '20.3.0', 'version': 'Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64', 'machine': 'x86_64', 'processor': 'i386'}, 'linux_dist_name': '', 'linux_dist_version': '', 'cpu_count': 12, 'fs_encoding': 'utf-8', 'ulimit_soft': 256, 'ulimit_hard': 9223372036854775807, 'cwd': '-', 'umask': '0o2', 'python': {'argv': '-', 'bin': '-', 'version': '3.7.1 (default, Oct 23 2018, 14:07:42) [Clang 4.0.1 (tags/RELEASE_401/final)]', 'compiler': 'Clang 4.0.1 (tags/RELEASE_401/final)', 'build_date': 'Oct 23 2018 14:07:42', 'version_info': [3, 7, 1, 'final', 0], 'features': {'openssl': 'OpenSSL 1.1.1d  10 Sep 2019', 'expat': 'expat_2.2.6', 'sqlite': '3.25.3', 'tkinter': '8.6', 'zlib': '1.2.11', 'unicode_wide': True, 'readline': True, '64bit': True, 'ipv6': True, 'threading': True, 'urandom': True}}, 'time_utc': '2021-02-10 23:31:27.276025', 'time_utc_offset': -5.0, '_eco_version': '1.0.1'}

@alisraza
Copy link

I think this is likely the same as #135 as @kaz-yos pointed out.

Description:

Running the following command:

"$basedir"/code/forks/mail-deduplicate/.venv/bin/mdedup \
	--input-format maildir \
	--size-threshold 0 \
	--content-threshold 0 \
	--strategy discard-all-but-one \
	--action move-selected \
	--export "$output_path" \
	--export-format maildir \
	--verbosity debug \
	"$mail_source_1" "$mail_source_2"

Yields (truncated output):

● Phase #3 - Perform action on selected mails
Perform move-selected action...
232 mails selected for action.
Creating new maildir box at [$output_path] ...
debug: Locking box...
debug: Move <MaildirDedupMail ["$mail_source_1"]:[NNNNNNNNNN].[NNNNN]_[NNN].["$hostname"],U=[NNN]> form ["$mail_source_1"] to ["$output_path"]...

With stacktrace:

  File "["$basedir"]/code/forks/mail-deduplicate/mail_deduplicate/cli.py", line 388, in mdedup
    perform_action(dedup)
  File "["$basedir"]/code/forks/mail-deduplicate/mail_deduplicate/action.py", line 114, in perform_action
    method(dedup)
  File "["$basedir"]/code/forks/mail-deduplicate/mail_deduplicate/action.py", line 62, in move_selected
    box.add(mail)
  File "[~]/.pyenv/versions/3.7.10/lib/python3.7/mailbox.py", line 300, in add
    subdir = message.get_subdir()
  File "[~]/.pyenv/versions/3.7.10/lib/python3.7/mailbox.py", line 1537, in get_subdir
    return self._subdir
AttributeError: 'MaildirDedupMail' object has no attribute '_subdir'

Debugging Information:

Coding is a side-hobby and I haven't looked at python code for a while, but from stepping through the code, my best guess is that when the mail object is created as a subclass, it may be running the __init__ function from the python standard library's Message class rather than the MaildirMessage class, given the __init__ function for the MaildirMessage class is:

class MaildirMessage(Message):
    """Message with Maildir-specific properties."""

    _type_specific_attributes = ['_subdir', '_info', '_date']

    def __init__(self, message=None):
        """Initialize a MaildirMessage instance."""
        self._subdir = 'new'
        self._info = ''
        self._date = time.time()
        Message.__init__(self, message)

However, based on the stacktrace, when I look at action.py in the move_selected function:

def move_selected(dedup):
    # truncated [...]
            box.add(mail)
            dedup.sources[mail.source_path].remove(mail.mail_id)
            logger.info(f"{mail!r} copied.")
    # truncated [...]

When pausing at box.add(mail), not only does the box object have the mailbox.Maildir class, but the mail object has the MaildirDedupMail class, which appears to be correct, although it is indeed missing the mail._subdir attribute. I would need more time to look into how mail is instantiated, but I hope the information thus far is somewhat helpful. I may be slow to respond in the next few days, but I appreciate anyone who is able to look into this issue.

Additional Information:

Code running with cwd as "$basedir"/code/forks/mail-deduplicate.
Virtual environment created with poetry install in .venv subdir.

poetry --version
# Poetry version 1.1.4
python --version
# Python 3.7.10
pyenv version
# 3.7.10 (set by "$basedir"/code/forks/mail-deduplicate/.python-version)
"$basedir"/code/forks/mail-deduplicate/.venv/bin/mdedup --version
# mdedup 6.1.3
# {'username': '-', 'guid': '82f4afc3ac75c9fa8c7849ab3364986', 'hostname': '-', 'hostfqdn': '-', 'uname': {'system': 'Linux', 'node': '-', 'release': '5.10.16-arch1-1', 'version': '#1 SMP PREEMPT Sat, 13 Feb 2021 20:50:18 +0000', 'machine': 'x86_64', 'processor': ''}, 'linux_dist_name': 'arch', 'linux_dist_version': 'Arch', 'cpu_count': 8, 'fs_encoding': 'utf-8', 'ulimit_soft': 8192, 'ulimit_hard': 524288, 'cwd': '-', 'umask': '0o2', 'python': {'argv': '-', 'bin': '-', 'version': '3.7.10 (default, Feb 18 2021, 17:50:07) [GCC 10.2.0]', 'compiler': 'GCC 10.2.0', 'build_date': 'Feb 18 2021 17:50:07', 'version_info': [3, 7, 10, 'final', 0], 'features': {'openssl': 'OpenSSL 1.1.1j  16 Feb 2021', 'expat': 'expat_2.2.8', 'sqlite': '3.34.1', 'tkinter': '', 'zlib': '1.2.11', 'unicode_wide': True, 'readline': True, '64bit': True, 'ipv6': True, 'threading': True, 'urandom': True}}, 'time_utc': '2021-02-19 10:10:34.969315', 'time_utc_offset': -5.0, '_eco_version': '1.0.1'}

For convenience, corresponding JSON:

{
	"username": "-",
	"guid": "82f4afc3ac75c9fa8c7849ab3364986",
	"hostname": "-",
	"hostfqdn": "-",
	"uname": {
		"system": "Linux",
		"node": "-",
		"release": "5.10.16-arch1-1",
		"version": "#1 SMP PREEMPT Sat, 13 Feb 2021 20:50:18 +0000",
		"machine": "x86_64",
		"processor": ""
	},
	"linux_dist_name": "arch",
	"linux_dist_version": "Arch",
	"cpu_count": 8,
	"fs_encoding": "utf-8",
	"ulimit_soft": 8192,
	"ulimit_hard": 524288,
	"cwd": "-",
	"umask": "0o2",
	"python": {
		"argv": "-",
		"bin": "-",
		"version": "3.7.10 (default, Feb 18 2021, 17:50:07) [GCC 10.2.0]",
		"compiler": "GCC 10.2.0",
		"build_date": "Feb 18 2021 17:50:07",
		"version_info": [3, 7, 10, "final", 0],
		"features": {
			"openssl": "OpenSSL 1.1.1j  16 Feb 2021",
			"expat": "expat_2.2.8",
			"sqlite": "3.34.1",
			"tkinter": "",
			"zlib": "1.2.11",
			"unicode_wide": true,
			"readline": true,
			"64bit": true,
			"ipv6": true,
			"threading": true,
			"urandom": true
		}
	},
	"time_utc": "2021-02-19 10:10:34.969315",
	"time_utc_offset": -5.0,
	"_eco_version": "1.0.1"
}

Thank you!

@kaz-yos
Copy link

kaz-yos commented Feb 19, 2021

@alisraza, thanks for the detailed investigation!

@pechfunk
Copy link
Contributor

pechfunk commented Apr 12, 2021

It looks like the problem is in the DedupMail constructor which tries to auto-detect which of the superclasses is the one that contributes Message-ness.

    def __init__(self, message=None):
        """Initialize a pre-parsed ``Message`` instance the same way the default
        factory in Python's ``mailbox`` module does.
        """
        # Hunt down in our parent classes (but ourselve) the first one inheriting the
        # mailbox.Message class. That way we can get to the original factory.
        orig_message_klass = None
        for klass in inspect.getmro(self.__class__)[1:]:
            if issubclass(klass, mailbox.Message):
                orig_message_klass = klass
                break
        assert orig_message_klass

        # Call original object initialization from the right message class we
        # inherits from mailbox.Message.
        super(orig_message_klass, self).__init__(message)

Now when the search finds a Message-like class orig_message_klass, the super-call will ensure that the successor of orig_message_klass in the MRO will be called first. This means for Maildir messages that the plain Message ctor gets called, but MaildirMessage's not.

I've tried to repair the clever construction in PR #222 . I'm not sure that the cleverness is necessary here, with only a handful of message classes to support, and little innovation in the field of Mbox dialects going on in general. But at least mdedup runs for me again!

@kdeldycke
Copy link
Owner

little innovation in the field of Mbox dialects going on in general

Indeed! I apologize for that part being well over-engineered. I wanted that part to be future-proof, why the vague idea of extending it to other source of mails (Gmail? S3?). But it ended up increasing complexity with little benefits.

Anyway, thanks a lot @pechfunk for diving deep into the root cause and proposing a fix! I just merged it back upstream, and try to cur a new release.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 19, 2021
@kdeldycke kdeldycke added 🐛 bug Something isn't working, or a fix is proposed and removed bug labels Nov 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🐛 bug Something isn't working, or a fix is proposed
Projects
None yet
Development

No branches or pull requests

5 participants