Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对于对邮件内容解析转码部分,是不是可以添加一个自定义的可转码的编码格式? #337

Closed
mostanily opened this issue May 29, 2019 · 6 comments
Labels
needs investigation This will be tested / debugged or checked out.

Comments

@mostanily
Copy link

先说下我遇到的问题

我使用的是outlook邮件,由于邮件服务器编码不是常用的UTF-8;而是US-ASCII;因此在初始化时,选择的是US-ASCII编码,

初始化

比较幸运,第一次调试的时候成功了,但是返回的结果中文都是乱码,初次判断是编码格式的问题,后来查了下源码,的确是,发现所有调用 decodeMimeStr()方法的地方,第二个入参,传入的都是类初始化时设置的编码,即:US-ASCII。后来试了下,将调用此方法的第二个入参都换成 UTF-8,邮件解析后的内容,中文全部都显示正常;

修改的时候,发现decodeRFC2231()此方法,第二入参也是传的初始化时设置的编码格式,所以所有调用此方法的地方,我也做了修改。

改动的地方

初始化时,新增了一个参数,用来表示 希望将邮件解析成哪一个编码格式,默认为UTF-8;

新增了两个方法及一个参数;
protected $localToEncoding
setLocalToEncoding()
getLocalToEncoding()

之后将所有调用decodeMimeStr()方法的第二个入参,以及DataPartInfofetch()方法中调用convertStringEncoding()方法的第二个入参,都替换成新增的方法,$this->getServerEncoding() => $this->getLocalToEncoding()。可能还有遗漏,没有进行全局搜索。

我也不清楚我修改的是否合理,但是这确实是解决了我这边的麻烦,如果有更好官方方法,希望能更新一下。
以下是我这边修改后,返回的结果(对返回内容以及做过筛选了):

返回结果

中文都显示正常了。
最后,谢谢。

@mostanily mostanily added the needs investigation This will be tested / debugged or checked out. label May 29, 2019
@mostanily
Copy link
Author

对了,还有个地方忘了说了,就是邮件下载附件的逻辑,目前版本中的附件下载逻辑,如果设置了附件保存地址,则每次抓取邮件时都会重复下载同一个附件,这会导致保存邮件附件的目录下堆积很多重复的文件。
希望能够优化下,对于同一份邮件来说,它的附件最好是能够跟实际是保持一致的。

@Sebbo94BY
Copy link
Collaborator

Fortunately Google Chrome translates the texts diligently, otherwise we would have a small understanding problem here. :D

Your described issue is an already known issue. See #306.

I haven't tested it yet, but did you already try to solve the encoding issue by changing the charset / encoding later in your code?

$mailbox = new PhpImap\Mailbox(
	'{imap.gmail.com:993/imap/ssl}INBOX', // IMAP server and mailbox folder
	'some@gmail.com', // Username for the before configured mailbox
	'*********', // Password for the before configured username
	__DIR__, // Directory, where attachments will be saved (optional)
	'US-ASCII' // Set encoding to us-ascii for Microsoft mail servers
);

try {
         // Get emails
	$mails_ids = $mailbox->searchMailbox('ALL');
} catch(PhpImap\Exceptions\ConnectionException $ex) {
	echo "IMAP connection failed: " . $ex;
	die();
}

// Change server encoding
$mailbox->setServerEncoding('UTF-8');

// Loop through all emails
foreach($mails_ids as $mail_id) {
	// Just a comment, to  see, that this is the begin of an email
	echo "+------ P A R S I N G ------+\n";

	// Get mail by $mail_id
	$email = $mailbox->getMail(
		$mail_id, // ID of the email, you want to get
		false // Do NOT mark emails as seen
	);

	echo "from-name: " . (isset($email->fromName)) ? $email->fromName : $email->fromAddress;
	echo "from-email: " . $email->fromAddress;
	echo "to: " . $email->to;
	echo "subject: " . $email->subject;
	echo "message_id: " . $email->messageId;
}

// Disconnect from mailbox
$mailbox->disconnect();

Regarding the attachments: You're right. Those should be only saved once. We need to update the code logic to keep the same file name in order to update those files instead of creating duplicates. I've created a new issue for this topic: #338

@mostanily
Copy link
Author

感谢回复,我的确是看漏了,没注意到那一行代码,不过源代码我的确已经修改了o(╯□╰)o,问题确实解决了,没想到官方已经给出解决办法了,不过改了后就不想再恢复了,包括附件那里,O(∩_∩)O哈哈~。

@Sebbo94BY
Copy link
Collaborator

Sorry, I didn't get your last comment. Can you please let me / us know, what it means in english? Google translates it unfortunately to something very strange, what doesn't make sense at all. Thank you!

@mostanily
Copy link
Author

This problem has been solved, this is a kind of tone with a joke, you don't have to worry about it.

@Sebbo94BY
Copy link
Collaborator

Ah, ok. Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs investigation This will be tested / debugged or checked out.
Projects
None yet
Development

No branches or pull requests

2 participants