Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[messagingframework] Ignore Re: and Fwd: when sort by subject. Contributes to JB#6059 #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bshmarin
Copy link

Add sort key for sort by trimmed subject (without Re, Fwd, e.t.c prefixes).

@okodron
Copy link

okodron commented Jun 21, 2021

Copy link
Member

@llewelld llewelld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My SQL sorting skills are no great shakes, so I have a couple of queries. As I've also commented, it would be good to have a header added to the patch.

+ };
+ static const QString trimTemplate = QStringLiteral("replace(%1, '%2', '')");
+ for (const QLatin1String &prefix : prefixes)
+ field = trimTemplate.arg(field, prefix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misreading this, but am I correct in thinking that this will remove all instances of "Re:" etc. from the string before sorting, independent of the position? So for example "HERE: a message" will be sorted as if it were "HE a message"?

Is there any way to have it only remove prefixed strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

+ for (const QLatin1String &prefix : prefixes)
+ field = trimTemplate.arg(field, prefix);
+ }
+ sortColumns.append(QLatin1String("ltrim(ltrim(") + field + QLatin1String(",'\\\"')) COLLATE NOCASE ") +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of having two nested ltrim statements? It looks like one may be redundant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -0,0 +1,77 @@
diff --git a/src/libraries/qmfclient/qmailmessagesortkey.cpp b/src/libraries/qmfclient/qmailmessagesortkey.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful if you could include a header similar to the other patches here, so that they can be applied using git am. The easiest way to generate the appropriate patch file that I'm aware of is to commit your changes to a temporary branch and then use git format-patch -N HEAD~ to create the patch file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@chriadam
Copy link
Contributor

Also, can you please add a commit message in the patch file and in the commit itself, which clearly states that this patch should NOT be upstreamed to Qt QMF, due to OMP having no CLA with TQC? Just to prevent any possible mistakes, when we upstream patches in future.

{
- QSqlQuery query(simpleQuery(QLatin1String("SELECT id FROM mailmessages"),
+ bool sortByTrimmedSubject = false;
+ for (const QMailMessageSortKey::ArgumentType &arg : sortKey.arguments())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiline for loop needs braces by qt coding style.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

map.insert(QMailMessageSortKey::Sender, QMailMessageKey::Sender);
map.insert(QMailMessageSortKey::Recipients, QMailMessageKey::Recipients);
map.insert(QMailMessageSortKey::Subject, QMailMessageKey::Subject);
+ map.insert(QMailMessageSortKey::TrimmedSubject, QMailMessageKey::TimeStamp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is Timestamp intentional? Can TrimmedSubject be added to QMailMessageKey?

Copy link
Author

@bshmarin bshmarin Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is Timestamp intentional?

Yes - the order of the time is used at the SQL level when many equal subjects.

Can TrimmedSubject be added to QMailMessageKey?

For what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct with the following: currently, your PR only supports using the trimmed subject for sorting; when performing a query from the database, the ORDER BY clause will use the Timestamp field in the query, and then the method which performs the query will then iterate over the (timestamp-sorted) results and then perform an additional sorting using the trimmed subject (via map insertion).

If so, there are a couple of potential issues:

  1. specifying a DESC sort order will result in the timestamp ordering returned by the query being DESC order, but the additional sorting via map insertion will not be inverted.
  2. sorting by the trimmed subject field will result in a potentially significant performance reduction, as the ordering cannot be done at the database query level, but is done in memory.

The existing QMF code basically uses that one mapping of sort key to mail message key to determine both the sort field and the filter field (at least, if I am reading the comment in messageSortMapInit() correctly), and for all other cases the mapping is defined such that the ORDER BY and SELECT fields are the same. Your PR breaks this assumption, because it defines that the ORDER BY field is Timestamp but handles SELECT in code rather than via expansion from the mapping.

Also, the word "trimmed" already has a well-defined meaning in Qt code (i.e. trimmed of whitespace). A better name might be simplifiedSubject or similar.

Conceptually, I would prefer if we can support this simplified subject more generally (that is, for both sorting and for filtering), without custom codepaths to handle this case specifically. Presumably, it would require modifying QMailMessageHeader, adding the appropriate keys (which would most likely map to the normal Subject field, with a flag for subject simplification, and then adding a method somewhere like QString simplifiedSubject(const QString &subj) const which is used every time we query the subject for these cases - there are probably only a couple of places, e.g. when querying headers, and when querying messages). Of course, this doesn't solve the performance hit, but that can be documented.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the additional sorting via map insertion will not be inverted.

Agree.

potentially significant performance reduction

SQLite also works in memory, so the performance hit shouldn't be too noticeable, but by means of sqlite we will not be able to carry out the sorting we need without creating additional fields in the database.

...that is, for both sorting and for filtering...

I do not see any advantages of filtering by a simplified subject compared to a subject.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the word "trimmed" already has a well-defined meaning in Qt code (i.e. trimmed of whitespace). A better name might be simplifiedSubject or similar.

May be originalSubject ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding originalSubject - I agree, that sounds good to me.

Regarding the advantages of supporting filtering as well as sorting - the only real advantage is that it keeps the code consistent for all keys. The current PR adds some cognitive load for future maintainers, because this specific key works differently to the rest.

Regarding the performance - ordinarily I would say: fair enough, better to avoid additional fields and complexity, so long as the performance overhead of current solution is very low or negligible. However, maybe this is related to the above point about "consistent codepaths for all keys" - it might be cognitively simpler to add them. I'm not sure.

@pvuorela @dcaliste I'd appreciate your thoughts here. I am not familiar enough with QMF to have a hard stance on what is best; that said, the unique semantics for this key type does seem suboptimal to me, from a maintenance perspective. A comprehensive commit message might help with that, I suppose.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding originalSubject - I agree, that sounds good to me.

Changed.

@chriadam
Copy link
Contributor

One comment, aside from that I would still like the commit message to include a warning that the patch should not be upstreamed due to lacking a CLA.

@bshmarin
Copy link
Author

bshmarin commented Nov 17, 2021

One comment, aside from that I would still like the commit message to include a warning that the patch should not be upstreamed due to lacking a CLA.

I'm waiting...

Added.

…subject

without Re: and Fwd: prefixes. Contributes to JB#6059

!!! WARNING !!!
Don't upstream this patch to the QTC due to lacking a CLA.
@krnlyng
Copy link

krnlyng commented Nov 22, 2021

This doesn't account for languages it seems. In german "AW:" is a common prefix to reply emails.

I understand that localizing this might be hard or even impossible, but i wanted to mention it. In case it was already discussed in https://git.sailfishos.org/mer-core/messagingframework/merge_requests/58 then ignore this, but the link is not working anymore.

@pvuorela
Copy link
Contributor

This doesn't account for languages it seems. In german "AW:" is a common prefix to reply emails.

What I gather the standard is "Re:" and if something else is used the client is broken. Broken clients unfortunately including Exchange.

Would be nice to work with all sorts of usage, but as you said that can get problematic if every language has its own strings.

@dcaliste
Copy link
Contributor

I agree, reminding what was originally in git.sailfishos.org MR would be helpfull.

I'm wondering for instance what is the purpose of this MR. I mean, its use case. Is it because the "sort by object" entry in the pulldown menu is broken in the sense that mails with the same object, but having these prefix, are not sorted together ? In that case, the purpose is actually to group by thread. A better implementation may rely then on References header. But well, since people often reply to an email to start a new thread, this solution is doomed also.

@chriadam
Copy link
Contributor

Well, FWIW, this PR now looks fine to me. I don't think it would be accepted upstream (after discussing with Matt, the concern about the differing semantics of this key would probably be a showstopper there) but given that this isn't being upstreamed, I think that isn't a big issue.

@pvuorela
Copy link
Contributor

I'm wondering for instance what is the purpose of this MR. I mean, its use case. Is it because the "sort by object" entry in the pulldown menu is broken in the sense that mails with the same object, but having these prefix, are not sorted together ?

Yes, that's the case.

In that case, the purpose is actually to group by thread. A better implementation may rely then on References header. But well, since people often reply to an email to start a new thread, this solution is doomed also.

Also a good point and maybe not a bad alternative. Though because of that thread splitting into new conversation thinking if the UI should then be something else than "sort by subject".

+ if (sortByOriginalSubject) {
+ QMap<QString, quint64> map;
+ while (query.next())
+ map.insertMulti(extractValue<QString>(query.value(1)).replace(QRegExp(QStringLiteral("^(re:|fw:|fwd:|\\s*|\\\")*"), Qt::CaseInsensitive), QString()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose it would be better creating the regexp outside the loop, i.e. not separately for every result.

Also on new code it could be better using QRegularExpression instead of QRegexp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants