New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String length limit with "ctrl-shift-v" #645

Closed
vanleo2001 opened this Issue Jul 27, 2017 · 24 comments

Comments

Projects
None yet
3 participants
@vanleo2001

vanleo2001 commented Jul 27, 2017

When I paste html with shortcut "ctrl-shift-v", I find a string length limit which may break the correct paste text formatting.
For an example, I write a simple html file (utf-8 encode)

<html>
<head>
</head>

<body>
<h1 id="toc_0">Note 2017-07-24</h1>
<p><strong>123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789</strong></p>

</body>
</html>

I open this html page with browser. I select the different length of content and paste (ctrl-shift-v) into QON, then I find it will not format correctly if the length of selected text exceed 128 characters.

snap627

@bepolymathe

This comment has been minimized.

bepolymathe commented Jul 27, 2017

Hi,

I confirm the same for me with Ctrl-shift-v but no problem with simple Ctrl-v

@pbek

This comment has been minimized.

Owner

pbek commented Jul 27, 2017

That's interesting, maybe it's a limitation of Qt regular expression parser. I will take a look at it.

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 28, 2017

There is the 1024 characters length limitation in QRegExp module, but not in QRegularExpression module. Also I write a demo to test, it works correctly and shows none length limitation.

QString text.replace(QRegularExpression(
            "<strong.*?>(.+?)<\\/strong>",
            QRegularExpression::CaseInsensitiveOption), "**\\1**");
@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 29, 2017

I know the why. When copying html, if copy content exceed a length limit, Win OS will insert some linebreaks \r\n in clipboard content. You can confirm this with "Free Clipboard Viewer" software. Then Qt regular expression parser can not match.

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 29, 2017

So add "remove \r\n in clipboard text" first before html tag converting https://github.com/pbek/QOwnNotes/blob/develop/src/mainwindow.cpp#L5888 ?

@pbek

This comment has been minimized.

Owner

pbek commented Jul 30, 2017

Strange, I cannot reproduce that problem under mac OS, regardless of how many characters there are in the strong-tag, I always was able to paste them correctly.

@pbek

This comment has been minimized.

Owner

pbek commented Jul 30, 2017

So add "remove \r\n in clipboard text" first before html tag converting

That would influence the interpretation of <pre> and <code> tags.

I neither can reproduce this issue under Windows 8.1. I was able to copy and paste html of any length. :/

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 30, 2017

Would you like to test the html file in zip attachment?
Test.zip

I make this special html page with some CR LF add. You can open it with browser and try "paste html or media" function in QON. Is it same as shown in browser?
I'm in win7, that shows many line breaks in QON. That is the problem. I try in MAC OS, that's no problem.

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 30, 2017

I google and test again. It seems the problem of browser. In windows 7, When copy (ctrl-c) some content in Firefox, it will auto insert some \r\n into copy data. So it will cause line breaks in QON when paste html (ctrl-shift-v}. While I do same in IE, it will not insert \r\n, even it will auto remove unnecessary \r\n.
You can test Test.zip above either in IE or Firefox to distinguish the difference.

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

Now I test QON in MAC OS virtual machine. <pre> and <code> tags are not replaced.

  1. In MAC OS, I open the html in below zip file with safari browser.
    Note 2017-07-24.zip

  2. I select some content which use <pre> tags in source.
    snap747

  3. Then ctrl-shift-v in QON. You can see the pasted text doesn't format as it in browser. The markdown "code block" tags are missing.
    snap748

  4. The problem is QRegularExpression will match in one line. If you want match in multilines, you need use param globalMatch .

@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

I tested your Test.html in Windows under Chrome and IE, where I didn't had any problems, but it didn't work with pasting from FF.

@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

The problem is QRegularExpression will match in one line. If you want match in multilines, you need use param globalMatch .

@vanleo2001, where did you find that problem? QString::replace replaces all occurrences (http://doc.qt.io/qt-5/qstring.html#replace-12)

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

I review the QT source. You are right. But it seems QON exactly doesn't match in multilines. Strange!

QString &QString::replace(const QRegularExpression &re, const QString &after)
{
......
    QRegularExpressionMatchIterator iterator = re.globalMatch(copy);
@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

Strange, but the other replaces in Utils::Misc::htmlToMarkdown definitely work multiple times...

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

I don't think so. All above what I said copy problem in FF is also caused by mismatch in multilines.

@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

I have not reverse engineered yet what FF does to the html code in the clipboard (and if it can be undone), but transforming multiple hits of all kind of tags with QString::replace seems to work with code from other browsers.

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

Would you like to read #645 (comment). The copy problem in Chrome, IE or Safari is they will auto remove the line breaks to fit the layout. QString::replace actually matchs in single line.

So I structure <pre>...</pre> with multilines, it fails.

@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

If QString::replace is the cause of this we have to file a bug for Qt, because http://doc.qt.io/qt-5/qstring.html#replace-12 states Replaces every occurrence of the regular expression re in the string with after. ;)

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

I get it. Please add param in Utils::Misc::htmlToMarkdown
QRegularExpression::DotMatchesEverythingOption

For example:

text.replace(QRegularExpression(
            "<pre.*?>(.+?)<\\/pre>",
            QRegularExpression::CaseInsensitiveOption | QRegularExpression::DotMatchesEverythingOption), "\n```\n\\1\n```\n");
@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

Yes, that is the equivalent of the multiline-regexp-modifier. That definitely is needed for <pre> and surely for other tags too... Thank you. (It also came into my mind right before you were writing your comment)

@vanleo2001

This comment has been minimized.

vanleo2001 commented Jul 31, 2017

👍

@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

It was more complex than I though...

17.08.0

  • when pasting html from the clipboard as markdown null characters and Windows
    line breaks will now be removed as well as all tags will be interpreted over
    multiple lines
@pbek

This comment has been minimized.

Owner

pbek commented Jul 31, 2017

There now is a new release, could you please test it and report if it works for you?

@pbek

This comment has been minimized.

Owner

pbek commented Aug 2, 2017

I will close this issue until there is more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment