Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editing BLOB as Text truncates value at first NULL byte #19

Closed
stevehodgett opened this issue May 25, 2014 · 11 comments
Closed

Editing BLOB as Text truncates value at first NULL byte #19

stevehodgett opened this issue May 25, 2014 · 11 comments
Labels
bug Confirmed bugs or reports that are very likely to be bugs.

Comments

@stevehodgett
Copy link

This is more a comment than a bug report, but I noticed this when testing the fix for issue #16.
Suppose you have a BLOB containing multiple null-terminated strings. When you edit the blob, everything is fine as type:Binary, but if you change the type to "Text" in the EditDialog the length of the data remains the same but only the data up to the first null is shown (understandable, but not what the user might expect). Switching back to type:Binary shows all the data again.
However, if you change the data in the type:Text view it is immediately truncated at the first null (the new length shown at the bottom of the EditDialog window), and switching back to type:Binary only shows the truncated data.
I do think it's a valuable facility to be able to edit text within a blob - editing UTF-8 encoded text in hex mode is only fun once :-)

MKleusberg added a commit that referenced this issue May 25, 2014
Make the EditDialog a bit more user friendly when editing binary data in
text mode:

Don't change back to the hex editor after changing any character.

Show the full binary data even if it contains a NULL byte.

Also (though a bit unrelated) disable rich text input for the text widget.

This partially fixes issue #19.
@MKleusberg
Copy link
Member

This is a difficult one because of Qt doing much UTF-8 converting automatically. What I could do is showing the full data in the text editor and keeping the dialog away from changing back to the hex editor all the time.
However, once you change a BLOB which contains a non-UTF-8 character it is replaced by the UTF-8 replacement character (EF BF BD) and thus rendered useless... This happens somewhere in QTextEdit or QTextDocument. NULL characters work though :)

@stevehodgett
Copy link
Author

Yes, it seems to be a very difficult area. Full function UTF8-aware hex/text editors seem very few and far between.
Perhaps it would be reasonable to settle for disabling Text editing for blobs :-(
By the way, I only discovered by accident that the binary edit mode has the text displayed on the right hand side - it's completely hidden when the EditDialog opens at its default size!
Maybe make the default width enough to show the 16 bytes in both hex and text?

@MKleusberg
Copy link
Member

Thanks for the hint - commit 41296e3 incereases the size of that dialog to fit the entire hex editor widget (at least on my system, that is).
In commit ae4d04f I've also added a warning which appears when the user attempts to change binary data in text mode.

I think these changes make the dialog a lot more user friendly but I'd still like to keep Qt away from inserting replacement characters. We'll have to see if that's even possible though 😒 Still, looking forward to your feedback on the new version tomorrow 😄

@stevehodgett
Copy link
Author

@MKleusberg both commits appear to have achieved what was intended - looks good, thanks!
May I make a couple more (final?) comments? (although I understand you may have better things to do than attend to this level of detail)

  1. there is no visible representation of a null byte in the text view, and
    2."Select all" highlights all the text, but "copy" only copies up to the first null

Seeing where there is a null means I'm less likely to remove it by mistake.
Being able to copy the entire blob value at least gives me the opportunity to take the data for processing elsewhere (and a really nice feature of the EditDialog is that I can get my binary data back into sqlitebrowser by pasting it as hex pairs).

@MKleusberg
Copy link
Member

Don't worry, I do care about this level of detail and all input is welcome :)

  1. This might be because of your font or so. I get at least some sort of representation on my system and if you don't have a similar result on yours we might be able to fix it.
    5
  2. Hmm, I can reproduce this. The problem here is that this bug lies somewhere in the Qt code. While it's possible to rewrite that code inside the application I'm not sure if it's worth the effort to be honest. I think the better approach might be to make editing binary data as comfortable as possible without usage of the text editor. If you haven't noticed them you might want to try the Export and Import buttons. And I'll check if it's feasible to extend the hex editor widget to allow input on the right side where the text representation is. I'd also be interested in any other suggestions!

@justinclift
Copy link
Member

As a side thought, is there a difference in the way Qt4 vs Qt5 do this? At the moment we can compile against either, but if Qt5 is better then we could potentially move to Qt5 specific?

@stevehodgett
Copy link
Author

Here's what I see on Windows 8.1:
image
To my mind, the best solution is, as you suggest, to extend the hex editor widget if possible to allow input on the right side.
By the way, the hex editor doesn't appear to make any attempt to display multi-byte UTF-8 characters ("café" will appear as "caf.."). Another Qt constraint?

@justinclift
Copy link
Member

@stevehodgett We've made a huge amount of changes and bug fixes since this was reported, including many in our unicode handling. Would you have a few moments to try our new release (3.8.0) and make sure it's working properly now for you? 😄

@justinclift justinclift added the bug Confirmed bugs or reports that are very likely to be bugs. label Jan 7, 2016
@MKleusberg
Copy link
Member

Closing this as it's almost been a year without updates. Chances are that the copy-paste situation improved when we moved to Qt5 but either way: there isn't much we can do about it on our side. Adding an edit option to the right side of the hex editor turns out to be difficult, too. We would have improve the QHexEdit widget but that's an external project by different people. The "caf.." problem isn't actually a bug but correct this way: the "é" is encoded using two separate bytes and none of them represents the "é" on its own, thus the two dots. And finally, you can now set the font for the edit dialog, choosing one which doesn't omit NULL bytes but prints a box or something instead. This way "testtest" should become ~~ "test[]test".

@MKleusberg
Copy link
Member

It's been almost three years, so this is probably just for the records, but: Good news! I've just updated the QHexEdit library in our repository and editing the ASCII code on the right side of the hex editor is now finally supported 😉

@justinclift
Copy link
Member

@MKleusberg That's awesome. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs.
Projects
None yet
Development

No branches or pull requests

3 participants