Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Export blobs confus(ed,ing) - of (at least) two minds #1438
This is almost a feature request, but I came down on the side of a UI bug...
The browser provides a "binary" view of blob data that is handy for humans.
Sometimes it's useful to extract the blob as binary for analysis (or formatting) by external tools.
The Export button in the "Edit Database Cell" pane is of two minds on this -- maybe more.
On one hand, it displays a text hexdump - which is handy for humans.
But Export provides a filename of ".txt", so one expects a human-readable form to be written.
Then again, the output isn't the text dump, but appears to be the raw data. I haven't tried a complex test case, but hopefully the output is written in binary ("wb") mode. (I encountered this with a record that has a a few bare ^M s - which seems to be enough for you to decide it's a blob.)
The following observations seem apropos:
As long as you're providing a hex dump, it would also be helpful to be able to select formatting as (16-bit, 32-bit or 64-bit) words. Saves manually grouping and flipping bytes on a little-endian machine.
Hope this is useful.
All this makes sense.
@tlhackque is the extension automatically added to the filename for you? In my case (Ubuntu 16.04), I have to add myself whatever extension I want, otherwise the file is created without extension. Maybe under Windows is different. In any case, it makes sense to set the appropriate filter for each case.
Under Windows, the Save As box comes up with "Save as type: Text files (*.txt).
If I enter just a filename, .txt is appended.
If I enter an explicit extension, it is honored. But the view is of any .txt files in the save path, not .bin files. And it's suggesting that the content will be text.
The dialog box also allows "All files (.)" as a (file) browser filter, but that isn't terribly helpful.
BTW, I just extracted a blob of deflated (pure binary) data & recovered the contents correctly.
So - either you're writing both text and binary in binary mode (windoze cares & will turn to , or you already have some cleverness...
My current project is Windoze based, so I can't say anything about what you do in the Unix environment.
FWIW, I'm using a nightly build from a few daze ago (I wanted the latest SQLite).
@mgrojo Agreed. That sounds better. Our translaters will have a bunch of stuff to change anyway, and I'm pretty sure they'd prefer to have better source text in the UI too.
For exporting a hex dump, should we use the extension
No, because .hex traditionally means a hex-encoded binary stream in Intel or Motorola format used for burning a (P)ROM, FLASH or PLD device, e.g. S-records.
.txt would be fine - note that it's also associated (all OS) with text editors that can read it.
(Well, except for the Notepad/wordpad mess on windoze - but I hear that's about to be fixed after a few decades...)
Suggest you make the Export dialog be something like
Or anything else that comes to mind as useful. I don't think you want a huge list, but I bet there are some common formats for blobs, like graphics (.jpg,png,gif) that might be worth considering - they don't cost anything except a menu entry. (For extra credit, you could even run the stream against one of the magic recognizers and pick one automagically. (e.g. the database used by the Unix 'file' command). As long as it doesn't go into the dozens of entries in the pulldown...
added a commit
Jun 20, 2018
Sure -- let me know when it's available (I'm actually working one something else so looking at this is interrupt-driven), & I'll give it a whirl.
Thank you for taking this on so quickly and being so responsive.
In case my last (non-bulleted) note in .0 wasn't clear:
What I mean is that the hex bytestrings in a dump (written low address to high left to right):
as words is read (little endian) as:
The easiest approach is to reverse the bytes in the listing, so the highest address in each row
So dump generators traditionally provide the option to view the data in either of these byte-swapped formats. Some also will insert spaces every 2, 4, or 8 bytes to show word boundaries. This tends to work even with a variety of elements packed into a structure, since they (hopefully) are usually naturally aligned for performance.
For big-endian (Network Order) data, the current left to right order works - grouping in word-sized chunks is the same as for little-endian.
So a simple toggle (big/little = left to right or right to left) would satisfy everyone. :-)
Well, to the extent that anyone is ever satisfied ;-0
@mgrojo Thanks for the wok.
Sorry for the delay.
I got this morning's build (MSI, W64). Save as now prompts for .bin & the messages are changed.
Both look improved.
But - I noticed the Import button. And after clicking on it, I realized that things aren't quite right yet :-(
It prompts for .txt files, even in binary mode. It does offer .bin, but the default selection should match the window's mode. E.g., if the Mode is binary, .bin should be the default; if the mode is image, the default should be images - etc.
This is also true - and somewhat worse on Export - the default selection for image mode is .bin. This is an improvement over TXT, but inappropriate for images. Worse is that in this case, the "save as type' pulldown only includes "bin" and "all files" This doesn't only control the default name; it also controls the default view (of existing files that one might want to use/supersede/save).
So, what I think we want is;
i think the mapping looks like this (Same for import and export):
Note that the text modes don't offer binary; image mode is s special case of binary. json and xml don't overlap each-other, but both are special cases of text files.
Hope this makes sense.
I know I'm being somewhat picky - but these things do make a difference when you try to use the tool.
The list of filters for the import button is always the same, independently of the selected mode or the detected data type already in the editor.
The rationale is that importing can change the data type. You can import any kind of data, even in any mode. But this could be made different. It could only allow to load the data of the selected mode. I don't have a strong opinion about it.
The export filters should be intelligent based on the data types that we are currently detecting, but not based on the current mode. This means, that if the offered filter was *.bin, that means that the data was detected as binary. Was it indeed an image in your test? Then it might be a bug.
The only current caveat of this approach is that we are not currently detecting the XML data. We should probably do it.
You are correct, of course. My point is that changing the type of data in a field is unusual, even during development. I don't mean to disallow it, but I do think that the default should follow what is already in the cell. The idea (and the reason to always allow .) is that with sharp tools, people can certainly hurt themselves. but the browser shouldn't guide them to mistakes.
I don't remember for sure what was in the cell when i looked for export - I think it was NULL.
I didn't realize that you look at the actual data when selecting the output filters. That's probably better than following the editor mode. But I'm not sure exactly how that works, given SQLite's loose typing.
If I click on an integer cell, it's detected as text. Which is fuzzy. I think of Integer as a binary field - it's stored that way. But it is human-readable, and it's probably natural to edit it that way.
Speaking of which, if I click on a blob and use binary mode - the binary isn't editable. Which might be a nice thing to have. (e.g. click on the hex or text, change a byte and it updates the cell)
I don't know what you would export from a NULL cell - but it certainly isn't text. (I verified that last night's build definitely does suggest text for NULL.) I suppose it could be a zero-length binary file. Or you could just refuse to export - after all, NULL is nothing.
As you DO detect what's in a cell, why not change the edit mode to match when a cell is selected?
Which led me to try the next "obvious' thing. I selected a row and clicked export. I got only one cell - the one i happened to have clicked last. That isn't unreasonable.
But there's no context menu choice to export an entire row, which seems odd.
It might be useful to be able to select a few rows and export them in that way...
referenced this issue
Jul 29, 2018
added a commit
Sep 15, 2018
referenced this issue
Sep 20, 2018
This is improved now. The default filter in the Import is following the editor mode. Would you want to test it in the nightly?
When the value is NULL, the export will have text and binary filters, but it will make an empty file in any case.
The application has its own detection algorithms.
There is a single data detected for text and numeric data, since both are updated using the text editor. It would make sense to see the real binary representation of the integer, though. There is already an issue: #1416.
This is only expected for read-only databases or views. Can you reproduce it in a read-write table?
Yes, maybe we should disable the export in that case, but it is harmless. Maybe in the future.
This was separated to a new issue (#1537) and it's already in the nightlies. Would you like to try it?
Yes, because the editor loads the last selected cell. It only understands about cells.
You can copy the data and paste it in other applications. There is also an SQL version that copies the insert statement for those cells. You can now also print them to a PDF file or to a real printer (unless you have problems with it like those reported in #760.
added a commit
Sep 29, 2018
By the way, there is an option in the hex editor widget library for dynamically changing the number of columns. Instead of a horizontal scrollbar and a fixed number of columns, we can control the number of columns adjusting the width of the panel. This seems more useful to me. What others think? It's easy to activate it if we don't need an option for it.
I'll try to look at the new stuff in the next few days; it's a busy time.
With respect to dynamically changing number of columns: I'm OK IF the result is an offset that increments by an even multiple of 8 for each row (though I prefer 10 (hex - 16 decimal). Otherwise, one has to do too much math to find a byte (or offset). I wouldn't want to have to fuss with the panel width to get a convenient offset increment; the dump should snap to one.
E.g. if the panel width can fit17-31 bytes/row, the dump should use 16. If it would fit 9-15, the dump should use 8.