Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mangled attachment/blob filenames #3817

Closed
hpk42 opened this issue Dec 6, 2022 · 5 comments
Closed

Mangled attachment/blob filenames #3817

hpk42 opened this issue Dec 6, 2022 · 5 comments
Assignees
Labels
bug Something is not working

Comments

@hpk42
Copy link
Contributor

hpk42 commented Dec 6, 2022

Extensions for some filenames are sensitive. For example with the "R" programming language as noted here the file extensions need to be '.Rmd' and '.R' and will otherwise not be recognized. Core always lowercases the extension part currently.

Moreover, core also creates new longer filenames if the filename already exists in the blobdir, to disambiguate them. So i might send "proposal.md" to someone, and they receive/save it as "proposal-0192830912830.md" because they already have a "proposal.md" in their blobdir. If they save and modify it and then send this back, it's going to be "proposal-0192830912830-10923810923.md" on my side. It's quite irritating even though i know why it happens.

It's arguably a UX bug that deltachat invents or mangles filenames for blobs, instead of trying to preserve the original incoming/user-provided filename and transfering precisely that when sending messages with attachment out. User-level filenames should be independent from how DC stores it internally.

A possible way to mostly avoid the above mangling filenames issues is to introduce a sql blobfiles table that keeps the original filename, and points to the current "mangled" filename but as an internal detail that is not exposed. Later, with such a blobfiles table there could then also be a hash for each blob file row, and we could have two rows having the same hash and thus pointing to the same internal filename (but with different user-visible filenames). This would provide deduplication for all blob files.

@hpk42 hpk42 added the bug Something is not working label Dec 6, 2022
@adbenitez
Copy link
Member

+100 I definitely see this filename changes as bugs, and it affects and disturb a lot of users of my bots, for example, when sending multi-part volumes compressed files (ex. RAR or 7z files can be compressed and split in several parts) this parts must conserve the same name and change only the extension with an incremental number, but Delta Chat core is inserting this weird numbers in the names and then when user download the files, the archiver fails to extract because some part has a different filename

@flub
Copy link
Member

flub commented Dec 8, 2022

FWIW purely preserving the original name for UI purposes does not require a full storage reconsideration as proposed here. We could add this to the existing database already as a new Param without any other changes. You'd then have to figure out if this can be used by the UIs without any extra work or not - I'm not sure of that part (but exploring that is also useful for the future storage work described here).

However, if we are doing something about storage it would indeed be a shame to not include this functionality.

@link2xt
Copy link
Collaborator

link2xt commented Mar 16, 2023

Current problem, at least on Desktop, is that when you open a file in the delta chat, the file is opened in the blobdir directly. dc_msg_get_file API should be replaced with an API that saves the file in the user-provided path, defaulting to Downloads and unmangled filename in the UI.

I propose to change the API like this:

  1. dc_msg_get_file API should create a temporary directory inside the blobdir, save a copy with unmangled filename, and return its path. In the housekeeping we regularly clear these directories. This API is kept for compatibility, but deprecated.
  2. Add new API dc_msg_save_file(msg, filename) which saves file copy at the user-provided filename. The call should fail if the file already exists at the provided path. The UI should open the file saving dialog, defaulting to Downloads and original filename, when asked to save the file. After confirmation it should call dc_msg_save_file.
  3. dc_msg_get_filename should return original filename for the use as a default in the file saving dialog.

@link2xt
Copy link
Collaborator

link2xt commented Apr 5, 2023

I have filed #4309 after a related bug in the chatbot. The whole filename mangling is tricky and may lead to sending wrong files to wrong contacts if blobdir is unreliable, so I suggest we should use random or content-defined filenames.

@adbenitez
Copy link
Member

@link2xt I think this issue could be closed in favor of #4309

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants