Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docsettings: add centralized sdr storage #10074

Merged
merged 52 commits into from
Feb 17, 2023
Merged

Conversation

hius07
Copy link
Member

@hius07 hius07 commented Feb 3, 2023

Discussed in #9265.
New centralized sdr storage for a book metadata:
koreader/sidecars/full/path/to/book/<book_name_without_suffix>.sdr
Also used as a fallback if book folder is readonly.
Default storage: book folder.

01

02

Closes #4831
Closes #5222
Closes #7648
Closes #8577
Closes #9265


This change is Reviewable

@hius07
Copy link
Member Author

hius07 commented Feb 3, 2023

If the conception is approved I will add support of it to file copy/move/rename/delete, Exporter etc.

@hius07
Copy link
Member Author

hius07 commented Feb 3, 2023

I will fix docsettings unit tests after discussion.

datastorage.lua Outdated Show resolved Hide resolved
frontend/docsettings.lua Outdated Show resolved Hide resolved
Comment on lines 166 to 175
-- New sidecar file in doc folder
sidecar_file_doc or "",
-- Backup file of new sidecar file in doc folder
sidecar_file_doc and (sidecar_file_doc..".old") or "",
-- Legacy sidecar file
new.legacy_sidecar_file or "",
legacy_sidecar_file or "",
-- New sidecar file in sidecars folder
sidecar_file_sidecars or "",
-- Backup file of new sidecar file in sidecars folder
sidecar_file_sidecars and (sidecar_file_sidecars..".old") or "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the ordering here doesn't matter and dont enforce a priority - and that it's the mtime of the files found that will set the priority order, right?

Copy link
Member

@NiLuJe NiLuJe Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The priority is always enforced; as a practical example, if you were to migrate from sidecar to centralized sidecar while preserving mtime and leaving the old sidecar files in place (and not touching the document so as to affect said sidecar data), the doc's sidecar would be preferred because they have higher priority (that rule only applies if mtime matches, otherwise the primary sort order is by MRU).

(Which leads to an obvious conclusion: if you do intend to create a migration path, the previous location should be cleared, or the new location files "touched" so as to get an updated mtime).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd better have sidecar.lua before sidecar.lua.old

local mtime = lfs.attributes(file_path, "modification")
-- NOTE: Extra trickery: if we're inserting a "backup" file, and its primary buddy exists,
-- make sure it will *never* sort ahead of it by using the same mtime.
-- This aims to avoid weird UTC/localtime issues when USBMS is involved,
-- c.f., https://github.com/koreader/koreader/issues/9227#issuecomment-1345263324
if file_path:sub(-4) == ".old" and previous_entry_exists then
local primary_mtime = candidates[#candidates].mtime
-- Only proceed with the switcheroo when necessary, and warn about it.
if primary_mtime < mtime then
logger.warn("DocSettings: Backup", file_path, "is newer (", mtime, ") than its primary (", primary_mtime, "), fudging timestamps!")
-- Use the most recent timestamp for both (i.e., the backup's).
candidates[#candidates].mtime = mtime
end
end

And finally the candidates are sorted by mtime, yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think about the migration of sidecar files for all books. Is it needed?
The migration for a certain book is done by opening it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please, no migration ! :)
We don't know all possible book locations.
And there's always the case of removable SD cards, that may not be plugged in at the time when the migration is done.

The priority is always enforced; as a practical example, if you were to migrate from sidecar to centralized sidecar while preserving mtime and leaving the old sidecar files in place (and not touching the document so as to affect said sidecar data), the doc's sidecar would be preferred because they have higher priority (that rule only applies if mtime matches, otherwise the primary sort order is by MRU).

So, I guess this is bad ? :/
The case we want to handle is:

  • we have book A and book B with "alongside doc" sidecars.
  • user changes setting to "centralized" (no migration is done)
  • user opens book A: "no centralized sidecar" found, "alongside doc" is found, use that
  • user closes book A: "centralized sidecar" is saved (should "alongside doc" be deleted?)
  • user changes setting to "alongside docs" (no migration is done)
  • user opens book A: "alongside doc" is found and used first, but "centralized sidecar" is newer and should have been used

Scratch that, if you really mean MRU (Most Recently User), which is what I wanted :)
(that is, the order the sidecar candidates are listed in the code does not matter - we don't copy preserving mtime, so there will always be a newer, that will be used).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order the sidecar candidates are listed in the code does not matter

It does (in a technical sense, at least), but it's only used as a tie-breaker if there's an mtime conflict.

If there's no mtime conflict, mtime wins ;).

frontend/docsettings.lua Outdated Show resolved Hide resolved
@@ -246,77 +242,47 @@ function DocSettings:flush()
ffiutil.fsyncOpenedFile(f_out) -- force flush to the storage device
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor (?) issue with koreader/sidecars/: we won't be fsync'ing the intermediate directories created by util.makePath(s[1]).
(Which might be a non-issue, as we only create them the first time, and it's not really an issue if the device soon after crash and we lose nearly new docsettings - it only matters for later flush when the user had the time to make some highlights, etc...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safer to replace util.makePath with mkdir -p?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safer in what way ?
I don't think mkdir -p would do any sync.
Moreover, there's more risks executing other programs (like launching mkdir) than doing things internally from the koreader process in Lua, because executing needs more memory and so may fail if not able to get it. See #2239 (comment)

Comment on lines 508 to 511
local metadata_folder_str = {
[":doc"] = _("book folder"),
[":sidecars"] = "koreader/sidecars",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These menu items will probably need a help_text (or the menu needs a 3rd item to display this help_text) explaining the pros and cons of each option (or/and even a confirmation when changeing it?).

@mergen3107
Copy link
Contributor

That’s a lot of changes :D
While I overall like it, I have questions:

  1. Can the current sdr folders be moved to the new place?
  2. If sdr is moved and the settings is set to the new location, will the old reading statistics still work?
  3. If both places have sdr folders, which one takes precedence?

@mergen3107
Copy link
Contributor

  1. Will the folder structure be preserved?

frontend/docsettings.lua Outdated Show resolved Hide resolved
frontend/docsettings.lua Outdated Show resolved Hide resolved
@hius07
Copy link
Member Author

hius07 commented Feb 3, 2023

1. Can the current sdr folders be moved to the new place?
2. If sdr is moved and the settings is set to the new location, will the old reading statistics still work?
3. If both places have sdr folders, which one takes precedence?
4. Will the folder structure be preserved?

  1. Yes, automatically while opening a book.
  2. I do not know much about the statistics but it is not stored in a sidecar file I think.
  3. The latest one.
  4. Which "folder structure"?

@mergen3107
Copy link
Contributor

Thanks @hius07
For 1), I have in mind settinng the setting to the new sidecar then moving all sdr folder to the new place.

For 4), I mean currently my home folder is at /storage/emulated/0/Books. Inside, there are books as well as other folders with books. That “folder structure I meant”.

E.g. will the sdr at $HOME/Collection1/Book1.sdr move to /koreader/sidecar or to /koreader/sidecar/Collection1 ?

@hius07
Copy link
Member Author

hius07 commented Feb 4, 2023

E.g. will the sdr at $HOME/Collection1/Book1.sdr move to /koreader/sidecar or to /koreader/sidecar/Collection1 ?

To /koreader/sidecar/$HOME/Collection1/Book1.sdr

@hius07
Copy link
Member Author

hius07 commented Feb 4, 2023

Updates in accordance with the reviews.
Centralized folder now is koreader/docsettings.

frontend/docsettings.lua Outdated Show resolved Hide resolved
@mergen3107
Copy link
Contributor

Thank you! Looks very promising then, can’t wait to test

@hius07
Copy link
Member Author

hius07 commented Feb 5, 2023

Generally, what is better to check if a file exists:
lfs.attributes(path, "mode") == "file" or util.fileExists(path)?

koreader/frontend/util.lua

Lines 745 to 751 in c5997a6

function util.fileExists(path)
local file = io.open(path, "r")
if file ~= nil then
file:close()
return true
end
end

@poire-z
Copy link
Contributor

poire-z commented Feb 5, 2023

I guess lfs.attributes(path, "mode") == "file" is our canonical way to check it (if it is a file, and so that it exists).
util.fileExists() checks if a file is readable (so, ok, that it exists too :) It was added by @pazos in #6177 (not sure why :), and used by @NiLuJe in kobo/device.lua to check /sys and /dev files where knowing if it's readable probably make sense.
Also, I don't know if any of these is cheaper than the other.

NiLuJe added a commit to NiLuJe/koreader that referenced this pull request Feb 11, 2023
The previous iteration, besides failing to handle leaf-only input,
was relying on splitFilePathName, which just doesn't do what's required
to incrementally build the directory tree the right way around ;).

This should more closely match mkdir -p, i.e., it will *fail* if any
part (or all of it) of the path exists but is *not* a directory.

Re koreader#10074
NiLuJe added a commit to NiLuJe/koreader that referenced this pull request Feb 11, 2023
The previous iteration, besides failing to handle leaf-only input,
was relying on splitFilePathName, which just doesn't do what's required
to incrementally build the directory tree the right way around ;).

This should more closely match mkdir -p, i.e., it will *fail* if any
part (or all of it) of the path exists but is *not* a directory.

Re koreader#10074
@poire-z poire-z added this to the 2023.02 milestone Feb 12, 2023
NiLuJe added a commit that referenced this pull request Feb 12, 2023
The previous iteration, besides failing to handle leaf-only input,
was relying on splitFilePathName, which just doesn't do what's required
to incrementally build the directory tree the right way around ;).

This should more closely match mkdir -p, i.e., it will *fail* if any
part (or all of it) of the path exists but is *not* a directory.

Re #10074
Copy link
Contributor

@poire-z poire-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for you @NiLuJe ?

Comment on lines +525 to +529
common_settings.document_metadata_folder = {
text_func = function()
local value = G_reader_settings:readSetting("document_metadata_folder", "doc")
return T(_("Book metadata folder: %1"), metadata_folder_str[value])
end,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather see "Book settings folder", document_settings_folder, used instead of "metadata" all around there, it feels less ambiguous than book metadata (ie. author, title).
Unless it's confusing in other ways to use "settings" in the code .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But may be it expresses more clearly where the "metadata".epub.lua will be kept ? :)

@mergen3107
Copy link
Contributor

(maybe not for this PR, but something to think about)
@hius07
I just had a thought about the .sdr folder migration. I remembered that Kindle native reader also uses the sdr folder to store .mbp, .apnx, and other Kindle-related stuff.

Do you think there should be at least warning about the migration if:
0) Device is Kindle, and

  1. Home folder is the same as on Kindle (/mnt/us/documents)
  2. .sdr directory contains not only .lua files (and others?) created by KOReader.

Or maybe the migration of these specific file formats should be avoided if that’s not too hard?

Not that it concerns myself since I don’t have a Kindle anymore, but I think KOReader should probably be friendly to native reader as well :D

@poire-z
Copy link
Contributor

poire-z commented Feb 13, 2023

(By migration, you mean individual migration on book opening right ? I hope you got that we won't do a batch migration :) - or if we do (or a pluging does), it should be a batch opening of books in History.)

Oh, and yes, there's the help_text with pros & cons to write :/ Or in some popup confirmation box.
(And I'm not really in the mood to write about it :\ )

@mergen3107
Copy link
Contributor

The initial plan AFAIK is yes, to migrate only on book-by-book basis when the book is reopened after the new settings is enabled.

Copy link
Member

@NiLuJe NiLuJe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r12, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @hius07 and @poire-z)


frontend/ui/elements/common_settings_menu_table.lua line 529 at r12 (raw file):

Previously, poire-z wrote…

But may be it expresses more clearly where the "metadata".epub.lua will be kept ? :)

It kinda does both in practice, and metadata feels slightly more generic and safe to me?

@NiLuJe
Copy link
Member

NiLuJe commented Feb 13, 2023

Good for you @NiLuJe ?

Nope, some of the previous review comments still need to be addressed ;).

(Plus, help_text stuff. I might try my hand at it tomorrow if nobody beats me to it).

@hius07
Copy link
Member Author

hius07 commented Feb 13, 2023

(1) Fixed: build serials without table.insert.
(2) Fixed: use makePath instead of mkdir -p.
(3) Added: DocSettings:update(), to be used for file rename/copy/move/delete operations (a separate PR).
(4) The main reason for "Book metadata folder" naming was the next menu item: "Save book metadata" (recently discussed and updated).

for _, f in ipairs(serials) do
local s_out = dump(data or self.data, nil, true)
for _, s in ipairs(serials) do
util.makePath(s[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, this would create a folder under koreader/docsettings for no reason even if using doc
Also purgeDir will only delete the immediate folder koreader/docsettings/path/to/file.sdr/ only file.sdr is deleted, leaving empty folders.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, this would create a folder under koreader/docsettings for no reason even if using doc

I believe no, we have a break of the loop if the "doc" sidecar has been saved.

Also purgeDir will only delete the immediate folder koreader/docsettings/path/to/file.sdr/ only file.sdr is deleted, leaving empty folders.

That's true.

Copy link
Member

@NiLuJe NiLuJe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good ;).

I forgot about help_text again, might have some time for it later.

The purgeDir issue that was just raised may indeed require more thinking ;).

Reviewed 2 of 2 files at r13, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @hius07, @poire-z, and @yparitcher)

@hius07
Copy link
Member Author

hius07 commented Feb 15, 2023

The purgeDir issue that was just raised may indeed require more thinking ;).

I do not think it is a big issue because usually library subfolders contain several books and after deleting a book we can keep a (empty) sdr path to be used by other books now or in the future. But yes, for the sake of good order we need smth like removePath(path) opposite to makePath(), iterating upper on the path and removing empty levels.

This (and help_text) can be added later within a new (big) PR that includes support of cerntralized sdr foder in file operations (almost finished, to be presented after this PR merging).

@NiLuJe NiLuJe mentioned this pull request Feb 16, 2023
@poire-z
Copy link
Contributor

poire-z commented Feb 16, 2023

@NiLuJe : OK with merging and to be continued in another PR?

@NiLuJe
Copy link
Member

NiLuJe commented Feb 16, 2023

Yup, fine with me ;).

@hius07 hius07 merged commit d1081fa into koreader:master Feb 17, 2023
@hius07 hius07 deleted the centralize-sdr branch February 17, 2023 06:36
Comment on lines -61 to +65
"data", "data/dict", "data/tessdata",
"history", "ota", "plugins",
"data", "data/dict", "data/tessdata", "docsettings",
"history", "ota", "patches", "plugins",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any special reason to have added "patches" in there ?
I think it was supposed to be explicitely created by the hardcore users who would need user patches, and the absence of this directory keeps the PatchManager plugin disabled, so to not lead softcore users into temptation :)
(Pinging @zwim)

@poire-z
Copy link
Contributor

poire-z commented Feb 18, 2023

Minor thought:
datastorage.lua creates the directory "history" (alongside "ota", "patches", "plugins",), and (at least on Kobo), we ship an empty history/ in the zip.
With this PR, I think we won't ever write anything into this history/ subdirectory.
So, it could be removed from these 2 things.

@hius07
Copy link
Member Author

hius07 commented Feb 18, 2023

The same for "clipboard" folder used by the Exporter plugin only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants