-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] Add media list command #1943
Conversation
This is an attempt to help alleviate superseriousbusiness#1776. Using admin media list --local the full path to each local media file will be printed, with a newline. The output of this should be feadable into backup tools in order to allow to backup local media too. Together with the database this should allow to fully recover from the loss of an instance. The list command also gets a --remote flag for symmetry. In the case of --remote we print the RemoteURL instead, the location the asset can be retrieved from. To get all media, you can run with --local and --remote.
test/envparsing.sh
Outdated
@@ -105,6 +105,7 @@ EXPECT=$(cat <<"EOF" | |||
"letsencrypt-email-address": "", | |||
"letsencrypt-enabled": true, | |||
"letsencrypt-port": 80, | |||
"local": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to how flags are declared through the config module, we now end up with these somewhat generic names in the config. It's not very terrible, but I could easily see this clashing with a different command at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could change the flags to have a more verbose and unique name, like --list-local
, but it starts to stutter a bit with admin media list --list-local
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're going to be having a tidy-up of CLI and environment stuff in a little while anyway so i wouldn't worry about it too much until then. we're still only alpha too so we can afford to have breaking changes here and there when it comes to CLI usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually thinking about it. What if we had admin media [list-local|list-remote]
as two separate commands? As I'm not sure I could see a situation where you would want both filepath outputs and remote URLs output at the same time 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking the same. I'll do that, that also avoids the flag issue while we're at it.
|
||
for _, a := range attachments { | ||
if local && a.RemoteURL == "" { | ||
if _, err := f.WriteString(path.Join(mediaPath, a.File.Path) + "\n"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should return the error here, or just ignore it entirely. Likelihood of this causing an error is essentially 0, unless os.Stdout
gets banged up somehow. It does have the advantage that if we do run into an issue, the command will exit non-zero so it's possible to detect that generating the list experienced an error. That seems desirable as the output of this command is meant to feed into other systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly i would just drop the error here. a stdout error indicates a serious issue that is going to stop this entire process from working. chances are stderr would be fucked too which is why i wouldn't bother with a panic either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also makes the code here much more concise :)
return fmt.Errorf("failed to retrieve attachments: %w", err) | ||
} | ||
|
||
f := bufio.NewWriter(os.Stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to buffer stdout here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually grab a buffered one when I expect to be flushing a lot of output. I'm happy to change it though.
Edit: Just realised that answer wasn't very clear. I typically buffer when I expect a lot of output to limit the amount of write
system calls that we end up needing. Depending on the size of the instance the amount of local or remote media can be significant.
// Set the state DB connection | ||
state.DB = dbConn | ||
|
||
attachments, err := dbConn.GetAttachments(ctx, "", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably use one of the other paged functions here? e.g. if you look at how we iterate over media in internal/cleaner
. on large instances this will be loading a lot of attachments into memory at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah indeed. I'll take a look at that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, so looking at internal/cleaner
it seems to iterate on the files in storage. But I can take the same approach with iterating using db.GetAttachments
by starting without a maxID
but with a limit and then go from there until I've made it through. That would let me do something like take 50 files at a time and not blow up memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what I was thinking essentially :)
This changes the implementation of admin media list --<variant> to two separate top-level commands, list-local and list-remote. The implementation now iterates over over the database in batches of 200 in order to avoid loading all media metadata into memory.
|
||
var errListDone = errors.New("no more") | ||
|
||
func (l *list) Next(ctx context.Context) ([]*gtsmodel.MediaAttachment, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super stoked about this "iterator" thing, but I wanted to avoid having all this code duplicated between the ListLocal and ListRemote commands.
Very open to better suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I've been thinking that maybe I could add a local bool
to this and based on that do the writing to l.out
directly. Only reason I didn't is because I thought it might be nicer to keep the fetching from the DB and the printing to stdout separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done away with the iterator-like thing because it just didn't feel right. I like the new approach with passing in a filter func a lot more.
You nerds seem to have this covered pretty well already, but ping me if you want my input on anything :) |
This does away with the somewhat odd iterator-like structure we had before and does away with most of the loop duplication in list-local and list-remote. Instead they call GetAllMediaPaths with a filter func to select the media they want. That's accumulated into a slice and eventually returned.
Since we don't append the empty string anywhere, the remote filter can be limited to returning RemoteURL, as that'll be an empty string for local media.
I'm happy to merge this if you're done tweaking, it looks good to me :) |
I think the docs was the last thing I wanted to add. So yap, I'm good at this point for a squerge. |
ACTIVATING SQUERGE |
Description
This was brought on by my attempt to set up backups for my own instance and the update to the docs in #1942.
This is an attempt to help alleviate #1776. Using
admin media list --local
the full path to each local media file will be printed, with a newline. The output of this should be feadable into backup tools in order to allow to backup local media too. Together with the database this should allow to fully recover from the loss of an instance.The list command also gets a
--remote
flag for symmetry. In the case of--remote
we print theRemoteURL
instead, the location the asset can be retrieved from.To get all media, you can run with
--local
and--remote
.Checklist
Please put an x inside each checkbox to indicate that you've read and followed it:
[ ]
->[x]
If this is a documentation change, only the first checkbox must be filled (you can delete the others if you want).
go fmt ./...
andgolangci-lint run
.