Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Temporarily missing media attachments #2175

Closed
firescry opened this issue Sep 3, 2023 · 8 comments · Fixed by #2331
Closed

[bug] Temporarily missing media attachments #2175

firescry opened this issue Sep 3, 2023 · 8 comments · Fixed by #2331
Assignees
Labels
bug Something isn't working

Comments

@firescry
Copy link
Contributor

firescry commented Sep 3, 2023

Describe the bug with a clear and concise description of what the bug is.

Media attachments are sometimes not visible in posts.

I observed this problem on this post. There are three images attached.

Screenshot of original post

Screenshot of original post with three images attached

Original post was posted at 9:30, although nearly 2h later image attachments were still not visible on my GTS instance.

Screenshot of post as visible on my GTS instance (attachments not visible)

In Tusky client:
Screenshot of original post with attachments missing

In Elk client:
Screenshot of original post with attachments missing

Eventually attached images become visible.

Screenshot of post as visible on my GTS instance (attachments visible)

In Tusky client:
Screenshot_20230903-125748

In Elk client:
Screenshot_20230903-125758

What's your GoToSocial Version?

0.11.1 git-c7a46e0

GoToSocial Arch

arm64

What happened?

Original post comes from account with many followers, so when my instance tried to fetch attachments, it timed-out (I assume that original instance was quite busy for some time trying to serve the attachments to all remote servers). This is what I see in the logs:

timestamp="03/09/2023 09:35:48.315" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/.well-known/webfinger?resource=acct%3AEU_Commission%40social.network.europa.eu requestID=<REDACTED> pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
timestamp="03/09/2023 09:35:48.701" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/users/EU_Commission requestID=<REDACTED> pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
timestamp="03/09/2023 09:35:49.013" func=workers.(*Processor).ProcessFromFediAPI level=INFO activityType=Create objectType=Note toAccount=<REDACTED> requestID=<REDACTED> msg="processing from fedi API"
timestamp="03/09/2023 09:35:49.014" func=server.glob..func1.Logger.func13.1 level=INFO latency="734.500391ms" userAgent="http.rb/5.1.1 (Mastodon/4.1.3; +https://social.network.europa.eu/)" method=POST statusCode=202 path=/users/<REDACTED>/inbox clientIP=141.94.105.157 requestID=<REDACTED> msg="Accepted: wrote 45B"
timestamp="03/09/2023 09:35:49.064" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/system/media_attachments/files/110/992/230/387/522/834/original/bb578aad8b827af8.png requestID=<REDACTED> pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
timestamp="03/09/2023 09:35:49.314" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/users/EU_Commission/statuses/110971498619145849 pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
timestamp="03/09/2023 09:35:49.639" func=server.glob..func1.Logger.func13.1 level=INFO latency="1.710453ms" userAgent="http.rb/5.1.1 (Mastodon/4.1.3; +https://social.network.europa.eu/)" method=GET statusCode=200 path=/users/<REDACTED>/main-key clientIP=141.94.105.157 requestID=<REDACTED> msg="OK: wrote 561B"
timestamp="03/09/2023 09:35:49.979" func=server.glob..func1.Logger.func13.1 level=INFO latency="7.252236ms" userAgent="http.rb/5.1.1 (Mastodon/4.1.3; +https://social.network.europa.eu/)" method=GET statusCode=200 path=/users/<REDACTED> clientIP=141.94.105.157 requestID=<REDACTED> msg="OK: wrote 1.06kiB"
timestamp="03/09/2023 09:35:50.821" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/system/media_attachments/files/110/971/497/172/216/939/original/2b783294bcc68fe6.jpeg pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
context deadline exceeded (Client.Timeout or context cancellation while reading body)
timestamp="03/09/2023 09:36:19.237" func=dereferencing.(*deref).fetchStatusAttachments level=ERROR requestID=<REDACTED> msg="error loading attachment: finish: error decoding image: png: invalid format: not enough pixel data"
timestamp="03/09/2023 09:36:19.265" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/system/media_attachments/files/110/992/231/711/866/305/original/9b23321b35457aa2.png requestID=<REDACTED> pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
context deadline exceeded (Client.Timeout or context cancellation while reading body)
timestamp="03/09/2023 09:36:49.427" func=dereferencing.(*deref).fetchStatusAttachments level=ERROR requestID=<REDACTED> msg="error loading attachment: finish: error decoding image: png: invalid format: not enough pixel data"
timestamp="03/09/2023 09:36:49.456" func=httpclient.(*Client).DoSigned level=INFO method=GET url=https://social.network.europa.eu/system/media_attachments/files/110/992/232/089/383/884/original/badfcd7f9fbba61a.png requestID=<REDACTED> pubKeyID=https://<REDACTED>/users/<REDACTED>/main-key msg="performing request"
context deadline exceeded (Client.Timeout or context cancellation while reading body)
timestamp="03/09/2023 09:37:19.655" func=dereferencing.(*deref).fetchStatusAttachments level=ERROR requestID=<REDACTED> msg="error loading attachment: finish: error decoding image: png: invalid format: not enough pixel data"

After some time my instance was able to fetch missing attachments.

What you expected to happen?

The biggest issue for me was that I didn't know that there are any attachments in that post. I guess that with distributed nature of Fediverse, problems with fetching remote media may happen from time to time, but it would be great if in such situations user is informed about part of the post missing (e.g. placeholder images could be shown) :)

How to reproduce it?

Setting timeout value to very low value and following accounts with many followers probably increases the chance of the issue to reoccur :)

Anything else we need to know?

Default value for timeout is set to 10s (https://github.com/superseriousbusiness/gotosocial/blob/v0.11.1/example/config.yaml#L748). I increased that value on my instance to 30s (the issue described above happened when that increased timeout was already set). I think I'll try to increase that value even more, but I'm not sure what side effects it may have (e.g. on instance performance). Also, should the default value in example config be changed?

@firescry firescry added the bug Something isn't working label Sep 3, 2023
@NyaaaWhatsUpDoc
Copy link
Member

What platform are you running this GoToSocial install on? With what sort of network connection? I'm just trying to figure out if these slowly running requests are a GoToSocial code thing or perhaps your instance is saturating the amount it's trying to download on the available connection at any one time 🤔

@firescry
Copy link
Contributor Author

firescry commented Sep 4, 2023

My instance runs on Raspberry Pi 4 and is connected via ethernet to the router. Theoretical transfer limit of internet connection is 500 Mbps.
When I fetch the video file attached to this post, transfer is usually fast enough:

 curl https://social.network.europa.eu/system/media_attachments/files/110/949/939/218/577/000/original/143e5abbb5cb66dd.mp4 > /dev/null 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 36.7M  100 36.7M    0     0  15.0M      0  0:00:02  0:00:02 --:--:-- 15.0M

but sometimes is a bit slower:

$ curl https://social.network.europa.eu/system/media_attachments/files/110/949/939/218/577/000/original/143e5abbb5cb66dd.mp4 > /dev/null 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 36.7M  100 36.7M    0     0  5241k      0  0:00:07  0:00:07 --:--:-- 5168k

@alok0
Copy link

alok0 commented Sep 13, 2023

I sometimes see something similar(?), on posts with videos. Typically what I see is that the version of the post on my timeline never gets the video.

Eventually loading the individual post will load the video, but the timeline version never shows the video.

I'm running on a VPS and am not bottlenecked afaik.

Maybe the remote cdn is just going to be slow when a user with large following posts a video and every instance in the world pulls?

I most commonly see this problem with stux's videos.

@NyaaaWhatsUpDoc
Copy link
Member

in which case I think you can try tweaking the http client timeout in your settings. should be under http-client -> timeout

@alok0
Copy link

alok0 commented Sep 14, 2023

I don't know how the api is supposed to work, but if we aren't able to cache the media locally, is there a way for the api to return the remote url.

I know that for mastodon if media is slow to load or if reject media is on, the frontend shows a blurhash placeholder and a link that goes to the remote media url. Though I'm not sure how it works in the api.

@tsmethurst
Copy link
Contributor

Mm, probably showing a blurhash placeholder would be a good idea yeah. Right now if we can't get a media attachment on first fetch, I think we just don't show any attachments (iirc) until the next time the status is refreshed (cooldown of two hours). Placeholder would definitely help there to give the user more information.

@tsmethurst tsmethurst added this to the v0.12.0 Spaghetti Sloth milestone Sep 19, 2023
@tsmethurst
Copy link
Contributor

tsmethurst commented Sep 30, 2023

I did a bit of poking into this and didn't find a solution yet that I'm 100% happy with (do I ever?). So I'm planning to come back to it later.

For now, my thinking was along the lines of altering this function to put the db update/store inside the defer function:

func (p *ProcessingMedia) load(ctx context.Context) (*gtsmodel.MediaAttachment, bool, error) {
var (
done bool
err error
)
err = p.proc.Process(func() error {
if p.done {
// Already proc'd.
return p.err
}
defer func() {
// This is only done when ctx NOT cancelled.
done = err == nil || !errors.Comparable(err,
context.Canceled,
context.DeadlineExceeded,
)
if !done {
return
}
// Store final values.
p.done = true
p.err = err
}()
// Attempt to store media and calculate
// full-size media attachment details.
if err = p.store(ctx); err != nil {
return err
}
// Finish processing by reloading media into
// memory to get dimension and generate a thumb.
if err = p.finish(ctx); err != nil {
return err
}
if p.recache {
// Existing attachment we're recaching, so only update.
err = p.mgr.state.DB.UpdateAttachment(ctx, p.media)
return err
}
// First time caching this attachment, insert it.
err = p.mgr.state.DB.PutAttachment(ctx, p.media)
return err
})
if err != nil {
return nil, done, err
}
return p.media, done, nil
}

As such, if a load failed, the attachment database entry would still be stored, and the status would still display with the attachment's blurhash (I think), but the attachment would never have cached marked as true, and so a refetch/recache could be attempted immediately.

The downside here is that because we never parsed the image data (because we couldn't retrieve it), we don't know the dimensions and stuff, and can only guess at the extension. But perhaps that's OK?

@NyaaaWhatsUpDoc do you have any onions on that possible solution?

@firescry
Copy link
Contributor Author

Thank you! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants