Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate post thumbnail/metadata in background (ref #4529) #4564

Merged
merged 5 commits into from Mar 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
22 changes: 17 additions & 5 deletions api_tests/src/image.spec.ts
Expand Up @@ -17,13 +17,16 @@ import {
deleteAllImages,
delta,
epsilon,
followCommunity,
gamma,
getSite,
imageFetchLimit,
registerUser,
resolveBetaCommunity,
resolveCommunity,
resolvePost,
setupLogins,
waitForPost,
unfollows,
} from "./shared";
const downloadFileSync = require("download-file-sync");
Expand Down Expand Up @@ -209,6 +212,11 @@ test("Images in remote post are proxied if setting enabled", async () => {
test("No image proxying if setting is disabled", async () => {
let user = await registerUser(beta, betaUrl);
let community = await createCommunity(alpha);
let betaCommunity = await resolveCommunity(
beta,
community.community_view.community.actor_id,
);
await followCommunity(beta, true, betaCommunity.community!.community.id);

const upload_form: UploadImage = {
image: Buffer.from("test"),
Expand All @@ -228,15 +236,19 @@ test("No image proxying if setting is disabled", async () => {
).toBeTruthy();
expect(post.post_view.post.body).toBe("![](http://example.com/image2.png)");

let gammaPost = await resolvePost(delta, post.post_view.post);
expect(gammaPost.post).toBeDefined();
let betaPost = await waitForPost(
beta,
post.post_view.post,
res => res?.post.alt_text != null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
res => res?.post.alt_text != null,
res => res?.post.alt_text !== null,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails with !== but passes with !=. Not sure why but Ive changed it back.

);
expect(betaPost.post).toBeDefined();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the problem, gamma was fetching the post here before metadata was generated. This means the post url was federated as Link which doesnt include alt text. Have to subscribe to the community and wait for it to send the post out after metadata is generated. Also had to change from delta to beta because follow didnt go through from delta for some reason.


// remote image doesnt get proxied after federation
expect(
gammaPost.post!.post.url?.startsWith("http://127.0.0.1:8551/pictrs/image/"),
betaPost.post.url?.startsWith("http://127.0.0.1:8551/pictrs/image/"),
).toBeTruthy();
expect(gammaPost.post!.post.body).toBe("![](http://example.com/image2.png)");
expect(betaPost.post.body).toBe("![](http://example.com/image2.png)");

// Make sure the alt text got federated
expect(post.post_view.post.alt_text).toBe(gammaPost.post!.post.alt_text);
expect(post.post_view.post.alt_text).toBe(betaPost.post.alt_text);
});
27 changes: 19 additions & 8 deletions api_tests/src/post.spec.ts
Expand Up @@ -55,7 +55,18 @@ afterAll(() => {
unfollows();
});

function assertPostFederation(postOne?: PostView, postTwo?: PostView) {
async function assertPostFederation(postOne: PostView, postTwo: PostView) {
// Link metadata is generated in background task and may not be ready yet at this time,
// so wait for it explicitly. For removed posts we cant refetch anything.
postOne = await waitForPost(beta, postOne.post, res => {
return res === null || res?.post.embed_title !== null;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alt_text issue might be a logical problem here. Its returning even if the res is null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failing test is "No image proxying if setting is disabled" which doesnt use this function. The null check is necessary because theres a removed post which cant be fetched.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm... gotta be somewhere else then. Otherwise all the other PRs would have alt_text issues too, because that's already in main.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to happen specifically with this PR, so be careful with merging even if CI randomly passes.

postTwo = await waitForPost(
beta,
postTwo.post,
res => res === null || res?.post.embed_title !== null,
);

expect(postOne?.post.ap_id).toBe(postTwo?.post.ap_id);
expect(postOne?.post.name).toBe(postTwo?.post.name);
expect(postOne?.post.body).toBe(postTwo?.post.body);
Expand Down Expand Up @@ -109,7 +120,7 @@ test("Create a post", async () => {
expect(betaPost?.community.local).toBe(true);
expect(betaPost?.creator.local).toBe(false);
expect(betaPost?.counts.score).toBe(1);
assertPostFederation(betaPost, postRes.post_view);
await assertPostFederation(betaPost, postRes.post_view);

// Delta only follows beta, so it should not see an alpha ap_id
await expect(
Expand Down Expand Up @@ -157,7 +168,7 @@ test("Unlike a post", async () => {
expect(betaPost?.community.local).toBe(true);
expect(betaPost?.creator.local).toBe(false);
expect(betaPost?.counts.score).toBe(0);
assertPostFederation(betaPost, postRes.post_view);
await assertPostFederation(betaPost, postRes.post_view);
});

test("Update a post", async () => {
Expand All @@ -178,7 +189,7 @@ test("Update a post", async () => {
expect(betaPost.community.local).toBe(true);
expect(betaPost.creator.local).toBe(false);
expect(betaPost.post.name).toBe(updatedName);
assertPostFederation(betaPost, updatedPost.post_view);
await assertPostFederation(betaPost, updatedPost.post_view);

// Make sure lemmy beta cannot update the post
await expect(editPost(beta, betaPost.post)).rejects.toStrictEqual(
Expand Down Expand Up @@ -329,7 +340,7 @@ test("Delete a post", async () => {
throw "Missing beta post 2";
}
expect(betaPost2.post.deleted).toBe(false);
assertPostFederation(betaPost2, undeletedPost.post_view);
await assertPostFederation(betaPost2, undeletedPost.post_view);

// Make sure lemmy beta cannot delete the post
await expect(deletePost(beta, true, betaPost2.post)).rejects.toStrictEqual(
Expand Down Expand Up @@ -372,7 +383,7 @@ test("Remove a post from admin and community on different instance", async () =>
// Make sure lemmy beta sees post is undeleted
let betaPost2 = (await resolvePost(beta, postRes.post_view.post)).post;
expect(betaPost2?.post.removed).toBe(false);
assertPostFederation(betaPost2, undeletedPost.post_view);
await assertPostFederation(betaPost2!, undeletedPost.post_view);
});

test("Remove a post from admin and community on same instance", async () => {
Expand Down Expand Up @@ -403,7 +414,7 @@ test("Remove a post from admin and community on same instance", async () => {
p => p?.post_view.post.removed ?? false,
);
expect(alphaPost?.post_view.post.removed).toBe(true);
assertPostFederation(alphaPost.post_view, removePostRes.post_view);
await assertPostFederation(alphaPost.post_view, removePostRes.post_view);

// Undelete
let undeletedPost = await removePost(beta, false, betaPost.post);
Expand All @@ -416,7 +427,7 @@ test("Remove a post from admin and community on same instance", async () => {
p => !!p && !p.post.removed,
);
expect(alphaPost2.post.removed).toBe(false);
assertPostFederation(alphaPost2, undeletedPost.post_view);
await assertPostFederation(alphaPost2, undeletedPost.post_view);
await unfollowRemotes(alpha);
});

Expand Down
56 changes: 54 additions & 2 deletions crates/api_common/src/request.rs
@@ -1,16 +1,24 @@
use crate::{
context::LemmyContext,
lemmy_db_schema::traits::Crud,
post::{LinkMetadata, OpenGraphData},
utils::proxy_image_link,
send_activity::{ActivityChannel, SendActivityData},
utils::{local_site_opt_to_sensitive, proxy_image_link, proxy_image_link_opt_apub},
};
use activitypub_federation::config::Data;
use encoding::{all::encodings, DecoderTrap};
use lemmy_db_schema::{
newtypes::DbUrl,
source::images::{LocalImage, LocalImageForm},
source::{
images::{LocalImage, LocalImageForm},
local_site::LocalSite,
post::{Post, PostUpdateForm},
},
};
use lemmy_utils::{
error::{LemmyError, LemmyErrorType},
settings::structs::{PictrsImageMode, Settings},
spawn_try_task,
version::VERSION,
REQWEST_TIMEOUT,
};
Expand Down Expand Up @@ -82,6 +90,50 @@ pub async fn fetch_link_metadata_opt(
_ => Default::default(),
}
}
/// Generate post thumbnail in background task, because some sites can be very slow to respond.
///
/// Takes a callback to generate a send activity task, so that post can be federated with metadata.
pub fn generate_post_link_metadata(
post: Post,
custom_thumbnail: Option<Url>,
send_activity: impl FnOnce(Post) -> Option<SendActivityData> + Send + 'static,
local_site: Option<LocalSite>,
context: Data<LemmyContext>,
) {
spawn_try_task(async move {
let allow_sensitive = local_site_opt_to_sensitive(&local_site);
let page_is_sensitive = post.nsfw;
let allow_generate_thumbnail = allow_sensitive || !page_is_sensitive;
let mut thumbnail_url = custom_thumbnail.or_else(|| post.thumbnail_url.map(Into::into));
let do_generate_thumbnail = thumbnail_url.is_none() && allow_generate_thumbnail;

// Generate local thumbnail only if no thumbnail was federated and 'sensitive' attributes allow it.
let metadata = fetch_link_metadata_opt(
post.url.map(Into::into).as_ref(),
do_generate_thumbnail,
&context,
)
.await;
if let Some(thumbnail_url_) = metadata.thumbnail {
thumbnail_url = Some(thumbnail_url_.into());
}
let thumbnail_url = proxy_image_link_opt_apub(thumbnail_url, &context).await?;

let form = PostUpdateForm {
embed_title: Some(metadata.opengraph_data.title),
embed_description: Some(metadata.opengraph_data.description),
embed_video_url: Some(metadata.opengraph_data.embed_video_url),
thumbnail_url: Some(thumbnail_url),
url_content_type: Some(metadata.content_type),
..Default::default()
};
let updated_post = Post::update(&mut context.pool(), post.id, &form).await?;
if let Some(send_activity) = send_activity(updated_post) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

ActivityChannel::submit_activity(send_activity, &context).await?;
}
Ok(())
});
}

/// Extract site metadata from HTML Opengraph attributes.
fn extract_opengraph_data(html_bytes: &[u8], url: &Url) -> Result<OpenGraphData, LemmyError> {
Expand Down
33 changes: 11 additions & 22 deletions crates/api_crud/src/post/create.rs
Expand Up @@ -4,8 +4,8 @@ use lemmy_api_common::{
build_response::build_post_response,
context::LemmyContext,
post::{CreatePost, PostResponse},
request::fetch_link_metadata_opt,
send_activity::{ActivityChannel, SendActivityData},
request::generate_post_link_metadata,
send_activity::SendActivityData,
utils::{
check_community_user_action,
generate_local_apub_endpoint,
Expand Down Expand Up @@ -75,6 +75,7 @@ pub async fn create_post(
is_url_blocked(&url, &url_blocklist)?;
check_url_scheme(&url)?;
check_url_scheme(&custom_thumbnail)?;
let url = proxy_image_link_opt_apub(url, &context).await?;

check_community_user_action(
&local_user_view.person,
Expand All @@ -98,18 +99,6 @@ pub async fn create_post(
}
}

// Only generate the thumbnail if there's no custom thumbnail provided,
// otherwise it will save it in pictrs
let generate_thumbnail = custom_thumbnail.is_none();

// Fetch post links and pictrs cached image
let metadata = fetch_link_metadata_opt(url.as_ref(), generate_thumbnail, &context).await;
let url = proxy_image_link_opt_apub(url, &context).await?;
let thumbnail_url = proxy_image_link_opt_apub(custom_thumbnail, &context)
.await?
.map(Into::into)
.or(metadata.thumbnail);

// Only need to check if language is allowed in case user set it explicitly. When using default
// language, it already only returns allowed languages.
CommunityLanguage::is_allowed_community_language(
Expand All @@ -134,18 +123,13 @@ pub async fn create_post(

let post_form = PostInsertForm::builder()
.name(data.name.trim().to_string())
.url_content_type(metadata.content_type)
.url(url)
.body(body)
.alt_text(data.alt_text.clone())
.community_id(data.community_id)
.creator_id(local_user_view.person.id)
.nsfw(data.nsfw)
.embed_title(metadata.opengraph_data.title)
.embed_description(metadata.opengraph_data.description)
.embed_video_url(metadata.opengraph_data.embed_video_url)
.language_id(language_id)
.thumbnail_url(thumbnail_url)
.build();

let inserted_post = Post::create(&mut context.pool(), &post_form)
Expand All @@ -170,6 +154,14 @@ pub async fn create_post(
.await
.with_lemmy_type(LemmyErrorType::CouldntCreatePost)?;

generate_post_link_metadata(
updated_post.clone(),
custom_thumbnail,
|post| Some(SendActivityData::CreatePost(post)),
Some(local_site),
context.reset_request_count(),
);

// They like their own post by default
let person_id = local_user_view.person.id;
let post_id = inserted_post.id;
Expand All @@ -183,9 +175,6 @@ pub async fn create_post(
.await
.with_lemmy_type(LemmyErrorType::CouldntLikePost)?;

ActivityChannel::submit_activity(SendActivityData::CreatePost(updated_post.clone()), &context)
.await?;

// Mark the post as read
mark_post_as_read(person_id, post_id, &mut context.pool()).await?;

Expand Down
46 changes: 9 additions & 37 deletions crates/api_crud/src/post/update.rs
Expand Up @@ -4,8 +4,8 @@ use lemmy_api_common::{
build_response::build_post_response,
context::LemmyContext,
post::{EditPost, PostResponse},
request::fetch_link_metadata,
send_activity::{ActivityChannel, SendActivityData},
request::generate_post_link_metadata,
send_activity::SendActivityData,
utils::{
check_community_user_action,
get_url_blocklist,
Expand Down Expand Up @@ -84,40 +84,11 @@ pub async fn update_post(
Err(LemmyErrorType::NoPostEditAllowed)?
}

// Fetch post links and thumbnail if url was updated
let (embed_title, embed_description, embed_video_url, metadata_thumbnail, metadata_content_type) =
match &url {
Some(url) => {
// Only generate the thumbnail if there's no custom thumbnail provided,
// otherwise it will save it in pictrs
let generate_thumbnail = custom_thumbnail.is_none() || orig_post.thumbnail_url.is_none();

let metadata = fetch_link_metadata(url, generate_thumbnail, &context).await?;
(
Some(metadata.opengraph_data.title),
Some(metadata.opengraph_data.description),
Some(metadata.opengraph_data.embed_video_url),
Some(metadata.thumbnail),
Some(metadata.content_type),
)
}
_ => Default::default(),
};

let url = match url {
Some(url) => Some(proxy_image_link_opt_apub(Some(url), &context).await?),
_ => Default::default(),
};

let custom_thumbnail = match custom_thumbnail {
Some(custom_thumbnail) => {
Some(proxy_image_link_opt_apub(Some(custom_thumbnail), &context).await?)
}
_ => Default::default(),
};

let thumbnail_url = custom_thumbnail.or(metadata_thumbnail);

let language_id = data.language_id;
CommunityLanguage::is_allowed_community_language(
&mut context.pool(),
Expand All @@ -129,15 +100,10 @@ pub async fn update_post(
let post_form = PostUpdateForm {
name: data.name.clone(),
url,
url_content_type: metadata_content_type,
body: diesel_option_overwrite(body),
alt_text: diesel_option_overwrite(data.alt_text.clone()),
nsfw: data.nsfw,
embed_title,
embed_description,
embed_video_url,
language_id: data.language_id,
thumbnail_url,
updated: Some(Some(naive_now())),
..Default::default()
};
Expand All @@ -147,7 +113,13 @@ pub async fn update_post(
.await
.with_lemmy_type(LemmyErrorType::CouldntUpdatePost)?;

ActivityChannel::submit_activity(SendActivityData::UpdatePost(updated_post), &context).await?;
generate_post_link_metadata(
updated_post.clone(),
custom_thumbnail,
|post| Some(SendActivityData::UpdatePost(post)),
Some(local_site),
context.reset_request_count(),
);

build_post_response(
context.deref(),
Expand Down