-
-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
federation: parallel sending per instance #4623
Open
phiresky
wants to merge
29
commits into
main
Choose a base branch
from
federation-send-parallel
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+692
−315
Open
Changes from 9 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
539f06a
federation: parallel sending
phiresky 491daab
federation: some comments
phiresky 987174a
lint and set force_write true when a request fails
phiresky a66aec6
inbox_urls return vec
phiresky a3d705f
split inbox functions into separate file
phiresky 7eedcb7
cleanup
phiresky e719baf
extract sending task code to separate file
phiresky 5e986ef
move federation concurrent config to config file
phiresky c1932f9
off by one issue
phiresky a7c7abd
improve msg
phiresky 13ff059
fix both permanent stopping of federation queues and multiple creatio…
phiresky 7cb4e82
Merge branch 'fix-dupe-activity-sending' into federation-send-parallel
phiresky 10d3b7d
fix after merge
phiresky ffb99cd
Merge remote-tracking branch 'origin/main' into federation-send-parallel
phiresky a0b0a7a
lint fix
phiresky cdff275
Update crates/federate/src/send.rs
phiresky 175133f
comment about reverse ordering
phiresky 2acdc78
remove crashable, comment
phiresky 9d87921
comment
phiresky 5538794
move comment
phiresky 7ee63f4
run federation tests twice
phiresky 3784b7f
fix test run
phiresky c2d18d3
prettier
phiresky 5a418ac
fix config default
phiresky c66bf26
upgrade rust to 1.78 to fix diesel cli
phiresky 1c1018b
Merge remote-tracking branch 'origin/upgrade-rust' into federation-se…
phiresky 101901b
fix clippy
phiresky dfccf3e
delay
phiresky 2dd7b71
Merge branch 'main' into federation-send-parallel
phiresky File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
use crate::util::LEMMY_TEST_FAST_FEDERATION; | ||
use anyhow::Result; | ||
use chrono::{DateTime, TimeZone, Utc}; | ||
use lemmy_db_schema::{ | ||
newtypes::{CommunityId, InstanceId}, | ||
source::{activity::SentActivity, site::Site}, | ||
utils::{ActualDbPool, DbPool}, | ||
}; | ||
use lemmy_db_views_actor::structs::CommunityFollowerView; | ||
use once_cell::sync::Lazy; | ||
use reqwest::Url; | ||
use std::collections::{HashMap, HashSet}; | ||
|
||
/// interval with which new additions to community_followers are queried. | ||
/// | ||
/// The first time some user on an instance follows a specific remote community (or, more precisely: the first time a (followed_community_id, follower_inbox_url) tuple appears), | ||
/// this delay limits the maximum time until the follow actually results in activities from that community id being sent to that inbox url. | ||
/// This delay currently needs to not be too small because the DB load is currently fairly high because of the current structure of storing inboxes for every person, not having a separate list of shared_inboxes, and the architecture of having every instance queue be fully separate. | ||
/// (see https://github.com/LemmyNet/lemmy/issues/3958) | ||
phiresky marked this conversation as resolved.
Show resolved
Hide resolved
|
||
static FOLLOW_ADDITIONS_RECHECK_DELAY: Lazy<chrono::TimeDelta> = Lazy::new(|| { | ||
if *LEMMY_TEST_FAST_FEDERATION { | ||
chrono::TimeDelta::try_seconds(1).expect("TimeDelta out of bounds") | ||
} else { | ||
chrono::TimeDelta::try_minutes(2).expect("TimeDelta out of bounds") | ||
} | ||
}); | ||
/// The same as FOLLOW_ADDITIONS_RECHECK_DELAY, but triggering when the last person on an instance unfollows a specific remote community. | ||
/// This is expected to happen pretty rarely and updating it in a timely manner is not too important. | ||
static FOLLOW_REMOVALS_RECHECK_DELAY: Lazy<chrono::TimeDelta> = | ||
Lazy::new(|| chrono::TimeDelta::try_hours(1).expect("TimeDelta out of bounds")); | ||
|
||
pub(crate) struct CommunityInboxCollector { | ||
// load site lazily because if an instance is first seen due to being on allowlist, | ||
// the corresponding row in `site` may not exist yet since that is only added once | ||
// `fetch_instance_actor_for_object` is called. | ||
// (this should be unlikely to be relevant outside of the federation tests) | ||
site_loaded: bool, | ||
site: Option<Site>, | ||
followed_communities: HashMap<CommunityId, HashSet<Url>>, | ||
last_full_communities_fetch: DateTime<Utc>, | ||
last_incremental_communities_fetch: DateTime<Utc>, | ||
instance_id: InstanceId, | ||
domain: String, | ||
pool: ActualDbPool, | ||
} | ||
impl CommunityInboxCollector { | ||
pub fn new( | ||
pool: ActualDbPool, | ||
instance_id: InstanceId, | ||
domain: String, | ||
) -> CommunityInboxCollector { | ||
CommunityInboxCollector { | ||
pool, | ||
site_loaded: false, | ||
site: None, | ||
followed_communities: HashMap::new(), | ||
last_full_communities_fetch: Utc.timestamp_nanos(0), | ||
last_incremental_communities_fetch: Utc.timestamp_nanos(0), | ||
instance_id, | ||
domain, | ||
} | ||
} | ||
/// get inbox urls of sending the given activity to the given instance | ||
/// most often this will return 0 values (if instance doesn't care about the activity) | ||
/// or 1 value (the shared inbox) | ||
/// > 1 values only happens for non-lemmy software | ||
pub async fn get_inbox_urls(&mut self, activity: &SentActivity) -> Result<Vec<Url>> { | ||
let mut inbox_urls: HashSet<Url> = HashSet::new(); | ||
|
||
if activity.send_all_instances { | ||
if !self.site_loaded { | ||
self.site = Site::read_from_instance_id(&mut self.pool(), self.instance_id).await?; | ||
self.site_loaded = true; | ||
} | ||
if let Some(site) = &self.site { | ||
// Nutomic: Most non-lemmy software wont have a site row. That means it cant handle these activities. So handling it like this is fine. | ||
inbox_urls.insert(site.inbox_url.inner().clone()); | ||
} | ||
} | ||
if let Some(t) = &activity.send_community_followers_of { | ||
if let Some(urls) = self.followed_communities.get(t) { | ||
inbox_urls.extend(urls.iter().cloned()); | ||
} | ||
} | ||
inbox_urls.extend( | ||
activity | ||
.send_inboxes | ||
.iter() | ||
.filter_map(std::option::Option::as_ref) | ||
.filter(|&u| (u.domain() == Some(&self.domain))) | ||
.map(|u| u.inner().clone()), | ||
); | ||
Ok(inbox_urls.into_iter().collect()) | ||
} | ||
|
||
pub async fn update_communities(&mut self) -> Result<()> { | ||
if (Utc::now() - self.last_full_communities_fetch) > *FOLLOW_REMOVALS_RECHECK_DELAY { | ||
tracing::debug!("{}: fetching full list of communities", self.domain); | ||
// process removals every hour | ||
(self.followed_communities, self.last_full_communities_fetch) = self | ||
.get_communities(self.instance_id, Utc.timestamp_nanos(0)) | ||
.await?; | ||
self.last_incremental_communities_fetch = self.last_full_communities_fetch; | ||
} | ||
if (Utc::now() - self.last_incremental_communities_fetch) > *FOLLOW_ADDITIONS_RECHECK_DELAY { | ||
// process additions every minute | ||
let (news, time) = self | ||
.get_communities(self.instance_id, self.last_incremental_communities_fetch) | ||
.await?; | ||
if !news.is_empty() { | ||
tracing::debug!( | ||
"{}: fetched {} incremental new followed communities", | ||
self.domain, | ||
news.len() | ||
); | ||
} | ||
self.followed_communities.extend(news); | ||
self.last_incremental_communities_fetch = time; | ||
} | ||
Ok(()) | ||
} | ||
|
||
/// get a list of local communities with the remote inboxes on the given instance that cares about them | ||
async fn get_communities( | ||
&mut self, | ||
instance_id: InstanceId, | ||
last_fetch: DateTime<Utc>, | ||
) -> Result<(HashMap<CommunityId, HashSet<Url>>, DateTime<Utc>)> { | ||
let new_last_fetch = | ||
Utc::now() - chrono::TimeDelta::try_seconds(10).expect("TimeDelta out of bounds"); // update to time before fetch to ensure overlap. subtract 10s to ensure overlap even if published date is not exact | ||
phiresky marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Ok(( | ||
CommunityFollowerView::get_instance_followed_community_inboxes( | ||
&mut self.pool(), | ||
instance_id, | ||
last_fetch, | ||
) | ||
.await? | ||
.into_iter() | ||
.fold(HashMap::new(), |mut map, (c, u)| { | ||
map.entry(c).or_default().insert(u.into()); | ||
map | ||
}), | ||
new_last_fetch, | ||
)) | ||
} | ||
fn pool(&self) -> DbPool<'_> { | ||
DbPool::Pool(&self.pool) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a note, i've not made any changes to this code, just moved it into a separate struct