Skip to content

Commit

Permalink
Don't follow type tables for incremental sitemaps (#2393)
Browse files Browse the repository at this point in the history
Incremental sitemaps ingest replication packets and follow the tables
therein to find changes to pages in the sitemaps.  If a type table is
updated, like release_group_primary_type for example, we'd end up trying
to fetch every page for every release group using a modified type.  (It
doesn't matter which columns were changed.)  I can't think of a
situation where changing a type would significantly affect the content
of so many pages that we'd want search engines to re-index them all, so
I'm having the code skip these type tables entirely.
  • Loading branch information
mwiencek committed Jan 24, 2022
1 parent 4f5f74d commit e774903
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion lib/MusicBrainz/Server/Sitemap/Incremental.pm
Original file line number Diff line number Diff line change
Expand Up @@ -198,12 +198,15 @@ sub build_page_url_from_row {
sub should_follow_table {
my ($self, $table) = @_;

return 0 if $table eq 'cover_art_archive.cover_art_type';
return 0 if $table eq 'musicbrainz.cdtoc';
return 0 if $table eq 'musicbrainz.language';
return 0 if $table eq 'musicbrainz.medium_cdtoc';
return 0 if $table eq 'musicbrainz.medium_index';
return 0 if $table eq 'musicbrainz.release_packaging';
return 0 if $table eq 'musicbrainz.release_status';
return 0 if $table eq 'musicbrainz.script';

return 0 if $table =~ /_type$/;
return 0 if $table =~ qr'[._](tag_|tag$)';
return 0 if $table =~ qw'_(meta|raw|gid_redirect)$';

Expand Down

0 comments on commit e774903

Please sign in to comment.