Skip to content

Tag visibility-permission scope regresses by 30+ extra DB queries per discussion list (2.0-rc.1 vs 1.8) #4605

@ekumanov

Description

@ekumanov

Problem

On a forum with the flarum/tags extension enabled, listing discussions on 2.0-rc.1 issues dozens more SQL queries than 1.8 for the same content. We measured 120 Execute statements on /api/discussions?page[limit]=20 under 2.0-rc.1 vs 29 on 1.8.16 with identical schema and data — a 4.1× query-count regression. The two new query templates that account for almost all of the regression:

Template 2.0 calls 1.8 calls
SELECT permission FROM group_permission WHERE group_id IN (?) 36 1
SELECT id, is_restricted FROM tags 6 0

Root cause

Flarum\Tags\Tag::scopeWhereHasPermission was rewritten between 1.x and 2.0:

1.8 (extensions/tags/src/Tag.php#L243 in flarum/framework v1.8.16): builds a single SQL subquery that resolves permitted tag IDs inline. The actor's getPermissions() is called once (and is User-instance-cached); no other DB hit per scope application.

2.0-rc.1 (extensions/tags/src/Tag.php#L233-L289): pulls user permissions, fetches all tags from the DB into PHP, filters in PHP, then builds an IN-clause. The result is wrapped in a WeakMap<User, array<string,mixed>> cache (Tag::$permittedTagIdCache).

$userCache = static::$permittedTagIdCache[$user] ?? [];
if (array_key_exists($currPermission, $userCache)) return $userCache[$currPermission];
// cache miss path:
$user->getPermissions();                                        // → 1 SELECT (per User instance)
$allTags = static::query()->select(['id', 'is_restricted'])->get();  // → 1 SELECT (per cache miss)
$permitted = $allTags->filter(...)->pluck('id')->all();
// builds IN-clause from $permitted

The WeakMap cache is keyed by the User object instance. Every fresh User instance loaded by Eloquent — and a discussion list loads many: discussion.user, discussion.lastPostedUser, firstPost.user, lastPostedUser.groups — gets its own cache slot. Even though most of those users share an identical group composition (e.g. [guest, member, group6]), each instance independently misses the cache and re-runs both queries.

We captured the parameters of the 30 group_permission selects in one request — they are all IN (2, 3, 6) (the actor's groups), repeated 30 times because 30 distinct User instances each independently triggered the cache miss:

30× group_id IN (2, 3, 6)    ← actor's groups, repeated for every fresh User instance
 3× group_id IN (2, 3)
 1× group_id IN (2, 3, 6, 7)
 1× group_id IN (2, 3, 4, 6)
 1× group_id IN (2, 3, 4)
 1× group_id IN (2)          ← guest

Reproducer

Stock 2.0-rc.1 install with flarum/tags, ≥20 discussions across multiple tags (some restricted, some not), at least one non-trivial group:

mysql -e "SET GLOBAL general_log='ON'; SET GLOBAL general_log_file='/tmp/g.log';"
curl -s -o /dev/null 'http://your-forum/api/discussions?page%5Blimit%5D=20'
mysql -e "SET GLOBAL general_log='OFF';"
grep -c 'group_permission' /tmp/g.log   # expect ~30+
grep -c 'select `id`, `is_restricted` from `tags`' /tmp/g.log   # expect ~6

Suggested fix directions

  1. Restore the 1.8 subquery approach (preferred). The per-PHP filter is the regression — 1.8's all-in-SQL path was bounded to one query per request regardless of how many User instances exist.
  2. Re-key the cache by the group-id-set string (e.g. "2,3,6" → permission list) rather than by User instance. Multiple User instances sharing groups would then share the cache. Trivial change; almost as good as Interface redesign #1 in practice.
  3. Preload Permission::all() at request boot into a request-scoped registry. Permissions are a small table; a single load eliminates per-User-instance queries entirely.

Impact

On a real production forum (~927 users, ~2266 discussions, ~54k posts), /forum/ HTML render: 1.8.16 = 0.38s, 2.0-rc.1 = 1.27s on identical hardware. Removing this regression alone (independent of an unrelated polls extension regression filed separately at FriendsOfFlarum/polls) should drop 2.0 to ~0.85–0.90s based on the query-count delta and observed per-query cost.

Profiling was done with php-spx cookie-gated, plus MySQL general_log for query capture. Happy to share raw flame graphs if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions