Skip to content

Conversation

@maparent
Copy link
Collaborator

@maparent maparent commented Nov 10, 2025

https://linear.app/discourse-graphs/issue/ENG-1012/update-query-builder-relation-conditions-to-use-reified-relation

Update query builder relation conditions.

Notes:
Right now, we use one of two query paths, according to the use-reified-relations setting.
I had to make changes to the optimizer. (Otherwise queries with or-clause would just fail, or even crash.)
Performance is clearly better than before but a bit less than hoped for.

Summary by CodeRabbit

  • New Features
    • Admin panel redesigned with tabbed interface: "Node list" and "Experiments" sections for improved organization.
    • Experiment settings now toggleable via checkbox with persistent storage.
    • Reified relations support for managing complex node relationships with deduplication control.
    • Relation migration utility to transform existing relation data.

@linear
Copy link

linear bot commented Nov 10, 2025

@supabase
Copy link

supabase bot commented Nov 10, 2025

This pull request has been ignored for the connected project zytfjzqyijgagqxrzbmz because there are no changes detected in packages/database/supabase directory. You can change this behaviour in Project Integrations Settings ↗︎.


Preview Branches by Supabase.
Learn more about Supabase Branching ↗︎.

@maparent maparent marked this pull request as draft November 10, 2025 00:36
@maparent
Copy link
Collaborator Author

@CodeRabbit review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

📝 Walkthrough

Walkthrough

This PR introduces reified relations support to the Discourse Graph by adding utilities for creating and querying reified blocks, extending the Datalog type system via a local module, refactoring AdminPanel into tabs with feature flag control, consolidating type imports, and conditionally routing Datalog translation through a new reified-relations path.

Changes

Cohort / File(s) Summary
UI Component Updates
apps/roam/src/components/settings/AdminPanel.tsx, apps/roam/src/components/settings/Settings.tsx
AdminPanel now accepts onloadArgs prop and implements a tabbed interface with "Node list" and "Experiments" tabs. Experiments tab includes a checkbox to toggle reified-relations feature. Settings passes onloadArgs to AdminPanel.
New Reified Relations Utilities
apps/roam/src/utils/createReifiedBlock.ts
Adds full workflow for creating and managing reified blocks including query utilities (queryForReifiedBlocks, strictQueryForReifiedBlocks), block creation with deduplication, and relation page management (getOrCreateRelationPageUid, getExistingRelationPageUid).
Relation Data and Migration
apps/roam/src/utils/getRelationData.ts, apps/roam/src/utils/migrateRelations.ts
getRelationData queries relation data by delegating to fireQuery for each relation. migrateRelations orchestrates sequential migration of relations by calling createReifiedRelation for each item.
Extended Datalog Type System
apps/roam/src/utils/extendedDatalogClause.ts
Introduces comprehensive Datalog type definitions including DatalogPredExpr, DatalogExpressionClause, and composite clause types (DatalogAndClause, DatalogOrClause, DatalogNotClause, etc.), consolidating extended Datalog support.
Type Import Consolidation
apps/roam/src/utils/compileDatalog.ts, apps/roam/src/utils/conditionToDatalog.ts, apps/roam/src/utils/discourseNodeFormatToDatalog.ts, apps/roam/src/utils/gatherDatalogVariablesFromClause.ts, apps/roam/src/utils/replaceDatalogVariables.ts, apps/roam/src/utils/predefinedSelections.ts
Refactors DatalogClause imports from external roamjs-components/types module to local ./extendedDatalogClause module. Removes type assertion on sourceUid in predefinedSelections.
Datalog Translator Integration
apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
Adds feature-flag-driven conditional path for reified-relations translation. When enabled, constructs relation metadata queries and forward/reverse clauses via reified-relation page; otherwise uses original edge-triple construction.
Query and Export Utilities
apps/roam/src/utils/fireQuery.ts, apps/roam/src/utils/getExportTypes.ts
fireQuery adjusts clause-scoring logic to downgrade disjunctions/negations more aggressively. getExportTypes refactors to use getRelationDataUtil instead of inlined retrieval, converts async functions to arrows, and adds flatten: true to markdown generation.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AdminPanel
    participant extensionAPI
    participant createReifiedBlock
    participant fireQuery
    participant roamAlphaAPI

    User->>AdminPanel: Toggle reified-relations setting
    AdminPanel->>extensionAPI: setSetting("use-reified-relations", true)
    
    Note over User,AdminPanel: Migration Workflow (via migrateRelations)
    User->>createReifiedBlock: migrateRelations() called
    createReifiedBlock->>fireQuery: getRelationData() queries all relations
    fireQuery-->>createReifiedBlock: returns relation data array
    
    loop For each relation
        createReifiedBlock->>createReifiedBlock: strictQueryForReifiedBlocks(params)
        alt Block exists (dedup check)
            createReifiedBlock-->>createReifiedBlock: return existing UID
        else Block not found
            createReifiedBlock->>roamAlphaAPI: Generate new UID & create block
            roamAlphaAPI->>roamAlphaAPI: Set discourse-graph properties
            roamAlphaAPI-->>createReifiedBlock: return new block UID
        end
    end
    createReifiedBlock-->>User: Migration complete
Loading
sequenceDiagram
    participant Query
    participant registerDiscourseDatalogTranslators
    participant getSetting
    participant createReifiedBlock
    participant fireQuery

    Query->>registerDiscourseDatalogTranslators: translateEdge(relation)
    registerDiscourseDatalogTranslators->>getSetting: check "use-reified-relations"
    
    alt Reified-relations enabled
        registerDiscourseDatalogTranslators->>createReifiedBlock: getExistingRelationPageUid()
        alt Relation page exists
            registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Build relClauseBasis queries
            registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Construct forward/reverse clauses
            registerDiscourseDatalogTranslators->>fireQuery: Execute reified-relations path
            fireQuery-->>registerDiscourseDatalogTranslators: Reified query results
        else Page not found
            registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Fall back to original path
        end
    else Reified-relations disabled
        registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Use original edge-triple construction
    end
    
    registerDiscourseDatalogTranslators-->>Query: Return clause set
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • registerDiscourseDatalogTranslators.ts requires careful review of the new reified-relations translation path, variable mapping logic, and interaction with the feature flag. The conditional branching and clause construction logic are dense.
  • createReifiedBlock.ts introduces deduplication logic, state management via cached relationPageUid, and dependency on extension settings; ensure consistency and error handling.
  • Type system changes across six files consolidate imports from external to local module; verify all usages and type safety.
  • getExportTypes.ts refactoring delegates relation-data retrieval; ensure the new pattern maintains expected behavior and error propagation.
  • AdminPanel signature change is a breaking change to a public component; verify all call sites pass onloadArgs correctly.

Possibly related PRs

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'ENG-1012 Update Query Builder Relation Conditions to Use Reified Relation Triples' is clear, specific, and accurately summarizes the main objective of the changeset. It reflects the core work of refactoring relation conditions to use reified relation triples across multiple utility files and components.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
apps/roam/src/utils/migrateRelations.ts (1)

10-17: Guard against duplicate relation triples before creating.

Given the inline note about “likely to create duplicates,” it’d be safer to track processed (source, relUid, target) combos before calling createReifiedRelation. That way reruns or redundant query rows won’t stamp extra reified blocks.

-  for (const rel of relationData) {
-    const uid = await createReifiedRelation({
-      sourceUid: rel.source,
-      destinationUid: rel.target,
-      relationBlockUid: rel.relUid,
-    });
-    // created.add(uid);
-  }
+  const processed = new Set<string>();
+  for (const rel of relationData) {
+    const key = `${rel.source}:${rel.relUid}:${rel.target}`;
+    if (processed.has(key)) continue;
+    processed.add(key);
+    await createReifiedRelation({
+      sourceUid: rel.source,
+      destinationUid: rel.target,
+      relationBlockUid: rel.relUid,
+    });
+  }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08113a9 and 5e38e3d.

📒 Files selected for processing (15)
  • apps/roam/src/components/settings/AdminPanel.tsx (4 hunks)
  • apps/roam/src/components/settings/Settings.tsx (1 hunks)
  • apps/roam/src/utils/compileDatalog.ts (1 hunks)
  • apps/roam/src/utils/conditionToDatalog.ts (1 hunks)
  • apps/roam/src/utils/createReifiedBlock.ts (1 hunks)
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts (1 hunks)
  • apps/roam/src/utils/extendedDatalogClause.ts (1 hunks)
  • apps/roam/src/utils/fireQuery.ts (4 hunks)
  • apps/roam/src/utils/gatherDatalogVariablesFromClause.ts (1 hunks)
  • apps/roam/src/utils/getExportTypes.ts (5 hunks)
  • apps/roam/src/utils/getRelationData.ts (1 hunks)
  • apps/roam/src/utils/migrateRelations.ts (1 hunks)
  • apps/roam/src/utils/predefinedSelections.ts (2 hunks)
  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts (3 hunks)
  • apps/roam/src/utils/replaceDatalogVariables.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/main.mdc)

**/*.{ts,tsx}: Prefer type over interface
Use explicit return types for functions
Avoid any types when possible

Files:

  • apps/roam/src/utils/migrateRelations.ts
  • apps/roam/src/utils/replaceDatalogVariables.ts
  • apps/roam/src/utils/compileDatalog.ts
  • apps/roam/src/utils/getRelationData.ts
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/conditionToDatalog.ts
  • apps/roam/src/components/settings/Settings.tsx
  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/components/settings/AdminPanel.tsx
  • apps/roam/src/utils/gatherDatalogVariablesFromClause.ts
  • apps/roam/src/utils/fireQuery.ts
  • apps/roam/src/utils/extendedDatalogClause.ts
  • apps/roam/src/utils/getExportTypes.ts
  • apps/roam/src/utils/predefinedSelections.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/main.mdc)

**/*.{ts,tsx,js,jsx}: Prefer arrow functions over regular function declarations
Use named parameters (object destructuring) when a function has more than 2 parameters
Use Prettier with the project's configuration
Maintain consistent naming conventions: PascalCase for components and types
Maintain consistent naming conventions: camelCase for variables and functions
Maintain consistent naming conventions: UPPERCASE for constants

Files:

  • apps/roam/src/utils/migrateRelations.ts
  • apps/roam/src/utils/replaceDatalogVariables.ts
  • apps/roam/src/utils/compileDatalog.ts
  • apps/roam/src/utils/getRelationData.ts
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/conditionToDatalog.ts
  • apps/roam/src/components/settings/Settings.tsx
  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/components/settings/AdminPanel.tsx
  • apps/roam/src/utils/gatherDatalogVariablesFromClause.ts
  • apps/roam/src/utils/fireQuery.ts
  • apps/roam/src/utils/extendedDatalogClause.ts
  • apps/roam/src/utils/getExportTypes.ts
  • apps/roam/src/utils/predefinedSelections.ts
apps/roam/**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)

apps/roam/**/*.{js,jsx,ts,tsx}: Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension
Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API
Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API

Files:

  • apps/roam/src/utils/migrateRelations.ts
  • apps/roam/src/utils/replaceDatalogVariables.ts
  • apps/roam/src/utils/compileDatalog.ts
  • apps/roam/src/utils/getRelationData.ts
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/conditionToDatalog.ts
  • apps/roam/src/components/settings/Settings.tsx
  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/components/settings/AdminPanel.tsx
  • apps/roam/src/utils/gatherDatalogVariablesFromClause.ts
  • apps/roam/src/utils/fireQuery.ts
  • apps/roam/src/utils/extendedDatalogClause.ts
  • apps/roam/src/utils/getExportTypes.ts
  • apps/roam/src/utils/predefinedSelections.ts
🧠 Learnings (11)
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API

Applied to files:

  • apps/roam/src/utils/compileDatalog.ts
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/conditionToDatalog.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/components/settings/AdminPanel.tsx
  • apps/roam/src/utils/extendedDatalogClause.ts
  • apps/roam/src/utils/predefinedSelections.ts
📚 Learning: 2025-06-22T10:40:52.752Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.

Applied to files:

  • apps/roam/src/utils/getRelationData.ts
  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/utils/getExportTypes.ts
📚 Learning: 2025-06-17T23:42:29.279Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:95-104
Timestamp: 2025-06-17T23:42:29.279Z
Learning: In the DiscourseGraphs/discourse-graph codebase, DiscourseRelation type properties are either string or Triple[], and the STANDARD_ROLES filter in conceptConversion.ts excludes Triple[] properties, so only string values remain after filtering.

Applied to files:

  • apps/roam/src/utils/getRelationData.ts
📚 Learning: 2025-06-22T10:40:21.679Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:15-16
Timestamp: 2025-06-22T10:40:21.679Z
Learning: In the getAllDiscourseNodesSince function in apps/roam/src/utils/getAllDiscourseNodesSince.ts, date validation is performed before the function is called, so additional date validation within the function is not needed.

Applied to files:

  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
📚 Learning: 2025-06-17T23:37:45.289Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:42-56
Timestamp: 2025-06-17T23:37:45.289Z
Learning: In the DiscourseNode interface from apps/roam/src/utils/getDiscourseNodes.ts, the field `type` serves as the unique identifier field, not a type classification field. The interface has no `uid` or `id` field, making `node.type` the correct field to use for UID-related operations.

Applied to files:

  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API

Applied to files:

  • apps/roam/src/utils/discourseNodeFormatToDatalog.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/components/settings/AdminPanel.tsx
  • apps/roam/src/utils/predefinedSelections.ts
📚 Learning: 2025-11-05T21:57:14.909Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.

Applied to files:

  • apps/roam/src/utils/registerDiscourseDatalogTranslators.ts
  • apps/roam/src/utils/createReifiedBlock.ts
  • apps/roam/src/utils/getExportTypes.ts
📚 Learning: 2025-11-06T13:48:35.007Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-11-06T13:48:35.007Z
Learning: In apps/roam/src/utils/createReifiedBlock.ts, the `setBlockProps` function does not need await as it is synchronous or fire-and-forget.

Applied to files:

  • apps/roam/src/utils/createReifiedBlock.ts
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension

Applied to files:

  • apps/roam/src/components/settings/AdminPanel.tsx
📚 Learning: 2025-08-11T19:09:58.252Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 337
File: apps/roam/src/components/DiscourseFloatingMenu.tsx:43-43
Timestamp: 2025-08-11T19:09:58.252Z
Learning: The roam subdirectory (apps/roam) is constrained to React 17 and cannot use React 18+ features like createRoot API. ReactDOM.render should be used instead of createRoot in this subdirectory.

Applied to files:

  • apps/roam/src/components/settings/AdminPanel.tsx
📚 Learning: 2025-09-01T19:09:07.052Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-09-01T19:09:07.052Z
Learning: User prefers to avoid making types explicit in each require statement due to anti-DRY concerns when working with module imports in TypeScript.

Applied to files:

  • apps/roam/src/utils/gatherDatalogVariablesFromClause.ts
🔇 Additional comments (16)
apps/roam/src/components/settings/Settings.tsx (1)

232-233: Prop wiring looks good.

Passing onloadArgs keeps the settings modal aligned with the updated AdminPanel API—no further action needed.

apps/roam/src/utils/discourseNodeFormatToDatalog.ts (1)

1-1: Local clause type import keeps things consistent.

Thanks for migrating this file to the shared extended clause types; it stays aligned with the rest of the refactor.

apps/roam/src/utils/replaceDatalogVariables.ts (1)

1-2: Shared clause typing looks good.

Pulling DatalogClause from the local extended definition keeps the replacement helper in sync with the new type system.

apps/roam/src/utils/fireQuery.ts (1)

65-160: Heuristic downgrade for uncaptured OR/NOT clauses makes sense.

Giving uncaptured or/not (and the residual and) clauses a worse score should stop them from jumping ahead of selective patterns, which fits the optimizer’s intent.

apps/roam/src/utils/getRelationData.ts (1)

8-60: Nice extraction of relation lookup logic.

Lifting the shared querying into getRelationDataUtil plus a thin default export keeps migration and export paths in sync without duplicating the fireQuery plumbing.

apps/roam/src/utils/predefinedSelections.ts (1)

1-2: Import realignment matches the other utils.

Good to see the selection helpers sharing the extended clause typing as well.

apps/roam/src/utils/getExportTypes.ts (5)

16-16: LGTM—delegating relation data retrieval to a shared utility.

This import change aligns with the PR objective to centralize relation data logic.


44-47: LGTM—simplified children handling.

The direct array access without intermediate casting is cleaner while maintaining the same functionality.


376-377: LGTM—extracted relation data logic to a shared utility.

Moving this logic to getRelationDataUtil improves code reuse and maintainability, consistent with the PR's objective to consolidate relation handling.


422-535: LGTM—refactored to arrow function per coding guidelines.

The conversion to an arrow function aligns with the project's coding guidelines. The logic for handling discourse context, references, and frontmatter remains unchanged.


639-749: LGTM—improved code organization with helper functions.

The refactor introduces well-structured helper functions (treeNodeToMarkdown, getMarkdownContent, getDiscourseResultsContent, getReferenceResultsContent) that make the PDF export logic more modular and maintainable. The addition of flatten: true to the toMarkdown options is appropriate for PDF rendering, where a flat structure without indentation is typically preferred.

apps/roam/src/utils/registerDiscourseDatalogTranslators.ts (5)

26-27: LGTM—new imports for reified relations feature.

The imports of getSetting and getExistingRelationPageUid are appropriate for the feature-flagged reified relations support being introduced.


32-42: LGTM—useful helper for clause construction.

The singleClause helper provides a clean way to collapse clause arrays into a single clause, with appropriate handling for empty, single, and multiple clause scenarios.


638-848: LGTM—preserved fallback logic with improved OR-join variable handling.

The non-reified path maintains the original triple-based implementation as a fallback. The enhanced OR-join variable collection at lines 828-834 correctly computes the intersection of variables across all alternatives, ensuring only variables present in all branches are included in the join clause:

const orJoinedVars = collectVariables(andParts[0]);
andParts.slice(1).forEach((a) => {
  const freeVars = collectVariables(a);
  Array.from(orJoinedVars).forEach((v) => {
    if (!freeVars.has(v)) orJoinedVars.delete(v);
  });
});

This is the correct approach for OR-joins in Datalog, where only shared variables can be safely joined.


384-637: Remove the flagged concern—the code correctly handles both-UIDs.

The reified relations logic properly constructs clauses when both source and target are UIDs. When sourceAsUid && targetAsUid, the code correctly produces:

  • forwardClauses: relSource = sourceAsUid + relTarget = targetAsUid
  • reverseClauses: relTarget = sourceAsUid + relSource = targetAsUid

The else blocks skip data-pattern clauses because both endpoints are already constrained as UIDs—this is the correct behavior, not an error. The console.warn is a development-time indicator, not a sign of unhandled logic. No exception should be thrown; this is a valid, working scenario.

Likely an incorrect or invalid review comment.


2-6: Import change verified and backward compatible.

All usages of DatalogClause, DatalogAndClause, and DatalogOrClause throughout the codebase are already referencing the local extendedDatalogClause module. The extended type definitions maintain the same structural contract while adding support for the equality predicate ("="). No breaking changes detected across seven utility files that consume these types.

Comment on lines 11 to 25
const query = `[:find ?u ?d
:in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")}
:where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d]
${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`;
Copy link
Contributor

@coderabbitai coderabbitai bot Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden query builder against unsanitized parameter keys

parameterUids keys come straight from schema definitions (i.e., user-authored page titles). As soon as a key contains whitespace, quotes, brackets, or any other EDN-reserved character, the interpolated Datalog snippet becomes syntactically invalid, and maliciously crafted names can even splice extra clauses into the query. We need to stop embedding raw keys in the query text and treat them as parameters instead (convert to keywords inside the query).

Please refactor this block along these lines:

-  const query = `[:find ?u ?d
-  :in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")}
-  :where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d]
-  ${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`;
-  return (await window.roamAlphaAPI.data.async.q(
-    query,
-    ...paramsAsSeq.map(([, v]) => v),
-  )) as [string, Record<string, string>][];
+  const inBindings = paramsAsSeq
+    .map((_, index) => `?key${index} ?value${index}`)
+    .join(" ");
+  const whereKeyClauses = paramsAsSeq
+    .map(
+      (_, index) =>
+        `[(keyword ?key${index}) ?kw${index}] [(get ?d ?kw${index}) ?value${index}]`,
+    )
+    .join("\n    ");
+  const query = `[:find ?u ?d
+  :in $${inBindings ? ` ${inBindings}` : ""}
+  :where
+    [?s :block/uid ?u]
+    [?s :block/props ?p]
+    [(get ?p :discourse-graph) ?d]${
+      whereKeyClauses ? `\n    ${whereKeyClauses}` : ""
+    }]`;
+  const args = paramsAsSeq.flatMap(([k, v]) => [k, v]);
+  return (await window.roamAlphaAPI.data.async.q(query, ...args)) as [
+    string,
+    Record<string, string>,
+  ][];

This keeps the query well-formed regardless of the characters present in schema keys and closes the injection vector.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const query = `[:find ?u ?d
:in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")}
:where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d]
${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`;
const inBindings = paramsAsSeq
.map((_, index) => `?key${index} ?value${index}`)
.join(" ");
const whereKeyClauses = paramsAsSeq
.map(
(_, index) =>
`[(keyword ?key${index}) ?kw${index}] [(get ?d ?kw${index}) ?value${index}]`,
)
.join("\n ");
const query = `[:find ?u ?d
:in $${inBindings ? ` ${inBindings}` : ""}
:where
[?s :block/uid ?u]
[?s :block/props ?p]
[(get ?p :discourse-graph) ?d]${
whereKeyClauses ? `\n ${whereKeyClauses}` : ""
}]`;
const args = paramsAsSeq.flatMap(([k, v]) => [k, v]);
return (await window.roamAlphaAPI.data.async.q(query, ...args)) as [
string,
Record<string, string>,
][];

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point. I do not know that we have a good entry point to add parameters; but this could be useful. Worth discussing, probably in another scope. Meanwhile, I propose we check against a regex at input.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maparent, understood—happy to defer the parameterization discussion to a follow-up. For the regex validation approach, you'll want to ensure parameter keys are valid Datalog identifiers. A pattern like /^[a-zA-Z][a-zA-Z0-9_-]*$/ would allow safe alphanumeric identifiers with hyphens and underscores, blocking whitespace, quotes, brackets, and other EDN-reserved characters.

If you'd like, I can open a follow-up issue to track the parameterization discussion for a future PR. Just let me know!


🧠 Learnings used
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-11-06T13:48:35.007Z
Learning: In apps/roam/src/utils/createReifiedBlock.ts, the `setBlockProps` function does not need await as it is synchronous or fire-and-forget.

Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.

Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.

Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API

Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API

@maparent maparent changed the base branch from main to eng-1018-create-functionui-to-migrate-existing-relations-to-reified November 10, 2025 00:53
@mdroidian
Copy link
Contributor

I had to redefine DatalogClause to be able to use an equality predicate, will signal upstream.
( RoamJS/roamjs-components#49 )
I had to tweak the optimizer, this is a bit ad hoc but feels right. (Otherwise queries with or-clause would just fail.)

Could you elaborate on why these changes needed to be made? I was under the assumption that all of the machinery already existed to simply switch over to the new reified relation triples.

Why, specifically, did you need to use an equality predicate?
We have many queries that already use or-clauses, is it the case that those don't fail but the ones you were trying did fail?

@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch from ae4c1cd to 3d84fea Compare November 10, 2025 15:35
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch from 5e38e3d to 0ecb5e9 Compare November 10, 2025 15:35
@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch from 3d84fea to 26bfd23 Compare November 10, 2025 17:37
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch 5 times, most recently from 56389de to 4510617 Compare November 12, 2025 17:02
@maparent
Copy link
Collaborator Author

maparent commented Nov 12, 2025

First question: Why do I need equality?
Thank you for fixing it upstream, appreciated, it makes for less changes here.
Still illustrating the issue with data in the discourse-graphs graph, can be tried live.
Given that the uid of the roam/js/discourse-graph/relations page is OjyVh_WEb, we can ask for relations with a given source:

window.roamAlphaAPI.q(`[:find
  ?rel ?reldata
 :where
  [?relpage :block/uid "OjyVh_WEb"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :sourceUid) ?relSource]
  [(= ?relSource "vnvlZMNeF")]]`)

That was using equality. Without equality, the similar query

window.roamAlphaAPI.q(`[:find
  ?rel ?reldata
 :where
  [?relpage :block/uid "OjyVh_WEb"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :sourceUid) "vnvlZMNeF"]]`)

fails with the following message: Cannot parse binding, expected (bind-scalar | bind-tuple | bind-coll | bind-rel)

ETA:
This is also correctly enforced in
https://github.com/RoamJS/roamjs-components/blob/b03819970e77f4c8a55506484b03a7a647d5a080/src/types/native.ts#L169C12-L169C26
where the DatalogFnExpr.binding expects a DatalogBinding, which does not include constants.

@maparent
Copy link
Collaborator Author

Ok, here are experiments and reflexions on the clause optimizer in fireQuery.ts.
The experiments are based on a query triggered by discourse context on

https://roamresearch.com/#/app/plugin-testing-akamatsulab/page/VKvPktVnD

Here is an instance of a compiled function, with clauses ordered as I pass it to fireQuery.
I remove the function Uid prefix for clarity.

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :hasSchema) ?relSchema]
  [(get ?reldata :sourceUid) ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
  [?Evidence :block/uid ?relSource]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
]

(works perfectly)

Here is the reordering done by the optimizer in its current form.

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [?Evidence :block/uid ?relSource]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
  [(get ?reldata :hasSchema) ?relSchema]
  [(get ?reldata :sourceUid) ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
]

This fails with this error message:
clojure.lang.ExceptionInfo: Insufficient bindings: #{?relSchema} not bound in [(= ?relSchema "WRCE-4nr9")] {:error :query/where, :form [(= ?relSchema "WRCE-4nr9")], :vars #{?relSchema}}

Indeed, the or comes before the variable is bound.

My first attempt was a slightly hacky tinker: It is known in logic that or and not are more costly than and. (Line 111)

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [?Evidence :block/uid ?relSource]
  [(get ?reldata :hasSchema) ?relSchema]
  [(get ?reldata :sourceUid) ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
]

And indeed this works fine.

Today I made more experiments, and added a bit of code that considers the last variable of a function-expression as bound. (Line 161). But if I apply it without the or-reorder, I got this:

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :hasSchema) ?relSchema]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
  [(get ?reldata :sourceUid) ?relSource]
  [?Evidence :block/uid ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
]

Crash. message: "took too long to run Please request less data or add limits. You might want to add limited recursion limits to pull specs: https://docs.datomic.com/pro/query/pull.html#recursive-specifications ."

If I do both, it fails even more brutally:

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :hasSchema) ?relSchema]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
  [(get ?reldata :sourceUid) ?relSource]
  [?Evidence :block/uid ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
]

Crash. message: "java.lang.OutOfMemoryError: Java heap space"

I realized that my binding alteration forced the optimizer ot follow the order of my original binding clauses:

  [(get ?reldata :hasSchema) ?relSchema]
  [(get ?reldata :sourceUid) ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]

Ideally this should be combined with giving a lower cost to equality (line 129) but without reordering the Or, I still get a crash.

[:find
  (pull ?Evidence [:block/string :node/title :block/uid])
  (pull ?Evidence [:block/uid])
:where
  [?relpage :block/uid "L8UGMqGnz"]
  [?rel :block/page ?relpage]
  [?rel :block/props ?relprops]
  [(get ?relprops :discourse-graph) ?reldata]
  [(get ?reldata :hasSchema) ?relSchema]
  (or [(= ?relSchema "WRCE-4nr9")]
      [(= ?relSchema "WRCE-4nr9")])
  [(get ?reldata :sourceUid) ?relSource]
  [?Evidence :block/uid ?relSource]
  [(get ?reldata :destinationUid) ?relTarget]
  [(= ?relTarget "W0oEwCbzb")]
]

Reordering this to put the hasSchema at the end helps, of course. But it would be interesting to be able to order those clauses according to the downstream cost... certainly out of scope now. I just put the relSchema at the end to make things easier.

In all cases: Those ors are useless in this case. (Not in all cases; I am quite sure there are cases where asking for a relation name will give more than one relation Id. But I did not find a good example of that.) Still, I simplified the queries by detecting the case of repeated Ids.

The current code has all the optimizations, which I believe are valid ; but of course we see that they have impact anad we should measure it on more than one sample.

@maparent
Copy link
Collaborator Author

Ah, one more note: You noted that the query results came one after another. The queries are launched in parallel, but I think there is queuing happening server-side which yields a linearity.

And a question: You said that using data.async.q would not be helpful to the UI. But I remember you asking Sid to do so, what was the rationale then? Or is it after he did that that you realized that it was not fully helpful?

@maparent maparent marked this pull request as ready for review November 12, 2025 23:57
@maparent
Copy link
Collaborator Author

This is now at the stage where it's functional and worth discussing. I do expect discussion to happen, there's a lot there.

@maparent maparent requested a review from mdroidian November 12, 2025 23:58
@mdroidian
Copy link
Contributor

And a question: You said that using data.async.q would not be helpful to the UI. But I remember you asking Sid to do so, what was the rationale then? Or is it after he did that that you realized that it was not fully helpful?

Sorry, I cannot recall the specific instance you are referring too, and without the full context I would just be guessing.

Copy link
Contributor

@mdroidian mdroidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the required loom video.

@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch from 09850d7 to 527c31b Compare November 13, 2025 18:49
@maparent maparent marked this pull request as draft November 14, 2025 15:55
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch 2 times, most recently from 356bbc5 to d862c88 Compare November 16, 2025 16:09
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch 2 times, most recently from a046d40 to efd8e11 Compare November 16, 2025 16:19
@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch from 8c53cec to 09fdaef Compare November 16, 2025 16:19
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch from efd8e11 to 1bf50eb Compare November 16, 2025 16:33
@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch from 09fdaef to 018d39a Compare November 16, 2025 16:33
@maparent maparent marked this pull request as ready for review November 17, 2025 14:48
@maparent maparent requested a review from mdroidian November 17, 2025 14:48

export const DISCOURSE_GRAPH_PROP_NAME = "discourse-graph";

const SANE_ROLE_NAME_RE = new RegExp(/^[a-zA-Z][a-zA-Z0-9_-]*$/);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we validating via this regex? I don't believe Roam uid's are required to start with a letter. (can be a number, dash, underscore, maybe more)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a coderabbit suggestion. I think some sanitization is good, so I will broaden the regexp, but happy to remove if you feel it's not necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roam probably handles this, we should let them do it with a simple try. Unless there is a specific case in which we are arbitrarily creating UID's, it is all being controlled by them, no need for us to haphazardly add a check here.

parameterUids: Record<string, string>,
): Promise<[string, Record<string, string>][]> => {
const paramsAsSeq = Object.entries(parameterUids);
// sanitize parameter names
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is validating, not sanitizing. (But we should probably sanitize instead of throw an error)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling it validate for now; I think that sanitization might create discrepancies with caller's expectations and more obscure errors, I think failing early is better.
That said, you also have a point that throwing an error without helping the user is not terribly useful either.
On the other other hand, I do think that user-side error handling is scope creep.
I mean... from my pov, none of this is known to happen, I was just trying to make the rabbit not complain.
BUT if it happens, we want to know about it, and throwing errors should lead us to see something in birdseatbugs, so I take that as a win.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple try/catch would suffice. Let Roam handle this, we use their functions to create UID's, create blocks, etc, etc. They are in charge of what is validated and what isn't, not us. We are just adding unnecessary complexity and risking false positive/negatives here.

@maparent maparent requested a review from mdroidian November 22, 2025 22:02
const typeOfTarget = typeOfValue(target);
const clauses: DatalogClause[] = [...relClauseBasis];

// todo: It could be a title or a node type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this todo waiting for? Does this still need to be completed for this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the todo is a leftover, thanks for spotting it.

@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch from 018d39a to 7c5dcd5 Compare November 27, 2025 18:41
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch 2 times, most recently from c08d306 to a7ec49f Compare November 27, 2025 20:33
@maparent maparent force-pushed the eng-1018-create-functionui-to-migrate-existing-relations-to-reified branch 2 times, most recently from 49b8b93 to cb13e01 Compare November 27, 2025 20:54
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch from a7ec49f to 048ce21 Compare November 27, 2025 20:55
Base automatically changed from eng-1018-create-functionui-to-migrate-existing-relations-to-reified to main November 27, 2025 21:01
@maparent maparent force-pushed the eng-1012-update-query-builder-relation-conditions-to-use-reified branch from 048ce21 to e5f66c1 Compare November 27, 2025 21:04
@maparent maparent merged commit 20199dc into main Nov 27, 2025
5 checks passed
@maparent maparent deleted the eng-1012-update-query-builder-relation-conditions-to-use-reified branch November 27, 2025 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants