-
Notifications
You must be signed in to change notification settings - Fork 3
ENG-1012 Update Query Builder Relation Conditions to Use Reified Relation Triples #538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENG-1012 Update Query Builder Relation Conditions to Use Reified Relation Triples #538
Conversation
|
This pull request has been ignored for the connected project Preview Branches by Supabase. |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThis PR introduces reified relations support to the Discourse Graph by adding utilities for creating and querying reified blocks, extending the Datalog type system via a local module, refactoring AdminPanel into tabs with feature flag control, consolidating type imports, and conditionally routing Datalog translation through a new reified-relations path. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant AdminPanel
participant extensionAPI
participant createReifiedBlock
participant fireQuery
participant roamAlphaAPI
User->>AdminPanel: Toggle reified-relations setting
AdminPanel->>extensionAPI: setSetting("use-reified-relations", true)
Note over User,AdminPanel: Migration Workflow (via migrateRelations)
User->>createReifiedBlock: migrateRelations() called
createReifiedBlock->>fireQuery: getRelationData() queries all relations
fireQuery-->>createReifiedBlock: returns relation data array
loop For each relation
createReifiedBlock->>createReifiedBlock: strictQueryForReifiedBlocks(params)
alt Block exists (dedup check)
createReifiedBlock-->>createReifiedBlock: return existing UID
else Block not found
createReifiedBlock->>roamAlphaAPI: Generate new UID & create block
roamAlphaAPI->>roamAlphaAPI: Set discourse-graph properties
roamAlphaAPI-->>createReifiedBlock: return new block UID
end
end
createReifiedBlock-->>User: Migration complete
sequenceDiagram
participant Query
participant registerDiscourseDatalogTranslators
participant getSetting
participant createReifiedBlock
participant fireQuery
Query->>registerDiscourseDatalogTranslators: translateEdge(relation)
registerDiscourseDatalogTranslators->>getSetting: check "use-reified-relations"
alt Reified-relations enabled
registerDiscourseDatalogTranslators->>createReifiedBlock: getExistingRelationPageUid()
alt Relation page exists
registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Build relClauseBasis queries
registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Construct forward/reverse clauses
registerDiscourseDatalogTranslators->>fireQuery: Execute reified-relations path
fireQuery-->>registerDiscourseDatalogTranslators: Reified query results
else Page not found
registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Fall back to original path
end
else Reified-relations disabled
registerDiscourseDatalogTranslators->>registerDiscourseDatalogTranslators: Use original edge-triple construction
end
registerDiscourseDatalogTranslators-->>Query: Return clause set
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
apps/roam/src/utils/migrateRelations.ts (1)
10-17: Guard against duplicate relation triples before creating.Given the inline note about “likely to create duplicates,” it’d be safer to track processed
(source, relUid, target)combos before callingcreateReifiedRelation. That way reruns or redundant query rows won’t stamp extra reified blocks.- for (const rel of relationData) { - const uid = await createReifiedRelation({ - sourceUid: rel.source, - destinationUid: rel.target, - relationBlockUid: rel.relUid, - }); - // created.add(uid); - } + const processed = new Set<string>(); + for (const rel of relationData) { + const key = `${rel.source}:${rel.relUid}:${rel.target}`; + if (processed.has(key)) continue; + processed.add(key); + await createReifiedRelation({ + sourceUid: rel.source, + destinationUid: rel.target, + relationBlockUid: rel.relUid, + }); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
apps/roam/src/components/settings/AdminPanel.tsx(4 hunks)apps/roam/src/components/settings/Settings.tsx(1 hunks)apps/roam/src/utils/compileDatalog.ts(1 hunks)apps/roam/src/utils/conditionToDatalog.ts(1 hunks)apps/roam/src/utils/createReifiedBlock.ts(1 hunks)apps/roam/src/utils/discourseNodeFormatToDatalog.ts(1 hunks)apps/roam/src/utils/extendedDatalogClause.ts(1 hunks)apps/roam/src/utils/fireQuery.ts(4 hunks)apps/roam/src/utils/gatherDatalogVariablesFromClause.ts(1 hunks)apps/roam/src/utils/getExportTypes.ts(5 hunks)apps/roam/src/utils/getRelationData.ts(1 hunks)apps/roam/src/utils/migrateRelations.ts(1 hunks)apps/roam/src/utils/predefinedSelections.ts(2 hunks)apps/roam/src/utils/registerDiscourseDatalogTranslators.ts(3 hunks)apps/roam/src/utils/replaceDatalogVariables.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/main.mdc)
**/*.{ts,tsx}: Prefertypeoverinterface
Use explicit return types for functions
Avoidanytypes when possible
Files:
apps/roam/src/utils/migrateRelations.tsapps/roam/src/utils/replaceDatalogVariables.tsapps/roam/src/utils/compileDatalog.tsapps/roam/src/utils/getRelationData.tsapps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/conditionToDatalog.tsapps/roam/src/components/settings/Settings.tsxapps/roam/src/utils/registerDiscourseDatalogTranslators.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/components/settings/AdminPanel.tsxapps/roam/src/utils/gatherDatalogVariablesFromClause.tsapps/roam/src/utils/fireQuery.tsapps/roam/src/utils/extendedDatalogClause.tsapps/roam/src/utils/getExportTypes.tsapps/roam/src/utils/predefinedSelections.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/main.mdc)
**/*.{ts,tsx,js,jsx}: Prefer arrow functions over regular function declarations
Use named parameters (object destructuring) when a function has more than 2 parameters
Use Prettier with the project's configuration
Maintain consistent naming conventions: PascalCase for components and types
Maintain consistent naming conventions: camelCase for variables and functions
Maintain consistent naming conventions: UPPERCASE for constants
Files:
apps/roam/src/utils/migrateRelations.tsapps/roam/src/utils/replaceDatalogVariables.tsapps/roam/src/utils/compileDatalog.tsapps/roam/src/utils/getRelationData.tsapps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/conditionToDatalog.tsapps/roam/src/components/settings/Settings.tsxapps/roam/src/utils/registerDiscourseDatalogTranslators.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/components/settings/AdminPanel.tsxapps/roam/src/utils/gatherDatalogVariablesFromClause.tsapps/roam/src/utils/fireQuery.tsapps/roam/src/utils/extendedDatalogClause.tsapps/roam/src/utils/getExportTypes.tsapps/roam/src/utils/predefinedSelections.ts
apps/roam/**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/roam.mdc)
apps/roam/**/*.{js,jsx,ts,tsx}: Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension
Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API
Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API
Files:
apps/roam/src/utils/migrateRelations.tsapps/roam/src/utils/replaceDatalogVariables.tsapps/roam/src/utils/compileDatalog.tsapps/roam/src/utils/getRelationData.tsapps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/conditionToDatalog.tsapps/roam/src/components/settings/Settings.tsxapps/roam/src/utils/registerDiscourseDatalogTranslators.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/components/settings/AdminPanel.tsxapps/roam/src/utils/gatherDatalogVariablesFromClause.tsapps/roam/src/utils/fireQuery.tsapps/roam/src/utils/extendedDatalogClause.tsapps/roam/src/utils/getExportTypes.tsapps/roam/src/utils/predefinedSelections.ts
🧠 Learnings (11)
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API
Applied to files:
apps/roam/src/utils/compileDatalog.tsapps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/conditionToDatalog.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/components/settings/AdminPanel.tsxapps/roam/src/utils/extendedDatalogClause.tsapps/roam/src/utils/predefinedSelections.ts
📚 Learning: 2025-06-22T10:40:52.752Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.
Applied to files:
apps/roam/src/utils/getRelationData.tsapps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/registerDiscourseDatalogTranslators.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/utils/getExportTypes.ts
📚 Learning: 2025-06-17T23:42:29.279Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:95-104
Timestamp: 2025-06-17T23:42:29.279Z
Learning: In the DiscourseGraphs/discourse-graph codebase, DiscourseRelation type properties are either string or Triple[], and the STANDARD_ROLES filter in conceptConversion.ts excludes Triple[] properties, so only string values remain after filtering.
Applied to files:
apps/roam/src/utils/getRelationData.ts
📚 Learning: 2025-06-22T10:40:21.679Z
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:15-16
Timestamp: 2025-06-22T10:40:21.679Z
Learning: In the getAllDiscourseNodesSince function in apps/roam/src/utils/getAllDiscourseNodesSince.ts, date validation is performed before the function is called, so additional date validation within the function is not needed.
Applied to files:
apps/roam/src/utils/discourseNodeFormatToDatalog.ts
📚 Learning: 2025-06-17T23:37:45.289Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 220
File: apps/roam/src/utils/conceptConversion.ts:42-56
Timestamp: 2025-06-17T23:37:45.289Z
Learning: In the DiscourseNode interface from apps/roam/src/utils/getDiscourseNodes.ts, the field `type` serves as the unique identifier field, not a type classification field. The interface has no `uid` or `id` field, making `node.type` the correct field to use for UID-related operations.
Applied to files:
apps/roam/src/utils/discourseNodeFormatToDatalog.ts
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API
Applied to files:
apps/roam/src/utils/discourseNodeFormatToDatalog.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/components/settings/AdminPanel.tsxapps/roam/src/utils/predefinedSelections.ts
📚 Learning: 2025-11-05T21:57:14.909Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.
Applied to files:
apps/roam/src/utils/registerDiscourseDatalogTranslators.tsapps/roam/src/utils/createReifiedBlock.tsapps/roam/src/utils/getExportTypes.ts
📚 Learning: 2025-11-06T13:48:35.007Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-11-06T13:48:35.007Z
Learning: In apps/roam/src/utils/createReifiedBlock.ts, the `setBlockProps` function does not need await as it is synchronous or fire-and-forget.
Applied to files:
apps/roam/src/utils/createReifiedBlock.ts
📚 Learning: 2025-07-19T22:34:23.619Z
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use BlueprintJS 3 components and Tailwind CSS for platform-native UI in the Roam Research extension
Applied to files:
apps/roam/src/components/settings/AdminPanel.tsx
📚 Learning: 2025-08-11T19:09:58.252Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 337
File: apps/roam/src/components/DiscourseFloatingMenu.tsx:43-43
Timestamp: 2025-08-11T19:09:58.252Z
Learning: The roam subdirectory (apps/roam) is constrained to React 17 and cannot use React 18+ features like createRoot API. ReactDOM.render should be used instead of createRoot in this subdirectory.
Applied to files:
apps/roam/src/components/settings/AdminPanel.tsx
📚 Learning: 2025-09-01T19:09:07.052Z
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-09-01T19:09:07.052Z
Learning: User prefers to avoid making types explicit in each require statement due to anti-DRY concerns when working with module imports in TypeScript.
Applied to files:
apps/roam/src/utils/gatherDatalogVariablesFromClause.ts
🔇 Additional comments (16)
apps/roam/src/components/settings/Settings.tsx (1)
232-233: Prop wiring looks good.Passing
onloadArgskeeps the settings modal aligned with the updated AdminPanel API—no further action needed.apps/roam/src/utils/discourseNodeFormatToDatalog.ts (1)
1-1: Local clause type import keeps things consistent.Thanks for migrating this file to the shared extended clause types; it stays aligned with the rest of the refactor.
apps/roam/src/utils/replaceDatalogVariables.ts (1)
1-2: Shared clause typing looks good.Pulling
DatalogClausefrom the local extended definition keeps the replacement helper in sync with the new type system.apps/roam/src/utils/fireQuery.ts (1)
65-160: Heuristic downgrade for uncaptured OR/NOT clauses makes sense.Giving uncaptured
or/not(and the residualand) clauses a worse score should stop them from jumping ahead of selective patterns, which fits the optimizer’s intent.apps/roam/src/utils/getRelationData.ts (1)
8-60: Nice extraction of relation lookup logic.Lifting the shared querying into
getRelationDataUtilplus a thin default export keeps migration and export paths in sync without duplicating the fireQuery plumbing.apps/roam/src/utils/predefinedSelections.ts (1)
1-2: Import realignment matches the other utils.Good to see the selection helpers sharing the extended clause typing as well.
apps/roam/src/utils/getExportTypes.ts (5)
16-16: LGTM—delegating relation data retrieval to a shared utility.This import change aligns with the PR objective to centralize relation data logic.
44-47: LGTM—simplified children handling.The direct array access without intermediate casting is cleaner while maintaining the same functionality.
376-377: LGTM—extracted relation data logic to a shared utility.Moving this logic to
getRelationDataUtilimproves code reuse and maintainability, consistent with the PR's objective to consolidate relation handling.
422-535: LGTM—refactored to arrow function per coding guidelines.The conversion to an arrow function aligns with the project's coding guidelines. The logic for handling discourse context, references, and frontmatter remains unchanged.
639-749: LGTM—improved code organization with helper functions.The refactor introduces well-structured helper functions (
treeNodeToMarkdown,getMarkdownContent,getDiscourseResultsContent,getReferenceResultsContent) that make the PDF export logic more modular and maintainable. The addition offlatten: trueto thetoMarkdownoptions is appropriate for PDF rendering, where a flat structure without indentation is typically preferred.apps/roam/src/utils/registerDiscourseDatalogTranslators.ts (5)
26-27: LGTM—new imports for reified relations feature.The imports of
getSettingandgetExistingRelationPageUidare appropriate for the feature-flagged reified relations support being introduced.
32-42: LGTM—useful helper for clause construction.The
singleClausehelper provides a clean way to collapse clause arrays into a single clause, with appropriate handling for empty, single, and multiple clause scenarios.
638-848: LGTM—preserved fallback logic with improved OR-join variable handling.The non-reified path maintains the original triple-based implementation as a fallback. The enhanced OR-join variable collection at lines 828-834 correctly computes the intersection of variables across all alternatives, ensuring only variables present in all branches are included in the join clause:
const orJoinedVars = collectVariables(andParts[0]); andParts.slice(1).forEach((a) => { const freeVars = collectVariables(a); Array.from(orJoinedVars).forEach((v) => { if (!freeVars.has(v)) orJoinedVars.delete(v); }); });This is the correct approach for OR-joins in Datalog, where only shared variables can be safely joined.
384-637: Remove the flagged concern—the code correctly handles both-UIDs.The reified relations logic properly constructs clauses when both source and target are UIDs. When
sourceAsUid && targetAsUid, the code correctly produces:
forwardClauses:relSource = sourceAsUid+relTarget = targetAsUidreverseClauses:relTarget = sourceAsUid+relSource = targetAsUidThe
elseblocks skipdata-patternclauses because both endpoints are already constrained as UIDs—this is the correct behavior, not an error. Theconsole.warnis a development-time indicator, not a sign of unhandled logic. No exception should be thrown; this is a valid, working scenario.Likely an incorrect or invalid review comment.
2-6: Import change verified and backward compatible.All usages of
DatalogClause,DatalogAndClause, andDatalogOrClausethroughout the codebase are already referencing the localextendedDatalogClausemodule. The extended type definitions maintain the same structural contract while adding support for the equality predicate ("="). No breaking changes detected across seven utility files that consume these types.
| const query = `[:find ?u ?d | ||
| :in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")} | ||
| :where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d] | ||
| ${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harden query builder against unsanitized parameter keys
parameterUids keys come straight from schema definitions (i.e., user-authored page titles). As soon as a key contains whitespace, quotes, brackets, or any other EDN-reserved character, the interpolated Datalog snippet becomes syntactically invalid, and maliciously crafted names can even splice extra clauses into the query. We need to stop embedding raw keys in the query text and treat them as parameters instead (convert to keywords inside the query).
Please refactor this block along these lines:
- const query = `[:find ?u ?d
- :in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")}
- :where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d]
- ${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`;
- return (await window.roamAlphaAPI.data.async.q(
- query,
- ...paramsAsSeq.map(([, v]) => v),
- )) as [string, Record<string, string>][];
+ const inBindings = paramsAsSeq
+ .map((_, index) => `?key${index} ?value${index}`)
+ .join(" ");
+ const whereKeyClauses = paramsAsSeq
+ .map(
+ (_, index) =>
+ `[(keyword ?key${index}) ?kw${index}] [(get ?d ?kw${index}) ?value${index}]`,
+ )
+ .join("\n ");
+ const query = `[:find ?u ?d
+ :in $${inBindings ? ` ${inBindings}` : ""}
+ :where
+ [?s :block/uid ?u]
+ [?s :block/props ?p]
+ [(get ?p :discourse-graph) ?d]${
+ whereKeyClauses ? `\n ${whereKeyClauses}` : ""
+ }]`;
+ const args = paramsAsSeq.flatMap(([k, v]) => [k, v]);
+ return (await window.roamAlphaAPI.data.async.q(query, ...args)) as [
+ string,
+ Record<string, string>,
+ ][];This keeps the query well-formed regardless of the characters present in schema keys and closes the injection vector.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const query = `[:find ?u ?d | |
| :in $ ${paramsAsSeq.map(([k]) => "?" + k).join(" ")} | |
| :where [?s :block/uid ?u] [?s :block/props ?p] [(get ?p :discourse-graph) ?d] | |
| ${paramsAsSeq.map(([k]) => `[(get ?d :${k}) ?${k}]`).join(" ")} ]`; | |
| const inBindings = paramsAsSeq | |
| .map((_, index) => `?key${index} ?value${index}`) | |
| .join(" "); | |
| const whereKeyClauses = paramsAsSeq | |
| .map( | |
| (_, index) => | |
| `[(keyword ?key${index}) ?kw${index}] [(get ?d ?kw${index}) ?value${index}]`, | |
| ) | |
| .join("\n "); | |
| const query = `[:find ?u ?d | |
| :in $${inBindings ? ` ${inBindings}` : ""} | |
| :where | |
| [?s :block/uid ?u] | |
| [?s :block/props ?p] | |
| [(get ?p :discourse-graph) ?d]${ | |
| whereKeyClauses ? `\n ${whereKeyClauses}` : "" | |
| }]`; | |
| const args = paramsAsSeq.flatMap(([k, v]) => [k, v]); | |
| return (await window.roamAlphaAPI.data.async.q(query, ...args)) as [ | |
| string, | |
| Record<string, string>, | |
| ][]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting point. I do not know that we have a good entry point to add parameters; but this could be useful. Worth discussing, probably in another scope. Meanwhile, I propose we check against a regex at input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maparent, understood—happy to defer the parameterization discussion to a follow-up. For the regex validation approach, you'll want to ensure parameter keys are valid Datalog identifiers. A pattern like /^[a-zA-Z][a-zA-Z0-9_-]*$/ would allow safe alphanumeric identifiers with hyphens and underscores, blocking whitespace, quotes, brackets, and other EDN-reserved characters.
If you'd like, I can open a follow-up issue to track the parameterization discussion for a future PR. Just let me know!
🧠 Learnings used
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 0
File: :0-0
Timestamp: 2025-11-06T13:48:35.007Z
Learning: In apps/roam/src/utils/createReifiedBlock.ts, the `setBlockProps` function does not need await as it is synchronous or fire-and-forget.
Learnt from: maparent
Repo: DiscourseGraphs/discourse-graph PR: 534
File: apps/roam/src/utils/createReifiedBlock.ts:40-48
Timestamp: 2025-11-05T21:57:14.909Z
Learning: In the discourse-graph repository, the function `getPageUidByPageTitle` from `roamjs-components/queries/getPageUidByPageTitle` is a synchronous function that returns a string directly (the page UID or an empty string if not found), not a Promise. It should be called without `await`.
Learnt from: sid597
Repo: DiscourseGraphs/discourse-graph PR: 232
File: apps/roam/src/utils/getAllDiscourseNodesSince.ts:18-31
Timestamp: 2025-06-22T10:40:52.752Z
Learning: In apps/roam/src/utils/getAllDiscourseNodesSince.ts, the user confirmed that querying for `?title` with `:node/title` and mapping it to the `text` field in the DiscourseGraphContent type is the correct implementation for retrieving discourse node content from Roam Research, despite it appearing to query page titles rather than block text content.
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use Roam Depot/Extension API documentation from https://roamresearch.com/#/app/developer-documentation/page/y31lhjIqU when working with the Roam Extension API
Learnt from: CR
Repo: DiscourseGraphs/discourse-graph PR: 0
File: .cursor/rules/roam.mdc:0-0
Timestamp: 2025-07-19T22:34:23.619Z
Learning: Applies to apps/roam/**/*.{js,jsx,ts,tsx} : Use the roamAlphaApi documentation from https://roamresearch.com/#/app/developer-documentation/page/tIaOPdXCj when working with the Roam API
Could you elaborate on why these changes needed to be made? I was under the assumption that all of the machinery already existed to simply switch over to the new reified relation triples. Why, specifically, did you need to use an equality predicate? |
ae4c1cd to
3d84fea
Compare
5e38e3d to
0ecb5e9
Compare
3d84fea to
26bfd23
Compare
56389de to
4510617
Compare
|
First question: Why do I need equality? That was using equality. Without equality, the similar query fails with the following message: ETA: |
|
Ok, here are experiments and reflexions on the clause optimizer in https://roamresearch.com/#/app/plugin-testing-akamatsulab/page/VKvPktVnD Here is an instance of a compiled function, with clauses ordered as I pass it to fireQuery. (works perfectly) Here is the reordering done by the optimizer in its current form. This fails with this error message: Indeed, the or comes before the variable is bound. My first attempt was a slightly hacky tinker: It is known in logic that or and not are more costly than and. (Line 111) And indeed this works fine. Today I made more experiments, and added a bit of code that considers the last variable of a function-expression as bound. (Line 161). But if I apply it without the or-reorder, I got this: Crash. message: "took too long to run Please request less data or add limits. You might want to add limited recursion limits to pull specs: https://docs.datomic.com/pro/query/pull.html#recursive-specifications ." If I do both, it fails even more brutally: Crash. message: "java.lang.OutOfMemoryError: Java heap space" I realized that my binding alteration forced the optimizer ot follow the order of my original binding clauses: Ideally this should be combined with giving a lower cost to equality (line 129) but without reordering the Or, I still get a crash. Reordering this to put the hasSchema at the end helps, of course. But it would be interesting to be able to order those clauses according to the downstream cost... certainly out of scope now. I just put the relSchema at the end to make things easier. In all cases: Those ors are useless in this case. (Not in all cases; I am quite sure there are cases where asking for a relation name will give more than one relation Id. But I did not find a good example of that.) Still, I simplified the queries by detecting the case of repeated Ids. The current code has all the optimizations, which I believe are valid ; but of course we see that they have impact anad we should measure it on more than one sample. |
|
Ah, one more note: You noted that the query results came one after another. The queries are launched in parallel, but I think there is queuing happening server-side which yields a linearity. And a question: You said that using |
|
This is now at the stage where it's functional and worth discussing. I do expect discussion to happen, there's a lot there. |
Sorry, I cannot recall the specific instance you are referring too, and without the full context I would just be guessing. |
mdroidian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the required loom video.
09850d7 to
527c31b
Compare
356bbc5 to
d862c88
Compare
a046d40 to
efd8e11
Compare
8c53cec to
09fdaef
Compare
efd8e11 to
1bf50eb
Compare
09fdaef to
018d39a
Compare
|
|
||
| export const DISCOURSE_GRAPH_PROP_NAME = "discourse-graph"; | ||
|
|
||
| const SANE_ROLE_NAME_RE = new RegExp(/^[a-zA-Z][a-zA-Z0-9_-]*$/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we validating via this regex? I don't believe Roam uid's are required to start with a letter. (can be a number, dash, underscore, maybe more)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a coderabbit suggestion. I think some sanitization is good, so I will broaden the regexp, but happy to remove if you feel it's not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roam probably handles this, we should let them do it with a simple try. Unless there is a specific case in which we are arbitrarily creating UID's, it is all being controlled by them, no need for us to haphazardly add a check here.
| parameterUids: Record<string, string>, | ||
| ): Promise<[string, Record<string, string>][]> => { | ||
| const paramsAsSeq = Object.entries(parameterUids); | ||
| // sanitize parameter names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is validating, not sanitizing. (But we should probably sanitize instead of throw an error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it validate for now; I think that sanitization might create discrepancies with caller's expectations and more obscure errors, I think failing early is better.
That said, you also have a point that throwing an error without helping the user is not terribly useful either.
On the other other hand, I do think that user-side error handling is scope creep.
I mean... from my pov, none of this is known to happen, I was just trying to make the rabbit not complain.
BUT if it happens, we want to know about it, and throwing errors should lead us to see something in birdseatbugs, so I take that as a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple try/catch would suffice. Let Roam handle this, we use their functions to create UID's, create blocks, etc, etc. They are in charge of what is validated and what isn't, not us. We are just adding unnecessary complexity and risking false positive/negatives here.
| const typeOfTarget = typeOfValue(target); | ||
| const clauses: DatalogClause[] = [...relClauseBasis]; | ||
|
|
||
| // todo: It could be a title or a node type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this todo waiting for? Does this still need to be completed for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the todo is a leftover, thanks for spotting it.
018d39a to
7c5dcd5
Compare
c08d306 to
a7ec49f
Compare
49b8b93 to
cb13e01
Compare
a7ec49f to
048ce21
Compare
048ce21 to
e5f66c1
Compare
https://linear.app/discourse-graphs/issue/ENG-1012/update-query-builder-relation-conditions-to-use-reified-relation
Update query builder relation conditions.
Notes:
Right now, we use one of two query paths, according to the
use-reified-relationssetting.I had to make changes to the optimizer. (Otherwise queries with or-clause would just fail, or even crash.)
Performance is clearly better than before but a bit less than hoped for.
Summary by CodeRabbit