ENG-270, ENG-328 Creation of the supabase embedding schema, integration with turbo and CI/CD #165

maparent · 2025-05-19T20:18:22Z

First expression of the database schema. Machinery to automate the steps.

Summary by CodeRabbit

New Features
- Introduced a comprehensive database schema for discourse platforms, including accounts, agents, spaces, documents, content, concepts, embeddings, and synchronization tasks.
- Added advanced vector search and similarity matching functions for content embeddings.
- Implemented synchronization task management with concurrency control, timeout handling, and failure backoff.
- Provided TypeScript type definitions covering the entire database schema.
Documentation
- Added detailed setup and usage instructions, schema diagrams, and example datasets for database management and development workflows.
Chores
- Added configuration files for Supabase local development, linting, and deployment automation.
- Updated deployment scripts and environment variable configurations for streamlined CI/CD integration.

linear · 2025-05-19T20:18:24Z

ENG-328 Look into integrating Vercel and Turborepo with Supabase

ENG-270 Define embeddings schema

vercel · 2025-05-19T20:18:26Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
discourse-graph	⬜️ Skipped (Inspect)			May 22, 2025 11:57pm

coderabbitai · 2025-05-19T20:18:28Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This update introduces a new database package for a discourse graph system, including a full PostgreSQL schema, Supabase configuration, migration scripts, LinkML and PlantUML schema definitions, TypeScript types, and supporting documentation. It also adds CI/CD workflow automation, linter configuration, and deployment scripts, establishing a comprehensive foundation for managing discourse data, concepts, content, and synchronization tasks.

Changes

File(s)	Change Summary
.github/workflows/database-deploy.yaml, turbo.json	Added GitHub Actions workflow for Supabase database deployment, triggered on workflow dispatch and main branch pushes. Updated `turbo.json` to pass Supabase environment variables during deployment.
packages/database/.sqruff	Introduced Sqruff linter configuration for SQL files in the database package, specifying dialect, rules, and indentation settings.
packages/database/README.md	Added documentation detailing setup, usage, and development workflow for the database schema, including instructions for Supabase, Docker, linter, and deployment processes.
packages/database/example.md	Added a comprehensive example markdown file illustrating the discourse graph data model, including nodes, edges, documents, concepts, and their relationships.
packages/database/package.json	Added `package.json` for the `@repo/database` package, configuring scripts for Supabase workflows, linting, type generation, and deployment. Declared dev dependencies and package exports.
packages/database/schema.puml	Added PlantUML class diagram representing the database schema, entities, attributes, relationships, and inheritance hierarchy for the discourse platform.
packages/database/schema.yaml	Introduced LinkML schema defining enumerations, types, slots, and classes for discourse graph data, including agents, platforms, accounts, spaces, content, concepts, and embeddings.
packages/database/supabase/.gitignore	Added `.gitignore` to exclude Supabase temp directories and dotenv environment files from version control.
packages/database/supabase/config.toml	Added detailed Supabase local development configuration, specifying API, database, authentication, storage, analytics, and experimental settings with environment variable support.
packages/database/supabase/migrations/20250504195841_remote_schema.sql, .../extensions.sql	Added migration and schema scripts for PostgreSQL extensions (pg_jsonschema, pg_stat_monitor, pgroonga, vector, and others), ensuring required extensions are present in the database.
packages/database/supabase/migrations/20250504202930_content_tables.sql, .../schemas/*.sql	Added core schema migrations and SQL files for creating tables, enums, sequences, and relationships: Agent, Person, AutomatedAgent, Account, Platform (renamed from DiscoursePlatform), Space (renamed from DiscourseSpace), SpaceAccess, Document, Content, Concept, ContentEmbedding, contributor join tables, and associated indexes and constraints.
packages/database/supabase/migrations/20250506174523_content_idx_id.sql	Added a unique partial index on Content for (space_id, source_local_id) where source_local_id is not null.
packages/database/supabase/migrations/20250512142307_sync_table.sql, .../schemas/sync.sql	Added schema, migration, and PL/pgSQL functions for synchronization task management: created `sync_info` table, `task_status` enum, and functions for proposing and ending sync tasks with concurrency, timeout, and backoff logic, plus a function to identify nodes needing synchronization.
packages/database/supabase/migrations/20250513173724_content_concept_key.sql, .../schemas/embedding.sql	Added migration and schema for concept/content relationship refactoring, embedding table for OpenAI vectors, and SQL functions for embedding similarity search and subset matching.
packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql	Added updated PL/pgSQL functions for sync task management with improved linting and assertion logic.
packages/database/supabase/migrations/20250520132747_restrict_search_by_document.sql	Added SQL functions for vector embedding similarity search with optional restriction by document and subset of node UIDs, replacing a previous function.
packages/database/supabase/migrations/20250520133551_nodes_needing_sync.sql	Added function to identify nodes needing synchronization by comparing external node edit timestamps with local content modification times.
packages/database/supabase/migrations/20250522193823_rename_discourse_space.sql	Renamed tables and columns: `DiscoursePlatform` to `Platform`, `DiscourseSpace` to `Space`, and updated foreign key constraints accordingly. Added descriptive comments to `Space` and `Account` tables.
packages/database/supabase/schemas/account.sql	Created `Account` and `SpaceAccess` tables with foreign keys and constraints for user accounts and space access control.
packages/database/supabase/schemas/agent.sql	Created `Agent`, `Person`, and `AutomatedAgent` tables with inheritance and foreign key relationships.
packages/database/supabase/schemas/base.sql	Created base schema with session settings, enum `EntityType`, sequence `entity_id_seq`, and privileges for roles.
packages/database/supabase/schemas/concept.sql	Created `EpistemicStatus` enum and `Concept` table with fields for epistemic status, schema relations, content, and foreign keys.
packages/database/supabase/schemas/content.sql	Created `Scale` enum and `Document` and `Content` tables with hierarchical and relational fields, indexes, and constraints.
packages/database/supabase/schemas/contributor.sql	Created join tables `content_contributors` and `concept_contributors` linking contributors to content and concepts with foreign keys and primary keys.
packages/database/supabase/schemas/embedding.sql	Created `EmbeddingName` enum and `ContentEmbedding_openai_text_embedding_3_small_1536` table for vector embeddings, plus functions for embedding similarity search and subset matching.
packages/database/supabase/schemas/extensions.sql	Added SQL script to create PostgreSQL extensions including pg_cron, pgroonga, pg_jsonschema, vector, and others in designated schemas.
packages/database/supabase/schemas/space.sql	Created `Platform` and `Space` tables with foreign keys and constraints representing discourse platforms and spaces.
packages/database/supabase/schemas/sync.sql	Created `task_status` enum, `sync_info` table, sequence, and PL/pgSQL functions for synchronization task management and node sync determination.
packages/database/types.gen.ts	Added generated TypeScript types reflecting the complete database schema, including tables, enums, utility types, and constants for type-safe database interaction.
packages/database/scripts/deploy.ts	Added deployment script that validates branch and environment variables, links Supabase project, pushes database changes, and optionally deploys functions.
packages/database/scripts/lint.ts	Added linting script that checks for `sqruff` installation and runs lint or fix commands on SQL schema files.

Sequence Diagram(s)

sequenceDiagram
    participant Developer
    participant GitHub Actions
    participant Supabase CLI
    participant Supabase DB

    Developer->>GitHub Actions: Push to main / Manual dispatch
    GitHub Actions->>Supabase CLI: Checkout code, setup CLI
    GitHub Actions->>Supabase CLI: Run turbo deploy command
    Supabase CLI->>Supabase DB: Deploy database schema/functions
    Supabase DB-->>GitHub Actions: Deployment status

sequenceDiagram
    participant Client
    participant Supabase DB
    participant PL/pgSQL Function

    Client->>Supabase DB: Call propose_sync_task(target, function, worker, timeout, interval)
    Supabase DB->>PL/pgSQL Function: Execute logic (insert/lock/check/backoff)
    PL/pgSQL Function-->>Supabase DB: Return interval/null (task acquired or wait)
    Supabase DB-->>Client: Response (interval or task ownership)

sequenceDiagram
    participant Client
    participant Supabase DB
    participant PL/pgSQL Function

    Client->>Supabase DB: Call match_content_embeddings(query_embedding, threshold, count, doc_id)
    Supabase DB->>PL/pgSQL Function: Vector similarity search
    PL/pgSQL Function-->>Supabase DB: Return matching content rows
    Supabase DB-->>Client: Embedding search results

Poem

🐇
In the warren of code, new roots now grow,
With schemas and types, our discourse will flow.
Diagrams, docs, and migrations in line,
Supabase ready, the structure is fine!
With scripts and with CI, we’re ready to hop—
This bunny’s database leap is a top-notch crop!
🥕

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

maparent · 2025-05-19T20:18:37Z

@CodeRabbit

coderabbitai · 2025-05-19T20:19:30Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

I see that you've tagged me, but I don't have a specific question or instruction to respond to. Let me gather more information about this PR to better assist you.

🏁 Script executed:

#!/bin/bash
# Get PR details
gh pr view 165 --json title,body,author,baseRefName,headRefName,state

# List changed files in the PR
gh pr view 165 --json files --jq '.files[].path'

# For a quick overview of the PR, get the commit messages
gh pr view 165 --json commits --jq '.commits[].messageHeadline'

Length of output: 2439

@maparent Hi there! I see you've tagged me, but I'm not sure what specific help you're looking for.

I can see this PR is focused on setting up Supabase embedding schema and integration with Turbo/CI/CD. Looking at the changes, you've added:

Database schema definitions and migrations
Supabase configuration
SQL schemas for various entities (account, agent, concept, content, etc.)
GitHub workflow for database deployment
Generated TypeScript types

How can I assist you with this PR? I can:

Review specific files or code sections
Help optimize SQL schema design
Suggest improvements to the migration strategy
Generate documentation
Analyze potential performance issues
Help with TypeScript types or CI/CD workflow

Just let me know what you need!

(_/)
( •_•)
/ >📋

maparent · 2025-05-20T00:09:37Z

@CodeRabbit full review

coderabbitai · 2025-05-20T00:09:41Z

✅ Actions performed

Full review triggered.

maparent · 2025-05-20T00:53:26Z

@CodeRabbit full review

coderabbitai · 2025-05-20T00:53:32Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 22

🔭 Outside diff range comments (1)

packages/database/supabase/.gitignore (1)
1-9: ⚠️ Potential issue

Prevent committing base .env files
Currently, only .env.keys, .env.local, and .env.*.local are ignored. The primary .env file (which may contain secrets) isn’t excluded. Add:
+.env
to avoid leaking credentials.

♻️ Duplicate comments (2)

packages/database/supabase/migrations/20250513173724_content_concept_key.sql (2)

47-62: match_content_embeddings function duplicated here.
This logic overlaps with schemas/embedding.sql. Ensure both definitions stay in sync or consolidate into a single migration file.

63-90: match_embeddings_for_subset_nodes duplicated—keep DRY.
Same as above; consider centralizing function creation.

🧹 Nitpick comments (32)

packages/database/supabase/migrations/20250506174523_content_idx_id.sql (1)
1-2: Optimize index creation and naming

To avoid blocking writes, consider using CONCURRENTLY:
- CREATE UNIQUE INDEX "Content_space_and_id" ON "Content" (space_id, source_local_id) WHERE source_local_id IS NOT NULL;
+ CREATE UNIQUE INDEX CONCURRENTLY "Content_space_and_id" ON "Content" (space_id, source_local_id) WHERE source_local_id IS NOT NULL;
For consistency with PostgreSQL conventions, you may opt for snake_case naming (e.g., content_space_and_source_local_id_idx).
Ensure your migration runner supports non-transactional operations or document the expected locking behavior.
.github/workflows/database-deploy.yaml (1)

3-3: Make the workflow_dispatch mapping explicit.
Although GitHub Actions accepts workflow_dispatch: without a body, adding an explicit empty mapping improves readability and future extensibility.
packages/database/package.json (2)
14-14: Split and simplify the build script.
The current "build" chains many operations with &&, which makes troubleshooting harder. Extract subtasks (lint, type gen, copy, diff) into separate npm scripts and orchestrate them with a runner (e.g., npm-run-all). This will improve maintainability and logs.

19-20: Clarify dbdiff:save usage.
The script is defined as "dbdiff:save": "supabase stop && supabase db diff -f ", but it's unclear how to pass the filename. Typically users run:
npm run dbdiff:save -- <migration_name>.sql
Consider updating the script or documenting this pattern in the README.
packages/database/README.md (3)
1-2: Fix period placement in intro.
Change:
All CLI commands below should be run in this directory (`packages/database`.)
to
All CLI commands below should be run in this directory (`packages/database`).
This places the period outside the code parentheses.

13-13: Correct typo in work session instruction.
Change:
End you work session with `npm run stop`
to:
End your work session with `npm run stop`
This fixes the pronoun.

🧰 Tools

🪛 LanguageTool

[grammar] ~13-~13: There is an agreement error between ‘work’ and ‘session’. Insert ‘a(n)’ or change the noun to plural.
Context: ... deployed while you work. 3. End you work session with npm run stop (alias for `supabas...

(PRP_VB_NN)

18-18: Rephrase for clarity and punctuation.
Instead of:
4. `turbo build`, which will do the following:
use:
Run `turbo build` (the following steps will be executed):
This reads more naturally.

🧰 Tools

🪛 LanguageTool

[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...n placing your file. 4. turbo build, which will do the following: 1. C...

(UNLIKELY_OPENING_PUNCTUATION)
packages/database/example.md (2)
1-2: Update the heading and remove trailing ellipsis.
Rename # example... to a descriptive title, for example:
# Example Dataset for Discourse Graph
This improves clarity for readers.

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

1-1: Trailing punctuation in heading
Punctuation: '...'

(MD026, no-trailing-punctuation)

8-11: Fix Markdown list indentation.
Nested list items under (dgpgid) roam/js/discourse-graph should indent consistently (4 spaces per level). For example:
-* (dgpgid) roam/js/discourse-graph
-  * (et1bkid) Opposes
+* (dgpgid) roam/js/discourse-graph
+    * (et1bkid) Opposes
This ensures proper rendering in Markdown.

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

9-9: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

10-10: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

11-11: Inconsistent indentation for list items at the same level
Expected: 6; Actual: 4

(MD005, list-indent)
packages/database/supabase/schemas/space.sql (1)
34-39: Add index on discourse_platform_id for performance
Foreign key constraints don’t auto-index referencing columns. For faster lookups:
+CREATE INDEX ON public."DiscourseSpace"(discourse_platform_id);
packages/database/supabase/schemas/contributor.sql (1)

1-4: Inconsistent table naming convention
The join tables content_contributors and concept_contributors use snake_case unquoted, while other tables follow PascalCase with double quotes. For consistency, consider renaming or quoting them (e.g., public."ContentContributors").

Also applies to: 24-27
packages/database/supabase/schemas/account.sql (1)
17-24: Add indexes on foreign key columns in Account
Neither person_id nor platform_id is indexed, which can degrade join performance. Consider:
+CREATE INDEX ON public."Account"(person_id);
+CREATE INDEX ON public."Account"(platform_id);
packages/database/supabase/config.toml (6)

20-23: Optional: Enable HTTPS for local API.
You’ve disabled TLS locally (enabled = false). For scenarios testing auth flows or secure cookies, consider toggling this on and providing dev certificates.

24-32: Database and pooler settings are appropriate.
Pinning major_version = 15 aligns with your production DB. You’ve disabled the pooler; if concurrent connections grow, enabling it could improve performance.

Also applies to: 33-44

48-63: Migrations ordering is explicit but verbose.
You list each schema file individually. Consider using a glob pattern (e.g., "./schemas/*.sql") to simplify maintenance as new schemas are added.

226-233: Twilio secret in ENV, good—but check variable naming.
You use SUPABASE_AUTH_SMS_TWILIO_AUTH_TOKEN; ensure this matches your CI/CD secret and documentation.

272-294: Third-party auth blocks are placeholders.
Good that all are disabled by default. Remove unused stubs once providers go live to reduce config noise.

308-313: Analytics backend set to Postgres.
Ensure you have required retention and cleanup policies in place to manage table growth.

packages/database/supabase/schemas/embedding.sql (1)

21-29: Privileges granted broadly.
GRANT ALL to anon and authenticated is convenient for local dev but violates least privilege. Consider SELECT/INSERT/UPDATE only as needed.

packages/database/supabase/schemas/content.sql (2)

107-121: Indexing strategy: B-tree + full-text.
You’ve added GIN/Btree for ID lookups and a pgroonga index on text. Confirm pgroonga extension is loaded in migrations. Consider adding trigram or GIN on metadata if querying JSON.

142-149: Broad grants on tables.
As with embeddings, limit grants to necessary operations in production.

packages/database/types.gen.ts (1)

572-580: match_embeddings_for_subset_nodes signature aligns.
Good to see the string[] Roam UID mapping. Consider overloading with numeric IDs for efficiency.

packages/database/supabase/migrations/20250513173724_content_concept_key.sql (1)

25-28: Data migration uses represents_id.
Double-check that all rows in Content had represents_id set to avoid nulls. You may want a quick validation query before dropping the old column.

packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql (3)

11-13: Handle missing sync_info entries gracefully
Using SELECT ... INTO STRICT will raise an exception if no matching row exists in sync_info. Consider adding a controlled exception block or an explicit check to provide clearer diagnostic messages or fallback behavior when no record is found.

57-57: Remove or translate non-English comment
The comment -- zut il renvoie null... is in French and unclear to non-French readers. Replace it with a clear English explanation or remove it to maintain consistency.

64-92: Consider using transaction-scoped advisory locks
Switch from session-level locks (pg_advisory_lock/unlock) to transaction-scoped locks (pg_advisory_xact_lock) to ensure locks are automatically released at the end of the transaction, reducing the risk of leaks on errors.
packages/database/supabase/migrations/20250512142307_sync_table.sql (2)
38-38: Remove or translate non-English comment
The comment -- zut il renvoie null... is in French and not descriptive. Replace it with an English comment or remove it.

45-46: Use transaction-scoped advisory locks
Replace the pair:
PERFORM pg_advisory_lock(s_id);
...
PERFORM pg_advisory_unlock(s_id);
with:
PERFORM pg_advisory_xact_lock(s_id);
to have locks automatically released at transaction end.

Also applies to: 91-92
packages/database/supabase/schemas/sync.sql (1)

95-157: Avoid duplication of function definitions
The PL/pgSQL functions propose_sync_task and end_sync_task appear both here and in the migration scripts. Maintaining two sources risks drift. Consider centralizing these definitions or generating them in one place to ensure consistency.

packages/database/supabase/schemas/concept.sql (1)

58-66: Standardize index naming conventions
The index "Concept_content" is quoted and in CamelCase. For consistency with the rest of the schema, consider using unquoted snake_case, e.g., idx_concept_content_jsonb.
packages/database/schema.puml (1)
80-91: Remove (i) suffix from inherited fields in ConceptSchema.
The (i) notation clutters the diagram—use inheritance arrows instead of inline markers.
-class "ConceptSchema" [[{None}]] {
-    {field} id(i) : integer  
-    {field} epistemic_status(i) : EpistemicStatus  
-    ...
-}
+class "ConceptSchema" [[{None}]] {
+    {field} id : integer  
+    {field} epistemic_status : EpistemicStatus  
+    ...
+}
packages/database/supabase/migrations/20250504202930_content_tables.sql (1)

186-193: Unify table naming conventions.
concept_contributors is lowercase and unquoted, whereas other tables use quoted PascalCase. Consider standardizing to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b24fd08 and d8bf318.

⛔ Files ignored due to path filters (1)

packages/database/schema.svg is excluded by !**/*.svg

📒 Files selected for processing (27)

.github/workflows/database-deploy.yaml (1 hunks)
packages/database/.sqruff (1 hunks)
packages/database/README.md (1 hunks)
packages/database/example.md (1 hunks)
packages/database/package.json (1 hunks)
packages/database/schema.puml (1 hunks)
packages/database/schema.yaml (1 hunks)
packages/database/supabase/.gitignore (1 hunks)
packages/database/supabase/config.toml (1 hunks)
packages/database/supabase/migrations/20250504195841_remote_schema.sql (1 hunks)
packages/database/supabase/migrations/20250504202930_content_tables.sql (1 hunks)
packages/database/supabase/migrations/20250506174523_content_idx_id.sql (1 hunks)
packages/database/supabase/migrations/20250512142307_sync_table.sql (1 hunks)
packages/database/supabase/migrations/20250513173724_content_concept_key.sql (1 hunks)
packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql (1 hunks)
packages/database/supabase/schemas/account.sql (1 hunks)
packages/database/supabase/schemas/agent.sql (1 hunks)
packages/database/supabase/schemas/base.sql (1 hunks)
packages/database/supabase/schemas/concept.sql (1 hunks)
packages/database/supabase/schemas/content.sql (1 hunks)
packages/database/supabase/schemas/contributor.sql (1 hunks)
packages/database/supabase/schemas/embedding.sql (1 hunks)
packages/database/supabase/schemas/extensions.sql (1 hunks)
packages/database/supabase/schemas/space.sql (1 hunks)
packages/database/supabase/schemas/sync.sql (1 hunks)
packages/database/types.gen.ts (1 hunks)
turbo.json (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.17.2)

packages/database/example.md

1-1: Trailing punctuation in heading
Punctuation: '...'

(MD026, no-trailing-punctuation)

9-9: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

10-10: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

11-11: Inconsistent indentation for list items at the same level
Expected: 6; Actual: 4

(MD005, list-indent)

🪛 LanguageTool

packages/database/README.md

[grammar] ~13-~13: There is an agreement error between ‘work’ and ‘session’. Insert ‘a(n)’ or change the noun to plural.
Context: ... deployed while you work. 3. End you work session with npm run stop (alias for `supabas...

(PRP_VB_NN)

[style] ~16-~16: Consider shortening or rephrasing this to strengthen your wording.
Context: ...u're working on a feature branch. 2. Make changes to the schema, by editing files in `projec...

(MAKE_CHANGES)

[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...n placing your file. 4. turbo build, which will do the following: 1. C...

(UNLIKELY_OPENING_PUNCTUATION)

🔇 Additional comments (60)

packages/database/.sqruff (1)

1-8: SQL lint configuration looks solid
The PostgreSQL dialect is specified, unwanted rules (CP05, LT05) are excluded, and indentation settings are clear. This aligns well with your schema management workflow.

turbo.json (1)

35-38: Supabase environment variables integrated into Turbo deploy
The new SUPABASE_PROJECT_ID, SUPABASE_DB_PASSWORD, and SUPABASE_ACCESS_TOKEN entries look correct. Please verify that these names match the secrets defined in your GitHub Actions workflow (.github/workflows/database-deploy.yaml).
packages/database/supabase/schemas/base.sql (1)

17-17: Verify publication existence or use IF EXISTS
The statement
ALTER PUBLICATION supabase_realtime OWNER TO postgres;
will error if the supabase_realtime publication isn’t defined beforehand. Either ensure you create the publication in an earlier migration or make this idempotent:
-ALTER PUBLICATION supabase_realtime OWNER TO postgres;
+ALTER PUBLICATION IF EXISTS supabase_realtime OWNER TO postgres;
packages/database/supabase/schemas/space.sql (1)

18-18: Consider enforcing NOT NULL on url
The url column in DiscourseSpace is nullable. If every space must have a URL, change it to:
url character varying NOT NULL,
to maintain data integrity.
packages/database/supabase/schemas/account.sql (1)

36-37: Ensure space_id is NOT NULL if required
space_id in SpaceAccess is currently nullable. If every access entry must reference a valid space, declare it as SPACE_ID bigint NOT NULL.

packages/database/supabase/config.toml (10)

1-6: Project Identifier is explicit and clear.
Using project_id = "discourse-graphs" matches the repo name, ensuring isolation of Supabase artifacts.

7-19: API configuration looks solid.
Exposing public and graphql_public schemas with a sensible max_rows limit guards against large payloads. Consider bumping max_rows if your use case evolves.

64-70: Caution: Automatic seeding enabled.
enabled = true will run seed scripts on every reset, which can slow CI runs or leak test data. Verify this behavior is desired in CI/CD workflows.

78-86: Studio configuration is correct.
You’ve wired up openai_api_key via ENV, which is secure. Make sure the key has only necessary scopes.

147-160: Rate limits look reasonable.
Throttling email, SMS, and anonymous sign-in should help mitigate abuse.

167-183: Email auth settings are comprehensive.
Requiring double confirmations and OTP expiry ensures strong security.

199-207: SMS auth disabled by default.
That's consistent with your policies. If SMS sign-up is ever required, revisit enable_signup.

234-242: MFA and phone MFA tuned.
MFA is enabled via TOTP and disabled via phone, matching a balanced security model.

Also applies to: 243-250

257-271: OAuth providers scaffolded correctly.
You’ve stubbed Apple OAuth; ensure credentials are provisioned before enabling.

295-304: Edge runtime enabled for Deno.
Great for edge functions; pinning deno_version = 1 is safe. Verify compatibility when upgrading.

packages/database/supabase/schemas/embedding.sql (2)

1-8: Enum EmbeddingName is comprehensive.
All targeted models are listed. Naming is clear and consistent with Supabase vector conventions.

12-18: Embedding table key design: single-column PK.
Using target_id as the primary key enforces one embedding per content. If you later support multiple models per content, you may need a composite PK.

packages/database/supabase/schemas/content.sql (3)

1-12: Scale enum covers all granularity levels.
This matches your LinkML model and provides clear content hierarchies.

16-28: Document table schema is sound.
Using an OID for raw content is appropriate for large blobs. Ensure clients correctly fetch large objects.

60-75: Content table supports nesting and metadata.
Foreign keys for creator_id vs. author_id and self‐referencing part_of_id are well defined.

packages/database/types.gen.ts (3)

1-8: Json type is correctly recursive.
This covers all JSON structures you’ll encounter.

316-325: Enum constants mirror DB definitions.
Great for static validation. Keep this in sync with schema SQL.

558-571: Function match_content_embeddings Args/Returns mismatch.
The TS signature omits current_document_id from the return but includes it as optional arg—OK. However, ensure the type of query_embedding (string) matches actual binary/vector type; client libs may prefer number[].

packages/database/supabase/migrations/20250513173724_content_concept_key.sql (6)

3-6: Constraint rename is precise.
Updating AutomatedAgent to reference Agent(id) cascades correctly.

9-14: Dropping last_synced columns—confirm no dependencies.
Ensure any downstream logic or views aren’t broken. Check for leftover references in sync scripts.

17-22: represented_by_id relationship refined.
Moving representation logic to Concept and adding a unique index is a clean data model improvement.

31-36: Dropping old constraint and column is straightforward.
Looks correct.

39-46: Disabling function body checks for <=> is necessary.
You noted the vector operator issue; this is the right workaround.

91-92: Re-enabling function body checks.
Restoring check_function_bodies is correct hygiene after DDL.

packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql (1)

14-16: Verify enum ordering for task_status
The comparison ASSERT s_status > 'active' depends on the internal ordering of the task_status enum. Please confirm that 'active' is defined first so that all subsequent statuses (timeout, failed, complete) correctly satisfy this check.

packages/database/supabase/schemas/concept.sql (1)

16-32: Ensure sequence usage privileges for entity_id_seq
The Concept.id column relies on nextval('public.entity_id_seq'), but this migration does not grant USAGE on that sequence. Verify that the roles inserting into Concept have permission, or add GRANT USAGE ON SEQUENCE public.entity_id_seq TO ....

packages/database/schema.puml (13)

8-12: UML Account class is well-defined.
Fields map correctly to the SQL table, and relationships to Person and DiscoursePlatform are drawn below.

13-17: UML DiscourseSpace class matches the SQL schema.
Fields id, url, and name are correct.

18-19: Verify optional space cardinality for SpaceAccess.
The diagram shows 0..1 for the space link, but in SQL space_id is nullable. Confirm if entries without a space are intended, or if space_id should be NOT NULL.

20-29: UML Document class aligns with the migration.
All fields match the SQL definition (including the contents blob mapped from OID).

30-33: Abstract Agent is correctly modeled.
Fields id and type mirror the base table.

34-43: UML Content class is accurate.
Attributes correspond to the SQL Content table.

44-47: Document–Agent and Document–Space relationships look good.
Cardinalities reflect the DB constraints.

48-52: UML DiscoursePlatform class matches the SQL schema.
Fields id, name, and url are correct.

53-60: UML Concept class fields are consistent.
All properties align with the SQL Concept table.

61-67: Concept and Content space relationships are correct.
The 0..1 and 1 cardinalities match the migration.

74-79: Embedding and content relationships are correct.
Links to Content, Concept, and Agent use the proper cardinalities.

96-102: UML Person subclass is accurate.
Fields beyond the inherited ones map cleanly to the SQL Person table.

110-112: Agent inheritance and account→agent link are correct.
Subclasses and associations reflect the DB constraints.
packages/database/supabase/migrations/20250504202930_content_tables.sql (16)

2-8: Sequence creation is correct.
The entity_id_seq with BIGINT and caching aligns with global ID requirements.

11-11: EntityType enum looks complete.
All intended entity variants are included.

13-13: Scale enum covers content granularity.
Values match the design doc.

15-15: EmbeddingName enum is correct.
All supported embedding identifiers are listed.

17-17: EpistemicStatus enum matches requirements.
Statuses are exhaustive.

19-23: Agent table aligned with LinkML and UML.
Basic PK and type field are correct.

26-34: Person table FK to Agent is correct.
Constraints and metadata align.

49-55: DiscoursePlatform definition is solid.
PK, name, and URL enforced as NOT NULL.

57-66: Account table correctly links platform and person.
FKs, permissions, and active flag look good.

69-76: DiscourseSpace table matches UML.
FK to DiscoursePlatform is present and cascades as expected.

79-87: SpaceAccess table definition is correct.
Unique constraint, FKs, and editor flag align with design.

92-105: Document table and comments are accurate.
All metadata fields, OID type, and author FK are in place.

114-135: Concept table, indexes, and comments look good.
Referential behavior and GIN index on JSONB are appropriate.

145-166: Content table, indexes, and comments align with requirements.
Nested content, search index, and FK cascades are correct.

195-203: Content embedding table is correctly defined.
Model default, vector type, and PK/FK align.

205-211: ⚠️ Potential issue

Missing semicolon at end of content_contributors.
The final CREATE TABLE lacks a terminating ;, which will cause a syntax error.
 CREATE TABLE content_contributors (
     content_id BIGINT,
     contributor_id BIGINT,
     PRIMARY KEY (content_id, contributor_id),
     FOREIGN KEY(content_id) REFERENCES "Content" (id) ON DELETE CASCADE ON UPDATE CASCADE,
     FOREIGN KEY(contributor_id) REFERENCES "Agent" (id) ON DELETE CASCADE ON UPDATE CASCADE
-)
+);
Likely an incorrect or invalid review comment.

packages/database/supabase/migrations/20250504195841_remote_schema.sql

.github/workflows/database-deploy.yaml

packages/database/supabase/schemas/extensions.sql

packages/database/README.md

packages/database/supabase/schemas/base.sql

packages/database/schema.puml

packages/database/supabase/migrations/20250504202930_content_tables.sql

mdroidian · 2025-05-20T23:26:33Z

This is quite large, so I think it'll be best to walk through together at our next meeting.

mdroidian

🔥
Just some small things we discussed and only the sqruff to eslint is blocking.
Approving so that when the change is done you can merge. Ping me if you'd like a double check.

mdroidian · 2025-05-22T04:33:40Z

packages/database/supabase/schemas/space.sql

+ALTER TABLE ONLY public."DiscoursePlatform"
+ADD CONSTRAINT "DiscoursePlatform_pkey" PRIMARY KEY (id);
+
+ALTER TABLE ONLY public."DiscourseSpace"


@maparent renaming this to just Space (and Platform) instead of DiscourseSpace keeps nagging at me. "Discourse` feels redundant. Thoughts? Or is it named this way for a specific purpose (eg: to avoid a clash of some sort)?

No anticipated conflict, and no objection. I'll do that rename later today.

maparent · 2025-05-22T14:02:55Z

🔥 Just some small things we discussed and only the sqruff to eslint is blocking. Approving so that when the change is done you can merge. Ping me if you'd like a double check.

Thank you. Reminder that the deploy mechanics also has to be made specific to main.

maparent · 2025-05-22T22:49:35Z

@CodeRabbit review

coderabbitai · 2025-05-22T22:49:39Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 9

♻️ Duplicate comments (2)

.github/workflows/database-deploy.yaml (1)
16-18: 🛠️ Refactor suggestion

Update actions/setup-node to v4

The current workflow uses actions/setup-node@v3, which is outdated according to the static analysis. Update to v4 for better compatibility with GitHub Actions runners.
-      - uses: actions/setup-node@v3
+      - uses: actions/setup-node@v4
        with:
          node-version: "20"
🧰 Tools

🪛 actionlint (1.7.7)

16-16: the runner of "actions/setup-node@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)
packages/database/supabase/schemas/sync.sql (1)

207-258: Restrict privileges on sync_info table and related functions

Granting ALL on the sync_info table, its sequence, and these functions to the anon and authenticated roles allows any user to propose or end sync tasks arbitrarily. This poses a security risk.

🧹 Nitpick comments (24)

packages/database/supabase/migrations/20250520133551_nodes_needing_sync.sql (3)
13-14: Ensure proper JSONB field existence validation

The code currently assumes that each node in the array contains 'uid' and 'roam_edit_time' fields. Consider adding validation to handle cases where these fields might be missing or null.
-            roam_node_uid := (node_info->>'uid')::text;
-            roam_node_edit_epoch_ms := (node_info->>'roam_edit_time')::bigint;
+            -- Skip nodes with missing uid or edit time
+            IF node_info->>'uid' IS NULL OR node_info->>'roam_edit_time' IS NULL THEN
+                CONTINUE;
+            END IF;
+            
+            roam_node_uid := (node_info->>'uid')::text;
+            roam_node_edit_epoch_ms := (node_info->>'roam_edit_time')::bigint;
20-21: Consider adding an index for performance

If this function will be run frequently with large datasets, consider adding an index on the source_local_id column in the Content table to improve query performance.

28-33: Consider logging for nodes requiring sync

For debugging and audit purposes, consider adding logging for nodes that need synchronization, especially when handling large datasets.
.github/workflows/database-deploy.yaml (3)
10-13: Add environment variable validation

Consider adding a step to validate that all required environment variables are set before proceeding with the deployment. This can help catch configuration issues early in the workflow.
    env:
      SUPABASE_ACCESS_TOKEN: ${{ secrets.SUPABASE_ACCESS_TOKEN }}
      SUPABASE_PROJECT_ID: ${{ secrets.SUPABASE_PROJECT_ID_PROD }}
      SUPABASE_DB_PASSWORD: ${{ secrets.SUPABASE_DB_PASSWORD_PROD }}
+    steps:
+      - name: Validate environment variables
+        run: |
+          if [ -z "$SUPABASE_ACCESS_TOKEN" ] || [ -z "$SUPABASE_PROJECT_ID" ] || [ -z "$SUPABASE_DB_PASSWORD" ]; then
+            echo "Missing required environment variables"
+            exit 1
+          fi
19-23: Add dependency caching

Adding a caching step for Node.js dependencies can significantly improve workflow run times on repeated executions.
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
+         cache: 'npm'
      - run: npm ci
      - uses: supabase/setup-cli@v1
4-6: Add path filtering to the workflow trigger

Consider adding path filters to the push trigger to only run this workflow when relevant database files change.
  push:
    branches:
      - main
+    paths:
+      - 'packages/database/**'
+      - '.github/workflows/database-deploy.yaml'
packages/database/scripts/lint.ts (3)
11-12: Add support for additional CLI arguments

The script only supports a -f flag for fixing issues, but it would be more flexible to support passing additional arguments to sqruff.
-      const command =
-        process.argv.length == 3 && process.argv[2] == "-f" ? "fix" : "lint";
+      // Default to lint, use fix if -f flag is present
+      let command = "lint";
+      let additionalArgs = "";
+      
+      // Process command line arguments
+      for (let i = 2; i < process.argv.length; i++) {
+        if (process.argv[i] === "-f") {
+          command = "fix";
+        } else {
+          // Pass through other arguments to sqruff
+          additionalArgs += " " + process.argv[i];
+        }
+      }
13-13: Make the target directory configurable

The script hardcodes the supabase/schemas directory. Consider making this configurable via an environment variable or command-line argument.
-      exec(`sqruff ${command} supabase/schemas`, {}, (err, stdout, stderr) => {
+      // Use directory from environment variable or default to supabase/schemas
+      const targetDir = process.env.SQL_SCHEMAS_DIR || "supabase/schemas";
+      exec(`sqruff ${command} ${targetDir}${additionalArgs}`, {}, (err, stdout, stderr) => {
13-17: Use promises for better error handling

The nested callback structure makes error handling more complex. Consider using promisified versions of exec or spawn for better control flow.
-      exec(`sqruff ${command} supabase/schemas`, {}, (err, stdout, stderr) => {
-        console.log(`${stdout}`);
-        console.log(`${stderr}`);
-        process.exit(err ? err.code : 0);
-      });
+      // Use promisified exec for better error handling
+      import { promisify } from "node:util";
+      const execPromise = promisify(exec);
+      
+      execPromise(`sqruff ${command} supabase/schemas`)
+        .then(({ stdout, stderr }) => {
+          if (stdout) console.log(stdout);
+          if (stderr) console.log(stderr);
+          process.exit(0);
+        })
+        .catch((err) => {
+          console.error("Linting failed:", err.message);
+          process.exit(err.code || 1);
+        });
packages/database/scripts/deploy.ts (3)
41-53: Add documentation for the -f flag

The script deploys functions when the -f flag is provided, but there's no documentation explaining this. Add a comment or help message explaining the purpose of this flag.
+            // Check if we should also deploy functions
             if (process.argv.length == 3 && process.argv[2] == "-f") {
-              // Also push functions
+              console.log("Deploying Supabase functions as requested with -f flag");
               exec(
6-62: Refactor to use async/await for better readability

The nested callback structure creates a "callback pyramid" that's hard to read and maintain. Consider refactoring using async/await for a more linear and readable flow.

Here's a partial example of how this could be improved:
import { promisify } from "node:util";
import { exec as execCallback } from "node:child_process";
import dotenv from "dotenv";

const exec = promisify(execCallback);
dotenv.config();

const main = async () => {
  try {
    // Check if we're on the main branch
    const { stdout: gitStatus } = await exec("git status -s -b");
    const branchMatch = gitStatus.match(/## ([^\.]+)/);
    if (!branchMatch || branchMatch[1] !== "main") {
      console.log("Not on main branch, not deploying database");
      return;
    }

    // Verify environment variables
    const { SUPABASE_PROJECT_ID, SUPABASE_DB_PASSWORD } = process.env;
    if (!SUPABASE_PROJECT_ID) {
      console.log("Please define SUPABASE_PROJECT_ID");
      process.exit(1);
    }
    if (!SUPABASE_DB_PASSWORD) {
      console.log("Please define SUPABASE_DB_PASSWORD");
      process.exit(1);
    }

    // Link Supabase project
    try {
      const { stdout, stderr } = await exec(
        `supabase link --project-ref ${SUPABASE_PROJECT_ID}`,
        { env: { ...process.env, SUPABASE_DB_PASSWORD } }
      );
      console.log(stdout);
      console.error(stderr);
    } catch (err) {
      console.error("Failed to link Supabase project");
      process.exit(1);
    }

    // Push database changes
    try {
      const { stdout, stderr } = await exec("supabase db push");
      console.log(stdout);
      console.error(stderr);
    } catch (err) {
      console.error("Failed to push database changes");
      process.exit(1);
    }

    // Deploy functions if requested
    if (process.argv.length == 3 && process.argv[2] == "-f") {
      console.log("Deploying Supabase functions as requested with -f flag");
      try {
        const { stdout, stderr } = await exec(
          `supabase functions deploy --project-ref ${SUPABASE_PROJECT_ID}`
        );
        console.log(stdout);
        console.error(stderr);
      } catch (err) {
        console.error("Failed to deploy Supabase functions");
        process.exit(1);
      }
    }
  } catch (error) {
    console.error("error:", error);
    process.exit(1);
  }
};
3-5: Add TypeScript types for better code quality

Consider adding TypeScript types for the imported modules and function parameters/returns to improve code quality and maintainability.
 import { exec } from "node:child_process";
 import dotenv from "dotenv";
+import { ExecException } from "child_process";

 dotenv.config();

-const main = () => {
+const main = (): void => {
packages/database/README.md (4)
7-7: Clarify TODO for group access token.

The instruction on line 7 includes a TODO: Create a group access token. Consider tracking this in your issue tracker or linking to a specific task, since team members will need correct credentials for supabase login.

13-13: Fix grammar in step description.

Change “End you work session” to “End your work session” for subject-verb agreement and clarity:
- 3. End you work session with `npm run stop`
+ 3. End your work session with `npm run stop`
🧰 Tools

🪛 LanguageTool

[uncategorized] ~13-~13: “you” seems less likely than “your” (belonging to you).
Context: ... are deployed while you work. 3. End you work session with npm run stop (alias...

(AI_HYDRA_LEO_CP_YOU_YOUR)

[grammar] ~13-~13: There is an agreement error between ‘work’ and ‘session’. Insert ‘a(n)’ or change the noun to plural.
Context: ... deployed while you work. 3. End you work session with npm run stop (alias for `supabas...

(PRP_VB_NN)

16-17: Tighten phrasing by removing unnecessary comma.

The phrase “Make changes to the schema, by editing files…” can be streamlined:
- 2. Make changes to the schema, by editing files in `packages/database/supabase/schemas`
+ 2. Make changes to the schema by editing files in `packages/database/supabase/schemas`
🧰 Tools

🪛 LanguageTool

[style] ~16-~16: Consider shortening or rephrasing this to strengthen your wording.
Context: ...u're working on a feature branch. 2. Make changes to the schema, by editing files in `packag...

(MAKE_CHANGES)

25-26: Normalize list numbering under “Development.”

Steps 10 and 11 follow after step 6, which can confuse readers. Consider using nested numbering (e.g., 3.7, 3.8) or rely on Markdown’s auto-numbering for subitems to maintain a clear sequence.
packages/database/example.md (2)
1-1: Remove trailing punctuation in heading.

Change # example... to a more conventional # Example (capitalized and without ellipsis) to adhere to Markdown style guidelines.

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

1-1: Trailing punctuation in heading
Punctuation: '...'

(MD026, no-trailing-punctuation)

8-17: Normalize list indentation.

Nested bullet points are inconsistently indented (some use 4 spaces, others 6). Align subitems using two spaces per level, for example:
-   * (et1bkid) Opposes
+    * (et1bkid) Opposes
Repeat for all nested entries under this block.

🧰 Tools

🪛 markdownlint-cli2 (0.17.2)

9-9: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

10-10: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

11-11: Inconsistent indentation for list items at the same level
Expected: 6; Actual: 4

(MD005, list-indent)
packages/database/supabase/migrations/20250520132747_restrict_search_by_document.sql (1)
21-22: Clarify or remove subjective comment.

The remark
-- Supabase wants to replace this function for no obvious reason. Letting it.
may be unclear to future maintainers. Either explain the context or remove the comment to keep the migration focused.
packages/database/supabase/schemas/concept.sql (1)
55-57: Refine comments for concept timestamps.

The comments on created and last_modified refer to “content.” Since this is the Concept table, consider adjusting to:
COMMENT ON COLUMN public."Concept".created IS 'Time when the concept was created in the remote source';
COMMENT ON COLUMN public."Concept".last_modified IS 'Time when the concept was last modified in the remote source';
packages/database/supabase/schemas/sync.sql (1)
121-121: Remove French debug comment

There's a French comment "zut il renvoie null..." that appears to be debugging-related. Consider removing it or translating to English for consistency.
-    -- zut il renvoie null...
+    -- Note: returns null for new rows
packages/database/schema.yaml (3)

122-122: Consider implementing the TODO for known skills

There's a TODO comment about implementing "known skills" to track what processes agents can confirm. Consider addressing this in a future update.

Would you like me to generate a draft implementation for the agent skills modeling?

310-338: Review commented Reference-related classes for future implementation

The commented-out Reference, ContentLink, and Occurrence classes appear to be planned for future implementation. Consider creating a ticket to track the implementation of these classes if they're still relevant to your roadmap.

Would you like me to create a draft implementation for these reference-related classes based on the comments?

296-306: Consider converting inline comments to proper documentation

The inline comments on lines 296-306 contain valuable design considerations about concepts. Consider converting these into proper documentation (perhaps in a separate markdown file) to preserve this knowledge for future developers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8bf318 and 489a1d5.

⛔ Files ignored due to path filters (1)

packages/database/schema.svg is excluded by !**/*.svg

📒 Files selected for processing (32)

.github/workflows/database-deploy.yaml (1 hunks)
packages/database/.sqruff (1 hunks)
packages/database/README.md (1 hunks)
packages/database/example.md (1 hunks)
packages/database/package.json (1 hunks)
packages/database/schema.puml (1 hunks)
packages/database/schema.yaml (1 hunks)
packages/database/scripts/deploy.ts (1 hunks)
packages/database/scripts/lint.ts (1 hunks)
packages/database/supabase/.gitignore (1 hunks)
packages/database/supabase/config.toml (1 hunks)
packages/database/supabase/migrations/20250504195841_remote_schema.sql (1 hunks)
packages/database/supabase/migrations/20250504202930_content_tables.sql (1 hunks)
packages/database/supabase/migrations/20250506174523_content_idx_id.sql (1 hunks)
packages/database/supabase/migrations/20250512142307_sync_table.sql (1 hunks)
packages/database/supabase/migrations/20250513173724_content_concept_key.sql (1 hunks)
packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql (1 hunks)
packages/database/supabase/migrations/20250520132747_restrict_search_by_document.sql (1 hunks)
packages/database/supabase/migrations/20250520133551_nodes_needing_sync.sql (1 hunks)
packages/database/supabase/migrations/20250522193823_rename_discourse_space.sql (1 hunks)
packages/database/supabase/schemas/account.sql (1 hunks)
packages/database/supabase/schemas/agent.sql (1 hunks)
packages/database/supabase/schemas/base.sql (1 hunks)
packages/database/supabase/schemas/concept.sql (1 hunks)
packages/database/supabase/schemas/content.sql (1 hunks)
packages/database/supabase/schemas/contributor.sql (1 hunks)
packages/database/supabase/schemas/embedding.sql (1 hunks)
packages/database/supabase/schemas/extensions.sql (1 hunks)
packages/database/supabase/schemas/space.sql (1 hunks)
packages/database/supabase/schemas/sync.sql (1 hunks)
packages/database/types.gen.ts (1 hunks)
turbo.json (1 hunks)

✅ Files skipped from review due to trivial changes (1)

packages/database/supabase/migrations/20250522193823_rename_discourse_space.sql

🚧 Files skipped from review as they are similar to previous changes (19)

packages/database/supabase/migrations/20250506174523_content_idx_id.sql
packages/database/supabase/migrations/20250504195841_remote_schema.sql
packages/database/.sqruff
packages/database/supabase/.gitignore
packages/database/package.json
turbo.json
packages/database/supabase/schemas/account.sql
packages/database/supabase/schemas/extensions.sql
packages/database/supabase/config.toml
packages/database/supabase/schemas/space.sql
packages/database/supabase/schemas/base.sql
packages/database/types.gen.ts
packages/database/supabase/schemas/contributor.sql
packages/database/supabase/schemas/agent.sql
packages/database/supabase/migrations/20250512142307_sync_table.sql
packages/database/supabase/schemas/embedding.sql
packages/database/supabase/schemas/content.sql
packages/database/supabase/migrations/20250517154122_plpgsql_linting.sql
packages/database/supabase/migrations/20250504202930_content_tables.sql

🧰 Additional context used

🧠 Learnings (2)

packages/database/supabase/migrations/20250520132747_restrict_search_by_document.sql (1)

Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#165
File: packages/database/supabase/schemas/embedding.sql:66-95
Timestamp: 2025-05-20T14:04:19.622Z
Learning: In the `match_embeddings_for_subset_nodes` SQL function in packages/database/supabase/schemas/embedding.sql, the number of results is implicitly limited by the length of the input array parameter `p_subset_roam_uids` since the function filters content using `WHERE c.source_local_id = ANY(p_subset_roam_uids)`.

packages/database/schema.yaml (1)

Learnt from: maparent
PR: DiscourseGraphs/discourse-graph#165
File: packages/database/schema.yaml:116-121
Timestamp: 2025-05-20T03:06:16.564Z
Learning: In the discourse-graph project's LinkML schema (packages/database/schema.yaml), attributes and slots are equivalent constructs. Items can be defined either as slots or attributes without needing to duplicate them in both sections.

🪛 actionlint (1.7.7)

.github/workflows/database-deploy.yaml

16-16: the runner of "actions/setup-node@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 LanguageTool

packages/database/README.md

[uncategorized] ~13-~13: “you” seems less likely than “your” (belonging to you).
Context: ... are deployed while you work. 3. End you work session with npm run stop (alias...

(AI_HYDRA_LEO_CP_YOU_YOUR)

[grammar] ~13-~13: There is an agreement error between ‘work’ and ‘session’. Insert ‘a(n)’ or change the noun to plural.
Context: ... deployed while you work. 3. End you work session with npm run stop (alias for `supabas...

(PRP_VB_NN)

[style] ~16-~16: Consider shortening or rephrasing this to strengthen your wording.
Context: ...u're working on a feature branch. 2. Make changes to the schema, by editing files in `packag...

(MAKE_CHANGES)

[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...n placing your file. 4. turbo build, which will do the following: 1. C...

(UNLIKELY_OPENING_PUNCTUATION)

🪛 markdownlint-cli2 (0.17.2)

packages/database/example.md

1-1: Trailing punctuation in heading
Punctuation: '...'

(MD026, no-trailing-punctuation)

9-9: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

10-10: Unordered list indentation
Expected: 4; Actual: 6

(MD007, ul-indent)

11-11: Inconsistent indentation for list items at the same level
Expected: 6; Actual: 4

(MD005, list-indent)

🔇 Additional comments (11)

packages/database/supabase/migrations/20250520133551_nodes_needing_sync.sql (1)

1-38: Function design is suitable for node synchronization with Roam

The get_nodes_needing_sync function effectively identifies nodes that need synchronization by comparing timestamps between Roam and the Content table. This is a clean implementation that follows the principle of single responsibility.

packages/database/supabase/schemas/sync.sql (6)

1-6: LGTM: Well-defined task status enum

The enum definition with appropriate states ('active', 'timeout', 'complete', 'failed') provides a clear structure for tracking synchronization task status.

10-20: Well-structured sync_info table with appropriate fields

Good job on defining a comprehensive table for tracking synchronization tasks. The table includes all necessary fields for effective task management: identifiers, status tracking, worker attribution, failure counting, and timestamps for task lifecycle events.

43-45: Good use of unique index for task identification

The unique index on (sync_target, sync_function) ensures that only one synchronization task can exist for a given target-function pair, which is essential for preventing duplicate tasks.

49-85: LGTM: Robust end_sync_task function with proper validations

This function correctly handles task completion with appropriate validations:

Ensures status progression (line 68)

Verifies worker identity (line 67)

Resets failure count on completion (line 71)

Increments failure count on non-completion status changes (line 74)

95-157: Well-implemented concurrency control in propose_sync_task

Excellent implementation of the task proposal system with:

Advisory locks for concurrency control (line 128)

Parameter validation (lines 113-115)

Timeout detection logic (lines 134-137)

Exponential backoff for failures (line 139)

168-204: LGTM: Effective implementation of get_nodes_needing_sync

This function properly identifies nodes needing synchronization by comparing timestamps between external data and local content records. The logic correctly handles both new nodes and updated nodes.

packages/database/schema.yaml (3)

1-8: LGTM: Well-defined schema header with appropriate identifiers

The schema header correctly defines the namespace, prefixes, and imports needed for the LinkML schema.

10-79: Comprehensive enumeration definitions

The enumerations are well-structured with clear permissible values and helpful descriptions. This provides a solid foundation for the schema's type system.

80-105: Well-defined custom types with appropriate annotations

The custom types are properly defined with proper URI references, base types, and SQL-specific annotations where needed. The vector type annotation for pgvector is particularly valuable for embedding storage.

packages/database/schema.puml (1)

1-110: LGTM: Comprehensive UML diagram of the database schema

The PlantUML diagram provides a clear visual representation of the entities, their attributes, and relationships. This is valuable for understanding the overall structure of the database schema.

Since this file is auto-generated as noted in previous review comments, no direct edits are suggested.

packages/database/scripts/lint.ts

packages/database/scripts/deploy.ts

coderabbitai · 2025-05-22T22:55:50Z

packages/database/scripts/deploy.ts

+const main = () => {
+  try {
+    exec("git status -s -b", (err, stdout, stderr) => {
+      if (err) {
+        console.error("Is git installed?");
+        process.exit(1);
+      }
+      const lines = stdout.split("\n");
+      if (lines[0] != "## main...main") {
+        console.log("Not on main branch, not deploying database");
+        process.exit(0);
+      }
+      const { SUPABASE_PROJECT_ID, SUPABASE_DB_PASSWORD } = process.env;
+      if (!SUPABASE_PROJECT_ID) {
+        console.log("Please define SUPABASE_PROJECT_ID");
+        process.exit(1);
+      }
+      if (!SUPABASE_DB_PASSWORD) {
+        console.log("Please define SUPABASE_DB_PASSWORD");


🛠️ Refactor suggestion

Improve branch verification logic

The current branch check is very specific ("## main...main") and might not work in all Git setups. Consider a more robust approach that extracts the branch name from the Git status output.

- const lines = stdout.split("\n"); - if (lines[0] != "## main...main") { + // Extract branch name using regex for more robust parsing + const branchMatch = stdout.match(/## ([^\.]+)/); + if (!branchMatch || branchMatch[1] !== "main") { console.log("Not on main branch, not deploying database"); process.exit(0); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const main = () => {

try {

exec("git status -s -b", (err, stdout, stderr) => {

if (err) {

console.error("Is git installed?");

process.exit(1);

}

const lines = stdout.split("\n");

if (lines[0] != "## main...main") {

console.log("Not on main branch, not deploying database");

process.exit(0);

}

const { SUPABASE_PROJECT_ID, SUPABASE_DB_PASSWORD } = process.env;

if (!SUPABASE_PROJECT_ID) {

console.log("Please define SUPABASE_PROJECT_ID");

process.exit(1);

}

if (!SUPABASE_DB_PASSWORD) {

console.log("Please define SUPABASE_DB_PASSWORD");

// Extract branch name using regex for more robust parsing

const branchMatch = stdout.match(/## ([^\.]+)/);

if (!branchMatch || branchMatch[1] !== "main") {

console.log("Not on main branch, not deploying database");

process.exit(0);

}

🤖 Prompt for AI Agents

In packages/database/scripts/deploy.ts around lines 6 to 24, the branch verification logic is too specific by checking for the exact string "## main...main" from git status output, which may fail in different Git setups. Modify the code to parse the first line of the git status output more robustly by extracting the current branch name dynamically, for example by splitting the line and retrieving the branch name portion, then compare it to "main" to decide whether to proceed with deployment.

The goal is not just to extract the branch name, but to ensure that the branch is pristine, with no un-pushed changes and ideally no uncommitted changes. (I pushed another change to handle the latter case.)

@maparent I understand your point now, thank you for the clarification. You're absolutely right that the goal is more comprehensive than just checking the branch name - ensuring the branch is pristine with no uncommitted or unpushed changes is crucial for a deployment script.

I see in the full file you've already implemented the check for uncommitted changes with:

if (lines.length > 1) { console.log( "You seem to have uncommitted changes, not deploying database", ); process.exit(0); }

This approach of using git status -s -b -uno and checking lines.length is an effective way to verify both the branch name and its pristine state. Sorry for not seeing the full context in my initial suggestion.

✏️ Learnings added

Learnt from: maparent PR: DiscourseGraphs/discourse-graph#165 File: packages/database/scripts/deploy.ts:0-0 Timestamp: 2025-05-22T23:48:45.419Z Learning: The database deployment script in packages/database/scripts/deploy.ts requires that the branch is 'main' and pristine (no uncommitted or unpushed changes) before allowing deployment to proceed.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

packages/database/supabase/migrations/20250520132747_restrict_search_by_document.sql

packages/database/supabase/migrations/20250513173724_content_concept_key.sql

packages/database/supabase/schemas/concept.sql

packages/database/schema.yaml

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

github-project-automation bot added this to General May 19, 2025

vercel bot deployed to Preview May 19, 2025 20:22 View deployment

vercel bot deployed to Preview May 19, 2025 20:26 View deployment

vercel bot deployed to Preview May 19, 2025 21:01 View deployment

vercel bot deployed to Preview May 20, 2025 00:08 View deployment

vercel bot deployed to Preview May 20, 2025 00:10 View deployment

coderabbitai bot reviewed May 20, 2025

View reviewed changes

vercel bot deployed to Preview May 20, 2025 03:34 View deployment

vercel bot temporarily deployed to Preview May 20, 2025 14:02 Inactive

vercel bot temporarily deployed to Preview May 20, 2025 14:07 Inactive

vercel bot temporarily deployed to Preview May 20, 2025 15:49 Inactive

maparent requested a review from mdroidian May 20, 2025 18:03

maparent added 8 commits May 21, 2025 17:33

Database structure

75af65f

add an example

9d334a0

unique source_id

67c362f

add svg/puml

00cccac

changes to example

f0783c1

some corrections to example

34bb91e

github mardown tables

d75b1a0

github mardown tables

8014a87

clean redirect urls

bbc7099

maparent force-pushed the feature/supabase branch from 692e100 to bbc7099 Compare May 21, 2025 21:33

vercel bot deployed to Preview May 21, 2025 21:34 View deployment

mdroidian approved these changes May 21, 2025

View reviewed changes

mdroidian reviewed May 22, 2025

View reviewed changes

Rename DiscoursePlatform and DiscourseSpace without Discourse

3ea3b17

vercel bot temporarily deployed to Preview May 22, 2025 19:59 Inactive

tolerate absence of sqruff

9a44165

vercel bot temporarily deployed to Preview May 22, 2025 22:05 Inactive

Make deploy work only on main branch

489a1d5

vercel bot temporarily deployed to Preview May 22, 2025 22:45 Inactive

coderabbitai bot reviewed May 22, 2025

View reviewed changes

Avoid putting passwords in logs.

079b89a

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

vercel bot temporarily deployed to Preview May 22, 2025 23:41 Inactive

maparent added 2 commits May 22, 2025 19:46

minor changes to README

84a185c

Also check for undeployed changes

0d7ccda

vercel bot temporarily deployed to Preview May 22, 2025 23:48 Inactive

Use RESET ALL where needed, not otherwise

c71198f

vercel bot temporarily deployed to Preview May 22, 2025 23:57 Inactive

maparent merged commit 88699ef into main May 22, 2025
3 checks passed

github-project-automation bot moved this to Done in General May 22, 2025

maparent deleted the feature/supabase branch May 22, 2025 23:58

coderabbitai bot mentioned this pull request May 26, 2025

Add unique keys to schema #178

Merged

This was referenced Jul 23, 2025

ENG-479: Resizing qb columns make them disappear sometimes #270

Merged

ENG-625 Construct environment files to access supabase. #301

Merged

coderabbitai bot mentioned this pull request Aug 3, 2025

Roam: Eng-450 remove unused db functions #315

Merged

coderabbitai bot mentioned this pull request Aug 17, 2025

Roam: ENG-733 Glue Settings UI components to database operations #343

Merged

ENG-270, ENG-328 Creation of the supabase embedding schema, integration with turbo and CI/CD #165

ENG-270, ENG-328 Creation of the supabase embedding schema, integration with turbo and CI/CD #165

Uh oh!

Conversation

maparent commented May 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented May 19, 2025

Uh oh!

vercel bot commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

maparent commented May 19, 2025

Uh oh!

coderabbitai bot commented May 19, 2025

Uh oh!

maparent commented May 20, 2025

Uh oh!

coderabbitai bot commented May 20, 2025

Uh oh!

maparent commented May 20, 2025

Uh oh!

coderabbitai bot commented May 20, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdroidian commented May 20, 2025

Uh oh!

mdroidian left a comment

Choose a reason for hiding this comment

Uh oh!

mdroidian May 22, 2025

Choose a reason for hiding this comment

Uh oh!

maparent May 22, 2025

Choose a reason for hiding this comment

Uh oh!

maparent commented May 22, 2025

Uh oh!

maparent commented May 22, 2025

Uh oh!

coderabbitai bot commented May 22, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maparent May 22, 2025

Choose a reason for hiding this comment

maparent commented May 19, 2025 •

edited by coderabbitai bot

Loading

vercel bot commented May 19, 2025 •

edited

Loading

coderabbitai bot commented May 19, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot May 22, 2025 •

edited

Loading