Skip to content

Add Oracle PL/SQL support #669

Description

@ouzsrcm

What problem does this solve?

codebase-memory-mcp currently has no PL/SQL support. Oracle source files
(.pks/.pkb/.fnc/.trg/...) aren't mapped to any language, so they're skipped
during discovery. If someone renames them to .sql, they fall through to the
generic DerekStride SQL grammar, which targets ANSI/PostgreSQL/MySQL and does
not model PL/SQL's procedural constructs — packages, package bodies, BEGIN/END
blocks, cursors, package-scoped procedures/functions, triggers.

The practical result: for enterprise Oracle codebases — which are often very
large and exactly where a structural knowledge graph pays off — the tool emits
no package/procedure/function nodes and no call edges. Engineers exploring a big
PL/SQL repo get nothing useful out of the graph.

Proposed solution

Add PL/SQL as a first-class structural language (CBM_LANG_PLSQL), following the
existing data-driven pattern — vendored tree-sitter grammar + a lang_specs row +
extension mappings + two small name-resolver special-cases in extract_defs.c —
exactly how C#/PHP/etc. are wired.

Scope (this PR): structural extraction only.

  • packages / object types / triggers -> Class nodes
  • procedures / functions (in packages and standalone) -> Function nodes
  • ref_call -> CALLS edges
  • if/case/loops/exception_handler -> branch (complexity)
  • assignment_statement -> WRITES, raise_statement -> THROWS
    Not in scope: Hybrid LSP semantic type resolution — can be a follow-up, like the
    majority of supported languages today.

Grammar: AndreasMaierDe/tree-sitter-plsql (MIT, ABI 14, no external scanner).

Extensions: .pks .pkb .pck .pls .plb .plsql .fnc .trg .bdy .tps .tpb
(.sql stays generic SQL; .prc is left as-is since it already maps to FORM.)

Candidate public OSS test beds:

  • utPLSQL/utPLSQL (large, well-maintained PL/SQL codebase)
  • oracle/db-sample-schemas (Oracle's official sample schemas)
  • mortenbra/alexandria-plsql-utils
  • OraOpenSource/oos-utils

I already have a working prototype: builds clean, full extraction suite green
(199/199, incl. 2 new PL/SQL tests), and verified end-to-end by indexing a sample
package (emp_pkg/util_pkg -> Class, hire/salary -> Function). Happy to open the PR
once you confirm the approach and grammar choice.

Alternatives considered

  1. iliasaz/tree-sitter-orasql — also covers Oracle SQL + PL/SQL, but its generated
    parser.c is ~30MB vs ~9MB for AndreasMaierDe, which bloats the binary more.
    Open to switching if you'd prefer its broader coverage.

  2. Reusing the existing DerekStride SQL grammar (remap PL/SQL extensions to
    CBM_LANG_SQL) — rejected: that grammar doesn't parse PL/SQL procedural syntax,
    so packages/procedures/functions wouldn't be extracted.

  3. Infra-pass pattern (like Dockerfile/K8s, no grammar) — N/A; PL/SQL needs real
    AST parsing, not YAML-reuse heuristics.

Known limitation to flag up front: the chosen grammar is still maturing upstream —
some standalone DDL (e.g. CREATE TYPE ... AS OBJECT) currently produces ERROR
nodes. Package specs/bodies, procedures, functions and triggers parse cleanly.

Confirmations

  • I searched existing issues and this is not a duplicate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestlanguage-requestRequest for new language supportparsing/qualityGraph extraction bugs, false positives, missing edges

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions