Skip to content

feat: add Objective-C language support#424

Closed
aaaron7 wants to merge 1 commit into
colbymchenry:mainfrom
aaaron7:feat/objc-support
Closed

feat: add Objective-C language support#424
aaaron7 wants to merge 1 commit into
colbymchenry:mainfrom
aaaron7:feat/objc-support

Conversation

@aaaron7
Copy link
Copy Markdown

@aaaron7 aaaron7 commented May 26, 2026

Summary

Add Objective-C language support via tree-sitter-objc grammar, enabling proper callers/callees relationship extraction for Objective-C codebases.

Changes

  • src/types.ts — Add 'objc' to LANGUAGES union type
  • src/extraction/grammars.ts — Register tree-sitter-objc.wasm grammar, .m/.mm file extensions, and display name
  • src/extraction/languages/objc.ts (new) — Full ObjC extractor supporting:
    • message_expression (method calls) — creates call edges
    • class_interface, class_implementation, protocol_declaration — container nodes
    • method_definition / method_declaration — function nodes with signatures
    • extractBareCall — Critical fix: visitFunctionBody bypasses the main visitNode dispatch, which means message_expression nodes inside method bodies were never captured. extractBareCall handles this, taking call edges from ~735 to 33,000+ on a real codebase.
    • Method signatures built from parameter selectors (e.g. -/+ + selector parts + parameter labels)
  • src/extraction/languages/index.ts — Import and register ObjC extractor

Testing

  • npm run build passes cleanly with tsc --noEmit
  • Tested on a real iOS codebase: call edges increased from 735 → 33,000+ after this fix

Add ObjC extractor with tree-sitter-objc grammar support:
- Register .m and .mm file extensions
- Handle message_expression, class_interface, class_implementation,
  protocol_declaration, and method nodes
- Add extractBareCall for message_expression capture during
  visitFunctionBody traversal (critical fix for call edges)
- Build method signatures from parameter selectors

This enables proper callers/callees relationship extraction for
Objective-C codebases.
@colbymchenry
Copy link
Copy Markdown
Owner

Thanks for taking the time to put this together, @aaaron7 — and especially
for the work of running it on a real iOS codebase to validate the call-edge
extraction was firing at all.

Closing this in favor of #165, which was opened a week earlier and takes
the same approach with a few key differences that turn out to matter for
correctness on real Objective-C code:

  • Full multi-part selectors. ObjC method symbols need every keyword
    (e.g. setWidth:height:, not just setWidth); otherwise distinct methods
    collide. Tree-sitter-objc emits each keyword as a separate identifier
    sibling — they have to be joined back together with : separators. This
    one is the load-bearing fix: without it, the high call-edge count looks
    impressive but reflects collisions, not coverage. (Same correction applies
    to message_expression callsites — each [recv a:1 b:2] emits multiple
    field-named method children, all of which need to participate in the
    selector.)
  • Protocols extracted (@protocol Foo ... @endprotocol nodes).
  • @property names found by walking `struct_declaration > struct_declarator

    pointer_declarator > identifier— there's nonamefield onproperty_declaration`.

  • @implementation reuses the @interface's class node instead of
    creating a duplicate.
  • .h content-sniffing so iOS headers are classified as objc rather
    than c.
  • 7 tests + README + CHANGELOG + agent-eval corpus entries, validated on
    Masonry / FMDB / SDWebImage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants