Skip to content

feat: qwen-code platform + Objective-C + Bash parser support#227

Merged
tirth8205 merged 2 commits intomainfrom
feat/qwen-objc-bash
Apr 11, 2026
Merged

feat: qwen-code platform + Objective-C + Bash parser support#227
tirth8205 merged 2 commits intomainfrom
feat/qwen-objc-bash

Conversation

@tirth8205
Copy link
Copy Markdown
Owner

Summary

Three feature-request closeouts from the post-v2.2.4 audit:

Both languages already have grammars in tree-sitter-language-pack, so no new runtime dependencies.

Commits

  1. fd56a3c feat: add qwen-code as a supported MCP install platform (支持 qwen-code吗 #83)

    • New PLATFORMS["qwen"] entry in code_review_graph/skills.py with config_path = ~/.qwen/settings.json, key = "mcpServers", format = "object", needs_type = True.
    • Added qwen to the --platform choices in cli.py for both install and init.
    • Uses the existing object-format merge path — no new install logic.
    • Tests: test_install_qwen_config, test_install_qwen_preserves_existing_servers.
  2. b1fbcd5 feat: add Objective-C and Bash/Shell parser support (Could you add support for Objective-C? #88, Support Bash/Shell (.sh) parsing for script-heavy repositories #197)

    • Objective-C (.mobjc):
      • Class types: class_interface, class_implementation, category_interface, protocol_declaration
      • Function types: method_definition + C-style function_definition
      • Imports: preproc_include (#import and #include both)
      • Calls: message_expression + call_expression
      • _get_name now recognizes method_definition (first identifier = method name, e.g. add:to: keeps add) and threads objc into the existing C/C++ function_declarator path so top-level main() is picked up.
      • _get_call_name handles message_expression by skipping the [, skipping the receiver, and returning the next identifier — works for simple calls, multi-part selectors, and chained [[... alloc] init].
      • Note: .h stays mapped to c (Obj-C .h disambiguation is a separate can of worms), .mm stays mapped to cpp for now.
    • Bash (.sh / .bash / .zshbash):
      • Function type: function_definition (name is a word child — added to _get_name)
      • Calls: every command node; _get_call_name returns the command_name text
      • Imports: new _extract_bash_source_command() hook in _extract_from_tree detects source path / . path, strips quotes, and emits IMPORTS_FROM edges; _do_resolve_module has a new bash branch that resolves relative paths against the caller's directory when the target file exists on disk
      • No classes (shell has none)

Verified end-to-end on fixtures

tests/fixtures/sample.m (Calculator with @interface + @implementation + C main):

  • 8 nodes, 18 edges
  • 5 functions: add, reset, logResult, sharedCalculator (all with parent_name="Calculator"), and main (top-level)
  • 2 IMPORTS_FROM: Foundation/Foundation.h, Logger.h
  • 9 CALLS including [self logResult:sum]::Calculator.logResult (resolved), [Calculator sharedCalculator]::Calculator.sharedCalculator, plus NSLog, alloc, init

tests/fixtures/sample.sh (sample script with source, 5 functions, and nested calls):

  • 6 nodes, 18 edges
  • 5 functions: log_info, log_error, ensure_dir, cleanup, main
  • 2 IMPORTS_FROM: sample_lib.sh resolved to absolute path (file exists), sample_config.sh kept as raw string
  • 11 CALLS — internal calls like mainlog_info / ensure_dir / cleanup resolved to qualified names; external commands (echo, mkdir, rm) kept as bare names

Test plan

  • uv run ruff check code_review_graph/ → clean
  • uv run mypy code_review_graph/ --ignore-missing-imports --no-strict-optional → clean
  • uv run bandit -r code_review_graph/ -c pyproject.toml → 0 H/M/L
  • uv run pytest --cov-fail-under=65717 passed, 1 skipped, 2 xpassed, coverage 74.84%
    • 16 new tests total: 2 qwen platform install tests, 7 Objective-C, 7 Bash
  • CI matrix (3.10 / 3.11 / 3.12 / 3.13)

Closes

Out of scope for this PR — from the same audit

🤖 Generated with Claude Code

tirth8205 and others added 2 commits April 11, 2026 21:06
Qwen Code reads MCP servers from ~/.qwen/settings.json using the same
mcpServers schema as Claude/Cursor/Windsurf. Added a PLATFORMS entry
plus a --platform qwen choice in cli.py, so `code-review-graph install
--platform qwen` now writes a merged settings.json without clobbering
existing Qwen config.

The existing object-format install path handles the merge + type=stdio
automatically; no new code was needed beyond the PLATFORMS entry and
the CLI choice.

Tests:
- test_install_qwen_config — writes a fresh settings.json with the
  correct shape
- test_install_qwen_preserves_existing_servers — merges alongside
  existing mcpServers entries without overwriting them

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Closes #88 (Objective-C) and #197 (Bash/Shell). Both tree-sitter
grammars ship with tree-sitter-language-pack so no new runtime dep.

Objective-C (.m):
- Class types: class_interface, class_implementation,
  category_interface, protocol_declaration (@interface / @implementation
  both produce Class nodes; the store dedupes by qualified name)
- Function types: method_definition (Obj-C methods) + function_definition
  (C-style helpers and main())
- Imports: preproc_include (tree-sitter-objc emits these for both
  #import and #include)
- Calls: message_expression ([receiver method:arg]) + call_expression
  (C-style NSLog(...))
- Name extraction for method_definition picks the first `identifier`
  child (first part of the selector); multi-part selectors like
  add:to: keep `add` as the canonical method name
- main() and other C-style functions use the existing function_declarator
  resolution path (added "objc" to the c/cpp special case in _get_name)
- _get_call_name handles message_expression by skipping the `[`,
  skipping the receiver, and returning the next identifier child;
  this handles simple calls, multi-part selectors, and chained
  [[... alloc] init] correctly
- .m only; .mm (Objective-C++) is left mapped to c++ for now, .h stays
  mapped to c (Objective-C .h disambiguation is a separate can of worms)

Bash/Shell (.sh, .bash, .zsh):
- Function type: function_definition
- Name extraction: bash stores function names as `word` children, not
  `identifier`, so added a bash-specific branch in _get_name
- Calls: every `command` node emits a CALLS edge; _get_call_name reads
  the command_name child
- source / . file.sh imports: new _extract_bash_source_command helper
  dispatched from _extract_from_tree detects these commands and emits
  IMPORTS_FROM edges instead of CALLS
- _do_resolve_module: added a bash branch that resolves relative paths
  to real files when they exist on disk

Tests (14 new):
- TestObjectiveCParsing: language detection, class extraction,
  instance + class methods, C-style main(), imports, message_expression
  calls (including internal resolution)
- TestBashParsing: language detection across .sh/.bash/.zsh, function
  extraction with no parents, source/. IMPORTS_FROM edges with path
  resolution for existing files, command invocation CALLS edges with
  internal resolution

Verified on fixtures:
- sample.m: 8 nodes, 18 edges (5 functions, 1 class, 2 imports,
  7 contains, 9 calls — including [self logResult:sum] and
  [Calculator sharedCalculator] resolving to qualified names)
- sample.sh: 6 nodes, 18 edges (5 functions, 2 imports with
  sample_lib.sh resolved to absolute path, 11 calls with internal
  references correctly resolved)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@tirth8205 tirth8205 merged commit 8c1a7fb into main Apr 11, 2026
9 checks passed
azizur100389 added a commit to azizur100389/code-review-graph that referenced this pull request Apr 11, 2026
Register .ksh (Korn shell) with tree-sitter-bash alongside the existing
.sh / .bash / .zsh entries added in tirth8205#227. Korn shell is close enough to
bash syntactically that tree-sitter-bash handles the structural features
the graph captures (function definitions, commands, source/. includes)
correctly.

Context
-------
In the close comment on PR tirth8205#230, @tirth8205 explicitly flagged .ksh as a
missing extension:

    "The .ksh extension in particular looks worth adding — I didn't
     include it in tirth8205#227."

This PR addresses exactly that gap. Issue tirth8205#235 tracks the request.

Why it matters
--------------
Korn shell is still used in legacy AIX/Solaris operations, IBM internal
tooling, and enterprise CI scripts. Repositories that ship .ksh scripts
currently index to 0 nodes because the extension is unrecognized — the
same failure mode that motivated tirth8205#197.

Implementation
--------------
One line added to EXTENSION_TO_LANGUAGE in parser.py:
    ".ksh": "bash"

All of the bash parsing machinery shipped in tirth8205#227 (_FUNCTION_TYPES,
_CALL_TYPES, _extract_bash_source_command, name/call resolution) already
supports any file parsed through the "bash" language path, so no further
changes are needed.

Tests added (tests/test_multilang.py::TestBashParsing)
------------------------------------------------------
1. test_detects_language — extended with a .ksh assertion to lock in
   the extension mapping (regression guard for tirth8205#235).
2. test_ksh_extension_parses_as_bash — end-to-end regression test that
   copies the existing tests/fixtures/sample.sh to a temp .ksh file,
   parses it through the real CodeParser, and asserts:
     - every node's language field is "bash"
     - the set of extracted Function names is identical to the .sh run
     - the CONTAINS / CALLS / IMPORTS_FROM edge counts per kind match
   The second assertion proves the .ksh path is fully wired through to
   the same structural extraction as .sh, not a degenerate zero-result
   read.

Test results
------------
Stage 1 (new targeted tests): 2/2 passed.
Stage 2 (tests/test_multilang.py full): 152/152 passed — zero regressions
  across any language.
Stage 3 (tests/test_parser.py adjacent): 67/67 passed.
Stage 4 (full suite): 733 passed. 8 pre-existing Windows failures in
  test_incremental (3) + test_main async coroutine detection (1) +
  test_notebook Databricks (4) — verified identical on unchanged main.
Stage 5 (ruff check on parser.py and test_multilang.py): clean.
Stage 6 (end-to-end smoke): detect_language("legacy.ksh") -> "bash";
  parsing a real .ksh file produces 6 Function nodes, 18 edges, all
  tagged language=bash.

Zero regressions. Single-line extension mapping change plus a targeted
regression guard against the specific issue the maintainer flagged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant