Add PHP DTO converter with round-trip support#2
Conversation
Agent-Logs-Url: https://github.com/voku/typeDiagram/sessions/dae66a18-9530-4cce-b933-6acbf1f38562 Co-authored-by: voku <264695+voku@users.noreply.github.com>
Agent-Logs-Url: https://github.com/voku/typeDiagram/sessions/dae66a18-9530-4cce-b933-6acbf1f38562 Co-authored-by: voku <264695+voku@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces PHP support to the typeDiagram ecosystem, including a new bidirectional converter for PHP DTOs, CLI integration, web-based syntax highlighting, and updated documentation. The PHP converter implementation requires several improvements to its parsing logic, specifically regarding the handling of string literals and comments in splitTopLevel, support for namespaced types in parameter parsing, and the inclusion of double-quoted strings for union variant discriminators. Additionally, the mapping of the Unit type to void for parameters needs correction to comply with PHP standards, and the interpretation of single-argument array docblocks should be updated to map to List instead of a Map.
| const splitTopLevel = (source: string, openChar: string, closeChar: string): string[] => { | ||
| const parts: string[] = []; | ||
| let depth = 0; | ||
| let start = 0; | ||
| for (let i = 0; i < source.length; i++) { | ||
| const char = source.charAt(i); | ||
| depth = adjustDepth(depth, char, openChar, closeChar); | ||
| if (char === "," && depth === 0) { | ||
| const part = source.slice(start, i).trim(); | ||
| if (part.length > 0) { | ||
| parts.push(part); | ||
| } | ||
| start = i + 1; | ||
| } | ||
| } | ||
| const last = source.slice(start).trim(); | ||
| return last.length > 0 ? [...parts, last] : parts; | ||
| }; |
There was a problem hiding this comment.
The splitTopLevel function does not account for string literals or comments. If a parameter list or generic argument list contains a comma inside a string (e.g., public string $msg = "Hello, world"), this function will incorrectly split the string, leading to parsing failures for subsequent fields. A more robust implementation should track whether the cursor is inside a string and skip delimiters accordingly.
| return { nativeType: "array", docType: mapTdToPhpDocType(type), hasDefaultNull: false }; | ||
| } | ||
| if (type.name === "Unit") { | ||
| return { nativeType: "mixed", docType: "void", hasDefaultNull: false }; |
There was a problem hiding this comment.
| const docTypes = parseParamDocsFromDocblock(constructorDoc); | ||
| return splitParams(source) | ||
| .map((part) => { | ||
| const match = /public\s+([?A-Za-z_][A-Za-z0-9_]*)\s+\$(\w+)(?:\s*=\s*null)?/.exec(part.replace(/\s+/g, " ")); |
There was a problem hiding this comment.
The regex for parsing parameters does not support namespaced types (e.g., \App\DTO\User). PHP types frequently include namespaces. The character class [?A-Za-z_] should be expanded to include the backslash \ character. Also, note that replace(/\s+/g, " ") on the entire parameter string can corrupt string literals containing multiple spaces.
| return keyType === undefined || valueType === undefined | ||
| ? "Map<String, String>" | ||
| : `Map<${mapPhpDocTypeToTd(keyType)}, ${mapPhpDocTypeToTd(valueType)}>`; | ||
| } |
There was a problem hiding this comment.
| const toTypeRef = (param: ParsedParam) => | ||
| parseTypeRef(param.docType === null ? mapPhpNativeTypeToTd(param.nativeType) : mapPhpDocTypeToTd(param.docType)); | ||
|
|
||
| const parseKindLiteral = (body: string) => /@var\s+'([^']+)'[\s\S]*?public string \$kind;/.exec(body)?.[1] ?? null; |
Agent-Logs-Url: https://github.com/voku/typeDiagram/sessions/e1ea5ebf-8997-4e24-b55d-3ff1514ed8c5 Co-authored-by: voku <264695+voku@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds PHP as a supported language across the typeDiagram platform, implementing a new bidirectional converter, CLI arguments, and web interface components. The implementation includes support for records, unions, and aliases, along with comprehensive integration tests and updated documentation. The review feedback highlights several areas for improving the PHP parser's robustness, including better handling of nested delimiters in constructor parameters, support for namespaced types in declaration regexes, and more reliable extraction of kind literals from docblocks.
| const adjustDepth = (depth: number, char: string, openChar: string, closeChar: string) => { | ||
| if (char === openChar) { | ||
| return depth + 1; | ||
| } | ||
| if (char === closeChar) { | ||
| return depth - 1; | ||
| } | ||
| return depth; | ||
| }; |
There was a problem hiding this comment.
The adjustDepth function currently only tracks the specific openChar and closeChar passed to it. This causes issues when splitting constructor parameters if a default value contains other delimiters, such as short array syntax []. For example, public array $v = [1, 2] would be incorrectly split at the comma inside the array because the depth counter wouldn't increment for [. It is better to track all common PHP delimiters to correctly identify the top-level scope for splitting.
const adjustDepth = (depth: number, char: string, openChar: string, closeChar: string) => {
if ([openChar, "(", "[", "{", "<"].includes(char)) return depth + 1;
if ([closeChar, ")", "]", "}", ">"].includes(char)) return depth - 1;
return depth;
};|
|
||
| const parseDeclarations = (source: string): ParsedDecl[] => { | ||
| const declarations: ParsedDecl[] = []; | ||
| const declarationRe = /interface\s+(\w+)\s*\{|final readonly class\s+(\w+)(?:\s+implements\s+(\w+))?\s*\{/g; |
There was a problem hiding this comment.
The regex for parsing declarations uses \w+ for names, which prevents it from matching namespaced interfaces in the implements clause (e.g., implements \App\Interfaces\Identifiable). Since DTOs are frequently namespaced in PHP, the regex should be updated to support backslashes in identifiers.
| const declarationRe = /interface\s+(\w+)\s*\{|final readonly class\s+(\w+)(?:\s+implements\s+(\w+))?\s*\{/g; | |
| const declarationRe = /interface\s+([\\a-zA-Z0-9_]+)\s*\{|final readonly class\s+([\\a-zA-Z0-9_]+)(?:\s+implements\s+([\\a-zA-Z0-9_]+))?\s*\{/g; |
| const literalStart = varIndex + "@var ".length; | ||
| const quote = body.charAt(literalStart); | ||
| return quote === "'" || quote === '"' ? readQuotedLiteral(body, literalStart + 1, quote) : null; |
There was a problem hiding this comment.
The logic for extracting the @var literal is fragile as it assumes the quoted value starts immediately after the @var string. This will fail if there is a newline or extra whitespace (such as the * prefix in multi-line docblocks). Using a regular expression is more robust.
| const literalStart = varIndex + "@var ".length; | |
| const quote = body.charAt(literalStart); | |
| return quote === "'" || quote === '"' ? readQuotedLiteral(body, literalStart + 1, quote) : null; | |
| const match = /@var\s+(['"])(.*?)\1/.exec(body.slice(varIndex)); | |
| return match ? match[2] : null; |
packages/typediagram/src/converters/php.tsand current PHP converter teststypediagram-corebuild, typecheck, and test baselineUnitfield mapping, andarray<T>docblock interpretationnpm run -w typediagram-core build,typecheck, andtest