Plaintext mode; JSONC support; and mime detection with file#30
Plaintext mode; JSONC support; and mime detection with file#30taciturnaxolotl wants to merge 3 commits intoAtaraxy-Labs:mainfrom
file#30Conversation
|
Thanks for the work, I will review it soon. |
rs545837
left a comment
There was a problem hiding this comment.
Nice additions! A few things to address before merging.
| let beforeEntities: SemanticEntity[] = []; | ||
| let afterEntities: SemanticEntity[] = []; | ||
|
|
||
| const fallback = registry.getPluginById('fallback')!; |
There was a problem hiding this comment.
The ! non-null assertion here will crash if no fallback plugin is registered. Should guard against it:
const fallback = registry.getPluginById('fallback');Then check if (plugin !== fallback && fallback) before calling fallback.extractEntities.
|
|
||
| // Map MIME types returned by `file --mime-type` to parser plugin IDs | ||
| const MIME_TO_PLUGIN: Record<string, string> = { | ||
| 'application/json': 'json', |
There was a problem hiding this comment.
This only covers 9 of the 20+ supported languages. Missing at minimum:
'text/x-c': 'code',
'text/x-c++src': 'code',
'text/x-java-source': 'code',
'text/x-ruby': 'code',
'text/x-php': 'code',
'text/x-csharp': 'code',
'text/x-shellscript': 'code',If MIME detection only works for a few languages, it gives a false sense of coverage.
| @@ -8,7 +9,9 @@ export class JsonParserPlugin implements SemanticParserPlugin { | |||
| extensions = ['.json']; | |||
|
|
|||
| extractEntities(content: string, filePath: string): SemanticEntity[] { | |||
There was a problem hiding this comment.
Since you added JSONC parsing, should .jsonc be added to the extensions list? Right now .jsonc files still fall to the fallback parser.
| export function detectPluginByMime(absoluteFilePath: string): string | null { | ||
| if (!isFileCommandAvailable()) return null; | ||
| try { | ||
| const mime = execFileSync('file', ['--mime-type', '-b', absoluteFilePath], { |
There was a problem hiding this comment.
getPlugin() is called for every file in a diff. This shells out to file synchronously for every unrecognized file that exists on disk. On repos with many extensionless files (Makefile, Dockerfile, LICENSE, etc.) this could add up. Consider caching the result per path, or at least per MIME type lookup.
|
|
||
| extractEntities(content: string, filePath: string): SemanticEntity[] { | ||
| const parsed = JSON.parse(content); | ||
| const errors: unknown[] = []; |
There was a problem hiding this comment.
The check errors.length > 0 && parsed === undefined means partially parsed JSONC with errors will silently succeed. Regular JSON.parse would have thrown on malformed .json files. This changes strictness for normal JSON. Consider keeping strict mode for .json and only using lenient JSONC parsing for .jsonc files.
Closes #29
I also added support for detecting unknown files via the
filecommand, so long as it exists. I also noticed in the process of doing that that JSONC breaks the current parsing so I added a fallback on JSON parse that tries JSONC parsing.