-
Notifications
You must be signed in to change notification settings - Fork 79
Add -f codepage flag for input/output encoding #627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add -f/--code-page flag with ODBC-compatible format parsing - Support 50+ codepages: Unicode, Windows, OEM/DOS, ISO-8859, CJK, EBCDIC, Macintosh - Apply input codepage in IncludeFile() for :r command - Apply output codepage in outCommand() for :OUT file writes - Add --list-codepages flag to display all supported codepages - Add comprehensive unit tests for parsing and encoding lookup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds ODBC sqlcmd-compatible -f/--code-page support to control input/output file encodings (plus --list-codepages) for improved interoperability.
Changes:
- Introduces codepage parsing/validation and a mapping from Windows codepages to Go encodings.
- Applies configured codepages when reading
-i/:Rfiles and when writing-o/:OUT/:ERRORoutputs. - Adds CLI argument tests, unit tests for codepage parsing/lookup, and README documentation.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
cmd/sqlcmd/sqlcmd.go |
Adds -f/--code-page, --list-codepages, validation, and wiring into runtime Sqlcmd configuration. |
cmd/sqlcmd/sqlcmd_test.go |
Adds CLI parsing/validation test cases for the new flags. |
pkg/sqlcmd/codepage.go |
Implements ParseCodePage, GetEncoding, and SupportedCodePages. |
pkg/sqlcmd/codepage_test.go |
Adds unit tests for parsing and encoding lookup. |
pkg/sqlcmd/commands.go |
Applies output codepage transforms in :OUT and :ERROR. |
pkg/sqlcmd/sqlcmd.go |
Applies input codepage transforms (or BOM-based auto-detect fallback) when including files. |
README.md |
Documents -f usage and --list-codepages. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
- Fix nil encoding panic in errorCommand when OutputCodePage is 65001 (UTF-8) - Close file handle in outCommand when GetEncoding returns an error - Handle close error properly in errorCommand - Apply UTF-8 BOM stripping when input codepage is 65001 - Fix test subtest names to use strconv.Itoa instead of string(rune)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
- Use localizer.Errorf for all user-facing error messages - Fix UTF-16 BOM handling using ExpectBOM for input decoding - Add transformWriteCloser to properly close underlying file handles - Use transformWriteCloser in outCommand and errorCommand for both UnicodeOutputFile and CodePage transforms to prevent file handle leaks - Add integration tests for output/error codepage encoding
Use strconv.Itoa instead of %d to avoid locale-specific thousands separators in error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
- Create codepageRegistry map as single source of truth for codepages - GetEncoding() now uses the registry instead of switch statement - SupportedCodePages() now generates list from registry - Removes duplicate codepage definitions between the two functions - Sort SupportedCodePages result by codepage number for consistency
- Verify all returned codepages are valid in GetEncoding - Ensure results are sorted by codepage number - Check well-known codepages are present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
| settings.OutputCodePage = cp | ||
| } | ||
| } | ||
|
|
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseCodePage currently accepts inputs that contain no actual codepage value (e.g. "," or only whitespace) and returns a non-nil CodePageSettings with both InputCodePage/OutputCodePage left as 0. That silently disables codepage handling even though the user supplied -f. Consider detecting this case (arg != "" but neither codepage parsed) and returning an error (and add a unit test for it).
| // If a non-empty argument was provided but no codepage was parsed, | |
| // treat this as an error rather than silently disabling codepage handling. | |
| if arg != "" && settings.InputCodePage == 0 && settings.OutputCodePage == 0 { | |
| return nil, localizer.Errorf("invalid codepage: %s", arg) | |
| } |
| // UTF-8 codepage: still apply BOM stripping | ||
| utf8bom := unicode.BOMOverride(unicode.UTF8.NewDecoder()) | ||
| reader = transform.NewReader(f, utf8bom) | ||
| } | ||
| } else { | ||
| // Default: auto-detect BOM for UTF-16, fallback to UTF-8 |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment "UTF-8 codepage: still apply BOM stripping" is a bit misleading: unicode.BOMOverride will also switch decoders when it sees a UTF-16 BOM (not just strip a UTF-8 BOM). If the intent is broader BOM auto-detection, consider updating the comment; if the intent is only stripping a UTF-8 BOM, consider using a transformer that only removes the UTF-8 BOM.
| // UTF-8 codepage: still apply BOM stripping | |
| utf8bom := unicode.BOMOverride(unicode.UTF8.NewDecoder()) | |
| reader = transform.NewReader(f, utf8bom) | |
| } | |
| } else { | |
| // Default: auto-detect BOM for UTF-16, fallback to UTF-8 | |
| // UTF-8 codepage: use BOMOverride to strip UTF-8 BOM and auto-detect UTF-16 BOMs, defaulting to UTF-8 otherwise | |
| utf8bom := unicode.BOMOverride(unicode.UTF8.NewDecoder()) | |
| reader = transform.NewReader(f, utf8bom) | |
| } | |
| } else { | |
| // Default: auto-detect BOMs (UTF-8/UTF-16) and decode accordingly, falling back to UTF-8 when no BOM is present |
| name string | ||
| codepage int | ||
| expectedBytes []byte | ||
| inputText string | ||
| skipOnEncError bool |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The skipOnEncError field in this test table is never used. Please remove it or use it (e.g., to skip assertions when a target encoding can’t represent the input text), to avoid dead/misleading test code.
| name string | |
| codepage int | |
| expectedBytes []byte | |
| inputText string | |
| skipOnEncError bool | |
| name string | |
| codepage int | |
| expectedBytes []byte | |
| inputText string |
|
PR replaced by #638 |
Summary
Implements the
-fflag from ODBC sqlcmd for specifying input and output file code pages.Changes
CodePageandListCodePagesfields,-f/--code-pageflag, validation, and runtime application:Rcommand to respect codepage settings when reading filesCodePagefield to Sqlcmd structUsage
Supported Codepages
Testing
Improves ODBC sqlcmd compatibility.