Search before asking
What would you like to be improved?
Motivation
AmsThriftUrl is used by clients to parse AMS thrift and ZooKeeper URLs. The current implementation has a few issues that can make URL parsing less robust:
parserThriftUrl() lowercases the entire URL before parsing.
socketTimeout query parameter parsing is duplicated.
- thrift URL validation is weak for missing host, missing port, or unsupported scheme.
- Some error messages are too generic, such as
invalid ams url.
Lowercasing the whole URL may unexpectedly change case-sensitive parts, such as catalog names or query parameter values.
Current behavior
For example:
AmsThriftUrl.parse("ThRiFt://LOCALHOST:1260/MyCatalog?socketTimeout=6000", null)
currently lowercases the URL before parsing, so values such as host/catalog may be changed unexpectedly.
Invalid URLs such as the following are not clearly rejected during parsing:
http://127.0.0.1:1260/catalog
thrift:///catalog
thrift://127.0.0.1/catalog
Expected behavior
AmsThriftUrl should:
- Preserve the original URL case for host, catalog name, and query values.
- Treat URL schemes case-insensitively.
- Validate that thrift URLs use the thrift scheme.
- Validate that thrift URLs contain both host and port.
- Parse
socketTimeout consistently in thrift and ZooKeeper URL paths.
- Provide clearer error messages for invalid URLs.
Proposed changes
- Avoid calling
toLowerCase() on the whole URL.
- Parse URL scheme with case-insensitive comparison.
- Add explicit thrift URL validation for:
- unsupported scheme
- missing host
- missing port
- Extract shared socketTimeout query parsing logic.
- Add unit tests for AmsThriftUrl.
How should we improve?
No response
Are you willing to submit PR?
Subtasks
No response
Code of Conduct
Search before asking
What would you like to be improved?
Motivation
AmsThriftUrlis used by clients to parse AMS thrift and ZooKeeper URLs. The current implementation has a few issues that can make URL parsing less robust:parserThriftUrl()lowercases the entire URL before parsing.socketTimeoutquery parameter parsing is duplicated.invalid ams url.Lowercasing the whole URL may unexpectedly change case-sensitive parts, such as catalog names or query parameter values.
Current behavior
For example:
currently lowercases the URL before parsing, so values such as host/catalog may be changed unexpectedly.
Invalid URLs such as the following are not clearly rejected during parsing:
Expected behavior
AmsThriftUrl should:
socketTimeoutconsistently in thrift and ZooKeeper URL paths.Proposed changes
toLowerCase()on the whole URL.How should we improve?
No response
Are you willing to submit PR?
Subtasks
No response
Code of Conduct