-
Notifications
You must be signed in to change notification settings - Fork 74
MLE-27077 Added fix for invalid header for empty doc #1899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Copyright Validation Results ⏭️ Skipped (Excluded) Files
✅ Valid Files
✅ All files have valid copyright headers! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a client-side workaround for malformed Content-Disposition headers (notably on empty documents) and updates/extends existing tests around document paging and URI handling.
Changes:
- Refactors
ReadDocumentPageTestimports/class base and adds a new test for an empty text document. - Updates
OkHttpServices#getHeaderUrito recover fromjakarta.mailparsing failures caused by a trailingformat=parameter. - Adds a helper to parse
Content-Dispositionand extractfilenamesafely after cleaning.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| marklogic-client-api/src/test/java/com/marklogic/client/test/document/ReadDocumentPageTest.java | Updates test structure and adds an empty text document test case related to the header parsing fix. |
| marklogic-client-api/src/main/java/com/marklogic/client/impl/OkHttpServices.java | Implements defensive parsing for malformed Content-Disposition to avoid ParseException and extract filenames reliably. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * See MLE-15748, which pertains to issues with javax.mail only allowing US-ASCII characters. | ||
| */ | ||
| @Test | ||
| void test() { |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method name test is too generic for a test suite and makes failures harder to interpret. Rename it to something intent-revealing (e.g., readsUriWithNonAsciiCharacters or searchHandlesNonAsciiUriFilename) to reflect what the test is validating.
| void test() { | |
| void readsUriWithNonAsciiCharacters() { |
| void emptyTextDocument() { | ||
| final String uri = "/sample/empty-file.txt"; | ||
|
|
||
| try (DatabaseClient client = Common.newClient()) { | ||
| JSONDocumentManager documentManager = client.newJSONDocumentManager(); | ||
| StructuredQueryDefinition query = new StructuredQueryBuilder().document(uri); | ||
| DocumentRecord documentRecord; | ||
| try (DocumentPage documentPage = documentManager.search(query, 1)) { | ||
| assertTrue(documentPage.hasNext(), "Expected a document in the page, but none was found."); | ||
| documentRecord = documentPage.next(); | ||
| } | ||
| String actualUri = documentRecord.getUri(); | ||
| assertEquals(uri, actualUri, "The URI of the empty document should match the one written."); | ||
| } | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test asserts the document exists but does not create/write /sample/empty-file.txt within the test. That makes the test dependent on external state and likely flaky. Make the test self-contained by writing an empty document at uri before searching (and cleaning it up afterward), or by using a known fixture/setup that guarantees the document exists.
| void test() { | ||
| Common.deleteUrisWithPattern("/aaa-page/*"); | ||
|
|
||
| final String uri = "/aaa-page/太田佳伸のXMLファイル.xml"; | ||
| DocumentRecord documentRecord; | ||
| try (DatabaseClient client = Common.newClient()) { |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test writes/reads under the /aaa-page/ URI space, but there’s no cleanup visible in the updated method. To keep the suite repeatable and avoid cross-test interference, add a cleanup step (e.g., delete the created URI(s) in a finally block or in an @AfterEach) so reruns don’t depend on prior test state.
| static private String getHeaderUri(BodyPart part) { | ||
| try { | ||
| if (part != null) { | ||
| return part.getFileName(); | ||
| if (part == null) { | ||
| return null; | ||
| } | ||
| // if it's not found, just return null | ||
|
|
||
| try { | ||
| String filename = part.getFileName(); | ||
| if (filename != null) { | ||
| return filename; | ||
| } | ||
| } catch (ParseException e) { | ||
| // Jakarta Mail's parser failed due to malformed Content-Disposition header. | ||
| // Check if MarkLogic sent a malformed "format=" parameter at the end, which violates RFC 2183. | ||
| String contentDisposition = getHeader(part, "Content-Disposition"); | ||
| if (contentDisposition != null && contentDisposition.matches(".*;\\s*format\\s*=\\s*$")) { | ||
| // Remove the trailing "; format=" to fix the malformed header | ||
| String cleaned = contentDisposition.replaceFirst(";\\s*format\\s*=\\s*$", "").trim(); | ||
| logger.debug("Removed trailing 'format=' from malformed Content-Disposition header: {} -> {}", contentDisposition, cleaned); | ||
| return extractFilenameFromContentDisposition(cleaned); | ||
| } | ||
| throw e; | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new ParseException recovery path is important behavior (it changes failure into a best-effort filename extraction). Add a focused test that reproduces a malformed Content-Disposition ending in ; format= and asserts that getHeaderUri returns the expected filename (and does not throw).
| static private String extractFilenameFromContentDisposition(String contentDisposition) { | ||
| if (contentDisposition == null) { | ||
| return null; | ||
| } | ||
| try { | ||
| // Use Jakarta Mail's ContentDisposition parser to extract the filename parameter. This is the class | ||
| // that throws an error when "format=" exists in the value, but that has been removed already. | ||
| ContentDisposition cd = new ContentDisposition(contentDisposition); | ||
| return cd.getParameter("filename"); | ||
| } catch (ParseException e) { | ||
| logger.warn("Failed to parse cleaned Content-Disposition header: {}; cause: {}", | ||
| contentDisposition, e.getMessage()); | ||
| return null; | ||
| } | ||
| } |
Copilot
AI
Feb 11, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fallback only checks the filename parameter. For valid Content-Disposition headers, servers may provide filename* (RFC 5987/2231) instead of filename. In that case this method will return null even though a filename exists. Consider checking for filename* as well (and decoding it appropriately) before returning null.
1a98321 to
4c6216a
Compare
4c6216a to
b0058d3
Compare
No description provided.