Skip to content

Conversation

@rjrudin
Copy link
Contributor

@rjrudin rjrudin commented Feb 11, 2026

No description provided.

Copilot AI review requested due to automatic review settings February 11, 2026 10:55
@github-actions
Copy link

github-actions bot commented Feb 11, 2026

Copyright Validation Results
Total: 4 | Passed: 2 | Failed: 0 | Skipped: 2 | at: 2026-02-11 12:38:54 UTC | commit: b0058d3

⏭️ Skipped (Excluded) Files

  • .copyrightconfig
  • test-app/src/main/ml-data/sample/empty-file.txt

✅ Valid Files

  • marklogic-client-api/src/main/java/com/marklogic/client/impl/OkHttpServices.java
  • marklogic-client-api/src/test/java/com/marklogic/client/test/document/ReadDocumentPageTest.java

✅ All files have valid copyright headers!

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a client-side workaround for malformed Content-Disposition headers (notably on empty documents) and updates/extends existing tests around document paging and URI handling.

Changes:

  • Refactors ReadDocumentPageTest imports/class base and adds a new test for an empty text document.
  • Updates OkHttpServices#getHeaderUri to recover from jakarta.mail parsing failures caused by a trailing format= parameter.
  • Adds a helper to parse Content-Disposition and extract filename safely after cleaning.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 5 comments.

File Description
marklogic-client-api/src/test/java/com/marklogic/client/test/document/ReadDocumentPageTest.java Updates test structure and adds an empty text document test case related to the header parsing fix.
marklogic-client-api/src/main/java/com/marklogic/client/impl/OkHttpServices.java Implements defensive parsing for malformed Content-Disposition to avoid ParseException and extract filenames reliably.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* See MLE-15748, which pertains to issues with javax.mail only allowing US-ASCII characters.
*/
@Test
void test() {
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name test is too generic for a test suite and makes failures harder to interpret. Rename it to something intent-revealing (e.g., readsUriWithNonAsciiCharacters or searchHandlesNonAsciiUriFilename) to reflect what the test is validating.

Suggested change
void test() {
void readsUriWithNonAsciiCharacters() {

Copilot uses AI. Check for mistakes.
Comment on lines 46 to 60
void emptyTextDocument() {
final String uri = "/sample/empty-file.txt";

try (DatabaseClient client = Common.newClient()) {
JSONDocumentManager documentManager = client.newJSONDocumentManager();
StructuredQueryDefinition query = new StructuredQueryBuilder().document(uri);
DocumentRecord documentRecord;
try (DocumentPage documentPage = documentManager.search(query, 1)) {
assertTrue(documentPage.hasNext(), "Expected a document in the page, but none was found.");
documentRecord = documentPage.next();
}
String actualUri = documentRecord.getUri();
assertEquals(uri, actualUri, "The URI of the empty document should match the one written.");
}
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts the document exists but does not create/write /sample/empty-file.txt within the test. That makes the test dependent on external state and likely flaky. Make the test self-contained by writing an empty document at uri before searching (and cleaning it up afterward), or by using a known fixture/setup that guarantees the document exists.

Copilot uses AI. Check for mistakes.
Comment on lines 30 to 33
void test() {
Common.deleteUrisWithPattern("/aaa-page/*");

final String uri = "/aaa-page/太田佳伸のXMLファイル.xml";
DocumentRecord documentRecord;
try (DatabaseClient client = Common.newClient()) {
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test writes/reads under the /aaa-page/ URI space, but there’s no cleanup visible in the updated method. To keep the suite repeatable and avoid cross-test interference, add a cleanup step (e.g., delete the created URI(s) in a finally block or in an @AfterEach) so reruns don’t depend on prior test state.

Copilot uses AI. Check for mistakes.
Comment on lines 1811 to +1833
static private String getHeaderUri(BodyPart part) {
try {
if (part != null) {
return part.getFileName();
if (part == null) {
return null;
}
// if it's not found, just return null

try {
String filename = part.getFileName();
if (filename != null) {
return filename;
}
} catch (ParseException e) {
// Jakarta Mail's parser failed due to malformed Content-Disposition header.
// Check if MarkLogic sent a malformed "format=" parameter at the end, which violates RFC 2183.
String contentDisposition = getHeader(part, "Content-Disposition");
if (contentDisposition != null && contentDisposition.matches(".*;\\s*format\\s*=\\s*$")) {
// Remove the trailing "; format=" to fix the malformed header
String cleaned = contentDisposition.replaceFirst(";\\s*format\\s*=\\s*$", "").trim();
logger.debug("Removed trailing 'format=' from malformed Content-Disposition header: {} -> {}", contentDisposition, cleaned);
return extractFilenameFromContentDisposition(cleaned);
}
throw e;
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ParseException recovery path is important behavior (it changes failure into a best-effort filename extraction). Add a focused test that reproduces a malformed Content-Disposition ending in ; format= and asserts that getHeaderUri returns the expected filename (and does not throw).

Copilot uses AI. Check for mistakes.
Comment on lines +1841 to +1855
static private String extractFilenameFromContentDisposition(String contentDisposition) {
if (contentDisposition == null) {
return null;
}
try {
// Use Jakarta Mail's ContentDisposition parser to extract the filename parameter. This is the class
// that throws an error when "format=" exists in the value, but that has been removed already.
ContentDisposition cd = new ContentDisposition(contentDisposition);
return cd.getParameter("filename");
} catch (ParseException e) {
logger.warn("Failed to parse cleaned Content-Disposition header: {}; cause: {}",
contentDisposition, e.getMessage());
return null;
}
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback only checks the filename parameter. For valid Content-Disposition headers, servers may provide filename* (RFC 5987/2231) instead of filename. In that case this method will return null even though a filename exists. Consider checking for filename* as well (and decoding it appropriately) before returning null.

Copilot uses AI. Check for mistakes.
@rjrudin rjrudin force-pushed the feature/27077-empty-doc branch 3 times, most recently from 1a98321 to 4c6216a Compare February 11, 2026 12:03
@rjrudin rjrudin force-pushed the feature/27077-empty-doc branch from 4c6216a to b0058d3 Compare February 11, 2026 12:38
@rjrudin rjrudin merged commit c505138 into develop Feb 11, 2026
4 checks passed
@rjrudin rjrudin deleted the feature/27077-empty-doc branch February 11, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants