Skip to content

Conversation

@xingfan-msft
Copy link

No description provided.

@xingfan-msft xingfan-msft requested a review from a team as a code owner May 12, 2025 09:52
@tnaum-ms tnaum-ms self-requested a review May 12, 2025 09:57
@tnaum-ms tnaum-ms self-assigned this May 12, 2025
Copy link
Collaborator

@tnaum-ms tnaum-ms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xingfan-msft Thank you for your first PR. Congratulations! 🎉

My comments in this review are there to help us align with the way we work in this repository.

  • I suggested changes to the buffer implementation to make it easier to use.
  • I've not started the extension with your changes, but it looks like it would work
  • I've spotted changes to package.json that were unexpected, this reminds me that I didn't share our dev-guide with you (note-to-self: we should really add it to this repo), but in general, in this case: we prefer more commits instead of a single one where many change are bundled together - we're aware that a clean cut is not always possible, but even if it's fuzzy, it still helps to follow the 'chain-of-thought' ;)

I'll follow up on this in our call later today.

Comment on lines +82 to +98
"@microsoft/vscode-azext-dev": "^2.1.0",
"@pmmmwh/react-refresh-webpack-plugin": "^0.5.15",
"@swc/cli": "^0.6.0",
"@swc/core": "^1.11.7",
"@swc/jest": "^0.2.37",
"@types/documentdb": "^1.10.13",
"@types/fs-extra": "^11.0.4",
"@types/jest": "^29.5.14",
"@types/lodash.debounce": "^4.0.9",
"@types/lodash.omit": "^4.5.9",
"@types/mocha": "^10.0.10",
"@types/node": "~20.9.5",
"@types/pg": "^8.11.11",
"@types/react": "^18.3.18",
"@types/react-dom": "^18.3.5",
"@types/semver": "^7.3.12",
"@types/uuid": "^10.0.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, did you have a conflict on this file?

We introduced some changes to package.json recently and did update a few packages. Your pull request seems to introduce changes here.

Was any change to package.json intended?

Comment on lines 57 to 58
export type InsertDocumentsResult = {
/** Indicates whether this write result was acknowledged. If not, then all other members of this result will be undefined */
acknowledged: boolean;
/** The number of inserted documents for this operations */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was acknowledged removed?

skip: number,
limit: number,
): Promise<WithId<Document>[]> {
// eslint-disable-next-line @typescript-eslint/no-unsafe-assignment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like our lint configs differ as this shouldn't be necessary. Let's discuss live.

Comment on lines +475 to +476
forceServerObjectId: true,
ordered: false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forceServerObjectId: this comment is not strictly related to this ticket. Let's create a research-ticket to figure out which setting here is recommended (as I don't want to invest in measurements at this stage - we're building a GUI, not a migration library) for bulk-insert scenarios when high throughput is expected.

Comment on lines +31 to +32
constructor(node: CollectionItem) {
this.node = node;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not on you here, we've started with a dependency on a node as it's a part of a command, but let's try to get rid of it the deeper we go. Here, a ClusterModel is a good candidate.

Comment on lines +28 to +29
private maxDocumentSize: number = 16 * 1024 * 1024;
private maxTotalSize: number = 32 * 1024 * 1024;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove maxDocumentSize and the document size check. We'll let the backend/the database deal with it.

Today it's 16MB, but, IIRC, it's in BSON, so measuring JSON can be a bit off and we'd run into errors that'd be hard to resolve in the future.

Comment on lines +44 to +51
let flushResponse: { count: number; error: Array<{ index: number; errmsg: string }> | undefined } | undefined;
if (this.documents.length >= this.maxBatchSize || this.totalSize + docSize > this.maxTotalSize) {
flushResponse = await this.flush();
}

this.documents.push(document);
this.totalSize += docSize;
return flushResponse;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the logic here.

We insert documents, but get no feedback, i.e. the return value is undefined and I'm supposed to interpret it as success: the document has been added to the buffer.

Then, once the buffer is flushed on it's own due to it being full, I'd get a response stating how many documents have been inserted.

On top, the flush function is public so I can call it in case of me being done with a buffer and no documents left. I see that you're doing it in another section.

However, it's hard to read and the behavior of the entire class is unexpected with use over time. Some calls will be fast, some will be slow. I'd suggest splitting it so that addDocuments focuses on filling the buffer and returning a success/failure information. Boolean would do, but a status response might be helpful as well (although today the only rejection reason would be the fact that the buffer is full.).

Then, The user would call flush when needed.

Having a size function on the buffer would be helpful.

Comment on lines +12 to 13
import * as fse from 'fs-extra';
import * as vscode from 'vscode';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the benefit of fse here? (out of interest, I'm not saying it's a bad choice.)

@tnaum-ms tnaum-ms requested a review from Copilot May 13, 2025 08:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR titled "insert with buffer" introduces a buffering mechanism for Mongo document insertion and improves error reporting in the insert flow.

  • Extended the InsertDocumentsResult type to include error indicators and detailed write errors.
  • Added a new DocumentBufferforMongo class to batch document inserts with size checks and flushing.
  • Adjusted dependency versions and ESLint disable comments to align with the updated code.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/documentdb/ClustersClient.ts Enhanced insert operation by extending error reporting and adding try–catch around document replacement.
src/commands/importDocuments/importDocuments.ts Introduced the DocumentBufferforMongo class for batching document insertion and refactored insertion logic.
package.json Updated dependency versions and adjusted the extension version to match the intended release state.
Comments suppressed due to low confidence (1)

src/commands/importDocuments/importDocuments.ts:23

  • [nitpick] Consider renaming 'DocumentBufferforMongo' to 'DocumentBufferForMongo' for consistent CamelCase styling.
export class DocumentBufferforMongo {

acknowledged: boolean;
/** The number of inserted documents for this operations */
insertedCount: number;
/** Indicates whether error occured duiring insertion. If not, then the member `errorMessage` will undefined */
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical errors in the comment; consider replacing 'occured' with 'occurred', 'duiring' with 'during', and inserting 'be' before 'undefined'.

Suggested change
/** Indicates whether error occured duiring insertion. If not, then the member `errorMessage` will undefined */
/** Indicates whether an error occurred during insertion. If not, then the member `errorMessage` will be undefined */

Copilot uses AI. Check for mistakes.
insertedCount: number;
/** Indicates whether error occured duiring insertion. If not, then the member `errorMessage` will undefined */
errorsOccured: boolean;
/** Status code of insertion opertion */
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in documentation; replace 'opertion' with 'operation'.

Suggested change
/** Status code of insertion opertion */
/** Status code of insertion operation */

Copilot uses AI. Check for mistakes.
errorMessage: string | undefined;
/** Array of writeError while processing the bulk write */
/** If writeErrors is valid, means the failure happened to specific documents, and the database acknowledged bulk write operation */
/** If writeErrors is undefined, means the failure happened to the whole operation, may related to doccument size or some other errors, where retry should be applied */
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical error: 'doccument' should be corrected to 'document'.

Suggested change
/** If writeErrors is undefined, means the failure happened to the whole operation, may related to doccument size or some other errors, where retry should be applied */
/** If writeErrors is undefined, means the failure happened to the whole operation, may related to document size or some other errors, where retry should be applied */

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants