Skip to content

Conversation

@jordan-smith721
Copy link
Collaborator

@jordan-smith721 jordan-smith721 commented Nov 6, 2025

Adds MongoDB Search, Vector Search, and Aggregation endpoints to the express API. Also adds unit testing for these endpoints and cleans up the unit test files.
This PR also fixes up an issue with the vector search results pagination not working correctly

Copy link
Collaborator

@dacharyc dacharyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of route issues we should fix to match the client, and a tsconfig Q related to a client file included here. Everything else is just a Q to consider.

"resolveJsonModule": true,
"isolatedModules": true,
"jsx": "react-jsx",
"jsx": "preserve",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the client, it's automatically switching this value back to "react-jsx" with this message:

   We detected TypeScript in your project and reconfigured your tsconfig.json file for you.
   The following mandatory changes were made to your tsconfig.json:

        - jsx was set to react-jsx (next.js uses the React automatic runtime)

Curious why you're changing it here, and if we need it to be "preserve", what we might need to change in the client to force it to respect this config field?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting, every time I run it it's been automatically changing it to preserve. I'm not sure what's causing the change but I'll look into it. For now I'll revert to react-jsx since that's what I originally had in there before it started changing it

Copy link
Collaborator Author

@jordan-smith721 jordan-smith721 Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly it's almost the same message too:

We detected TypeScript in your project and reconfigured your tsconfig.json file for you.
The following mandatory changes were made to your tsconfig.json:

     - jsx was set to preserve (next.js implements its own optimized jsx transform)

Copy link
Collaborator Author

@jordan-smith721 jordan-smith721 Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I see, it's a versioning issue. From Claude:

The Problem
The issue occurs because Next.js automatically modifies your tsconfig.json file when it starts up. Different versions of Next.js have different preferences for the jsx setting:

Your current setup: jsx: "preserve" (which Next.js 15+ prefers)
Others' setup: jsx: "react-jsx" (which older versions or different configurations prefer)
Why This Happens
Version Differences: Different Next.js versions have different default TSConfig preferences
Auto-Configuration: Next.js automatically "fixes" what it thinks are incorrect settings
Lock File Issues: Different package-lock.json or yarn.lock states can cause different dependency resolutions

Since this is using the latest version of Next, I think we should actually stick with preserve since later versions prefer it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't track with the usage, though. I ran npm install on the client in the last few days so it's pulling in the most recent version of Next.js. (Actually I checked and I've got 16.0.0 and it looks like Next released 16.0.1 recently.)

It may be a versioning difference issue, but that doesn't really hold up if we're both on the latest version of Next. So I think it's more likely it's some sort of configuration difference, or maybe a Node or other version issue, and we should probably try to get to the bottom of this as a fast follow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Okay yeah I'll add it to the list of fast follows and try and get to the bottom of what's going on here

}

// Validate and set limit (default: 10, min: 1, max: 50)
const limitNum = Math.min(Math.max(parseInt(limit as string) || 10, 1), 50);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this. When performing vector search, the client provides a default limit of 20 - should the backend default match the client default, or are we intentionally defaulting to a different value of 10 here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it should match, good catch!

index: "vector_index",
path: "plot_embedding_voyage_3_large",
queryVector: queryVector,
numCandidates: limitNum * 15, // Search more candidates for better results
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something we should say about why we're multiplying the limit by 15 here? 15 feels like an arbitrary/magic number and I wonder if there's some way we arrived at this number or some guidance we should provide about balancing performance/quality here when setting this multiple?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out the number is actually supposed to be 20! From the docs here: https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/#numcandidates-selection

I clarified the comment to explain better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome - thanks for the doc link! Good to know what the guidance is. I haven't checked the other two backends but we should probably check what the others are using as part of the cleanup pass tomorrow, and update it if we're using different values there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, I'll check!

* Find similar movies using vector search based on a movie ID.
* Demonstrates finding semantically similar movies using plot embeddings.
*/
export async function findSimilarMovies(req: Request, res: Response): Promise<void> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed it but I can't find anywhere in the frontend that we're using this API. Are we supporting future development here? I wonder if we can/should omit this from the v1 of the app if we haven't implemented the corresponding frontend feature yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yep, this was a separate, earlier implementation of vector search that wasn't working out. I definitely thought I got rid of it but I guess not! Removed now.

* Aggregate movies with their most recent comments.
* Demonstrates MongoDB $lookup aggregation to join collections.
*/
router.get("/aggregations/comments", asyncHandler(movieController.getMoviesWithMostRecentComments));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, these aggregation routes don't match the frontend. We need some adjustments to match the client:

Suggested change
router.get("/aggregations/comments", asyncHandler(movieController.getMoviesWithMostRecentComments));
router.get("/aggregations/reportingByComments", asyncHandler(movieController.getMoviesWithMostRecentComments));

* Aggregate directors with the most movies.
* Demonstrates MongoDB $unwind and $group for array aggregation.
*/
router.get("/aggregations/directors", asyncHandler(movieController.getDirectorsWithMostMovies));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
router.get("/aggregations/directors", asyncHandler(movieController.getDirectorsWithMostMovies));
router.get("/aggregations/reportingByDirectors", asyncHandler(movieController.getDirectorsWithMostMovies));

router.get("/aggregations/years", asyncHandler(movieController.getMoviesByYearWithStats));

/**
* GET /api/movies/aggregations/directors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* GET /api/movies/aggregations/directors
* GET /api/movies/aggregations/reportingByDirectors

title: result.title,
year: result.year,
genres: result.genres,
imdb: result.imdbRating ? { rating: result.imdbRating } : undefined,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to match what the client is expecting:

Suggested change
imdb: result.imdbRating ? { rating: result.imdbRating } : undefined,
imdbRating: result.imdbRating,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

Comment on lines 235 to 238
imdb?: {
rating?: number;
votes?: number;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the mismatch with what the client is expecting:

Suggested change
imdb?: {
rating?: number;
votes?: number;
};
imdbRating?: number;

* Find similar movies using vector search based on a movie ID.
* Demonstrates finding semantically similar movies using plot embeddings.
*/
router.get("/find-similar-movies", asyncHandler(movieController.findSimilarMovies));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted in the controller, but it looks like the frontend isn't using this route currently. Should we omit this until the frontend has implemented it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Copy link
Collaborator

@dacharyc dacharyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome - thanks for the fixups, everything seems to be working well! ✅

@dacharyc dacharyc merged commit d7c83d3 into mongodb:development Nov 7, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants