Skip to content

fix(db-mongodb): improve compatability with Firestore database #12763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

elliott-w
Copy link

@elliott-w elliott-w commented Jun 11, 2025

What?

Adds four more arguments to the mongooseAdapter:

  useJoinAggregations?: boolean  /* The big one */
  useAlternativeDropDatabase?: boolean
  useBigIntForNumberIDs?: boolean
  usePipelineInSortLookup?: boolean

Also export a new compatabilityOptions object from @payloadcms/db-mongodb where each key is a mongo-compatible database and the value is the recommended mongooseAdapter settings for compatability.

Why?

When using firestore and visiting /admin/collections/media/payload-folders, we get:

MongoServerError: invalid field(s) in lookup: [let, pipeline], only lookup(from, localField, foreignField, as) is supported

Firestore doesn't support the full MongoDB aggregation API used by Payload which gets used when building aggregations for populating join fields.

There are several other compatability issues with Firestore:

  • The invalid pipeline property is used in the $lookup aggregation in buildSortParams
  • Firestore only supports number IDs of type Long, but Mongoose converts custom ID fields of type number to Double
  • Firestore does not support the dropDatabase command
  • Firestore does not support the createIndex command (not addressed in this PR)

How?

useJoinAggregations?: boolean  /* The big one */

When this is false we skip the buildJoinAggregation() pipeline and resolve the join fields through multiple queries. This can potentially be used with AWS DocumentDB and Azure Cosmos DB to support join fields, but I have not tested with either of these databases.

useAlternativeDropDatabase?: boolean

When true, monkey-patch (replace) the dropDatabase function so that it calls collection.deleteMany({}) on every collection instead of sending a single dropDatabase command to the database

useBigIntForNumberIDs?: boolean

When true, use mongoose.Schema.Types.BigInt for custom ID fields of type number which converts to a firestore Long behind the scenes

  usePipelineInSortLookup?: boolean

When false, modify the sortAggregation pipeline in buildSortParams() so that we don't use the pipeline property in the $lookup aggregation. Results in slightly worse performance when sorting by relationship properties.

Limitations

This PR does not add support for transactions or creating indexes in firestore.

Fixes

Fixed a bug (and added a test) where you weren't able to sort by multiple properties on a relationship field.

Future work

  1. Firestore supports simple $lookup aggregations but other databases might not. Could add a useSortAggregations property which can be used to disable aggregations in sorting.

@elliott-w elliott-w changed the title fix(db-mongodb): Improve compatability with Firestore database fix(db-mongodb): improve compatability with Firestore database Jun 11, 2025
@elliott-w elliott-w force-pushed the firestore branch 2 times, most recently from b67270c to da01bfb Compare June 12, 2025 00:43
@FinnIckler
Copy link

I just ran into the issue that documentdb doesn't support $lookup concise correlated subquery , this could be fixed with this PR? I would be open to test if I got instruction how to monkey patch this version of payload into my dockerfile

Copy link
Contributor

@DanRibbens DanRibbens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incredible PR and I love the effort put in. I to see this work finished out in a way that is not too specific to Firebase and I have a clear picture on how that can look.

What do you all think of this?
If we had individual flags for different compatibility options we could support all the different adapters with one overarching setting.
For example in documentDB or firebase you might needj to do:

{ 
  useFacets: false,
  useJoinAggregations: false,
  transactionOptions: false,
  disableIndexHints: true, // this should probably be inverted for consistency
  // etc...
}

With each compatibility flag individually allowed to be set you could do it directly, but the easy mode for users would be to set their compatibilityMode: 'firebase' or 'documentDB' which we could then just spread the compatibility flags object that we have in code specific to each provider.

@elliott-w
Copy link
Author

@FinnIckler

I just ran into the issue that documentdb doesn't support $lookup concise correlated subquery , this could be fixed with this PR? I would be open to test if I got instruction how to monkey patch this version of payload into my dockerfile

I tried linking @payloadcms/db-mongodb in my node_modules to a local development folder, but ran into a bunch of issues - the main one being that packages/db-mongodb/**/*.ts files that import other payload packages first look in packages/db-mongodb/node_modules rather than your external project's node_modules.

I ended up just running tests directly in the payload repo with a .env file in root directory and set the DATABASE_URI.

If you checkout the firestore branch on my repo and copy-paste the test:int:firestore script to test:int:documentdb and then add a documentdb key in generateDatabaseAdapter.ts, you can run integration tests against DocumentDB. The main tests to run would be:

  • pnpm test:int:documentdb database
  • pnpm test:int:documentdb relationships
  • pnpm test:int:documentdb joins

@DanRibbens

This is an incredible PR and I love the effort put in.

Thank you! <3

With each compatibility flag individually allowed to be set you could do it directly, but the easy mode for users would be to set their compatibilityMode: 'firebase' or 'documentDB' which we could then just spread the compatibility flags object that we have in code specific to each provider.

I was going to go with this approach initially, but I thought there might be so many little flags per database it would clutter the mongooseAdapter options. For example, firestore would require 3 extra:

useAlternativeDropDatabase?: boolean
useBigIntForCustomIdNumberField?: boolean
usePipelineInSortLookup?: boolean

Maybe you could have a minorCompatabilityOptions object to hide some of that complexity, but seems weird idk.

I'll rename manualJoins to useJoinAggregations.

Copy link
Member

@r1tsuu r1tsuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really hugee PR! Thank you! Here's some feedback / questions

Comment on lines 99 to 149
const as = `__${relationshipPath.replace(/\./g, '__')}`

// If we have not already sorted on this relationship yet, we need to add a lookup stage
if (!sortAggregation.some((each) => '$lookup' in each && each.$lookup.as === as)) {
let localField = versions ? `version.${relationshipPath}` : relationshipPath

if (adapter.compatabilityMode === 'firestore') {
const flattenedField = `__${localField.replace(/\./g, '__')}_lookup`
sortAggregation.push({
$addFields: {
[flattenedField]: `$${localField}`,
},
})
localField = flattenedField
}

sortAggregation.push({
$lookup: {
as: `__${path}`,
as,
foreignField: '_id',
from: foreignCollection.Model.collection.name,
localField: versions ? `version.${relationshipPath}` : relationshipPath,
pipeline: [
{
$project: {
[sortFieldPath]: true,
localField,
...(adapter.compatabilityMode !== 'firestore' && {
pipeline: [
{
$project: {
[sortFieldPath]: true,
},
},
},
],
],
}),
},
})

sort[`__${path}.${sortFieldPath}`] = sortDirection
if (adapter.compatabilityMode === 'firestore') {
sortAggregation.push({
$unset: localField,
})
}
}

return true
if (adapter.compatabilityMode !== 'firestore') {
const lookup = sortAggregation.find(
(each) => '$lookup' in each && each.$lookup.as === as,
) as PipelineStage.Lookup
const pipeline = lookup.$lookup.pipeline![0] as PipelineStage.Project
pipeline.$project[sortFieldPath] = true
}

sort[`${as}.${sortFieldPath}`] = sortDirection
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need only 1 condition here with firestore, you should just ignore relationshipsSort:

if (
sortAggregation &&
relationshipSort({
adapter,
fields,
locale,
path: sortProperty,
sort: acc,
sortAggregation,
sortDirection,
versions,
})
) {
return acc
}

since aggregations are not supported there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r1tsuu Actually firestore does support aggregations, just not the full aggregation API. So when sorting on relationship properties, even though we can't project on relationship fields I think it's more efficient to $lookup the entire relationship document and use the $sort aggregation in mongodb, rather than doing the sorting client-side.

For the test 'should sort by multiple properties of a relationship', this is what the native aggregation looks like:

[
  { "$match": {} },
  {
    "$lookup": {
      "as": "__director",
      "foreignField": "_id",
      "from": "directors",
      "localField": "director",
      "pipeline": [ { "$project": { "name": true, "createdAt": true } } ]
    }
  },
  {
    "$sort": { "__director.name": 1, "__director.createdAt": 1, "createdAt": -1 }
  },
  { "$skip": 0 },
  { "$limit": 10 }
]

And for firestore:

[
  { "$match": {} },
  { "$addFields": { "__director_lookup": "$director" } },
  {
    "$lookup": {
      "as": "__director",
      "foreignField": "_id",
      "from": "directors",
      "localField": "__director_lookup"
    }
  },
  { "$unset": "__director_lookup" },
  {
    "$sort": { "__director.name": 1, "__director.createdAt": 1, "createdAt": -1 }
  },
  { "$skip": 0 },
  { "$limit": 10 }
]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially unset the relationship field (e.g. __director) after sorting to reduce the amount of unecessary data sent back to the client. What do you think?

Comment on lines 24 to 42
firestore: `
import { mongooseAdapter } from '@payloadcms/db-mongodb'

if (!process.env.DATABASE_URI) {
throw new Error('DATABASE_URI must be set when using firestore')
}

export const databaseAdapter = mongooseAdapter({
ensureIndexes: false,
disableIndexHints: true,
useJoinAggregations: false,
url: process.env.DATABASE_URI,
collation: {
strength: 1,
},
compatabilityMode: 'firestore'
})`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this to the workflow so the tests with it will run on CI

strategy:
fail-fast: false
matrix:
database:
- mongodb
- postgres
- postgres-custom-schema
- postgres-uuid
- supabase
- sqlite
- sqlite-uuid

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r1tsuu Would you guys setup a firestore database for testing? Or do you want me to just set it up so it tests against in-memory mongodb with the feature flags specifically for firestore?

Either way it will fail some tests (mostly those related to indexes and transactions) so would need to add some way to skip those tests for firestore.

@@ -110,6 +110,8 @@ export interface Args {
collation?: Omit<CollationOptions, 'locale'>

collectionsSchemaOptions?: Partial<Record<CollectionSlug, SchemaOptions>>
/** Solves some common issues related to the specified database. Full compatability is not guaranteed. */
compatabilityMode?: 'firestore'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @DanRibbens here on compatiblityMode. This property shouldn't be used for the adapter's logic, but only to set some default properties, like useJoinAggregations: false. Another approach, instead of having this "useless" property would be something like:

import { mongooseAdapter, firestoreCompatibillity } from '@payloadcms/db-mognodb'

mongooseAdapter({ ...firestoreCompatibillity, url: process.env.DATABASE_URI })

what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will implement this!

Comment on lines 146 to 150
_id:
idField.type === 'number'
? payload.db.compatabilityMode === 'firestore'
? mongoose.Schema.Types.BigInt
: Number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be a flag, like useBigIntForNumberIDs: true

Comment on lines 908 to 912
if (payload.db.compatabilityMode === 'firestore') {
return mongoose.Schema.Types.BigInt
} else {
return mongoose.Schema.Types.Number
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here as well

Comment on lines +753 to +805
it('should sort by multiple properties of a relationship', async () => {
await payload.delete({ collection: 'directors', where: {} })
await payload.delete({ collection: 'movies', where: {} })

const createDirector = {
collection: 'directors',
data: {
name: 'Dan',
},
} as const

const director_1 = await payload.create(createDirector)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this and your changes to the logic in buildSortParam are not related to firestore compatibility? I wonder if it'd be better to open a separate, fix PR for this, which should be easier for review and have more high priority for merge. But if we can merge this one quickly it's fine too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r1tsuu buildSortParam changes are required for compatability with firestore, see comment above

Comment on lines 261 to 265
// Add polymorphic joins
for (const join of collectionConfig.polymorphicJoins || []) {
// For polymorphic joins, we use the collections array as the target
joinMap[join.joinPath] = { ...join, targetCollection: join.field.collection as string }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question here. One of the nice things about polymorphic joins is that like with SQL' UNION ALL, it can sort and so paginate documents from multiple collections at once, so your result isn't always just like:
[...Documents from collection 1, ...Documents from collection 2], rather you can have:
[Document from collection 1, Document from collection 2, Document from collection 1]

I suppose without aggregations this is not possible, right?

Copy link
Author

@elliott-w elliott-w Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you could do this with the $unionWith stage operator, but firestore doesn't support that either. My intuition is that most mongo-compatible nosql databases are unlikely to support the stage operators ($unionWith or pipeline in $lookup) required for polymorphic joins.

Comment on lines 452 to 473
// Sort the grouped results
const sortParam = joinQuery.sort || joinDef.field.defaultSort
for (const parentKey in grouped) {
if (sortParam) {
// Parse the sort parameter
let sortField: string
let sortDirection: 'asc' | 'desc' = 'asc'

if (typeof sortParam === 'string') {
if (sortParam.startsWith('-')) {
sortField = sortParam.substring(1)
sortDirection = 'desc'
} else {
sortField = sortParam
}
} else {
// For non-string sort params, fall back to default
sortField = 'createdAt'
sortDirection = 'desc'
}

grouped[parentKey]!.sort((a, b) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is the answer to my question above. You did this at the app level, what about performance since you need to retrieve all the documents that match the query from the DB for each collection, and there's no any projection set either?

Comment on lines 667 to 668
// Execute the query to get all related documents
const results = await JoinModel.find(whereQuery, null).sort(mongooseSort).lean()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can use projection, at least for non polymorphic you surely need only _id, otherwise you need fields that are specified in sort I suppose.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a while to figure out why you'd need a projection here, because I couldn't find anyway to project on the join documents in the Local API.

It seems that the initial find only queries join document ids (and sort fields) then there's a secondary set of finds (one for each collection in the join) that gets executed in the afterRead hook to populate those documents with the full data.

Is that correct? If so, why not just use a single query with aggregations? Is it because it's easier to support other databases that way?

I will add a projection for the initial find.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do have additional finds for population. The main reason is for access control, but it also does simplify the db adapter contract and isolate concerns that are taken care of in field hooks.

@elliott-w elliott-w requested review from DanRibbens and r1tsuu June 16, 2025 13:02
@elliott-w
Copy link
Author

@DanRibbens

I removed the compatabilityMode arg and now export a compatabilityOptions object instead.

@r1tsuu

I've simplified the resolveJoins code quite a bit and added a projection.

I've added firestore to the testing matrix, but it doesn't use all of compatabilityOptions.firestore because some of them cause tests to fail (related to transactions and indexes). I don't think firestore provides github actions for their database since it's quite new, but you can always create a firestore database for testing purposes and just reference it with DATABASE_URI.

Also updated the original comment to represent the current state of this PR.

@elliott-w
Copy link
Author

@DanRibbens fixed int-firestore job, it was not starting the in-memory mongodb server. Could you please run the tests again?

Copy link
Member

@r1tsuu r1tsuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed everything and as long as tests do pass I think it's all good. My only concern is that resolveJoins may be slow for large collections, but we'll see that if it's true and can optimize later. Really huge work, thanks!

Looks like just 1 test fails with firestore - https://github.com/payloadcms/payload/actions/runs/15914464389/job/44916302036?pr=12763#step:11:20544, are you able to fix that and I think we can get this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this file

Comment on lines +39 to 58
if (this.useAlternativeDropDatabase) {
if (this.connection.db) {
// Firestore doesn't support dropDatabase, so we monkey patch
// dropDatabase to delete all documents from all collections instead
this.connection.db.dropDatabase = async function (): Promise<boolean> {
const existingCollections = await this.listCollections().toArray()
await Promise.all(
existingCollections.map(async (collectionInfo) => {
const collection = this.collection(collectionInfo.name)
await collection.deleteMany({})
}),
)
return true
}
this.connection.dropDatabase = async function () {
await this.db?.dropDatabase()
}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this also drop existing indexes and collections themselves outside of just deleting all the documents?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r1tsuu ideally yes, but firestore doesn't support managing (CRUD) indexes through the Mongo API (Wire Protocol). It also doesn't support the dropCollection command. It all has to be done through the Google Cloud Console or CLI.

I don't think this is that big of an issue though, because the only time payload ever drops a database is during testing and it's fine to pollute a test db with empty collections.

@elliott-w
Copy link
Author

@r1tsuu

My only concern is that resolveJoins may be slow for large collections

For join fields that reference a singular collection, we do sorting and limit in the database, so it'll be just as fast as the regular join aggregation pipeline.

For polymorphic join fields that reference multiple collections, it will be slow if there are many documents that can populate the join field. For example, let's say you are "Browsing all collections by folder" and you have a folder that has thousands of documents across multiple collections, getting the first ten documents for that folder requires retrieving every document (from every collection) in the folder and then sorting and limiting on the client-side.

There's probably some small optimizations you could make for niche scenarios, but ultimately by using these non-native mongodb databases that don't support:

  • let or pipeline in $lookup aggregation
  • $unionWith aggregation

The slow performance for certain complex queries is the tradeoff you have to accept.

@elliott-w
Copy link
Author

@r1tsuu

Looks like just 1 test fails with firestore - https://github.com/payloadcms/payload/actions/runs/15914464389/job/44916302036?pr=12763#step:11:20544, are you able to fix that and I think we can get this?

I think data is not being cleared up from other tests, probably due to the useAlternativeDropDatabase method. I've set this to false for the integration tests (although I think it's still useful to keep this option available for when developing/testing against an actual firestore database).

Please run tests again! ❤️

elliott-w and others added 28 commits June 28, 2025 14:18
The test was incorrectly expecting only 1 document when it should expect 2.
The documentsAndFolders join correctly returns documents from both folderPoly1
and folderPoly2 collections when querying with a relationTo filter that includes
both collection types.

Also cleaned up the test setup to remove redundant WHERE filters that were
causing the test to only return one collection type.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add commonTitle field to FolderPoly1 and FolderPoly2 collections to support
the polymorphic join test case requirements. This field provides a shared
property across both collection types for testing purposes.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@elliott-w
Copy link
Author

Apologies for the mess of commits, if I merge from main then the PR gets polluted with file changes I didn't make, and if I rebase onto main then the PR convo gets flooded with the same commits but with different IDs. Please let me know if there's a better way 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants