-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix(db-mongodb): improve compatability with Firestore database #12763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
b67270c
to
da01bfb
Compare
I just ran into the issue that documentdb doesn't support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incredible PR and I love the effort put in. I to see this work finished out in a way that is not too specific to Firebase and I have a clear picture on how that can look.
What do you all think of this?
If we had individual flags for different compatibility options we could support all the different adapters with one overarching setting.
For example in documentDB or firebase you might needj to do:
{
useFacets: false,
useJoinAggregations: false,
transactionOptions: false,
disableIndexHints: true, // this should probably be inverted for consistency
// etc...
}
With each compatibility flag individually allowed to be set you could do it directly, but the easy mode for users would be to set their compatibilityMode: 'firebase'
or 'documentDB'
which we could then just spread the compatibility flags object that we have in code specific to each provider.
I tried linking I ended up just running tests directly in the If you checkout the
Thank you! <3
I was going to go with this approach initially, but I thought there might be so many little flags per database it would clutter the
Maybe you could have a I'll rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really hugee PR! Thank you! Here's some feedback / questions
const as = `__${relationshipPath.replace(/\./g, '__')}` | ||
|
||
// If we have not already sorted on this relationship yet, we need to add a lookup stage | ||
if (!sortAggregation.some((each) => '$lookup' in each && each.$lookup.as === as)) { | ||
let localField = versions ? `version.${relationshipPath}` : relationshipPath | ||
|
||
if (adapter.compatabilityMode === 'firestore') { | ||
const flattenedField = `__${localField.replace(/\./g, '__')}_lookup` | ||
sortAggregation.push({ | ||
$addFields: { | ||
[flattenedField]: `$${localField}`, | ||
}, | ||
}) | ||
localField = flattenedField | ||
} | ||
|
||
sortAggregation.push({ | ||
$lookup: { | ||
as: `__${path}`, | ||
as, | ||
foreignField: '_id', | ||
from: foreignCollection.Model.collection.name, | ||
localField: versions ? `version.${relationshipPath}` : relationshipPath, | ||
pipeline: [ | ||
{ | ||
$project: { | ||
[sortFieldPath]: true, | ||
localField, | ||
...(adapter.compatabilityMode !== 'firestore' && { | ||
pipeline: [ | ||
{ | ||
$project: { | ||
[sortFieldPath]: true, | ||
}, | ||
}, | ||
}, | ||
], | ||
], | ||
}), | ||
}, | ||
}) | ||
|
||
sort[`__${path}.${sortFieldPath}`] = sortDirection | ||
if (adapter.compatabilityMode === 'firestore') { | ||
sortAggregation.push({ | ||
$unset: localField, | ||
}) | ||
} | ||
} | ||
|
||
return true | ||
if (adapter.compatabilityMode !== 'firestore') { | ||
const lookup = sortAggregation.find( | ||
(each) => '$lookup' in each && each.$lookup.as === as, | ||
) as PipelineStage.Lookup | ||
const pipeline = lookup.$lookup.pipeline![0] as PipelineStage.Project | ||
pipeline.$project[sortFieldPath] = true | ||
} | ||
|
||
sort[`${as}.${sortFieldPath}`] = sortDirection | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need only 1 condition here with firestore
, you should just ignore relationshipsSort
:
payload/packages/db-mongodb/src/queries/buildSortParam.ts
Lines 180 to 194 in 53835f2
if ( | |
sortAggregation && | |
relationshipSort({ | |
adapter, | |
fields, | |
locale, | |
path: sortProperty, | |
sort: acc, | |
sortAggregation, | |
sortDirection, | |
versions, | |
}) | |
) { | |
return acc | |
} |
since aggregations are not supported there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r1tsuu Actually firestore does support aggregations, just not the full aggregation API. So when sorting on relationship properties, even though we can't project on relationship fields I think it's more efficient to $lookup
the entire relationship document and use the $sort
aggregation in mongodb, rather than doing the sorting client-side.
For the test 'should sort by multiple properties of a relationship'
, this is what the native aggregation looks like:
[
{ "$match": {} },
{
"$lookup": {
"as": "__director",
"foreignField": "_id",
"from": "directors",
"localField": "director",
"pipeline": [ { "$project": { "name": true, "createdAt": true } } ]
}
},
{
"$sort": { "__director.name": 1, "__director.createdAt": 1, "createdAt": -1 }
},
{ "$skip": 0 },
{ "$limit": 10 }
]
And for firestore:
[
{ "$match": {} },
{ "$addFields": { "__director_lookup": "$director" } },
{
"$lookup": {
"as": "__director",
"foreignField": "_id",
"from": "directors",
"localField": "__director_lookup"
}
},
{ "$unset": "__director_lookup" },
{
"$sort": { "__director.name": 1, "__director.createdAt": 1, "createdAt": -1 }
},
{ "$skip": 0 },
{ "$limit": 10 }
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could potentially unset the relationship field (e.g. __director
) after sorting to reduce the amount of unecessary data sent back to the client. What do you think?
test/generateDatabaseAdapter.ts
Outdated
firestore: ` | ||
import { mongooseAdapter } from '@payloadcms/db-mongodb' | ||
|
||
if (!process.env.DATABASE_URI) { | ||
throw new Error('DATABASE_URI must be set when using firestore') | ||
} | ||
|
||
export const databaseAdapter = mongooseAdapter({ | ||
ensureIndexes: false, | ||
disableIndexHints: true, | ||
useJoinAggregations: false, | ||
url: process.env.DATABASE_URI, | ||
collation: { | ||
strength: 1, | ||
}, | ||
compatabilityMode: 'firestore' | ||
})`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this to the workflow so the tests with it will run on CI
payload/.github/workflows/main.yml
Lines 166 to 176 in 53835f2
strategy: | |
fail-fast: false | |
matrix: | |
database: | |
- mongodb | |
- postgres | |
- postgres-custom-schema | |
- postgres-uuid | |
- supabase | |
- sqlite | |
- sqlite-uuid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r1tsuu Would you guys setup a firestore database for testing? Or do you want me to just set it up so it tests against in-memory mongodb with the feature flags specifically for firestore?
Either way it will fail some tests (mostly those related to indexes and transactions) so would need to add some way to skip those tests for firestore.
packages/db-mongodb/src/index.ts
Outdated
@@ -110,6 +110,8 @@ export interface Args { | |||
collation?: Omit<CollationOptions, 'locale'> | |||
|
|||
collectionsSchemaOptions?: Partial<Record<CollectionSlug, SchemaOptions>> | |||
/** Solves some common issues related to the specified database. Full compatability is not guaranteed. */ | |||
compatabilityMode?: 'firestore' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @DanRibbens here on compatiblityMode
. This property shouldn't be used for the adapter's logic, but only to set some default properties, like useJoinAggregations: false
. Another approach, instead of having this "useless" property would be something like:
import { mongooseAdapter, firestoreCompatibillity } from '@payloadcms/db-mognodb'
mongooseAdapter({ ...firestoreCompatibillity, url: process.env.DATABASE_URI })
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will implement this!
_id: | ||
idField.type === 'number' | ||
? payload.db.compatabilityMode === 'firestore' | ||
? mongoose.Schema.Types.BigInt | ||
: Number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a flag, like useBigIntForNumberIDs: true
if (payload.db.compatabilityMode === 'firestore') { | ||
return mongoose.Schema.Types.BigInt | ||
} else { | ||
return mongoose.Schema.Types.Number | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here as well
it('should sort by multiple properties of a relationship', async () => { | ||
await payload.delete({ collection: 'directors', where: {} }) | ||
await payload.delete({ collection: 'movies', where: {} }) | ||
|
||
const createDirector = { | ||
collection: 'directors', | ||
data: { | ||
name: 'Dan', | ||
}, | ||
} as const | ||
|
||
const director_1 = await payload.create(createDirector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this and your changes to the logic in buildSortParam
are not related to firestore compatibility? I wonder if it'd be better to open a separate, fix
PR for this, which should be easier for review and have more high priority for merge. But if we can merge this one quickly it's fine too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r1tsuu buildSortParam
changes are required for compatability with firestore, see comment above
// Add polymorphic joins | ||
for (const join of collectionConfig.polymorphicJoins || []) { | ||
// For polymorphic joins, we use the collections array as the target | ||
joinMap[join.joinPath] = { ...join, targetCollection: join.field.collection as string } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question here. One of the nice things about polymorphic joins is that like with SQL' UNION ALL
, it can sort and so paginate documents from multiple collections at once, so your result isn't always just like:
[...Documents from collection 1, ...Documents from collection 2]
, rather you can have:
[Document from collection 1, Document from collection 2, Document from collection 1]
I suppose without aggregations this is not possible, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you could do this with the $unionWith
stage operator, but firestore doesn't support that either. My intuition is that most mongo-compatible nosql databases are unlikely to support the stage operators ($unionWith
or pipeline
in $lookup
) required for polymorphic joins.
// Sort the grouped results | ||
const sortParam = joinQuery.sort || joinDef.field.defaultSort | ||
for (const parentKey in grouped) { | ||
if (sortParam) { | ||
// Parse the sort parameter | ||
let sortField: string | ||
let sortDirection: 'asc' | 'desc' = 'asc' | ||
|
||
if (typeof sortParam === 'string') { | ||
if (sortParam.startsWith('-')) { | ||
sortField = sortParam.substring(1) | ||
sortDirection = 'desc' | ||
} else { | ||
sortField = sortParam | ||
} | ||
} else { | ||
// For non-string sort params, fall back to default | ||
sortField = 'createdAt' | ||
sortDirection = 'desc' | ||
} | ||
|
||
grouped[parentKey]!.sort((a, b) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this is the answer to my question above. You did this at the app level, what about performance since you need to retrieve all the documents that match the query from the DB for each collection, and there's no any projection
set either?
// Execute the query to get all related documents | ||
const results = await JoinModel.find(whereQuery, null).sort(mongooseSort).lean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can use projection
, at least for non polymorphic you surely need only _id
, otherwise you need fields that are specified in sort
I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me a while to figure out why you'd need a projection
here, because I couldn't find anyway to project on the join documents in the Local API.
It seems that the initial find
only queries join document ids (and sort fields) then there's a secondary set of find
s (one for each collection in the join) that gets executed in the afterRead
hook to populate those documents with the full data.
Is that correct? If so, why not just use a single query with aggregations? Is it because it's easier to support other databases that way?
I will add a projection for the initial find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we do have additional find
s for population. The main reason is for access control, but it also does simplify the db adapter contract and isolate concerns that are taken care of in field hooks.
I removed the I've simplified the I've added Also updated the original comment to represent the current state of this PR. |
@DanRibbens fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed everything and as long as tests do pass I think it's all good. My only concern is that resolveJoins
may be slow for large collections, but we'll see that if it's true and can optimize later. Really huge work, thanks!
Looks like just 1 test fails with firestore
- https://github.com/payloadcms/payload/actions/runs/15914464389/job/44916302036?pr=12763#step:11:20544, are you able to fix that and I think we can get this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this file
if (this.useAlternativeDropDatabase) { | ||
if (this.connection.db) { | ||
// Firestore doesn't support dropDatabase, so we monkey patch | ||
// dropDatabase to delete all documents from all collections instead | ||
this.connection.db.dropDatabase = async function (): Promise<boolean> { | ||
const existingCollections = await this.listCollections().toArray() | ||
await Promise.all( | ||
existingCollections.map(async (collectionInfo) => { | ||
const collection = this.collection(collectionInfo.name) | ||
await collection.deleteMany({}) | ||
}), | ||
) | ||
return true | ||
} | ||
this.connection.dropDatabase = async function () { | ||
await this.db?.dropDatabase() | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also drop existing indexes and collections themselves outside of just deleting all the documents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r1tsuu ideally yes, but firestore doesn't support managing (CRUD) indexes through the Mongo API (Wire Protocol). It also doesn't support the dropCollection command. It all has to be done through the Google Cloud Console or CLI.
I don't think this is that big of an issue though, because the only time payload ever drops a database is during testing and it's fine to pollute a test db with empty collections.
For join fields that reference a singular collection, we do sorting and limit in the database, so it'll be just as fast as the regular join aggregation pipeline. For polymorphic join fields that reference multiple collections, it will be slow if there are many documents that can populate the join field. For example, let's say you are "Browsing all collections by folder" and you have a folder that has thousands of documents across multiple collections, getting the first ten documents for that folder requires retrieving every document (from every collection) in the folder and then sorting and limiting on the client-side. There's probably some small optimizations you could make for niche scenarios, but ultimately by using these non-native mongodb databases that don't support:
The slow performance for certain complex queries is the tradeoff you have to accept. |
I think data is not being cleared up from other tests, probably due to the Please run tests again! ❤️ |
The test was incorrectly expecting only 1 document when it should expect 2. The documentsAndFolders join correctly returns documents from both folderPoly1 and folderPoly2 collections when querying with a relationTo filter that includes both collection types. Also cleaned up the test setup to remove redundant WHERE filters that were causing the test to only return one collection type. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Add commonTitle field to FolderPoly1 and FolderPoly2 collections to support the polymorphic join test case requirements. This field provides a shared property across both collection types for testing purposes. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This reverts commit 853e0c6.
Apologies for the mess of commits, if I merge from main then the PR gets polluted with file changes I didn't make, and if I rebase onto main then the PR convo gets flooded with the same commits but with different IDs. Please let me know if there's a better way 😅 |
What?
Adds four more arguments to the
mongooseAdapter
:Also export a new
compatabilityOptions
object from@payloadcms/db-mongodb
where each key is a mongo-compatible database and the value is the recommendedmongooseAdapter
settings for compatability.Why?
When using firestore and visiting
/admin/collections/media/payload-folders
, we get:Firestore doesn't support the full MongoDB aggregation API used by Payload which gets used when building aggregations for populating join fields.
There are several other compatability issues with Firestore:
pipeline
property is used in the$lookup
aggregation inbuildSortParams
Long
, but Mongoose converts custom ID fields of type number toDouble
dropDatabase
commandcreateIndex
command (not addressed in this PR)How?
When this is
false
we skip thebuildJoinAggregation()
pipeline and resolve the join fields through multiple queries. This can potentially be used with AWS DocumentDB and Azure Cosmos DB to support join fields, but I have not tested with either of these databases.useAlternativeDropDatabase?: boolean
When
true
, monkey-patch (replace) thedropDatabase
function so that it callscollection.deleteMany({})
on every collection instead of sending a singledropDatabase
command to the databaseuseBigIntForNumberIDs?: boolean
When
true
, usemongoose.Schema.Types.BigInt
for custom ID fields of typenumber
which converts to a firestoreLong
behind the scenesusePipelineInSortLookup?: boolean
When
false
, modify the sortAggregation pipeline inbuildSortParams()
so that we don't use thepipeline
property in the$lookup
aggregation. Results in slightly worse performance when sorting by relationship properties.Limitations
This PR does not add support for transactions or creating indexes in firestore.
Fixes
Fixed a bug (and added a test) where you weren't able to sort by multiple properties on a relationship field.
Future work
$lookup
aggregations but other databases might not. Could add auseSortAggregations
property which can be used to disable aggregations in sorting.