diff --git a/concepts/user-folder-scoping.mdx b/concepts/user-folder-scoping.mdx index 4b68f95..71cc542 100644 --- a/concepts/user-folder-scoping.mdx +++ b/concepts/user-folder-scoping.mdx @@ -103,6 +103,49 @@ async with AsyncMorphik() as db: - Scopes can be used for both reading and writing operations - User and folder information is stored as metadata with the documents, so you can still filter across scopes with explicit filter parameters if needed +## Filtering by Folder Name in Metadata + +When you ingest documents in a folder with the `folder_name` parameter, that value is automatically available in the document's metadata for filtering. This enables powerful cross-folder queries using Morphik's metadata filter operators: + +```python +from morphik import Morphik + +db = Morphik() + +# Filter documents from multiple folders +filters = { + "folder_name": {"$in": ["legal", "hr", "finance"]} +} +docs = db.list_documents(filters=filters) + +# Combine folder filtering with other metadata +filters = { + "$and": [ + {"folder_name": {"$regex": {"pattern": "^project_", "flags": "i"}}}, + {"status": "active"}, + {"priority": {"$gte": 70}} + ] +} +response = db.query("What are the high-priority project updates?", filters=filters) + +# Exclude specific folders +filters = { + "$and": [ + {"folder_name": {"$nin": ["archived", "drafts"]}}, + {"created_date": {"$gte": "2024-01-01"}} + ] +} +chunks = db.retrieve_chunks("quarterly report", filters=filters, k=10) +``` + +This approach is useful when you need to: +- Query across multiple folders simultaneously +- Use pattern matching on folder names +- Combine folder filters with complex metadata conditions +- Build dynamic queries where folder selection isn't known at scope creation time + +For more filtering examples, see the [Complex Metadata Filtering](/cookbooks/complex-metadata-filtering) cookbook and [Metadata Filtering](/concepts/metadata-filtering) reference. + ## Use Cases ### Multi-Project Research Team diff --git a/cookbooks/complex-metadata-filtering.mdx b/cookbooks/complex-metadata-filtering.mdx index 8932ef0..546f06e 100644 --- a/cookbooks/complex-metadata-filtering.mdx +++ b/cookbooks/complex-metadata-filtering.mdx @@ -99,6 +99,40 @@ filters = { } ``` +### Filtering by Folder Name + +Documents ingested with a `folder_name` parameter can be filtered using that value in metadata. This enables cross-folder queries and pattern matching: + +```python +# Filter specific folder +filters = {"folder_name": "reports"} + +# Query multiple folders +filters = { + "folder_name": {"$in": ["reports", "invoices", "contracts"]} +} + +# Exclude archived folders +filters = { + "folder_name": {"$nin": ["archived", "drafts", "test"]} +} + +# Pattern matching on folder names +filters = { + "folder_name": {"$regex": {"pattern": "^project_", "flags": "i"}} +} + +# Combine folder with other metadata +filters = { + "$and": [ + {"folder_name": {"$in": ["legal", "compliance"]}}, + {"priority": {"$gte": 70}}, + {"status": "active"}, + {"year": 2024} + ] +} +``` + ## 3. List Documents with Filters Find documents matching your criteria: diff --git a/python-sdk/query.mdx b/python-sdk/query.mdx index 51c9fe1..6730718 100644 --- a/python-sdk/query.mdx +++ b/python-sdk/query.mdx @@ -85,6 +85,23 @@ response = db.query( ) ``` +You can also filter by folder name and use expressive operators like `$in`, `$regex`, and `$nin`: + +```python +# Query across multiple folders +filters = { + "$and": [ + {"folder_name": {"$in": ["reports", "invoices"]}}, + {"year": 2024}, + {"priority": {"$gte": 50}} + ] +} + +response = db.query("What are the key financial highlights?", filters=filters) +``` + +For more advanced filtering patterns, see the [Complex Metadata Filtering cookbook](/cookbooks/complex-metadata-filtering). + ## Returns - `CompletionResponse`: Response containing the completion, source information, and potentially structured output.