Semantics of resource propoerty where there is more than one backing resource #135
Replies: 1 comment
-
|
My reading of the current spec is that resource should be the canonical URI for the underlying asset when there is one. In your example, the OKF document sounds like it describes the logical table T, not one specific physical tenant instance of T. So I would probably avoid putting only one concrete tenant path such as: resource: s3://tenant1-datalake/tbecause that would imply the concept is specifically about tenant 1’s physical table. Instead, I would model the OKF concept as the logical table and make resource either a logical/canonical identifier or omit it if you do not have one. For example: ---
type: Datalake Table
title: T
description: Logical multitenant datalake table T.
resource: urn:example:datalake-table:T
physical_resources:
pattern: s3://{tenant_id}-datalake/t
examples:
- s3://tenant1-datalake/t
- s3://tenant2-datalake/t
---or, if your organization already has a stable internal catalog URI: ---
type: Datalake Table
title: T
resource: catalog://datalake/tables/T
physical_resources:
- s3://tenant1-datalake/t
- s3://tenant2-datalake/t
---The important part is that resource remains the canonical identifier of the thing the document is about. If the document is about the logical table, use a logical URI. If the document is about one tenant’s physical table, use that tenant-specific S3 URI and create separate concepts per tenant. So I would choose between two patterns:
For most multitenant schemas, I would prefer the first pattern unless tenant-specific schema, ownership, retention, policy, or quality metadata differ enough to deserve separate concepts. Since OKF explicitly allows producer-defined frontmatter keys, an extension field for the resource pattern seems like a reasonable approach until/if the spec defines a first-class way to represent multiple backing resources. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello there, I'm wondering if / how to properly populate the resource link in situations where:
As an example, I'm working on a multitenant system. In this case, there's clearly a rationale for having an OKF-compliant description of some datalake table T. However, the physical location of T in AWS depends on the tenant ID. So you might have s3://tenant1-datalake/t, s3://tenant2-datalake/t etc.
Beta Was this translation helpful? Give feedback.
All reactions