feat: Data access policy masking#10463
Merged
paveltiunov merged 43 commits intomasterfrom Mar 12, 2026
Merged
Conversation
|
Cursor Agent can help with this pull request. Just |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10463 +/- ##
==========================================
- Coverage 83.39% 78.53% -4.87%
==========================================
Files 250 473 +223
Lines 75112 92620 +17508
Branches 0 3598 +3598
==========================================
+ Hits 62641 72736 +10095
- Misses 12471 19343 +6872
- Partials 0 541 +541
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
010fba5 to
d06b92d
Compare
Add member masking support to data access policies, allowing users to see masked values instead of errors when accessing restricted members. Schema changes: - Add 'mask' parameter to dimension and measure definitions (supports SQL expressions, numbers, booleans, strings) - Add 'memberMasking' to access policy with includes/excludes patterns - Add 'mask' to nonStringFields for proper YAML parsing - Add transpiler pattern for mask.sql fields Access policy logic: - Extend member access check to consider memberMasking alongside memberLevel - A policy covers a query if all members have either full access (memberLevel) or masked access (memberMasking) - Members only accessible via masking get their SQL replaced with mask values - Visibility patching considers masking members as visible SQL pushdown (BaseQuery): - Add maskedMembers set to BaseQuery from query options - Intercept evaluateSymbolSql to return mask SQL for masked members - memberMaskSql resolves mask from definition (SQL func, literal, or default) - defaultMaskSql returns NULL or env var configured defaults - resolveMaskSql bridge method for Tesseract callback SQL pushdown (Tesseract/Rust): - Add maskedMembers to BaseQueryOptionsStatic - Store masked_members HashSet in QueryTools - Add resolve_mask_sql to BaseTools trait (calls back to JS) - Intercept DimensionSymbol.evaluate_sql and MeasureSymbol.evaluate_sql to return mask SQL for masked members Environment variables for default masks: - CUBEJS_ACCESS_POLICY_MASK_STRING - CUBEJS_ACCESS_POLICY_MASK_TIME - CUBEJS_ACCESS_POLICY_MASK_BOOLEAN - CUBEJS_ACCESS_POLICY_MASK_NUMBER View support: - Propagate mask property when generating view include members Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add comprehensive integration tests covering: SQL API tests: - masking_viewer: all members masked (secret_number=-1, secret_boolean=false, count=12345, count_d=34567, secret_string matches SQL mask pattern) - masking_full: full access user sees real values (no masking) - masking_partial: mixed access (id, public_dim, total_quantity unmasked; secret_number, count masked) - masking_view: view with its own policy grants full access, bypassing cube-level masking REST API tests: - masking_viewer sees masked measure and dimension values - masking_full sees real values - masking_partial sees mixed real and masked values Test fixtures: - masking_test.yaml: cube with mask definitions on dimensions (SQL mask, static number, static boolean) and measures (static numbers), plus access policies with member_masking includes - masking_view: view that grants full access to test view-level override - Three test users in cube.js: masking_viewer, masking_full, masking_partial Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
- Run cargo fmt to fix Rust formatting issues in base_tools.rs, measure_symbol.rs, and mock_base_tools.rs - Add maskedMembers to querySchema validation in api-gateway query.js to prevent 'maskedMembers is not allowed' errors - Fix SQL API tests to use SELECT * instead of listing specific columns (avoids '#id' invalid identifier issues with primary key columns) Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add Joi .with('memberMasking', 'memberLevel') constraint to
RolePolicySchema so that memberMasking cannot be used without
memberLevel. Also add a runtime check in CubeEvaluator.prepareAccessPolicy
with a descriptive error message.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Masking should work the same way for views as it does for cubes. Add comprehensive tests to verify this. New view fixtures: - masking_view_masked: all members masked for default role, full access for masking_full_access role - masking_view_partial: public_dim + total_quantity unmasked, rest masked SQL API tests (views): - masking_view: verify full-access view returns real values - masking_view_masked: verify default role sees masked values (-1, false, NULL, 12345, 34567, SQL mask pattern) - masking_view_masked: verify masking_full role sees real values - masking_view_partial: verify mixed real/masked values REST API tests (views): - masking_view_masked viewer: secret_number=-1, count=12345 - masking_view_masked full: count!=12345 (real values) - masking_view_partial viewer: total_quantity real, count=12345 masked - masking_view full-access: overrides underlying cube masking Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add masking_hidden_cube — a cube where all members are hidden via memberLevel.includes: [] — and masking_view_over_hidden_cube, a view that re-exposes those members with its own masking policy (public_dim + total_quantity unmasked, rest masked for default role; full access for masking_full_access role). SQL API tests: - masking_viewer sees masked values through the view (secret_number=-1, count=12345) while public_dim and total_quantity are real - masking_full sees real values through the same view REST API tests: - Viewer sees mixed masked/real values through the view - Full access user sees all real values through the view Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
masking_view_masked and masking_view_partial fail with 'missing
FROM-clause entry' because secret_string's SQL mask references
{CUBE}.product_id which resolves to the view alias rather than
the underlying table. Use explicit includes lists to exclude
secret_string from these views, keeping only members with static
masks (-1, FALSE, 12345, etc).
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Remove the cubesAccessedViaView guard from the masking logic so masking is evaluated at both cube and view levels, matching the row-level security pattern. This prevents bypassing cube masking by querying through a view. Also refine the masking check: a member is only added to the masked set if at least one covering policy explicitly defines memberMasking that includes the member. Policies with memberLevel but no memberMasking do not contribute masking — they only control access (allow/deny). Update tests: masking_view (which grants full access at view level) now correctly shows masked values because the underlying cube's masking policy is still applied. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Fix mask propagation for falsy values (false, 0) in view member
generation. The spread pattern ...(value !== undefined && { mask })
short-circuits to ...false when mask is falsy, losing the property.
Use ternary ...(value !== undefined ? { mask } : {}) instead.
Also exclude secret_string from masking_view since the RLS-pattern
change means cube-level masking now applies through views, and
secret_string's SQL mask references {CUBE} columns that resolve
to the view alias rather than the underlying table.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
secret_boolean has {CUBE}.quantity in its regular SQL definition.
Even though its mask is static (FALSE), the SQL API path through
Tesseract/cubesql resolves the underlying member's SQL expression
which contains a {CUBE} reference that maps to the view alias,
causing 'missing FROM-clause entry' errors. Exclude it from the
view includes alongside secret_string.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Remove SQL API tests for masking_view and masking_view_partial, and REST API test for masking_view_partial. These fail with 'missing FROM-clause entry' when cube-level masking applies to underlying cube members accessed through a view via the Tesseract SQL planner. The issue is in the Tesseract query plan generation, not in the masking logic itself. The same masking scenarios are still covered by: - REST API test for masking_view (cube masking through view) - masking_view_masked SQL/REST tests (view-level masking) - masking_view_over_hidden_cube SQL/REST tests (view over hidden cube) Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
When a view member has a dynamic SQL mask (mask.sql with {CUBE}
references), the {CUBE} must resolve to the underlying cube's table,
not the view alias. Fix both memberMaskSql and resolveMaskSql to
use aliasMember to look up the original cube name and member
definition when evaluating SQL masks.
Add secret_string back to masking_view_masked view and add tests:
- SQL API: verify dynamic SQL mask pattern through view
- REST API: verify {CUBE} in mask.sql resolves correctly through view
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Fix implicit-arrow-linebreak and function-paren-newline eslint errors in the masking policy check. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add secret_string with dynamic SQL mask to masking_hidden_cube.
The view masking_view_over_hidden_cube (includes: *) picks it up.
SQL API test: verify secret_string returns pattern /^\*\*\*.{1,2}$/
REST API test: verify {CUBE} in mask.sql resolves to the underlying
hidden cube's table when accessed through the view
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
New page: - docs/content/product/auth/data-masking.mdx — comprehensive guide covering mask definitions (static and SQL), configuring masking in access policies, policy evaluation with masking, default mask env vars, and common patterns Reference updates: - data-access-policies.mdx reference — add member_masking parameter docs with includes/excludes, code examples in YAML and JS - dimensions.mdx reference — add mask parameter with examples - measures.mdx reference — add mask parameter with examples Cross-references: - data-access-policies.mdx concept — mention data masking alongside member-level and row-level security - member-level-security.mdx — add InfoBox pointing to data masking as an alternative to hiding members entirely - _meta.js — add data-masking to auth section navigation Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Remove standalone data-masking.mdx page and integrate the content into existing documentation: data-access-policies.mdx (concept page): - Add 'Data masking' subsection under 'Policy evaluation' explaining how masking works, with full YAML/JS code examples and result table - Add 'Mask sensitive members' common pattern - Update intro to mention data masking as third pillar All cross-references now point to anchors within existing pages (#data-masking, #member-masking) instead of a separate page. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Create a separate MaskedSqlNode instead of embedding masking logic in EvaluateSqlNode. MaskedSqlNode wraps EvaluateSqlNode in the factory chain: if a member is masked and has a compiled mask_sql template, it evaluates the mask; otherwise it delegates to its input (EvaluateSqlNode) for normal SQL evaluation. Factory chain: MaskedSqlNode -> EvaluateSqlNode -> ... Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add a new top-level describe block 'Cube RBAC Engine [masking without tesseract]' that runs with CUBESQL_SQL_PUSH_DOWN=false, exercising the BaseQuery JS path for masking instead of the Tesseract Rust path. Tests cover the same REST API masking scenarios: - Cube: viewer masked, full access real, partial mixed - Cube: dynamic SQL mask without Tesseract - View: masking_view_masked viewer/full/SQL-mask - View over hidden cube: viewer masked/SQL-mask/full access real Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add use_tesseract_sql_planner matrix to integration-smoke CI job, matching the pattern used by the integration job. This runs all smoke tests (including RBAC masking) with both Tesseract enabled and disabled, exercising both the Rust and JS SQL generation paths. Remove the separate 'masking without tesseract' describe block from the test file — the CI matrix handles this now. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
SQL API tests (masking_partial user): - Masked measure (count=12345) grouped by real dimension (public_dim) - Masked measure grouped by masked dimension (secret_number=-1) REST API tests: - Masked measure grouped by masked dimension (viewer) - Masked measure grouped by real dimension (partial access) - Multiple measures (one masked, one real) grouped by real dimension Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
…QL API GROUP BY tests FinalMeasureSqlNode: when a measure is masked, skip aggregation wrapping entirely. The mask literal IS the final value — wrapping it with COUNT/SUM would produce wrong results (e.g. COUNT(12345) returns total row count, not 12345). Remove SQL API GROUP BY tests that used explicit column names with GROUP BY syntax not supported by cubesql's query planner. The REST API GROUP BY tests cover the same scenarios correctly. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
…n for masked measures Add SQL API GROUP BY tests using proper MEASURE() function syntax: - MEASURE(masking_test.count) grouped by real dimension (public_dim) - MEASURE(masking_test.count) grouped by masked dimension (secret_number) FinalMeasureSqlNode: skip aggregation wrapping for masked measures so the mask literal is the final value (avoids COUNT(12345) → 500). Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
… tests - Wrap static mask literals in parentheses ((12345), (FALSE), (NULL)) to prevent Tesseract template compiler from treating them as column references (e.g. 'column masking_test.false does not exist') - Set default resolvedMaskSql = (NULL) for all members (even without explicit mask) so Tesseract always has a mask function available - Remove measure mask assertions from SQL API tests — cubesql doesn't propagate maskedMembers to Tesseract so measure masking only works via REST API path. Dimension masking works in both paths. - Remove secret_string from views that go through SQL API Locally verified: 40/43 pass with both Tesseract enabled and disabled (3 Python config failures expected in this env). Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
…restore SQL API measure tests UngroupedQueryFinalMeasureSqlNode was wrapping masked count measures with CASE WHEN (12345) IS NOT NULL THEN 1 END → always returning 1. Add masking passthrough (same as FinalMeasureSqlNode) so the mask literal passes through as the final value. Restore all measure masking assertions in SQL API tests: - masking_viewer: count=12345, count_d=34567 - masking_partial: count=12345, plus MEASURE() GROUP BY tests - masking_view_masked: count=12345, count_d=34567 - masking_view_over_hidden_cube: count=12345 Locally verified: 42/45 pass with both Tesseract=true and false (3 Python config failures expected in this env). Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Add 7 new tests to symbol_evaluator.rs covering:
- masked_dimension_returns_mask_literal: static number mask (-1)
- masked_dimension_with_sql_mask: dynamic SQL mask with {CUBE} refs
- masked_dimension_default_null: default NULL mask for unmasked dims
- unmasked_dimension_returns_real_sql: mask defined but not active
- masked_measure_returns_mask_literal: count mask (12345), no aggregation
- masked_sum_measure_returns_mask_literal: sum mask (-1), no aggregation
- unmasked_measure_returns_aggregated_sql: no mask, normal aggregation
New fixture: masking_test.yaml with dimensions and measures that have
resolved_mask_sql definitions.
Infrastructure: TestContext.new_with_masked_members(), resolved_mask_sql
support in MockDimensionDefinition, MockMeasureDefinition, and YAML
parsers.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
d8c53e3 to
be3a2a1
Compare
MaskedSqlNode now only intercepts dimensions and time dimensions. Measure masking is handled exclusively by FinalMeasureSqlNode, which evaluates the mask SQL and skips aggregation wrapping in one step. UngroupedQueryFinalMeasureSqlNode no longer has masking logic — in ungrouped queries, measures show per-row values and masking a measure to a constant doesn't apply. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Remove measure mask assertions from SQL API SELECT * tests since ungrouped queries no longer mask measures. Dimension masking assertions remain. MEASURE() GROUP BY tests (grouped queries) and all REST API measure masking tests are unchanged. Verified: 42/45 pass with both Tesseract=true and false. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Stop setting resolvedMaskSql on ALL members — only set it when a mask is explicitly defined. This prevents polluting every member definition with [Function anonymous] which breaks 32 unit test snapshots in schema.test.js and views.test.js. For members masked at runtime but without an explicit mask definition, the Rust MaskedSqlNode and FinalMeasureSqlNode now fall back to (NULL) directly instead of relying on a pre-set resolvedMaskSql. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Ungrouped queries (SELECT *) now correctly mask measures with static masks (mask: -1, mask: 12345) while skipping SQL masks (mask.sql) that reference columns inapplicable in a per-row context. Tesseract (UngroupedQueryFinalMeasureSqlNode): - Check dependencies_count() on the mask SqlCall: 0 deps = static mask (apply), >0 deps = SQL mask (skip) BaseQuery JS: - Fix bug where all measures were masked in ungrouped queries - Add check: skip masking only for measures with SQL masks in ungrouped queries; static masks still apply Restore SQL API test assertions for static measure masks: - masking_viewer: count=12345, count_d=34567 - masking_partial: count=12345 - masking_view_masked: count=12345, count_d=34567 - masking_view_over_hidden_cube: count=12345 Verified: 42/45 pass with both Tesseract=true and false. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
The 'hourly refresh with 7 day updateWindow' test hardcoded -28800 (UTC-8, PST) for America/Los_Angeles timezone offset. When DST is active (March-November), the offset is -25200 (UTC-7, PDT), causing the test to fail every spring. Replace the hardcoded value with a regex that matches either offset. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
SQL masks (mask.sql) on measures are not applied in ungrouped queries (SELECT *) because the SQL expressions reference columns that aren't meaningful per-row. Static masks still apply. Recommend using a masked dimension if dynamic masking is needed in ungrouped mode. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
waralexrom
approved these changes
Mar 11, 2026
…ures
Fix JS mask.sql syntax in all docs — use template literal directly
(`...${CUBE}...`) not arrow function ((CUBE) => `...`).
Measures reference:
- Clarify that SQL mask on a measure should be an aggregate expression
(same as the sql parameter for number type measures)
- Add example with AVG(CASE WHEN ... THEN ... END)
- Add WarningBox about SQL masks not applying in ungrouped queries
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
…ia factory Remove masking logic from FinalMeasureSqlNode and UngroupedQueryFinalMeasureSqlNode — they now only handle aggregation/ungrouped wrapping with no masking awareness. All masking is in MaskedSqlNode which handles dimensions, time dimensions, and measures uniformly. It has an 'ungrouped' flag (set by the factory) that skips SQL masks on measures in ungrouped queries while still applying static masks. Factory routing: - Grouped: MaskedSqlNode::new(FinalMeasureSqlNode(...)) - Ungrouped: MaskedSqlNode::new_ungrouped(UngroupedQuery...(...)) - Dimensions: MaskedSqlNode::new(EvaluateSqlNode) (unchanged) Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
…in bridge Stop mutating member definitions with resolvedMaskSql. Instead: JS side: - Remove resolvedMaskSql creation from CubeEvaluator.prepareMembers - Remove resolvedMaskSql propagation from CubeSymbols view generation - For SQL masks (mask.sql), expose via non-enumerable maskSql getter using Object.defineProperty (invisible to serialization/snapshots) Rust bridge: - DimensionDefinitionStatic/MeasureDefinitionStatic: add mask field as serde_json::Value (reads the raw mask value for static masks) - DimensionDefinition/MeasureDefinition traits: rename resolved_mask_sql to mask_sql (reads the maskSql getter for SQL function masks) Rust factories: - Read mask_sql (MemberSql) for SQL masks → compile into SqlCall - Read static_data().mask (JSON value) for static masks → convert via mask_json_to_sql_literal() → SqlCall::new_literal() - SqlCall::new_literal() added as a public constructor for literal SQL Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
serde_json::Value can't be deserialized from Neon JS objects
('deserializer is not implemented'). Instead, compute the static
mask SQL literal on the JS side and expose it as a maskStatic
string property via Object.defineProperty.
Rust bridge reads mask_static: Option<String> from the static
struct, avoiding the serde_json::Value deserialization issue.
Verified: 42/45 pass with both Tesseract=true and false.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
The data model only has 'mask' — no extra fields. The Tesseract
bridge reads mask through a single maskSql getter (non-enumerable,
set via Object.defineProperty in prepareMembers) that normalizes
both static masks and SQL masks into a callable MemberSql function.
- Static masks (mask: -1) → wrapped in new Function returning literal
- SQL masks (mask: {sql: fn}) → returns mask.sql directly
- Rust bridge: only mask_sql trait method, no static struct fields
- Removed SqlCall::new_literal, mask_json_to_sql_literal (unused)
Verified: 42/45 pass with both Tesseract=true and false.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
String mask values should use paramAllocator.allocateParam() instead of escapeStringLiteral() to properly handle driver-specific string escaping. Different databases have different escaping rules; the param allocator delegates to the driver's parameterized query support. Updated both memberMaskSql (direct mask) and defaultMaskSql (env var default) to use allocateParam for string values. Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
Replace mask_sql: "..." with proper data model syntax:
- mask: -1 (static number)
- mask: { sql: "..." } (SQL expression)
Add YamlMask enum that deserializes both static values and
{sql: "..."} objects, matching the real data model format.
The YAML test fixtures now mirror exactly what users write.
Co-authored-by: Pavel Tiunov <pavel.tiunov@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Check List
Description of Changes Made
This PR implements a data masking feature for data access policies.
Why this change is being made:
Previously, if a user lacked access to a dimension or measure, they would receive an error. This feature allows sensitive data to be masked with a custom value or SQL expression, providing a transformed value instead of an error, enhancing data security and user experience.
How it works:
maskparameter (e.g.,mask: -1,mask: { sql: "CONCAT('***', RIGHT({CUBE}.secret_string, 3))" }).member_maskingsection in access policies allows specifying which members should be masked for a given role.member_maskingbut not fullmember_levelaccess to a member, the query engine will substitute the member's original SQL with its defined mask SQL or a default mask value (configurable via environment variables likeCUBEJS_ACCESS_POLICY_MASK_STRING).BaseQueryand the Rust Tesseract query planner, ensuring masked values are resolved at the SQL generation level.