ENG-50887: Mask value using a masking config#227
Conversation
Test Results330 tests +5 330 ✅ +5 11s ⏱️ ±0s Results for commit d38e08c. ± Comparison against base commit e092fbf. This pull request removes 72 and adds 18 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
| tenantScopedMaskingCriteria = [ | ||
| { | ||
| "tenantId": "testTenant", | ||
| "timeRangeAndMaskValues": [ |
There was a problem hiding this comment.
So, we can have multiple TimeRange conditions, right?
There was a problem hiding this comment.
Yes, each timeRange condition will have it's own masking values
| # ] | ||
| # } | ||
| # ] | ||
| tenantScopedMaskingCriteria = [ |
There was a problem hiding this comment.
let's comment it in main application.conf file here. As a default, there is no masking applied.
There was a problem hiding this comment.
Yes done, you've reviewed an older commit
| public class HandlerScopedMaskingConfig { | ||
| private static final String TENANT_SCOPED_MASKS_CONFIG_KEY = "tenantScopedMaskingCriteria"; | ||
| private final Map<String, List<MaskValuesForTimeRange>> tenantToMaskValuesMap; | ||
| private HashMap<String, Boolean> shouldMaskAttribute = new HashMap<>(); |
There was a problem hiding this comment.
What is key in both these maps - shouldMaskAttribute and maskedValue?
There was a problem hiding this comment.
AttributeId is the key to both.
Removing shouldMaskAttribute as it is not needed.
|
|
||
| private static boolean isTimeRangeOverlap( | ||
| MaskValuesForTimeRange timeRangeAndMasks, Instant queryStartTime, Instant queryEndTime) { | ||
| boolean timeRangeOverlap = true; |
There was a problem hiding this comment.
Default should be false, right?
There was a problem hiding this comment.
The following conditionals check for no overlap, i.e. they set the timeRangeOverlap to false. This statement is correct.
| timeRangeOverlap = false; | ||
| } | ||
|
|
||
| Instant endTimeInstant = Instant.ofEpochMilli(timeRangeAndMasks.getStartTimeMillis().get()); |
There was a problem hiding this comment.
should be getEndTimeMillis?
There was a problem hiding this comment.
I guess, is this what we are looking as function?
if (timeRangeAndMasks.getStartTimeMillis().isPresent()) {
Instant startTimeInstant = Instant.ofEpochMilli(timeRangeAndMasks.getStartTimeMillis().get());
Instant endTimeInstant = Instant.ofEpochMilli(timeRangeAndMasks.getEndTimeMillis().get());
if (!(startTimeInstant.isAfter(queryEndTime) || endTimeInstant.isBefore(queryStartTime))) {
return true;
}
}
return false;
There was a problem hiding this comment.
Right, fixed the condition.
| } | ||
|
|
||
| public void parseColumns(ExecutionContext executionContext) { | ||
| shouldMaskAttribute.clear(); |
There was a problem hiding this comment.
It seems like the state is maintained per request, but we should only test against the timeRange condition.
Pre-compute using config:
- tenantId -> List of timeRanges
- timeRange -> set of attributes
- timeRange -> map(attributeId, maskValue)
During response processing:
- Check if any time range matches.
- Pick the first match (or should we apply UNION?).
- If UNION is used, and the same attribute is present in two time ranges with different mask values, which one should we consider? I guess any Should be fine.
There was a problem hiding this comment.
This is how I've done it.
In case of attribute in multiple time ranges, I choose any value.
|
|
||
| return Observable.fromIterable(rowBuilderList) | ||
| .map(Builder::build) | ||
| // .map(row -> handlerScopedMaskingConfig.mask(row)) |
| # "tenantId": "testTenant", | ||
| # "timeRangeAndMaskValues": [ | ||
| # { | ||
| # "startTimeMillis": 0, |
There was a problem hiding this comment.
I think, we should take the timestamp as mandatory.
- if startTime or endTime missing -> log an warn stating that the filter will be ignored.
| if (indexToLogicalName.containsKey(colIdx)) { | ||
| return indexToLogicalName.get(colIdx); | ||
| } | ||
|
|
There was a problem hiding this comment.
nit: remove additional space.
| return result; | ||
| } | ||
|
|
||
| String getLogicalNameFromColIdx(Integer colIdx) { |
There was a problem hiding this comment.
nit: Lets see if we can use Optional as return type.
|
|
||
| private static final String MASKED_VALUE = "*"; | ||
| // This is how empty list is represented in Pinot | ||
| private static final String PINOT_EMPTY_LIST = "[\"\"]"; |
There was a problem hiding this comment.
PINOT_EMPTY_LIST -> ARRAY_TYPE_MASKED_VALUE
MASKED_VALUE-> DEFAULT_MASKED_VALUE
| public List<String> getMaskedAttributes(ExecutionContext executionContext) { | ||
| String tenantId = executionContext.getTenantId(); | ||
| List<String> maskedAttributes = new ArrayList<>(); | ||
| // maskedValue.clear(); |
| this.tenantId = config.getString(TENANT_ID_CONFIG_KEY); | ||
| this.maskValues = | ||
| config.getConfigList(TIME_RANGE_AND_MASK_VALUES_CONFIG_KEY).stream() | ||
| .map(MaskValuesForTimeRange::new) |
There was a problem hiding this comment.
filter out the empty maskings
| // to retrieve data | ||
| String colVal = resultAnalyzer.getDataFromRow(rowId, logicalName); | ||
| String colVal = | ||
| !maskedAttributes.contains(logicalName) |
There was a problem hiding this comment.
nit : get rid of the ! and invert to simplify
…ry-service into mask-unbofuscated-cookies
|
|
||
| Observable<Row> convert(ResultSetGroup resultSetGroup, LinkedHashSet<String> selectedAttributes) { | ||
| Observable<Row> convert(ResultSetGroup resultSetGroup, ExecutionContext executionContext) { | ||
| LinkedHashSet<String> selectedAttributes = executionContext.getSelectedColumns(); |
There was a problem hiding this comment.
can you move this also inside resultSetGroup.getResultSetCount() > 0?
| for (int colIdx = 0; colIdx < resultSet.getColumnCount(); colIdx++) { | ||
| for (int colIdx = 0, logicalColIdx = 0; | ||
| colIdx < resultSet.getColumnCount(); | ||
| colIdx++, logicalColIdx++) { |
There was a problem hiding this comment.
Why do we need logicalColIdx?
There was a problem hiding this comment.
There can be multiple map fields. When creating the idxToLogical name map, a map field would only increment the counter by one. Here each map field increments colIdx by 2. That's why I have created a new variable, which is incremented only once even if we go through a maps 2 columns (key and value)
| } | ||
|
|
||
| Optional<String> getLogicalNameFromColIdx(Integer colIdx) { | ||
| return Optional.ofNullable(indexToLogicalName.get(colIdx)); |
There was a problem hiding this comment.
On what scenario, can it be null?
There was a problem hiding this comment.
You're right, it shouldn't be
Uh oh!
There was an error while loading. Please reload this page.