Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Array in claims taken as duplicate claims with "Duplicate claims are not allowed within a request." #1957

Closed
1 task done
abratv opened this issue Jan 9, 2024 · 3 comments · Fixed by #2003
Closed
1 task done
Assignees
Labels
auth bug Something isn't working cri Customer Reported issue engine issues that require change in engine code
Milestone

Comments

@abratv
Copy link

abratv commented Jan 9, 2024

What happened?

The claim type is "scope", stored as "array" value
Data api builder simply crash because we have array in our access token

image
image
image

Version

Microsoft.DataApiBuilder 0.9.7+e560142426d1c080b9fd7b7fabff51a276f6bf61

What database are you using?

Azure SQL

What hosting model are you using?

Local (including CLI)

Which API approach are you accessing DAB through?

REST

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@abratv abratv added bug Something isn't working triage issues to be triaged labels Jan 9, 2024
@seantleonard seantleonard self-assigned this Jan 9, 2024
@seantleonard seantleonard added auth engine issues that require change in engine code and removed triage issues to be triaged labels Jan 9, 2024
@seantleonard
Copy link
Contributor

Thank you for reporting, @abratv. I agree this is an issue and I'm looking into it. Few questions:

  1. What identity provider are you using, EntraID/Azure AD?
  2. Out of curiosity, do you use the scope claims (scp) in any DAB Database authorization policy rules?

@abratv
Copy link
Author

abratv commented Jan 10, 2024

  1. It's our internal open id provider
    Like what I said in another issue, I'm aware officially its Az AD but there is no "technical" limitation other than "business decision" here, Msft promote this as open source and can run it on premise so... (does not mean i want to fork it :P)
  2. No, normally scope claim is used to authorize the api endpoint, say "api:person:read", then this indicate the access token can only be used to reach GET /person, it does not even hit the database yet
    Having saying that, because DAB simply throw all claims to sp_sessions_context? (there was discussion to configure this instead of using all claims but it does not made it?), it should be up to us to decide which claims to be used in session context
    DAB should simply give us "example/best practice" and not limit the possibility of using DAB

@seantleonard seantleonard added this to the 0.11rc milestone Jan 10, 2024
seantleonard added a commit that referenced this issue Jan 10, 2024
…on of duplicate claims. Addresses #1957. Update and add tests to exercise changed code.
@abratv
Copy link
Author

abratv commented Jan 11, 2024

@seantleonard i saw your PR, seems that means we can't use any stored claims with JSON array, why not simply using "scope_0", "scope_1",... but again this should be configurable

seantleonard added a commit that referenced this issue Feb 13, 2024
# Improved Claims Handling for DB Policies and MS SQL Session Context

## Why make this change?

Initially reported in #1957, DAB does not gracefully handle instances
where the provided JWT token has claims with value type JSON Array.

Two negative impacts:
1. Developers who don't override the default MS SQL
`set-session-context` value of `true` to `false` will observe that
requests fail for tokens that fit the above criteria.
2. Developers who write database policies with `@claims.claimType`
references will see requests fail when the claimType referenced fits the
above criteria.

**Note:** The issue raised in #1957 is unique because the user may be
using a historical version of Duende's IdentityServer which emits token
scopes in a `scope` claim here the value is a JSON array. That format
differs from Entra ID which emits tokens scopes in the 'scp' claim whose
value is a string of values delimited by spaces. Reference: [Entra ID
Access Token Claims
Reference](https://learn.microsoft.com/entra/identity-platform/access-token-claims-reference#payload-claims).

## What changes are introduced in this PR?

### 1. Prevents `@claims.claimType` references in a DB policy from
failing a request when the claimType has a value type of JSON array.

**Example:** 
Given the database policy: `@claims.groups eq @item.groupid`, DAB would
previously (before this PR's changes) fail a request if the provided
access token had >1 group because the `groups` claim is a JSON string
array in the JWT token which dotnet resolves into multiple `Claim`
objects. (one claim per group where claimType is `groups` and value is
`groupGUID`. The request failure no longer occurs.

**How is this implemented?** 
DAB now recognizes when multiple `Claim` objects exist for a single
claim type. Given the policy `@claims.groups eq @item.groupid` DAB will
replace `@claims.groups` with the first instance of the claimType
`groups` that DAB finds. If DAB has resolved two `groups` claim objects
such as `["groupGUID1", "groupGUID2"]`, DAB would only resolve the first
it finds: `groupGUID1`.

When/if implemented, issue #2004 addresses this behavior by adding the
OData operator `in` so that the query predicate is generated to be
`([dbo].[groupid] in ('tokenGroupGUID1', 'tokenGroupGUID2'))`.

### 2. Prevents multiple instances of a claimType from failing a request
utilizing MSSQL's `set-session-context` feature

**How?**
Aggregates multiple `Claim` objects of the same `claimType` (claim name)
into a JSON array serialized into a string. The original value type in
the JWT JSON array (bool, int, string) is preserved. DAB uses the
serialized JSON as the value for a session context variable whose key is
`claimType`.

When dotnet processes a JWT token, claims whose values are JSON arrays
will be split into distinct `Claim` objects where `claimType` is the
claim name and `value` is one of the values in the array. `Claim`
objects are created for each object in the array. DAB uses the session
context feature to pass token claims and claim values to the database as
session context variables.

#### This is not a breaking change.
This is not a breaking change because access tokens that had claims
which didn't result in DAB failing the request will still have the claim
values passed as is -> scalar values. This is because usable tokens
didn't contain claims with JSON arrays as values.

A breaking change would be modifying this behavior to pass the scalar by
its original type as present in the JWT token by setting
`DbConnectionParam.DbType` explicitly when creating `DbConnectionParam`.

#### How can developers write security policies in SQL? 
When the value of a session context variable is a serialized JSON string
containing JSON array, the value can be processed using the following
tsql functionality:
- `JSON_QUERY`(SQL Server 2016(13.X) and later, SQL MI, Azure SQL
Database, Azure Synapse Analytics)
- `JSON_VALUE` (SQL Server 2016(13.X) and later, SQL MI, Azure SQL
Database, Azure Synapse Analytics)
- `JSON_ARRAY` (SQL Server 2022 (16.X), Azure SQL Database)

#### Example JWT token that is now handled without error 
```json
{
  "aud": "00000003-0000-0000-c000-000000000000",
  "iss": "https://sts.windows.net/25fd4421-0fff-4ed6-ad2f-c2ca00ed7207/",
  "iat": 1706642510,
  "acr": "1",
  "int_array": [1,2,3],
  "bool_array": [true, false, true],
  "groups":"src1",
  "_claim_sources":{
     "src1" : { "endpoint" : "https://graph.microsoft.com/v1.0/users/{userID}/getMemberObjects" }},
  "wids": ["d74b8d81-39eb-4201-bd9f-9f1c4011e3c9","18d14519-c4da-4ad4-936d-9a2de69d33cf","9e513fc0-e8af-43b1-a6c7-949edb1967a3"],
  "roles": ["anonymous", "authenticated", "myCustomRole"],
  "scp": "email openid profile User.Read",
  "scope": ["idServerScope1", "idServerScope2"]
}
```

#### Example tsql created by dab for session context

The following SQL shows how an Entra ID `scp` claim is still passed as a
space delimited string and
how a `roles` claim, a JSON array in the JWT token, is properly
serialized into a JSON array string and passed to MS SQL session context
without the request failing:

Assumption: `x-ms-api-role` header value is `authenticated`
```sql
EXEC sp_set_session_context 'roles', @session_param0, @read_only = 1;
EXEC sp_set_session_context 'scp', @session_param1, @read_only = 1;
EXEC sp_set_session_context 'scopes', @session_param2, @read_only = 1;

SELECT TOP 1 [dbo_books].[id] AS [id], [dbo_books].[title] AS [title], [dbo_books].[publisher_id] AS [publisher_id] FROM [dbo].[books] AS [dbo_books] WHERE [dbo_books].[id] = @param0 ORDER BY [dbo_books].[id] ASC FOR JSON PATH, INCLUDE_NULL_VALUES,WITHOUT_ARRAY_WRAPPER

@param0 int,
@param1 nvarchar(2),
@Param2 nvarchar(5),
@param3 nvarchar(12),
@session_param0 nvarchar(13),
@session_param1 nvarchar(37)
@session_param2 nvarchar(44)

@param0=1,
@param1=N'id',
@Param2=N'title',
@param3=N'publisher_id',
@session_param0=N'["Authenticated"]',
@session_param1=N'GraphQL.ReadWrite REST.EndpointAccess'
@session_param2=N'["GraphQL.ReadWrite", "REST.EndpointAccess"]'
```

#### Matrix of how jwt claims are process by DAB and dotnet for session
context

`Dictionary<string, string> GetProcessedUserClaims(HttpContext?
context)` is populated based on the following rules:

| Jwt ClaimName | JwtClaimValue |

|--------------------|-------------------------------------------------------------------------------------------|
| scp | openid profile customScope |
| scopes (idserver4) | ["openid", "profile","customScope"] |
| intArrayClaim | [1,2,3] |
| boolArrayClaim | [true, false, true] |
| nullValueClaim | null |
| _claim_sources | {"src1":
{"endpoint":"https://graph.microsoft.com/v1.0/users/{userID}/getMemberObjects"}}
|

| DotNetProcessed Name | DotNet Processed Value |

|----------------------------------------------|-------------------------------------------------------------------------------------------|
| scp | openid profile customScope |
| scopes </br> scopes </br> scopes | openid </br> profile </br>
customScope |
| intArrayClaim </br> intArrayClaim </br> intArrayClaim | 1 </br> 2
</br> 3 |
| boolArrayClaim </br> boolArrayClaim </br> boolArrayClaim | true </br>
false </br> true |
| nullValueClaim | //empty string |
| _claim_sources | {"src1":
{"endpoint":"https://graph.microsoft.com/v1.0/users/{userID}/getMemberObjects"}}
|


| ProcessedValueForSessionCtx | Remarks |

|-------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| openid profile customScope | scp claim from Entra ID access token |
| ["openid", "profile", "customScope" | Non-Entra ID identity providers
may emit a "scopes" claim as a JSON string array. Dotnet processes the
array into individual scope claims and doesn't retain the space
delimited string. |
| [1,2,3] | |
| [true, false, true] | |
| | |
| {"src1":
{"endpoint":"https://graph.microsoft.com/v1.0/users/{userID}/getMemberObjects"}}
| |


## Tests

-[x] Unit tests
-[x] Integration tests
@seantleonard seantleonard added the cri Customer Reported issue label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auth bug Something isn't working cri Customer Reported issue engine issues that require change in engine code
Projects
None yet
2 participants