[#10881] feat(authentication): Add local authentication design document#10883
[#10881] feat(authentication): Add local authentication design document#10883jerryshao merged 42 commits intoapache:mainfrom
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new design document proposing a built-in “Local IDP” mode for Apache Gravitino, primarily aimed at POC/offline/emergency-fallback deployments where external OAuth IdPs are undesirable or unavailable.
Changes:
- Introduces a Local IDP proposal using HTTP Basic authentication and relational-store-backed identity data.
- Specifies password hashing/storage using Argon2id with PHC-style encoded hash strings.
- Describes proposed global (non-metalake-scoped) local user/group tables and management APIs, plus bootstrap/admin behavior and security notes.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| | Field | Type | Description | | ||
| |---|---|---| | ||
| | `name` | `String` | User name | | ||
| | `audit` | `AuditDTO` | Audit information | |
There was a problem hiding this comment.
Why do we need audit information here?
There was a problem hiding this comment.
It is consistent with the previous similar interface, so "audit info" has been added.
There was a problem hiding this comment.
Maybe we can keep it simple first, we can add it when required for the user.
There was a problem hiding this comment.
Would it be possible to remove the audit information from the local auth-related interfaces?
There was a problem hiding this comment.
We can remove them first. We can add them if required by users.
There was a problem hiding this comment.
Got i have fixed it.
| | Item | Value | | ||
| |---|---| | ||
| | Method | `PUT` | | ||
| | Path | `/api/users/{user}/password` | |
There was a problem hiding this comment.
Do u need use the password here? Is /api/users/{user} enough?
There was a problem hiding this comment.
OK, remove "password" at the end.
|
|
||
| - Only service admin can change any account password. | ||
| - The password cannot contain a colon (`:`). (RFC 7617) | ||
| - The password must be at least 12 characters long and at most 64 characters long. (OWASP) |
There was a problem hiding this comment.
A general cryptography association can remove this. I'll submit the code.
| - local IdP mode is recommended for POC, offline rather than as a | ||
| replacement for enterprise-grade external identity systems | ||
|
|
||
| --- |
There was a problem hiding this comment.
How did the Trino handle the Basic authentication?
There was a problem hiding this comment.
I will conduct further tests on this. If the use of Trino requires code modification for verification, I will open a separate issue/pr to address this problem.
There was a problem hiding this comment.
You should a segment to explain this point.
There was a problem hiding this comment.
trinodb/trino#29132 have found this PR for Trino that can support this scenario. This PR was merged on April 29, 2026 at 6:00 (Beijing time).
There was a problem hiding this comment.
You should add the segment to explain this point.
There was a problem hiding this comment.
My document has been updated.
There was a problem hiding this comment.
you can find it in 8.3
| |---|---| | ||
| | Method | `GET` | | ||
| | Path | `/api/users/{user}` | | ||
| | Permission | Only service admin / owner can execute | |
There was a problem hiding this comment.
Are these interfaces optional? Do we have a configuration option to control them?
There was a problem hiding this comment.
These interfaces are currently fixed. Should we add a feature to determine whether basic authentication is enabled or not, in order to decide whether these interfaces can be accessed?
There was a problem hiding this comment.
This interfaces should be optional.
There was a problem hiding this comment.
Got I have been modified.These interfaces can only be called when the user uses "basic" seriously.
|
|
||
| | Error case | HTTP status | | ||
| |---|---| | ||
| | Password verification failed | `401` | |
There was a problem hiding this comment.
I have concern about these error codes. Are they necessary?
There was a problem hiding this comment.
Yes, I removed the 428 and other less commonly used HTTP codes, and only retained codes like 401, which are common authentication failure codes.
There was a problem hiding this comment.
Especially for 422 Unprocessable Content, Is it common usage?
There was a problem hiding this comment.
This comment isn't resolved. You shouldn't mark this resolved.
There was a problem hiding this comment.
Source of the specification: Initially defined in RFC 4918 of the WebDAV extension, it has now been incorporated into the core specification of HTTP semantics and content RFC 9110, becoming an Internet standard.
There was a problem hiding this comment.
Systems That Use HTTP 422 Status Code
HTTP 422 (Unprocessable Entity) originated as a WebDAV extension but is now widely adopted in modern RESTful APIs for semantic validation errors.
🌐 API Endpoints / Gateways & Cloud Platforms
Mainstream Web Frameworks (provide built-in support for returning 422):
Laravel (PHP): Returns 422 automatically for AJAX request validation failures.
Rails (Ruby) & Devise: Returns 422 when Turbo-driven form validations fail.
Drupal (PHP): Provides UnprocessableHttpEntityException class for throwing 422 exceptions.
Yii (PHP): Offers yii\web\UnprocessableEntityHttpException.
ASP.NET Core (C#): Includes UnprocessableEntity() result type.
Apache HttpClient (Java): Defines SC_UNPROCESSABLE_ENTITY constant.
Ruby: Gem::Net::HTTPUnprocessableEntity class for handling 422.
Specific Services & Components:
Red Hat Satellite: Returns 422 via its API when creating host entries with invalid parameters.
CnosDB: Uses 422 in its REST API to indicate operation execution failure.
FOLIO: Returns 422 with "EMAIL_ALREADY_EXIST" error when a user registers with an existing email.
NHS e-Referral Service: Returns 422 in its FHIR API when a request is unprocessable due to business logic (e.g., invalid relationship).
📡 Web Servers & Storage Systems
WebDAV servers: Use 422 when handling complex or malformed XML request bodies.
NetApp E-Series (SANtricity): Returns 422 when enabling/disabling AutoSupport with invalid parameters.
Dell PowerProtect Data Manager (versions <19.18): Shows a 422 error in the UI when configuring network with an invalid "search domain" value.
IBM Systems (AIX, z/OS): 422 may appear as a specific system error code.
📱 Operating Systems & Application Layer
Microsoft AD FS: Web Application Proxy (WAP) servers log Event ID 422 when idle connection timeout occurs (e.g., 100 seconds of inactivity).
Bittitan: Reports 422 Unprocessable Entity errors when interacting with Exchange WebDAV API on unprocessable mailbox folders.
OutSystems (mobile app framework): Returns 422 when data sent to an API fails validation
There was a problem hiding this comment.
Do they use 422 to present that old password same as the new one.
There was a problem hiding this comment.
Maybe you can use 400 Bad Request here. Because it represents illegalArguements in our system instead of format exception. I prefer using 400. You can listen to others' opinion.
|
|
||
| ## 10. Security Considerations | ||
|
|
||
| The local IdP is intentionally lightweight, but the following constraints remain important: |
There was a problem hiding this comment.
Could u give me the user process, bootstrap process and verification?
There was a problem hiding this comment.
I have included this part of the content in sections 6.2 and 7.1.
There was a problem hiding this comment.
The code has been submitted and the design document has been revised.
There was a problem hiding this comment.
You don't get the point about process. You should add user process instead of implementation process.
There was a problem hiding this comment.
Got I have add in document : "### 6.3 Example Basic Authentication Bootstrap Flow"
There was a problem hiding this comment.
Ctrl + F you can find it
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
| - only one bootstrap account is created by default: **service admin** | ||
| - the default password is **123456** | ||
| - after initial login, the administrator is expected to change the password immediately |
There was a problem hiding this comment.
This is used for UI login. REST API is hard to judge the first login.
There was a problem hiding this comment.
The web filter can filter such requests. If it is the "serviceAdmin" account and there is no password for "serviceAdmin" in the database, then only the "serviceAdmin" account is allowed to call the interface for changing the password.
There was a problem hiding this comment.
I plan to adopt a different approach by using 'interactive script' to initialize the authentication of serviceAdmin. This will generate an SQL statement that will be directly written into the database.
|
|
||
| | Key | Default | Optional Values | | ||
| |---|---|---| | ||
| | `gravitino.authenticator.basic.algorithm` | `Argon2id` | `Argon2id` | |
There was a problem hiding this comment.
Is gravitino.authenticator.basic.algorithm a good name?
There was a problem hiding this comment.
gravitino.authenticator.basic.password-hash-algorithm I have changed config name to this one.
There was a problem hiding this comment.
Could u borrow other system's experience? Notice, we should use camel style.
There was a problem hiding this comment.
Got I have resolved and push code.
| 2. Start Gravitino. | ||
|
|
||
| 3. On first startup, no active `adminUser` record exists yet in `local_user_meta`, so Gravitino | ||
| accepts the bootstrap credential `adminUser:123456` only for the bootstrap login and immediate |
There was a problem hiding this comment.
How to determine bootstrap login, this is not easy for REST.
There was a problem hiding this comment.
For basic authentication, the user is stateless. Therefore, logic can be added in the filter: check the local_user_meta table. If the serviceAdmin does not have any data in this table, then the serviceAdmin must not have been registered in the built-in IDP. At this time, the filter will intercept all requests except for "change password" and prompt the serviceAdmin account to change the password.
There was a problem hiding this comment.
Collect others' suggestion.
There was a problem hiding this comment.
I plan to adopt a different approach by using 'interactive script' to initialize the authentication of serviceAdmin. This will generate an SQL statement that will be directly written into the database.
| These tables follow Gravitino's existing metadata table conventions: | ||
|
|
||
| - numeric primary keys, | ||
| - `audit_info`, |
There was a problem hiding this comment.
Got i have resolved it.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
| ```shell | ||
| curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ | ||
| http://localhost:8090/api/idp/users/alice |
There was a problem hiding this comment.
Got I have supplemented the responses for all the interfaces.
| | Error case | HTTP status | | ||
| |---|---| | ||
| | Account doesn't exist | `404` | | ||
| | Password same with old | `422` | |
There was a problem hiding this comment.
I have concern about this error code.
There was a problem hiding this comment.
Got I have fixed it
| You can add users to a local group by providing the group name in the path and the target user | ||
| names in the request body. | ||
|
|
||
| The request path for REST API is `/api/idp/groups/{group}/users/add`. |
There was a problem hiding this comment.
This seems that you have a user called add.
There was a problem hiding this comment.
Got it. I removed "/users" from the URL and have submitted the code.
|
|
||
| The recommended module name is: | ||
|
|
||
| - `authenticators:authenticator-local-authentication` |
There was a problem hiding this comment.
This is not aligned with one mentioned above. Also, the name is too long.
There was a problem hiding this comment.
Got, 'authenticators/authenticator-basic' is better. It is shorter and more in line with the semantic meaning of the context.
| Unlike Gravitino's existing `user_meta` and `group_meta` tables, `local_user_meta` and | ||
| `local_group_meta` are intentionally designed as **global identity tables** and therefore **do not |
There was a problem hiding this comment.
Make the table name more meaningful, local_xxx will easily be confused with existing table names.
There was a problem hiding this comment.
May be idp_user_meta/idp_group_meta is more meaningful.The URL of the newly added interface also includes "idp".
| - only one bootstrap account is created by default: the configured **service admin** account (for | ||
| example, **adminUser**) | ||
| - the default password is **123456** | ||
| - the default password **123456** is intended only for the initial bootstrap login and the immediate |
There was a problem hiding this comment.
I would suggest to change the password to like "gravitino".
There was a problem hiding this comment.
Got, the document has been modified.
| admin account (for example, **adminUser**) with password **123456** to pass Basic verification | ||
| only for the bootstrap login and immediate password reset flow. | ||
| 3. Reject other management operations until the bootstrap password has been reset successfully. | ||
| 4. During the first successful password reset, create the bootstrap service admin record and store |
There was a problem hiding this comment.
How to reset the password, does it happen automatically or the service admin should do it manually?
There was a problem hiding this comment.
The process has been changed. An 'interactive script' will be used, and the password will be changed by the service admin actively.
|
|
||
| ```properties | ||
| gravitino.authenticators=basic | ||
| gravitino.authenticator.basic.passwordHasher=Argon2id |
There was a problem hiding this comment.
Do you need to make this a configuration? I don't think it is necessary. If you want to make it a configuration, you need to provide different choices, do you support this?
There was a problem hiding this comment.
Got, Currently only supports this one encryption algorithm and there are no other options available. This configuration can be removed.
The design document has been modified.
| -H 'Content-Type: application/json' \ | ||
| https://<gravitino-host>/api/idp/users/adminUser \ | ||
| -d '{ | ||
| "password": "ChangeMeToAStrongPassword" |
There was a problem hiding this comment.
Is it safe to transfer the password without hashing?
There was a problem hiding this comment.
Yes, transmitting passwords in plain text is secure. However, certain conditions must be met: 1. HTTPS 2. TLS 1.2 or a higher version 3. The server does not log passwords (log desensitization).
OWASP and NIST also recommend using this solution.
| ```shell | ||
| curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \ | ||
| http://localhost:8090/api/idp/groups/engineering | ||
| ``` |
There was a problem hiding this comment.
Can we remove a group if this group still has users?
There was a problem hiding this comment.
Yes, when there are still users in the user group, the group can be removed. This interface will first remove the relationship between group and users, and then remove the group. The design of this interface is based on the design of the "remove role" interface.
Spec have also been incorporated into the design document.
There was a problem hiding this comment.
This may be dangerous, you can have a query parameter like force to let the user choose. You can refer to schema / catalog deletion implementation.
| ```shell | ||
| curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \ | ||
| -H "Content-Type: application/json" -d '{ | ||
| "userNames": ["alice"] |
There was a problem hiding this comment.
Simplify all the userNames to users.
There was a problem hiding this comment.
Got, 'users' is better. I have modified the document.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| - the JDBC URL | ||
| - the Gravitino database name | ||
| - the JDBC user name | ||
| - the JDBC password |
There was a problem hiding this comment.
This can be read from the configuration. If user they want to start Gravitino, they have to configure the JDBC information, so that can be shared here, no need to let user to input again.
There was a problem hiding this comment.
Besides, I think this script is only needed for local authentication, right?
There was a problem hiding this comment.
1.Yes, JDBC information can be read from other configuration files.
2.Yes, this script is only for local authentication.
| 7. Hash the password with Argon2id. | ||
| 8. Generate the `INSERT` statement required for the target JDBC backend and execute it immediately | ||
| so the service admin record is written into `idp_user_meta`. | ||
| 9. Exit successfully only after the insert has been committed. |
There was a problem hiding this comment.
What happens if the initialization script is not running before the start?
There was a problem hiding this comment.
Did you sync with @danhuawang and @geyanggang about how to use it in the docker and k8s environment?
There was a problem hiding this comment.
Frankly saying, I'm in favor of the previous solution to have a default password, and let service admin to reset it for the first time.
To avoid hardcode issue, we can leave initial password as blank, then ask service admin to reset to a valid one.
Using a separate script and interactive solution makes things complicated, and hard to use in a container deployment.
There was a problem hiding this comment.
- if the initialization script is not running, the filter will block all requests.(Due to service admin has no password)
- Got, I only discussed this matter with Rory. Now I'm going to discuss it with @danhuawang and @geyanggang
There was a problem hiding this comment.
I agree with the idea leaving initial password as blank, then ask service admin to reset it for the first time access. Using a separate script only for password init is not convenient.
| - **interactive service admin initialization** | ||
|
|
||
| The result is a self-contained authentication path that is easy to deploy, fast to evaluate, and | ||
| well aligned with Gravitino's lightweight quick-start experience. |
There was a problem hiding this comment.
Can you add a chapter to describe you work plan and checklist, it will be helpful for reviewer and AI to check.
There was a problem hiding this comment.
Got I will follow the style of other design documents to complete the plan and checklist.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Code Coverage Report
|



What changes were proposed in this pull request?
This PR adds a new design document,
design-docs/gravitino-local-idp.md, for Local IDP support in Gravitino.The document covers:
local_user_meta,local_group_meta, andlocal_group_user_reluser_meta/group_metaWhy are the changes needed?
Gravitino currently relies on external identity providers for OAuth-based authentication. That works well in enterprise environments, but it creates friction for POC, offline, isolated, and emergency-fallback scenarios.
This design document captures a lightweight Local IDP proposal so the feature scope, storage model, authentication flow, and management APIs are clearly defined before implementation.
Fix: N/A
Does this PR introduce any user-facing change?
Yes.
It adds a new design document describing the proposed Local IDP feature, including:
basicauthenticator modegravitino.authenticator.basic.algorithmconfiguration keyHow was this patch tested?
Documentation-only change.
Reviewed the document structure and aligned it with the existing
design-docsstyle.