Skip to content

[feature][CGS][entrance] add support for controlling location clause usage in Hive tasks for security #5357

@v-kkhuang

Description

@v-kkhuang

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Problem Description

Currently, Hive tasks can use the LOCATION clause to specify custom data locations. This poses security risks as it allows users to potentially access unauthorized data paths or interfere with other users'\ data, compromising system security and data isolation.

Description

This PR adds a configuration option to control whether Hive tasks are allowed to use the LOCATION clause. When enabled, the system will detect and block SQL statements containing the LOCATION clause in Hive tasks.

Use case

Administrators want to prevent users from using the LOCATION clause in Hive tasks to enhance system security by ensuring users cannot access unauthorized data paths or interfere with other users'\ data through custom location specifications.

Solutions

  1. Add configuration linkis.entrance.sql.explain.hive.location.control.enabled to enable/disable LOCATION clause validation
  2. Implement LOCATION clause detection in Explain interceptor for Hive tasks
  3. Throw exception when LOCATION clause is detected in Hive SQL and control is enabled
  4. Add comprehensive unit tests for LOCATION control logic

Anything else

Affected module: linkis-computation-governance/linkis-entrance
Configuration: linkis.entrance.sql.explain.hive.location.control.enabled (default: false)

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions