Skip to content
This repository has been archived by the owner on May 4, 2021. It is now read-only.

Support for HDFS files #3

Closed
mrunesson opened this issue Jul 27, 2018 · 2 comments
Closed

Support for HDFS files #3

mrunesson opened this issue Jul 27, 2018 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@mrunesson
Copy link
Contributor

When I tag a Hive table, I want the same tag to be applies files the table is stored in.

This requires to:

  • look up path for table in metastore.
  • Add tag for the path in Atlas, when tagging.
@mrunesson mrunesson added the enhancement New feature or request label Jul 27, 2018
@mrunesson mrunesson self-assigned this Sep 11, 2018
@mrunesson
Copy link
Contributor Author

mrunesson commented Sep 11, 2018

Plan to handle tag files is to add a command line option --hdfs to tags_to_atlas command. This means program will for each table defined in table_tags.csv look up its storage location and tag that storage location.

Second step is extend command rules_to_ranger to be able to convert hive resources in ranger_policy.json to hdfs resources with similar access rules. This will only be done on commands in the ranger_policy.json file with block:

"options": {
        "expandHiveResourceToHdfs": true,
        "hdfsService": "hadoop"
}

For tag based rules, intention is to require user to extend the policy block to also cover hdfs.

Note: Row filtering and masking will still not work on jobs reading from HDFS.

@mrunesson
Copy link
Contributor Author

Released in v1.1.0

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant