Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we implement a MetaStoreFilterHook to modify the path and support HMS that uses the same ns #320

Open
flaming-archer opened this issue Jun 24, 2024 · 9 comments

Comments

@flaming-archer
Copy link
Contributor

Is your feature request related to a problem? Please describe.

An HMS uses a Hadoop cluster, assuming its path name is hdfs://ns/path/to/hms1 . Another HMS uses another Hadoop cluster, assuming its path name is hdfs://ns/path/to/hms2 .In this case, if hs2 obtains two identical ns paths, it is unclear which Hadoop cluster to access.
Assuming we have RBF connecting these two different Hadoop clusters, we can change these two paths to RBF paths. For example, hdfs://ns/path/to/hms1 become hdfs://rbf/ns1/path/to/hms1 , hdfs://ns/path/to/hms2 become hdfs://rbf/ns2/path/to/hms2 . The change of path can be modified based on a set of rules, and the above example is a simple rule. In this scenario, can we implement a MetaStoreFilterHook to achieve this goal.

Describe the solution you'd like

I see an example of PrefixingMetastoreFilter that can be used to modify a path. Perhaps we can refer to this to add a new class to achieve such functionality.

Describe alternatives you've considered

Perhaps there are other ways, such as changing the path of the existing HMS to the RBF path, but this may introduce some operational risks to the existing environment

Additional context
No, this is a new feature.

@patduin
Copy link
Contributor

patduin commented Jun 24, 2024

Please have a look at this maybe this satisfies the usecase. https://github.com/ExpediaGroup/apiary-extensions/tree/main/hive-hooks

Either way WD should support loading such hooks already it's just a matter of loading it all up. The implementation of this hook can live in any project. I prefer to leave those out of WD.

@flaming-archer
Copy link
Contributor Author

Please have a look at this maybe this satisfies the usecase. https://github.com/ExpediaGroup/apiary-extensions/tree/main/hive-hooks

Either way WD should support loading such hooks already it's just a matter of loading it all up that the implementation of this hook can live in any project. I prefer to leave those out of WD.

Thank you very much, this is very important to me, haha.

@patduin
Copy link
Contributor

patduin commented Jun 24, 2024

sure, have a look and if it's ok we can close this issue.

@flaming-archer
Copy link
Contributor Author

flaming-archer commented Jun 24, 2024

sure, have a look and if it's ok we can close this issue.

I took a look and it doesn't seem to be a perfect match. ApiaryMetastoreFilter is simply replaced as a path according to rules. What if the paths of two HMS are exactly the same. It would be better to introduce the concept of db or hms as a replacement for path rules at this time. For example, the path of hms1 and hms2 are replaced with different paths.
So is there a better way to deal with this situation,please do not hesitate to teach me...

@patduin
Copy link
Contributor

patduin commented Jun 24, 2024 via email

@flaming-archer
Copy link
Contributor Author

Can't you add a hook in the metastores and return the path with namenode so they are unique again?

On Mon, 24 Jun 2024, 11:18 tian bao, @.> wrote: sure, have a look and if it's ok we can close this issue. I took a look and it doesn't seem to be a perfect match. ApiaryMetastoreFilter is simply replaced as a path according to rules. What if the paths of two HMS are exactly the same. It would be better to introduce the concept of db or hms as a replacement for path rules at this time. For example, the path of hms1 and hms2 are replaced with different paths. So is there a better way to deal with this situation. — Reply to this email directly, view it on GitHub <#320 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAP6JGCKPSQEMXYEKZSPMA3ZI7P6JAVCNFSM6AAAAABJZGCRY2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBWGAYTINRRGY . You are receiving this because you commented.Message ID: @.>

Thank you, your thoughts is very good, I'll give it a try.

@flaming-archer
Copy link
Contributor Author

flaming-archer commented Jul 2, 2024

@patduin Hi, we have tested many times and found that hook works on the client side of HMS, and the modification on the server side of HMS is invalid. There are two things that we can meet the needs after doing. 1. Each client of WD needs to configure different regular replacement expressions, and currently, WD is unable to load different configuration items when loading hooks. 2.The hook implementation in apiary is compiled from hive2, and we need to modify it to hive3.

It seems that ApiaryNullAuthorizationProvider doesn't need to be configured either

Perhaps we can assign these two PR to these two projects separately? Do you think our approach is correct @patduin.

@patduin
Copy link
Contributor

patduin commented Jul 3, 2024

on 1) it's possible to load multiple RegEx but yeah I see they are not scoped to by hook. The hook itself is set per metastore though so you could potentially load a different implementation that does what you need per metastore.
I'd probably try make your own extensions where you just load a hook per metastore that does what you need. It's hard to make this very generic I'm not sure it's worth the effort. You'll have more control if you write your own hook and just use WD to hook it up which it already supports. You won't have to depend on us for reviews etc... You could potentially open source your extensions and we would be happy to add a link from WD readme as another example.
Consider the extensions just as a potential example on how to do it, you don't necessarily have to use that project.

@flaming-archer
Copy link
Contributor Author

on 1) it's possible to load multiple RegEx but yeah I see they are not scoped to by hook. The hook itself is set per metastore though so you could potentially load a different implementation that does what you need per metastore. I'd probably try make your own extensions where you just load a hook per metastore that does what you need. It's hard to make this very generic I'm not sure it's worth the effort. You'll have more control if you write your own hook and just use WD to hook it up which it already supports. You won't have to depend on us for reviews etc... You could potentially open source your extensions and we would be happy to add a link from WD readme as another example. Consider the extensions just as a potential example on how to do it, you don't necessarily have to use that project.

on 1) we can make a modification so that different configurations can be loaded to different hms in the future. @yangyuxia you can try sending a PR, it looks very similar to what you submitted last time. on 2) we can create our own hook and use it ourselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants