Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Catalog - Add ability to link to existing entries (or obtain the entryId) via linkedResource #6766

Open
Ryan-Lintern opened this issue Jul 9, 2020 · 6 comments

Comments

@Ryan-Lintern
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

When creating any BigQuery table or dataset on GCP, it is automatically assigned an entry, with a path of the form

projects/{project_id}/locations/{location}/entrygroups/{entryGroupId}/entries/{entryId}

Currently, the google_data_catalog_entry resource only allows for the creation of new entries, rather than linking to an existing entry. As such, any tags created are attached to the new entry, and not onto the existing entry/table.
As such, please add the ability to obtain the entryId relating to the existing BigQuery linkedResource (https://cloud.google.com/data-catalog/docs/reference/rest/v1/entries/lookup)

New or Affected Resource(s)

  • google_datacatalog_tag
  • google_datacatalog_entry

Potential Terraform Configuration

References

  • #0000
@tylerwengerd-cr
Copy link

tylerwengerd-cr commented Aug 21, 2020

For anyone who needs this functionality now for BQ resources - the entryId field is a base64-encoded string generated from the resource id, minus any padding.

For example, projects/my-project/datasets/my-dataset/tables/my-table has the following base64 encoding:

> echo -n "projects/my-project/datasets/my-dataset/tables/my-table" | base64
cHJvamVjdHMvbXktcHJvamVjdC9kYXRhc2V0cy9teS1kYXRhc2V0L3RhYmxlcy9teS10YWJsZQ==

So that table in the US location results in the following entryId:

projects/my-project/locations/us/entryGroups/@bigquery/entries/cHJvamVjdHMvbXktcHJvamVjdC9kYXRhc2V0cy9teS1kYXRhc2V0L3RhYmxlcy9teS10YWJsZQ

My current workaround to generate the entryId based on the above info is as follows:

locals {
  table_location       = lower(google_bigquery_table.default.location)
  base64_encoded_id    = trim(base64encode(google_bigquery_table.default.id), "=")
  datacatalog_entry_id = "projects/${google_bigquery_table.default.project}/locations/${local.table_location}/entryGroups/@bigquery/entries/${local.base64_encoded_id}"
}

This is for a table but is basically the same for a dataset.

@burnzy
Copy link

burnzy commented Aug 24, 2020

That's great to know! Thanks for the info @tylerwengerd-cr

@burnzy
Copy link

burnzy commented Oct 20, 2020

@tylerwengerd-cr Any chance you know if this workaround also works for column-level resources?
I can't seem to see what the proper resource id's are for columns — only datasets and tables seem to be supported, but maybe I'm missing something? (I can't seem to find a full list of all GCP resource names/ids)

@tylerwengerd-cr
Copy link

tylerwengerd-cr commented Oct 21, 2020

@burnzy unfortunately no, I never got into column-level resources when I was working with this. Here is the best relevant documentation I can find on resource names (that you've probably already read 😄 ) so it might take some poking around with the API to see if it's even supported. Good luck!

@27Bslash6
Copy link

Can anyone confirm this workaround still works?

When trying to attach to an existing dataset/entry I'm seeing:

Error: Error creating Entry: googleapi: Error 400: "projects/my-project/locations/eu/entryG..." is an invalid value for CreateEntryRequest.entry_id. It must contain only English letters, numbers and underscores; and be at most 64 characters.

@AlexT-Ki
Copy link

This is what I've done as a workaround (Requires gcloud cli installed):

data "external" "data_catalog_lookup" {
  program = ["gcloud", "data-catalog", "--format", "json(name)", "entries", "lookup", "bigquery.table.`${google_bigquery_table.table.project}`.${google_bigquery_table.table.dataset_id}.${google_bigquery_table.table.table_id}"]
}

The entry location can be retreived using data.external.data_catalog_lookup.result.name.

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Nov 22, 2022
…CR interface (hashicorp#6766)

Co-authored-by: Luca Prete <lucaprete@google.com>
Signed-off-by: Modular Magician <magic-modules@google.com>
modular-magician added a commit that referenced this issue Nov 22, 2022
…ace (#6766) (#13105)

Co-authored-by: Luca Prete <lucaprete@google.com>
Signed-off-by: Modular Magician <magic-modules@google.com>

Signed-off-by: Modular Magician <magic-modules@google.com>
Co-authored-by: Luca Prete <lucaprete@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants