Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Presto not getting deployed with type: gcp-types/dataproc-v1:projects.regions.clusters #546

Closed
Priyankasaggu11929 opened this issue Mar 20, 2020 · 25 comments

Comments

@Priyankasaggu11929
Copy link

Priyankasaggu11929 commented Mar 20, 2020

[TL;DR] Here is the solution to the below problem: #546 (comment)

I tried creating a Presto dataproc cluster using the optionalComponents field under the software-config, but I observed only PRESTO is not getting installed. Rest all other components are successfully getting installed from the below template.

Besides, I see Presto is not in beta as well, so what could be the possible solution here?

{% set clusterName = (env["deployment"] + "-dataproc-cluster") %}

resources:
- name: {{ clusterName }}
  type: gcp-types/dataproc-v1:projects.regions.clusters
  properties:
    region: {{ properties["region"] }}
    projectId: {{ env["project"] }}
    clusterName: {{ clusterName }}
    config:
      configBucket: example-bucket
      gceClusterConfig:
        zoneUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}
        tags: 
        - example-firewall-02
        subnetworkUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/regions//{{ properties["zone"] }}/subnetworks/example-subnet-1
        internalIpOnly: true
      masterConfig:
        numInstances: 1
        machineTypeUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}/machineTypes/n1-standard-2
        diskConfig:
          bootDiskSizeGb: 200
          bootDiskType: pd-ssd
      workerConfig:
        numInstances: 2
        machineTypeUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}/machineTypes/n1-standard-2
        diskConfig: 
          bootDiskSizeGb: 200
          bootDiskType: pd-ssd
      softwareConfig:
        imageVersion: 1.4.23-ubuntu18
        optionalComponents: 
        - ANACONDA
        - HIVE_WEBHCAT 
        - JUPYTER
        - PRESTO
        - ZEPPELIN 
        - ZOOKEEPER  
@Priyankasaggu11929 Priyankasaggu11929 changed the title Presto not getting deployed with type: gcp-types/dataproc-v1:projects.regions.clusters Presto not getting deployed with type: gcp-types/dataproc-v1:projects.regions.clusters Mar 20, 2020
@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

Hi @Priyankasaggu11929,

I am not a Dataproc expert, but let me try to help you.
Can you clarify, which components are installed? ( I am confused by your current description. I have a feeling you mean ANACONDA, HIVE_WEBHCAT, JUPYTER, ZEPPELIN, ZOOKEEPER is installed, but NOT PRESTO. Is this correct?

@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

Taking a quick look at the REST APIs:
dataproc-v1
dataproc-v1beta2

It looks like to me, the GA API does NOT support PRESTO, only the BETA API.
Unfortunately there is not official GCP type released for dataproc-v1beta, but you can use custom types easily.
Do you want me to help you to create a custom type for dataproc-v1beta? ( Looking at the dataproc-v1 type, it should be fairly simple.)

@Priyankasaggu11929
Copy link
Author

@ocsig Thanks for the reply.

Yes, I meant, ANACONDA, HIVE_WEBHCAT, JUPYTER, ZEPPELIN, ZOOKEEPER are getting installed but not PRESTO.

I think I need your help here. I'm not sure how to create a custom type for dataproc-v1beta. :)

@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

Ok, I am working on it right now.
I found issue with your jinja:
/regions//{{ properties["zone"] }} >> /regions/{{ properties["zone"] }}

Is this script working for you with the GA type ( except the PRESTO installation)?

@Priyankasaggu11929
Copy link
Author

Oh, @ocsig sorry for the typo there. That happened while I was pasting the template here and removing the actual values with configurable environment variables.

Yes, the script works properly apart from the PRESTO installation.

@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

(NOTE: I am still rinning my test, I may modify this post if it fails.)

Creation of a custom Type documentation

You don't need to modify your template except the type you are using. I made changes to use defaults because it was easier for testing.

Step1:
Create an options file ( nano dataproc-v1beta.type.yaml)
dataproc-v1beta.type.yaml:

options:
    inputMappings:
        - fieldName: Authorization
          location: HEADER
          value: $.concat("Bearer ", $.googleOauth2AccessToken())
          methodMatch: .*
    collectionOverrides:
    - collection: projects.regions.clusters
      options:
        virtualProperties: |
          schema: http://json-schema.org/draft-04/schema#
          type: object
          properties:
            region:
              type: string
          required:
          - region
        inputMappings:
        - methodMatch: ^(create|update|get|patch|delete)$
          location: PATH
          fieldName: region
          value: >
            $.resource.properties.region
        - methodMatch: ^setIamPolicy$
          location: PATH
          fieldName: resource
          value: >
            $.resource.self.name
        - methodMatch: ^(update|get|patch|delete)$
          location: PATH
          fieldName: clusterName
          value: >
            $.resource.properties.clusterName

Step 2:
Create the custom type.

gcloud beta deployment-manager type-providers create dataproc-v1beta --api-options-file dataproc-v1beta.type.yaml --descriptor-url='https://dataproc.googleapis.com/$discovery
/rest?version=v1beta2'

Waiting for insert [operation-1584697838999-5a14637c5191b-9deb0532-b1336e26]...done.
Created type_provider [dataproc-v1beta]

From this point, your project has a custom type: my-project/dataproc-v1beta.
You will use it just like a gcp-type: my-project/dataproc-v1beta:projects.regions.clusters

dpbeta.jinja NOTE: Change 'my-project'

{% set clusterName = (env["deployment"] + "-dataproc-cluster") %}

resources:
- name: {{ clusterName }}
  type: my-project/dataproc-v1beta:projects.regions.clusters
  properties:
    region: {{ properties["region"] }}
    projectId: {{ env["project"] }}
    clusterName: {{ clusterName }}
    config:
#      configBucket: example-bucket
      gceClusterConfig:
        zoneUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}
        tags: 
        - example-firewall-02
#        subnetworkUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/regions/{{ properties["zone"] }}/subnetworks/example-subnet-1
#        internalIpOnly: true
      masterConfig:
        numInstances: 1
        machineTypeUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}/machineTypes/n1-standard-2
        diskConfig:
          bootDiskSizeGb: 200
          bootDiskType: pd-ssd
      workerConfig:
        numInstances: 2
        machineTypeUri: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}/machineTypes/n1-standard-2
        diskConfig: 
          bootDiskSizeGb: 200
          bootDiskType: pd-ssd
      softwareConfig:
        imageVersion: 1.4.23-ubuntu18
        optionalComponents: 
        - ANACONDA
        - HIVE_WEBHCAT 
        - JUPYTER
        - PRESTO
        - ZEPPELIN 
        - ZOOKEEPER  

dpbeta.yaml

imports:
- path: dpbeta.jinja

resources:
- name: dpbeta
  type: dpbeta.jinja
  properties:
    region: us-west1
    zone: us-west1-b

Creating the deployment:

gcloud deployment-manager deployments create dptest --config=dpbeta.yaml

@Priyankasaggu11929
Copy link
Author

Priyankasaggu11929 commented Mar 20, 2020

Thanks for the steps. I'm reproducing the steps and testing now.

@Priyankasaggu11929
Copy link
Author

@ocsig I followed the steps and I think it happened as well.

But now I'm stuck at this error

 message: Required 'deploymentmanager.typeProviders.get' permission for '{{service_account_number}}@cloudservices.gserviceaccount.com
    for resource projects/{{project_name}}/typeProviders/dataproc-v1beta'

I was reading this https://cloud.google.com/deployment-manager/docs/access-control but still couldn't find out how to provide permission for the same.

@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

Is your type in the same project where you are launching the deployment from? ( so {{service_account_number}} == {{project_name}} ( The Project number and the project ID is identifying the same project, the two value is actually different?)

@ocsig
Copy link
Member

ocsig commented Mar 20, 2020

Make sure you either have the type in every project where you want to use it OR every DM Service account is a roles/deploymentmanager.viewer in the project where the custom type lives so it can read it. ( Every SA means, every project has a different default DM SA. These are project Editors in that project, but has no IAM attached in other projects. It needs to be DM.viewer in the project where it loads the type from.)

@Priyankasaggu11929
Copy link
Author

I am using my organisation's gcp platform, so even after I'm (corresponding user account) granted with Editor role, the permissions are still not satisfied. I'll update here once it gets solved

Or in otherwise case too.
Thanks a lot @ocsig for your time and help :)

@Priyankasaggu11929
Copy link
Author

@ocsig

[UPDATES]

I exhaustively added all the roles available for Deployment manager to the required service account.

screenshot_1

I re-checked the project name as well to ensure if it is right. Also, I realised we have only project, so no chance of having the custom type-provider in another project. But I keep getting the same error down below.

screenshot_2

Nothing seem to resolve the Permission error.

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

Can you confirm that the reducted projectID is correct?
projects/*******/typeProviders/dataproc-v1beta is your type where ******* has to be your project ID. ( If that is not correct, that would explain everything.)

@Priyankasaggu11929
Copy link
Author

Priyankasaggu11929 commented Mar 23, 2020

yes, project ID is correct. I re-checked it.

Also, when I'm trying to describe the custom type-provider through gcloud, the selflink comes as something like https://www.googleapis.com/deploymentmanager/v2beta/projects/******/global/typeProviders/dataproc-v1beta.

This .../projects/{project-id}/global/typeProviders/... is different from .../projects/{project-id}/typeProviders/...?

screenshot_2

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

It is pretty hard to debug your setup like this, so please forgive me for the super basic checks. What I am trying to verify if everything is happening in the same project or at one place your code was pointing to somewhere else.

Lets say, the project where you want to use this type is has the ID abc123 and the Number: 123123.

Would you mind to go throught the following checklist and let me know if you find any deviation?

  • gcloud config get-value project returns abc123. (This means you are querying the project you want.)
  • gcloud beta deployment-manager type-providers describe dataproc-v1beta | grep typeProviders/dataproc-v1beta returns https://www.googleapis.com/deploymentmanager/v2beta/projects/abc123/global/typeProviders/dataproc-v1beta ( expecially the projects/abc123/global part) (This means your type is in the right project.)
  • more dpbeta.jinja | grep type returns type: abc123/dataproc-v1beta:projects.regions.clusters ( This means your template is trying to access the custom type in the right project.)
  • In your error message you see projects/abc123/typeProviders/dataproc-v1beta
  • In your error message you see 123123@cloudervices.gserviceaccount.com

Please let me know which point(s) are failing in the checklist and what value do you see instead of your ProjectID/Number. ( Is it an other ID of yours you see?)

@Priyankasaggu11929
Copy link
Author

Priyankasaggu11929 commented Mar 23, 2020

Ok, giving more clarification.

  1. There is an organisation set in my GCP console abc.com (Ancestry: abc.com) and with corresponding id 123456.
  2. Under that organisation, I have one project with name, abc (Ancestry: abc.com > abc) and with corresponding id abc123 and project number 123123.

Now, when I run the above checklists, I get the following outputs.

  • gcloud config get-value project returns abc123. (This means you are querying the project you want.)
  • gcloud beta deployment-manager type-providers describe dataproc-v1beta | grep typeProviders/dataproc-v1beta returns https://www.googleapis.com/deploymentmanager/v2beta/projects/abc123/global/typeProviders/dataproc-v1beta ( expecially the projects/abc123/global part) (This means your type is in the right project.)
  • more dpbeta.jinja | grep type returns type: abc123/dataproc-v1beta:projects.regions.clusters ( This means your template is trying to access the custom type in the right project.)

Output: Here in place of getting abc123/dataproc-v1beta:projects.regions.clusters, I get abc/dataproc-v1beta:projects.regions.clusters i.e. not the project-id but the project-name.

  • In your error message you see projects/abc123/typeProviders/dataproc-v1beta

Output: Again, I get projects/abc/typeProviders/dataproc-v1beta rather the one with project-id abc123.

  • In your error message you see 123123@cloudervices.gserviceaccount.com.

Note: If it doesn't makes proper sense even now, may I drop you an email with proper screenshots or actual information, whatsoever help you the best.

Thank you once again. :)

@Priyankasaggu11929
Copy link
Author

@ocsig I updated the above comment.

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

Great, I believe I have an understanding of the problem.

  • Your project Name: abc >> This is a human readable name, this can be changed. This is not an identifier.
  • Your project ID: abc123 >> This is a globally unique ID. This can be specified at creation, has to be unique among ALL GCP projects ( even outside your organization). This can not be changed later.
  • Your project Number: This is a number only ID which is auto-generated during the creation of your project and used in some places like in the Build-In Service Account names.
    ( Running gcloud projects describe abc123 will display you these information.)

The issue is, that the type you would like to use is NOT abc/dataproc-v1beta:projects.regions.clusters but abc123/dataproc-v1beta:projects.regions.clusters.

Would you mind to update the dpbeta.jinja so under the type, you using your ProjectID ( and not your Project Name)?

( And because you do not have listing permision to the project which has a ProjectID='abc', you are getting permission error.)

Let me know if this solves your problem.

@Priyankasaggu11929
Copy link
Author

@ocsig yes, I tried doing the above change sometime back only.

It gives me the following error:

ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1584979015681-5a187af34c710-e90b525d-78a6b53c]: errors:
- code: RESOURCE_ERROR
  location: /deployments/dev-dataproc-67/resources/dev-dataproc-67-dataproc-cluster
  message: '{"ResourceType":"abc123/dataproc-v1beta:projects.regions.clusters","ResourceErrorCode":"401","ResourceErrorMessage":{"code":401,"message":"Request
    had invalid authentication credentials. Expected OAuth 2 access token, login cookie
    or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.","status":"UNAUTHENTICATED","statusMessage":"Unauthorized","requestPath":"https://dataproc.googleapis.com/v1beta2/projects/abc123/regions/us-west1/clusters","httpMethod":"POST"}}'

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

The good news is, this is a different error now. Now we know you are using the custom type, because this error actually comes from communicating with the Dataproc API.

Can you verify your type configuration contains the authentication part:

Running 'gcloud beta deployment-manager type-providers describe dataproc-v1beta' should contain this at the end:

[....]
options:
  inputMappings:
  - fieldName: Authorization
    location: HEADER
    methodMatch: .*
    value: $.concat("Bearer ", $.googleOauth2AccessToken())
[.....]

If this is missing, your type did not picked up the config file dataproc-v1beta.type.yaml (see my comment above.)

@Priyankasaggu11929
Copy link
Author

Priyankasaggu11929 commented Mar 23, 2020

Yes, It comes in the output.
screenshot_3

Ok, I think I found a problem. In place of value: $.concat("Bearer ", $.googleOauth2AccessToken()), I have written value: $.concat("Bearer", $.googleOauth2AccessToken()).

It is running currently. I will let you know if it worked successfully or not.

No errors so far.

Now, I feel super funny as all this was because of a singe space typo. :|

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

Wow, I knew that there has to be a typo, but couln't spot it.

From this point every property should be properly passed to the Dataproc API. Let me know.

( I almost dropped IT when I was 14 because of a missing ; in my PHP book...)

@Priyankasaggu11929
Copy link
Author

@ocsig I can't thank you enough for being so patient with me.

It worked properly this time. I have PRESTO installation in the dataproc cluster.

Do you want me to delete the unneccesary comments above, so that someone else would find it easy to look for the solution.

@ocsig
Copy link
Member

ocsig commented Mar 23, 2020

I was happy to help, no need to delete the comments, the debugging steps are important as well. Maybe put a short TL:DR; on the top of your opening comment.

@ocsig ocsig closed this as completed Mar 23, 2020
@Priyankasaggu11929
Copy link
Author

Priyankasaggu11929 commented Mar 23, 2020

Thank you once again. :)

I added the link to the solution comment at the top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants