Skip to content

Conversation

Chaffelson
Copy link
Contributor

Disable dnf AppStream module for Postgresql when centos 8 in use
Set default CM version to 7.4.4 in cloudera.cluster
Ensure that Python2/3 are present on nodes when deploying on Rhel8 family OS
Update utility VM used for download mirror to function correctly with parcel distribution OS
Add default_cluster tag to cluster.yml playbook CM Agent heartbeat test to avoid obscure failures
Enable by setting parcel_distro in definition to [el7, el8, or bionic] per dynamic OS options in cloudera.exe.infrastructure.vars
Change default dynamic inventory selection strings from 'centos7' to match the distribution identity strings of el7 etc. - user can update defaults to a different el7 distro etc. in cloudera.exe.infrastructure.vars
Add Ubuntu 18.04 'bionic' as option to dynamic inventory
Determine preferred parcel distribution in cloudera-deploy init
Add uniqueness to generated dynamic inventory VM name to reflect selection of distribution in case multiple clusters are deployed in the same account
Move dynamic inventory OS selection to globals, update appropriate reference docs to reflect change
Add filtering by distro to download mirror support, and ensure that manifest is still always collected
Enforce no_log always when working with Paywall credentials
Increase initial paywall download timeout to 7200s due to present CDN speed issues when deploying on EC2 outside of us-east-1
Modify ansible.builtin.package lock_timeout to only be used on RedHat, as it is not a Debian option
Pass selected parcel distribution to repo analysis during intial deployment when target cluster OS is not yet determined by deployment
Move cloudera.cluster plays which require knowledge of the cluster distribution to run on the cloudera_manager host instead of the Ansible controller so the correct distribution actions are applied
Fix extract_products_from_manifests filter in cloudera.cluster to correctly reference self and process os_distribution value
Fix import ordering in cloudera.cluster filters.py to not break under recent versions of Python3
Add distribution specific tasks for cloudera.cluster.deployment.repometa so it can identify the cluster distribution using the strings recognised by Cloudera Manager deployment
Force refresh of apt package cache on Debian distributions during OS prereqs setup in cloudera.cluster as the package cache in the image is sometimes missing packages

@tmgstevens
Copy link
Contributor

https://github.com/Chaffelson/cloudera.cluster/blob/debian_fixes/roles/security/tls_signing/tasks/csr_signing_local.yml#L25 - we're going to need to set the executable on that task to /bin/bash as exec 100 > /tmp/ca_server.lock isn't going to run on dash.

@tmgstevens
Copy link
Contributor

Also found that on the ca_server role, on centos 8 you can't install PyOpenSSL using yum, it needs to be via pip, so we're going to need a separate RHEL7/8 codepath there.

Once those changes are in I'm happy to push this

cofin and others added 7 commits November 18, 2021 15:01
- When dynamically creating the cluster template, the OS distribution is not taken into account.
- For instance, Ubuntu tries to load the KEYTRUSTEE_SERVER parcel but fails because it's only available for `el7` and `el8`

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>
Signed-off-by: Cody Fincher <cody.fincher@gmail.com>
WIP
Disable dnf AppStream module for Postgresql when centos 8 in use
Setup system Python2 for CM to use

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>
Set default CM version to 7.4.4 in cloudera.cluster
Derive parcel version of el7 or el8 from inventory target OS family in cloudera.cluster.cloudera_manager.repo
Ensure that Python2/3 are present on nodes when deploying on Rhel8 family OS
Change cloudera-deploy defaults to el7, but allow el8 to be specified and recognised
Update utility VM used for download mirror to function correctly with el7 or el8 OS
Add default_cluster tag to cluster.yml playbook CM Agent heartbeat test to avoid obscure failures

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>
Enable by setting parcel_distro in definition to [el7, el8, or bionic] per dynamic OS options in cloudera.exe.infrastructure.vars
Change default dynamic inventory selection strings from 'centos7' to match the distribution identity strings of el7 etc. - user can update defaults to a different el7 distro etc. in cloudera.exe.infrastructure.vars
Add Ubuntu 18.04 'bionic' as option to dynamic inventory
Determine preferred parcel distribution in cloudera-deploy init
Add uniqueness to generated dynamic inventory VM name to reflect selection of distribution in case multiple clusters are deployed in the same account
Move dynamic inventory OS selection to globals, update appropriate reference docs to reflect change
Add filtering by distro to download mirror support, and ensure that manifest is still always collected
Enforce no_log always when working with Paywall credentials
Increase initial paywall download timeout to 7200s due to present CDN speed issues when deploying on EC2 outside of us-east-1
Modify ansible.builtin.package lock_timeout to only be used on RedHat, as it is not a Debian option
Pass selected parcel distribution to repo analysis during intial deployment when target cluster OS is not yet determined by deployment
Move cloudera.cluster plays which require knowledge of the cluster distribution to run on the cloudera_manager host instead of the Ansible controller so the correct distribution actions are applied
Fix extract_products_from_manifests filter in cloudera.cluster to correctly reference self and process os_distribution value
Fix import ordering in cloudera.cluster filters.py to not break under recent versions of Python3
Add distribution specific tasks for cloudera.cluster.deployment.repometa so it can identify the cluster distribution using the strings recognised by Cloudera Manager deployment
Force refresh of apt package cache on Debian distributions during OS prereqs setup in cloudera.cluster as the package cache in the image is sometimes missing packages

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>
Pin boto3 version <1.18 when using Python2 for s3sync to work
Set GPG to not be checked by default when deploying cm5
Set variants for cm5 and cm6/7 paths for cloudera manager URL

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>
Improve readability of cloudera manager database user tasks
Simplify variable inclusion for Debian OS by standardising for both Ubuntu18 and Ubuntu20
Replace hardcoded Ubuntu1804 version with derived version string where necessary
Improve ca_server setup to recognise python differences between el7, el8 and debian
Set rdbms setup to use psycopg2-binary instead of slightly less reliable python package install in Debian
Move os-specific configurations to top of os setup tasks to allow Python to be fixed first as dependency for other things
Add fix for Ubuntu20.04 to allow root to edit any file on the OS by default, which otherwise breaks cloudera manager database setup
Ensure python2 is symlinked to /usr/bin/python on Redhat where /usr/bin/python is not already symlinked. This fixes Ranger startup issues.
tls_signing now explicitly uses /bin/bash as executable to also work on Ubuntu20
Updated readme to reflect additionally tested OSs

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>
@Chaffelson
Copy link
Contributor Author

This PR has been updated to also include support for Focal Fossa (Ubuntu 20.04) and handle automatically setting up TLS on deployments 🎉

@tmgstevens
Copy link
Contributor

This looks good to me.
Dan - please could you share what scenarios you have tested, i.e. RHEL7/8, Ubuntu18/20, TLS on/off, etc?

@Chaffelson
Copy link
Contributor Author

Tested with:
Centos7, Centos8, Ubuntu18.04, Ubuntu20.04
All with tls=True, now that you mention it I probably need to retest without tls

@tmgstevens
Copy link
Contributor

Regression tested with and without TLS against Secure+HA cluster. All looking good.

@tmgstevens tmgstevens merged commit 75101ea into cloudera-labs:devel Nov 23, 2021
tmgstevens pushed a commit that referenced this pull request Nov 24, 2021
* FIX: parcel os distribution
- When dynamically creating the cluster template, the OS distribution is not taken into account.
- For instance, Ubuntu tries to load the KEYTRUSTEE_SERVER parcel but fails because it's only available for `el7` and `el8`

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* FIX: add ansible_distribution_release to product parsing.

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* Centos 8 support

WIP
Disable dnf AppStream module for Postgresql when centos 8 in use
Setup system Python2 for CM to use

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add EL8 support to Cloudera Collections

Set default CM version to 7.4.4 in cloudera.cluster
Derive parcel version of el7 or el8 from inventory target OS family in cloudera.cluster.cloudera_manager.repo
Ensure that Python2/3 are present on nodes when deploying on Rhel8 family OS
Change cloudera-deploy defaults to el7, but allow el8 to be specified and recognised
Update utility VM used for download mirror to function correctly with el7 or el8 OS
Add default_cluster tag to cluster.yml playbook CM Agent heartbeat test to avoid obscure failures

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add selectable distribution support for cloudera.cluster

Enable by setting parcel_distro in definition to [el7, el8, or bionic] per dynamic OS options in cloudera.exe.infrastructure.vars
Change default dynamic inventory selection strings from 'centos7' to match the distribution identity strings of el7 etc. - user can update defaults to a different el7 distro etc. in cloudera.exe.infrastructure.vars
Add Ubuntu 18.04 'bionic' as option to dynamic inventory
Determine preferred parcel distribution in cloudera-deploy init
Add uniqueness to generated dynamic inventory VM name to reflect selection of distribution in case multiple clusters are deployed in the same account
Move dynamic inventory OS selection to globals, update appropriate reference docs to reflect change
Add filtering by distro to download mirror support, and ensure that manifest is still always collected
Enforce no_log always when working with Paywall credentials
Increase initial paywall download timeout to 7200s due to present CDN speed issues when deploying on EC2 outside of us-east-1
Modify ansible.builtin.package lock_timeout to only be used on RedHat, as it is not a Debian option
Pass selected parcel distribution to repo analysis during intial deployment when target cluster OS is not yet determined by deployment
Move cloudera.cluster plays which require knowledge of the cluster distribution to run on the cloudera_manager host instead of the Ansible controller so the correct distribution actions are applied
Fix extract_products_from_manifests filter in cloudera.cluster to correctly reference self and process os_distribution value
Fix import ordering in cloudera.cluster filters.py to not break under recent versions of Python3
Add distribution specific tasks for cloudera.cluster.deployment.repometa so it can identify the cluster distribution using the strings recognised by Cloudera Manager deployment
Force refresh of apt package cache on Debian distributions during OS prereqs setup in cloudera.cluster as the package cache in the image is sometimes missing packages

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Improve for CDH5 and Centos7

Pin boto3 version <1.18 when using Python2 for s3sync to work
Set GPG to not be checked by default when deploying cm5
Set variants for cm5 and cm6/7 paths for cloudera manager URL

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Further multi-os improvements for clusters

Improve readability of cloudera manager database user tasks
Simplify variable inclusion for Debian OS by standardising for both Ubuntu18 and Ubuntu20
Replace hardcoded Ubuntu1804 version with derived version string where necessary
Improve ca_server setup to recognise python differences between el7, el8 and debian
Set rdbms setup to use psycopg2-binary instead of slightly less reliable python package install in Debian
Move os-specific configurations to top of os setup tasks to allow Python to be fixed first as dependency for other things
Add fix for Ubuntu20.04 to allow root to edit any file on the OS by default, which otherwise breaks cloudera manager database setup
Ensure python2 is symlinked to /usr/bin/python on Redhat where /usr/bin/python is not already symlinked. This fixes Ranger startup issues.
tls_signing now explicitly uses /bin/bash as executable to also work on Ubuntu20
Updated readme to reflect additionally tested OSs

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

Co-authored-by: Cody Fincher <cody.fincher@gmail.com>

Fixes #32
tmgstevens pushed a commit that referenced this pull request Nov 24, 2021
* FIX: parcel os distribution
- When dynamically creating the cluster template, the OS distribution is not taken into account.
- For instance, Ubuntu tries to load the KEYTRUSTEE_SERVER parcel but fails because it's only available for `el7` and `el8`

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* FIX: add ansible_distribution_release to product parsing.

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* Centos 8 support

WIP
Disable dnf AppStream module for Postgresql when centos 8 in use
Setup system Python2 for CM to use

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add EL8 support to Cloudera Collections

Set default CM version to 7.4.4 in cloudera.cluster
Derive parcel version of el7 or el8 from inventory target OS family in cloudera.cluster.cloudera_manager.repo
Ensure that Python2/3 are present on nodes when deploying on Rhel8 family OS
Change cloudera-deploy defaults to el7, but allow el8 to be specified and recognised
Update utility VM used for download mirror to function correctly with el7 or el8 OS
Add default_cluster tag to cluster.yml playbook CM Agent heartbeat test to avoid obscure failures

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add selectable distribution support for cloudera.cluster

Enable by setting parcel_distro in definition to [el7, el8, or bionic] per dynamic OS options in cloudera.exe.infrastructure.vars
Change default dynamic inventory selection strings from 'centos7' to match the distribution identity strings of el7 etc. - user can update defaults to a different el7 distro etc. in cloudera.exe.infrastructure.vars
Add Ubuntu 18.04 'bionic' as option to dynamic inventory
Determine preferred parcel distribution in cloudera-deploy init
Add uniqueness to generated dynamic inventory VM name to reflect selection of distribution in case multiple clusters are deployed in the same account
Move dynamic inventory OS selection to globals, update appropriate reference docs to reflect change
Add filtering by distro to download mirror support, and ensure that manifest is still always collected
Enforce no_log always when working with Paywall credentials
Increase initial paywall download timeout to 7200s due to present CDN speed issues when deploying on EC2 outside of us-east-1
Modify ansible.builtin.package lock_timeout to only be used on RedHat, as it is not a Debian option
Pass selected parcel distribution to repo analysis during intial deployment when target cluster OS is not yet determined by deployment
Move cloudera.cluster plays which require knowledge of the cluster distribution to run on the cloudera_manager host instead of the Ansible controller so the correct distribution actions are applied
Fix extract_products_from_manifests filter in cloudera.cluster to correctly reference self and process os_distribution value
Fix import ordering in cloudera.cluster filters.py to not break under recent versions of Python3
Add distribution specific tasks for cloudera.cluster.deployment.repometa so it can identify the cluster distribution using the strings recognised by Cloudera Manager deployment
Force refresh of apt package cache on Debian distributions during OS prereqs setup in cloudera.cluster as the package cache in the image is sometimes missing packages

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Improve for CDH5 and Centos7

Pin boto3 version <1.18 when using Python2 for s3sync to work
Set GPG to not be checked by default when deploying cm5
Set variants for cm5 and cm6/7 paths for cloudera manager URL

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Further multi-os improvements for clusters

Improve readability of cloudera manager database user tasks
Simplify variable inclusion for Debian OS by standardising for both Ubuntu18 and Ubuntu20
Replace hardcoded Ubuntu1804 version with derived version string where necessary
Improve ca_server setup to recognise python differences between el7, el8 and debian
Set rdbms setup to use psycopg2-binary instead of slightly less reliable python package install in Debian
Move os-specific configurations to top of os setup tasks to allow Python to be fixed first as dependency for other things
Add fix for Ubuntu20.04 to allow root to edit any file on the OS by default, which otherwise breaks cloudera manager database setup
Ensure python2 is symlinked to /usr/bin/python on Redhat where /usr/bin/python is not already symlinked. This fixes Ranger startup issues.
tls_signing now explicitly uses /bin/bash as executable to also work on Ubuntu20
Updated readme to reflect additionally tested OSs

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

Co-authored-by: Cody Fincher <cody.fincher@gmail.com>

Fixes #32
WillDyson pushed a commit to WillDyson/cloudera.cluster that referenced this pull request Jul 15, 2022
…bs#35)

* FIX: parcel os distribution
- When dynamically creating the cluster template, the OS distribution is not taken into account.
- For instance, Ubuntu tries to load the KEYTRUSTEE_SERVER parcel but fails because it's only available for `el7` and `el8`

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* FIX: add ansible_distribution_release to product parsing.

Signed-off-by: Cody Fincher <cody.fincher@gmail.com>

* Centos 8 support

WIP
Disable dnf AppStream module for Postgresql when centos 8 in use
Setup system Python2 for CM to use

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add EL8 support to Cloudera Collections

Set default CM version to 7.4.4 in cloudera.cluster
Derive parcel version of el7 or el8 from inventory target OS family in cloudera.cluster.cloudera_manager.repo
Ensure that Python2/3 are present on nodes when deploying on Rhel8 family OS
Change cloudera-deploy defaults to el7, but allow el8 to be specified and recognised
Update utility VM used for download mirror to function correctly with el7 or el8 OS
Add default_cluster tag to cluster.yml playbook CM Agent heartbeat test to avoid obscure failures

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Add selectable distribution support for cloudera.cluster

Enable by setting parcel_distro in definition to [el7, el8, or bionic] per dynamic OS options in cloudera.exe.infrastructure.vars
Change default dynamic inventory selection strings from 'centos7' to match the distribution identity strings of el7 etc. - user can update defaults to a different el7 distro etc. in cloudera.exe.infrastructure.vars
Add Ubuntu 18.04 'bionic' as option to dynamic inventory
Determine preferred parcel distribution in cloudera-deploy init
Add uniqueness to generated dynamic inventory VM name to reflect selection of distribution in case multiple clusters are deployed in the same account
Move dynamic inventory OS selection to globals, update appropriate reference docs to reflect change
Add filtering by distro to download mirror support, and ensure that manifest is still always collected
Enforce no_log always when working with Paywall credentials
Increase initial paywall download timeout to 7200s due to present CDN speed issues when deploying on EC2 outside of us-east-1
Modify ansible.builtin.package lock_timeout to only be used on RedHat, as it is not a Debian option
Pass selected parcel distribution to repo analysis during intial deployment when target cluster OS is not yet determined by deployment
Move cloudera.cluster plays which require knowledge of the cluster distribution to run on the cloudera_manager host instead of the Ansible controller so the correct distribution actions are applied
Fix extract_products_from_manifests filter in cloudera.cluster to correctly reference self and process os_distribution value
Fix import ordering in cloudera.cluster filters.py to not break under recent versions of Python3
Add distribution specific tasks for cloudera.cluster.deployment.repometa so it can identify the cluster distribution using the strings recognised by Cloudera Manager deployment
Force refresh of apt package cache on Debian distributions during OS prereqs setup in cloudera.cluster as the package cache in the image is sometimes missing packages

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Improve for CDH5 and Centos7

Pin boto3 version <1.18 when using Python2 for s3sync to work
Set GPG to not be checked by default when deploying cm5
Set variants for cm5 and cm6/7 paths for cloudera manager URL

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

* Further multi-os improvements for clusters

Improve readability of cloudera manager database user tasks
Simplify variable inclusion for Debian OS by standardising for both Ubuntu18 and Ubuntu20
Replace hardcoded Ubuntu1804 version with derived version string where necessary
Improve ca_server setup to recognise python differences between el7, el8 and debian
Set rdbms setup to use psycopg2-binary instead of slightly less reliable python package install in Debian
Move os-specific configurations to top of os setup tasks to allow Python to be fixed first as dependency for other things
Add fix for Ubuntu20.04 to allow root to edit any file on the OS by default, which otherwise breaks cloudera manager database setup
Ensure python2 is symlinked to /usr/bin/python on Redhat where /usr/bin/python is not already symlinked. This fixes Ranger startup issues.
tls_signing now explicitly uses /bin/bash as executable to also work on Ubuntu20
Updated readme to reflect additionally tested OSs

Signed-off-by: Daniel Chaffelson <chaffelson@gmail.com>

Co-authored-by: Cody Fincher <cody.fincher@gmail.com>

Fixes cloudera-labs#32

Signed-off-by: William Dyson <wdyson@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants