diff --git a/circle.yml b/circle.yml index 0a984e474..a1163e3e7 100644 --- a/circle.yml +++ b/circle.yml @@ -25,7 +25,6 @@ deployment: branch: master commands: - make upload-snapshot - - eval $(docker run gliderlabs/pagebuilder circleci-cmd) release: branch: release diff --git a/docs/accountmgmt.md b/docs/accountmgmt.md deleted file mode 100644 index 8cc5ca866..000000000 --- a/docs/accountmgmt.md +++ /dev/null @@ -1,24 +0,0 @@ -## Account management - -Cloudbreak defines three distinct roles: - -1. DEPLOYER -2. ACCOUNT_ADMIN -3. ACCOUNT_USER - -###Cloudbreak deployer -This is the `master` role - the user which is created during the deployment process will have this role. - -###Account admin -We have introduced the notion of accounts - and with that comes an administrator role. Upon registration a user will become an account administrator. - -The extra rights associated with the account admin role are: - -* Invite users to join the account -* Share account wide resources (credential, blueprints, templates) -* See resources created by account users -* Monitor clusters started by account users -* Management and reporting tool available - -###Account user -An account user is a user who has been invited to join Cloudbreak by an account administrator. Account users activity will show up in the management and reporting tool for account wide statistics - accessible by the account administrator. Apart from common account wide resources, the account users can manage their own private resources. diff --git a/docs/api.md b/docs/api.md deleted file mode 100644 index 3b9a5a84c..000000000 --- a/docs/api.md +++ /dev/null @@ -1,5 +0,0 @@ -##API documentation - - Cloudbreak is a RESTful application development platform with the goal of helping developers to build solutions for deploying HDP clusters in different environments. Once it is deployed in your favourite servlet container it exposes a REST API allowing to span up Hadoop clusters of arbitary sizes and cloud providers. - -The [API documentation](https://cloudbreak-api.sequenceiq.com/api/index.html) is generated from the code using [Swagger](http://swagger.io/). diff --git a/docs/aws-image.md b/docs/aws-image.md deleted file mode 100644 index b0a1b0fb6..000000000 --- a/docs/aws-image.md +++ /dev/null @@ -1,20 +0,0 @@ -# AWS Cloud Images - -We have pre-built cloud images for AWS with the Cloudbreak Deployer pre-installed. You can launch the latest Cloudbreak Deployer image based on your region at the [AWS Management Console](https://aws -.amazon.com/console/). - -> Alternatively, instead of using the pre-built cloud images for AWS, you can install Cloudbreak Deployer on your own VM. See [install the Cloudbreak Deployer](onprem.md) for more information. - -Make sure you opened the following ports on your virtual machine: - - * SSH (22) - * Ambari (8080) - * Identity server (8089) - * Cloudbreak GUI (3000) - * User authentication (3001) - -### AWS Image Details - -## Setup Cloudbreak Deployer - -Once you have the Cloudbreak Deployer installed, proceed to [Setup Cloudbreak Deployer](aws.md). diff --git a/docs/aws.md b/docs/aws.md deleted file mode 100644 index 91902df64..000000000 --- a/docs/aws.md +++ /dev/null @@ -1,73 +0,0 @@ -# AWS Setup - -## Setup Cloudbreak Deployer - -If you already have Cloudbreak Deployer either by [using the AWS Cloud Images](aws-image.md) or by [installing the Cloudbreak Deployer](onprem.md) manually on your own VM, -you can start to setup the Cloudbreak Application with the deployer. - -Create and open the `cloudbreak-deployment` directory: - -``` -cd cloudbreak-deployment -``` - -This is the directory of the config files and the supporting binaries that will be downloaded by Cloudbreak deployer. - -### Initialize your Profile - -First initialize cbd by creating a `Profile` file: - -``` -cbd init -``` - -It will create a `Profile` file in the current directory. Please edit the file - one of the required configurations is the `PUBLIC_IP`. -This IP will be used to access the Cloudbreak UI (called Uluwatu). In some cases the `cbd` tool tries to guess it, if can't than will give a hint. - -The other required configuration in the `Profile` are the AWS keys belonging to the AWS account used by the Cloudbreak application. -In order for Cloudbreak to be able to launch clusters on AWS on your behalf you need to set your AWS keys in the `Profile` file. -We suggest to use the keys of an *IAM User* here. The IAM User's policies must be configured to have permission to assume roles (`sts:AssumeRole`) on all (`*`) resources. - -``` -export AWS_ACCESS_KEY_ID=AKIA**************W7SA -export AWS_SECRET_ACCESS_KEY=RWCT4Cs8******************/*skiOkWD -``` - -You can learn more about the concepts used by Cloudbreak with AWS accounts in the [prerequisites chapter](aws_pre_prov.md) - -### Generate your Profile - -You are done with the initialization of Cloudbreak deployer. The last thing you have to do is to regenerate the configurations in order to take effect. - -``` -rm *.yml -cbd generate -``` - -This command applies the following steps: - -- creates the **docker-compose.yml** file that describes the configuration of all the Docker containers needed for the Cloudbreak deployment. -- creates the **uaa.yml** file that holds the configuration of the identity server used to authenticate users to Cloudbreak. - -### Start Cloudbreak - -To start the Cloudbreak application use the following command. -This will start all the Docker containers and initialize the application. It will take a few minutes until all the services start. - -``` -cbd start -``` - ->Launching it first will take more time as it downloads all the docker images needed by Cloudbreak. - -After the `cbd start` command finishes you can check the logs of the Cloudbreak server with this command: - -``` -cbd logs cloudbreak -``` ->Cloudbreak server should start within a minute - you should see a line like this: `Started CloudbreakApplication in 36.823 seconds` - -### Next steps - -Once Cloudbreak is up and running you should check out the [Provisioning Prerequisites](aws_pre_prov.md) needed to create AWS -clusters with Cloudbreak. diff --git a/docs/aws_cb_shell.md b/docs/aws_cb_shell.md deleted file mode 100644 index 6fc87d855..000000000 --- a/docs/aws_cb_shell.md +++ /dev/null @@ -1,250 +0,0 @@ -## Interactive mode - -Start the shell with `cbd util cloudbreak-shell`. This will launch the Cloudbreak shell inside a Docker container and you are ready to start using it. - -You have to copy files into the cbd working directory, which you would like to use from shell. For example if your `cbd` working directory is `~/prj/cbd` then copy your blueprint and public ssh key file into this directory. You can refer to these files with their names from the shell. - -### Create a cloud credential - -In order to start using Cloudbreak you will need to have an AWS cloud credential configured. - ->**Note** that Cloudbreak **does not** store your cloud user details - we work around the concept of [IAM](http://aws -.amazon.com/iam/) - on Amazon (or other cloud providers) you will have to create an IAM role, a policy and associate that with your Cloudbreak account. - -``` -credential create --EC2 --description "description" --name my-aws-credential --roleArn --sshKeyPath -``` - -Alternatively you can upload your public key from an url as well, by using the `—sshKeyUrl` switch. You can check whether the credential was created successfully by using the `credential list` command. You can switch between your cloud credentials - when you’d like to use one and act with that you will have to use: - -``` -credential select --name my-aws-credential -``` - -### Create a template - -A template gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related resources, maintaining and updating them in an orderly and predictable fashion. A template can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -template create --EC2 --name awstemplate --description aws-template --instanceType M3Xlarge --volumeSize 100 --volumeCount 2 -``` -You can check whether the template was created successfully by using the `template list` or `template show` command. - -You can delete your cloud template - when you’d like to delete one you will have to use: -``` -template delete --name awstemplate -``` - -### Create or select a blueprint - -You can define Ambari blueprints with cloudbreak-shell: - -``` -blueprint add --name myblueprint --description myblueprint-description --file -``` - -Other available options: - -`--url` the url of the blueprint - -`--publicInAccount` flags if the network is public in the account - -We ship default Ambari blueprints with Cloudbreak. You can use these blueprints or add yours. To see the available blueprints and use one of them please use: - -``` -blueprint list - -blueprint select --name hdp-small-default -``` - -### Create a network - -A network gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related networking, maintaining and updating them in an orderly and predictable fashion. A network can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -network create --AWS --name awsnetwork --description aws-network --subnet 10.0.0.0/16 -``` - -Other available options: - -`--vpcID` your existing vpc on amazon - -`--internetGatewayID` your amazon internet gateway of the given VPC - -`--publicInAccount` flags if the network is public in the account - -There is a default network with name `default-aws-network`. If we use this for cluster creation, Cloudbreak will create a new VPC with 10.0.0.0/16 subnet. - -You can check whether the network was created successfully by using the `network list` command. Check the network and select it if you are happy with it: - -``` -network show --name awsnetwork - -network select --name awsnetwork -``` - -### Create a security group - -A security group gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related security rules. - -``` -securitygroup create --name secgroup_example --description securitygroup-example --rules 0.0.0.0/0:tcp:8080,9090;10.0.33.0/24:tcp:1234,1235 -``` - -You can check whether the security group was created successfully by using the `securitygroup list` command. Check the security group and select it if you are happy with it: - -``` -securitygroup show --name secgroup_example - -securitygroup select --name secgroup_example -``` - -There are two default security groups defined: `all-services-port` and `only-ssh-and-ssl` - -`only-ssh-and-ssl:` all ports are locked down except for SSH and gateway HTTPS (you can't access Hadoop services outside of the VPC) - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services and SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -### Configure instance groups - -You have to configure the instancegroups before the provisioning. An instancegroup is defining a group of your nodes with a specified template. Usually we create instancegroups for the hostgroups defined in the blueprints. - -``` -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName minviable-aws -``` - -Other available option: - -`--templateId` Id of the template - -### Create a Hadoop cluster -You are almost done - two more command and this will create your Hadoop cluster on your favorite cloud provider. Same as the API, or UI this will use your `credential`, `instancegroups`, `network`, `securitygroup`, and by using CloudFormation will launch a cloud stack -``` -stack create --name my-first-stack --region US_EAST_1 -``` -Once the `stack` is up and running (cloud provisioning is done) it will use your selected `blueprint` and install your custom Hadoop cluster with the selected components and services. -``` -cluster create --description "my first cluster" -``` -You are done - you can check the progress through the Ambari UI. If you log back to Cloudbreak UI you can check the progress over there as well, and learn the IP address of Ambari. - -### Stop/Restart cluster and stack -You have the ability to **stop your existing stack then its cluster** if you want to suspend the work on it. - -Select a stack for example with its name: -``` -stack select --name my-stack -``` -Other available option to define a stack is its `--id` (instead of the `--name`). - -Apply the following commands to stop the previously selected stack: -``` -cluster stop -stack stop -``` ->**Important!** The related cluster should be stopped before you can stop the stack. - - -Apply the following command to **restart the previously selected and stopped stack**: -``` -stack start -``` -After the selected stack has restarted, you can **restart the related cluster as well**: -``` -cluster start -``` - -### Upscale/Downscale cluster and stack -You can **upscale your selected stack** if you need more instances to your infrastructure: -``` -stack node --ADD --instanceGroup host_group_slave_1 --adjustment 6 -``` -Other available options: - -`--withClusterUpScale` indicates cluster upscale after stack upscale -or you can upscale the related cluster separately as well: -``` -cluster node --ADD --hostgroup host_group_slave_1 --adjustment 6 -``` - - -Apply the following command to **downscale the previously selected stack**: -``` -stack node --REMOVE --instanceGroup host_group_slave_1 --adjustment -2 -``` -and the related cluster separately: -``` -cluster node --REMOVE --hostgroup host_group_slave_1 --adjustment -2 -``` - -## Silent mode - -With Cloudbreak shell you can execute script files as well. A script file contains cloudbreak shell commands and can be executed with the `script` cloudbreak shell command - -``` -script -``` - -or with the `cbd util cloudbreak-shell-quiet` cbd command: - -``` -cbd util cloudbreak-shell-quiet < example.sh -``` - -## Example - -The following example creates a hadoop cluster with `hdp-small-default` blueprint on M3Xlarge instances with 2X100G attached disks on `default-aws-network` network using `all-services-port` security group. You should copy your ssh public key file into your cbd working directory with name `id_rsa.pub` and change the `` part with your arn role. - -``` -credential create --EC2 --description description --name my-aws-credential --roleArn --sshKeyPath id_rsa.pub -credential select --name my-aws-credential -template create --EC2 --name awstemplate --description aws-template --instanceType M3Xlarge --volumeSize 100 --volumeCount 2 -blueprint select --name hdp-small-default -instancegroup configure --instanceGroup cbgateway --nodecount 1 --templateName awstemplate -instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName awstemplate -instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName awstemplate -instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName awstemplate -instancegroup configure --instanceGroup host_group_client_1 --nodecount 1 --templateName awstemplate -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName awstemplate -network select --name default-aws-network -securitygroup select --name all-services-port -stack create --name my-first-stack --region US_EAST_1 -cluster create --description "My first cluster" -``` - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/aws_cb_ui.md b/docs/aws_cb_ui.md deleted file mode 100644 index 1c6152651..000000000 --- a/docs/aws_cb_ui.md +++ /dev/null @@ -1,196 +0,0 @@ -#Provisioning via Browser - -You can log into the Cloudbreak application at http://PUBLIC_IP:3000. - -The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider account. -This description details the AWS setup - if you'd like to use a different cloud provider check out its manual. - -This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI: - -- connect your AWS account with Cloudbreak -- create some template resources on the UI that describe the infrastructure of your clusters -- create a blueprint that describes the HDP services in your clusters and add some recipes for customization -- launch the cluster itself based on these template resources - -## Setting up AWS credentials - -Cloudbreak works by connecting your AWS account through so called *Credentials*, and then uses these credentials to create resources on your behalf. -The credentials can be configured on the "manage credentials". - -Add a `name` and a `description` for the credential, copy your IAM role's Amazon Resource Name (ARN) to the corresponding field (`IAM Role ARN`) and copy your SSH public key to the `SSH public key` field. -To learn more about how to setup the IAM Role on your AWS account check out the [prerequisites](aws_pre_prov.md). - -The SSH public key must be in OpenSSH format and it's private keypair can be used later to [SSH onto every instance](http://sequenceiq.com/cloudbreak-deployer/1.1.0/insights/#ssh-to-the-host) of every cluster you'll create with this credential. -The SSH username for the EC2 instances is **ec2-user**. - -There is a last option called `Public in account` - it means that all the users belonging to your account will be able to use this credential to create clusters, but cannot delete or modify it. - -![](/images/aws-credential.png) - -## Infrastructure templates - -After your AWS account is linked to Cloudbreak you can start creating templates that describe your clusters' infrastructure: - -- resources -- networks -- security groups - -When you create a template, Cloudbreak *doesn't make any requests* to AWS. -Resources are only created on AWS after the `create cluster` button is pushed. -These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure. - -**Resources** - -Resources describe the instances of your cluster - the instance type and the attached volumes. -A typical setup is to combine multiple resources in a cluster for the different types of nodes. -For example you may want to attach multiple large disks to the datanodes or have memory optimized instances for Spark nodes. - -There are some additional configuration options here: - -- `Spot price (USD)` is not mandatory, if specified Cloudbreak will request spot price instances (which might take a while or never be fulfilled by Amazon). This option is *not supported* by the default RedHat images. -- `EBS encryption` is supported for all volume types. If this option is checked then all the attached disks [will be encrypted](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSEncryption.html) by Amazon using the AWS KMS master keys. -- If `Public in account` is checked all the users belonging to your account will be able to use this resource to create clusters, but cannot delete or modify it. - -![](/images/aws-resources.png) - -**Networks** - -Your clusters can be created in their own Virtual Private Cloud (VPC) or in one of your already existing VPCs. -Currently Cloudbreak creates a new subnet in both cases, in a later release it may change. -The subnet's IP range must be defined in the `Subnet (CIDR)` field using the general CIDR notation. - -If you don't want to use your already existing VPC, you can use the default network (`default-aws-network`) for all your clusters. -It will create a new VPC with a `10.0.0.0/16` subnet every time a cluster is created. - -If you'd like to deploy a cluster to an already existing VPC you'll have to create a new network template where you configure the identifier of your VPC and the internet gateway (IGW) that's attached to the VPC. -In this case you'll have to create a different network template for every one of your clusters because the Subnet CIDR cannot overlap an already existing subnet in the VPC. -For example you can create 3 different clusters with 3 different network templates for the subnets `10.0.0.0/24`, `10.0.1.0/24`, `10.0.2.0/24` but with the same VPC and IGW identifiers. - ->**Important** Please make sure that the subnet you define here doesn't overlap with any of your already deployed -subnets in the VPC because the validation only happens after the cluster creation starts. - -If `Public in account` is checked all the users belonging to your account will be able to use this network template to create clusters, but cannot delete or modify it. - ->**Note** that the VPCs, IGWs and/or subnets are *not created* on AWS after the `Create Network` button is pushed, -only after the cluster provisioning starts with the selected network template. - -![](/images/aws-network.png) - -**Security groups** - -Security group templates are very similar to the security groups on the AWS Console. -They describe the allowed inbound traffic to the instances in the cluster. -Currently only one security group template can be selected for a Cloudbreak cluster and all the instances have a public IP address so all the instances in the cluster will belong to the same security group. -This may change in a later release. - -You can define your own security group by adding all the ports, protocols and CIDR range you'd like to use. 443 needs to be there in every security group otherwise Cloudbreak won't be able to communicate with the provisioned cluster. The rules defined here doesn't need to contain the internal rules, those are automatically added by Cloudbreak to the security group on AWS. - -You can also use the two pre-defined security groups in Cloudbreak: - -`only-ssh-and-ssl:` all ports are locked down except for SSH and gateway HTTPS (you can't access Hadoop services outside of the VPC): - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services and SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -If `Public in account` is checked all the users belonging to your account will be able to use this security group template to create clusters, but cannot delete or modify it. - ->**Note** that the security groups are *not created* on AWS after the `Create Security Group` button is pushed, only -after the cluster provisioning starts with the selected security group template. - -![](/images/ui-secgroup.png) - -## Defining cluster services - -**Blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are [used by Ambari](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints). - -You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own. -Blueprints can be added from an URL (an example [blueprint](https://github.com/sequenceiq/ambari-rest-client/raw/1.6.0/src/main/resources/blueprints/multi-node-hdfs-yarn)) or the whole JSON can be copied to the `Manual copy` field. - -The hostgroups added in the JSON will be mapped to a set of instances when starting the cluster and the services and components defined in the hostgroup will be installed on the corresponding nodes. -It is not necessary to define all the configuration fields in the blueprints - if a configuration is missing, Ambari will fill that with a default value. -The configurations defined in the blueprint can also be modified later from the Ambari UI. - -If `Public in account` is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it. - -![](/images/ui-blueprints.png) - -A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. -There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. -When the blueprint is exported some configurations will have hardcoded for example domain names, or memory configurations that won't be applicable to the Cloudbreak cluster. - -**Cluster customization** - -Sometimes it can be useful to define some custom scripts that run during cluster creation and add some additional functionality. -For example it can be a service you'd like to install but it's not supported by Ambari or some script that automatically downloads some data to the necessary nodes. -The most notable example is Ranger setup: it has a prerequisite of a running database when Ranger Admin is installing. -A PostgreSQL database can be easily started and configured with a recipe before the blueprint installation starts. - -To learn more about these so called *Recipes*, and to check out the Ranger database recipe, take a look at the [Cluster customization](recipes.md) part of the documentation. - - -## Cluster deployment - -After all the templates are configured you can deploy a new HDP cluster. Start by selecting a previously created AWS credential in the header. -Click on `create cluster`, give it a `name`, select a `Region` where the cluster infrastructure will be provisioned and select one of the `Networks` and `Security Groups` created earlier. -After you've selected a `Blueprint` as well you should be able to configure the `Template resources` and the number of nodes for all of the hostgroups in the blueprint. - -If `Public in account` is checked all the users belonging to your account will be able to see the newly created cluster on the UI, but cannot delete or modify it. - -If `Enable security` is checked as well, Cloudbreak will install Key Distribution Center (KDC) and the cluster will be Kerberized. See more about it in the [Kerberos](kerberos.md) section of this documentation. - -After the `create and start cluster` button is pushed Cloudbreak will start to create resources on your AWS account. -Cloudbreak uses *CloudFormation* to create the resources - you can check out the resources created by Cloudbreak on the AWS Console under the CloudFormation page. - ->**Important** Always use Cloudbreak to delete the cluster. If that fails for some reason always try to delete the -CloudFormation stack first. -Instances are started in an Auto Scaling Group so they may be restarted if you terminate an instance manually! - -**Advanced options** - -There are some advanced features when deploying a new cluster, these are the following: - -`Availability Zone`: You can restrict the instances to a specific availability zone. It may be useful if you're using reserved instances. - -`Use dedicated instances:` You can use [dedicated instances](https://aws.amazon.com/ec2/purchasing-options/dedicated-instances/) on EC2 - -`Minimum cluster size:` the provisioning strategy in case of the cloud provider cannot allocate all the requested nodes - -`Validate blueprint:` feature to validate the Ambari blueprint. By default is switched on. - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/aws_pre_prov.md b/docs/aws_pre_prov.md deleted file mode 100644 index 87d0d9f0c..000000000 --- a/docs/aws_pre_prov.md +++ /dev/null @@ -1,73 +0,0 @@ -# Provisioning Prerequisites - -## IAM role setup - -Cloudbreak works by connecting your AWS account through so called *Credentials*, and then uses these credentials to create resources on your behalf. - ->**Important** Cloudbreak deployment uses two different AWS accounts for two different purposes: - -- The account belonging to the *Cloudbreak webapp* itself that acts as a *third party* that creates resources on the account of the *end-user*. This account is configured at server-deployment time. -- The account belonging to the *end user* who uses the UI or the Shell to create clusters. This account is configured when setting up credentials. - -These two accounts are usually *the same* when the end user is the same who deployed the Cloudbreak server, but it allows Cloudbreak to act as a SaaS project as well if needed. - -Credentials use [IAM Roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html) to give access to the third party to act on behalf of the end user without giving full access to your resources. -This IAM Role will be *assumed* later by the deployment account. -This section is about how to setup the IAM role, to see how to setup the *deployment* account check out [this description](aws.md). - -To connect your (*end user*) AWS account with a credential in Cloudbreak you'll have to create an IAM role on your AWS account that is configured to allow the third-party account to access and create resources on your behalf. -The easiest way to do this is with cbd commands (but it can also be done manually from the [AWS Console](https://console.aws.amazon.com)): - -``` -cbd aws generate-role - Generates an AWS IAM role for Cloudbreak provisioning on AWS -cbd aws show-role - Show assumers and policies for an AWS role -cbd aws delete-role - Deletes an AWS IAM role, removes all inline policies -``` - -The `generate-role` command creates a role that is assumable by the Cloudbreak Deployer AWS account and has a broad policy setup. -By default the `generate-role` command creates a role with the name `cbreak-deployer`. -If you'd like to create the role with a different name or if you'd like to create multiple roles then the role's name can be changed by adding this line to your `Profile`: - -``` -export AWS_ROLE_NAME=my-cloudbreak-role -``` - -You can check the generated role on your AWS console, under IAM roles: - -![](/images/aws-iam-role.png) - -## Generate a new SSH key - -All the instances created by Cloudbreak are configured to allow key-based SSH, -so you'll need to provide an SSH public key that can be used later to SSH onto the instances in the clusters you'll create with Cloudbreak. -You can use one of your existing keys or you can generate a new one. - -To generate a new SSH keypair: - -``` -ssh-keygen -t rsa -b 4096 -C "your_email@example.com" -# Creates a new ssh key, using the provided email as a label -# Generating public/private rsa key pair. -``` - -``` -# Enter file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter] -You'll be asked to enter a passphrase, but you can leave it empty. - -# Enter passphrase (empty for no passphrase): [Type a passphrase] -# Enter same passphrase again: [Type passphrase again] -``` - -After you enter a passphrase the keypair is generated. The output should look something like below. -``` -# Your identification has been saved in /Users/you/.ssh/id_rsa. -# Your public key has been saved in /Users/you/.ssh/id_rsa.pub. -# The key fingerprint is: -# 01:0f:f4:3b:ca:85:sd:17:sd:7d:sd:68:9d:sd:a2:sd your_email@example.com -``` - -Later you'll need to pass the `.pub` file's contents to Cloudbreak and use the private part to SSH to the instances. - -## Next steps - -After your IAM role is configured and you have an SSH key you can move on to create clusters on the [UI](aws_cb_ui.md) or with the [Shell](aws_cb_shell.md). \ No newline at end of file diff --git a/docs/azure.md b/docs/azure.md deleted file mode 100644 index 48a3a17ba..000000000 --- a/docs/azure.md +++ /dev/null @@ -1,206 +0,0 @@ -# Azure Setup - -## Setup Cloudbreak Deployer - -To install and configure the Cloudbreak Deployer on Azure, start -an [OpenLogic 7.1](https://azure.microsoft.com/en-in/marketplace/partners/openlogic/centosbased71/) VM on Azure. - -Make sure you opened the following ports: - - * SSH (22) - * Ambari (8080) - * Identity server (8089) - * Cloudbreak GUI (3000) - * User authentication (3001) - -Please log in to the machine with SSH or use username and password authentication (the following example shows how to ssh into the machine): - -``` -ssh -i @ -``` - -Assume **root** privileges with this command: - -``` -sudo su -``` - -Configure the correct yum repository on the machine: - -``` -cat > /etc/yum.repos.d/CentOS-Base.repo <<"EOF" -# CentOS-Base.repo -# -# The mirror system uses the connecting IP address of the client and the -# update status of each mirror to pick mirrors that are updated to and -# geographically close to the client. You should use this for CentOS updates -# unless you are manually picking other mirrors. -# -# If the mirrorlist= does not work for you, as a fall back you can try the -# remarked out baseurl= line instead. -# -# - -[base] -name=CentOS-$releasever - Base -mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra -#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ -gpgcheck=1 -gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 - -#released updates -[updates] -name=CentOS-$releasever - Updates -mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates&infra=$infra -#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/ -gpgcheck=1 -gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 - -#additional packages that may be useful -[extras] -name=CentOS-$releasever - Extras -mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras&infra=$infra -#baseurl=http://mirror.centos.org/centos/$releasever/extras/$basearch/ -gpgcheck=1 -gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 - -#additional packages that extend functionality of existing packages -[centosplus] -name=CentOS-$releasever - Plus -mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus&infra=$infra -#baseurl=http://mirror.centos.org/centos/$releasever/centosplus/$basearch/ -gpgcheck=1 -enabled=0 -gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 -EOF -``` - -Install the correct version of **kernel**, **kernel-tools** and **systemd**: - -``` -yum install -y kernel-3.10.0-229.14.1.el7 kernel-tools-3.10.0-229.14.1.el7 systemd-208-20.el7_1.6 -``` - -To permanently disable **SELinux** set SELINUX=disabled in /etc/selinux/config This ensures that SELinux does not turn itself on after you reboot the machine: - -``` -setenforce 0 && sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config -``` - -You need to install `iptables-services`, otherwise the `iptables save` command will not be available: - -``` -yum -y install iptables-services net-tools unzip -``` - -Please configure your `iptables` on your machine: - -``` -iptables --flush INPUT && \ -iptables --flush FORWARD && \ -service iptables save && \ -sed -i 's/net.ipv4.ip_forward = 0/net.ipv4.ip_forward = 1/g' /etc/sysctl.conf -``` - -Configure a custom Docker repository for installing the correct version of Docker: - -``` -cat > /etc/yum.repos.d/docker.repo <<"EOF" -[dockerrepo] -name=Docker Repository -baseurl=https://yum.dockerproject.org/repo/main/centos/7 -enabled=1 -gpgcheck=1 -gpgkey=https://yum.dockerproject.org/gpg -EOF -``` - -Then you are able to install the Docker service: - -``` -yum install -y docker-engine-1.8.3 -``` - -Configure your installed Docker service: - -``` -cat > /usr/lib/systemd/system/docker.service <<"EOF" -[Unit] -Description=Docker Application Container Engine -Documentation=https://docs.docker.com -After=network.target docker.socket cloud-final.service -Requires=docker.socket -Wants=cloud-final.service - -[Service] -ExecStart=/usr/bin/docker -d -H fd:// -H tcp://0.0.0.0:2376 --selinux-enabled=false --storage-driver=devicemapper --storage-opt=dm.basesize=30G -MountFlags=slave -LimitNOFILE=200000 -LimitNPROC=16384 -LimitCORE=infinity - -[Install] -WantedBy=multi-user.target -EOF -``` - -Remove docker folder and restart Docker service: - -``` -rm -rf /var/lib/docker && systemctl daemon-reload && service docker start && systemctl enable docker.service -``` - -Download **cloudbreak-deployer**: - -``` -curl https://raw.githubusercontent.com/sequenceiq/cloudbreak-deployer/master/install-latest | sh && cbd --version -``` - -### Initialize your Profile - -First initialize cbd by creating a `Profile` file: - -``` -cbd init -``` - -It will create a `Profile` file in the current directory. Please edit the file - the only required -configuration is the `PUBLIC_IP`. This IP will be used to access the Cloudbreak UI -(called Uluwatu). In some cases the `cbd` tool tries to guess it, if can't than will give a hint. - -### Generate your Profile - -You are done with the configuration of Cloudbreak deployer. The last thing you have to do is to regenerate the configurations in order to take effect. - -``` -rm *.yml -cbd generate -``` - -This command applies the following steps: - -- creates the **docker-compose.yml** file that describes the configuration of all the Docker containers needed for the Cloudbreak deployment. -- creates the **uaa.yml** file that holds the configuration of the identity server used to authenticate users to Cloudbreak. - -### Start Cloudbreak - -To start the Cloudbreak application use the following command. -This will start all the Docker containers and initialize the application. It will take a few minutes until all the services start. - -``` -cbd start -``` - ->Launching it first will take more time as it downloads all the docker images needed by Cloudbreak. - -After the `cbd start` command finishes you can check the logs of the Cloudbreak server with this command: - -``` -cbd logs cloudbreak -``` ->Cloudbreak server should start within a minute - you should see a line like this: `Started CloudbreakApplication in 36.823 seconds` - -### Next steps - -Once Cloudbreak is up and running you should check out the [Provisioning Prerequisites](azure_pre_prov.md) needed to create Azure -clusters with Cloudbreak. \ No newline at end of file diff --git a/docs/azure_cb_shell.md b/docs/azure_cb_shell.md deleted file mode 100644 index af711fa63..000000000 --- a/docs/azure_cb_shell.md +++ /dev/null @@ -1,253 +0,0 @@ -## Interactive mode - -Start the shell with `cbd util cloudbreak-shell`. This will launch the Cloudbreak shell inside a Docker container and you are ready to start using it. - -You have to copy files into the cbd working directory, which you would like to use from shell. For example if your `cbd` working directory is `~/prj/cbd` then copy your blueprint and public ssh key file into this directory. You can refer to these files with their names from the shell. - -### Create a cloud credential - -``` -credential create --AZURE --description "credential description" --name myazurecredential --subscriptionId --appId --tenantId --password --sshKeyPath -``` - -> Cloudbreak is supporting simple rsa public key instead of X509 certificate file after 1.0.4 version - -Alternatively you can upload your public key from an url as well, by using the `—sshKeyUrl` switch. You can check whether the credential was creates successfully by using the `credential list` command. -You can switch between your cloud credential - when you’d like to use one and act with that you will have to use: -``` -credential select --name myazurecredential -``` - -You can delete your cloud credential - when you’d like to delete one you will have to use: -``` -credential delete --name myazurecredential -``` - -You can show your cloud credential - when you’d like to show one you will have to use: -``` -credential show --name myazurecredential -``` - -### Create a template - -A template gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related resources, maintaining and updating them in an orderly and predictable fashion. A template can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -template create --AZURE --name azuretemplate --description azure-template --instanceType STANDARD_D3 --volumeSize 100 --volumeCount 2 -``` -You can check whether the template was created successfully by using the `template list` or `template show` command. - -You can delete your cloud template - when you’d like to delete one you will have to use: -``` -template delete --name azuretemplate -``` - -### Create or select a blueprint - -You can define Ambari blueprints with cloudbreak-shell: - -``` -blueprint add --name myblueprint --description myblueprint-description --file -``` - -Other available options: - -`--url` the url of the blueprint - -`--publicInAccount` flags if the network is public in the account - -We ship default Ambari blueprints with Cloudbreak. You can use these blueprints or add yours. To see the available blueprints and use one of them please use: - -``` -blueprint list - -blueprint select --name hdp-small-default -``` - -### Create a network - -A network gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related networking, maintaining and updating them in an orderly and predictable fashion. A network can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -network create --AZURE --name azurenetwork --description azure-network --subnet 10.0.0.0/16 --addressPrefix 10.0.0.0/8 -``` - -Other available options: - -`--publicInAccount` flags if the network is public in the account - -There is a default network with name `default-azure-network` with 10.0.0.0/16 subnet and 10.0.0.0/8 addressPrefix. - -You can check whether the network was created successfully by using the `network list` command. Check the network and select it if you are happy with it: - -``` -network show --name azurenetwork - -network select --name azurenetwork -``` - -### Create a security group - -A security group gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related security rules. - -``` -securitygroup create --name secgroup_example --description securitygroup-example --rules 0.0.0.0/0:tcp:8080,9090;10.0.33.0/24:tcp:1234,1235 -``` - -You can check whether the security group was created successfully by using the `securitygroup list` command. Check the security group and select it if you are happy with it: - -``` -securitygroup show --name secgroup_example - -securitygroup select --name secgroup_example -``` - -There are two default security groups defined: `all-services-port` and `only-ssh-and-ssl` - -`only-ssh-and-ssl:` all ports are locked down (you can't access Hadoop services outside of the VPC) - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services + SSH/HTTP are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -### Configure instance groups - -You have to configure the instancegroups before the provisioning. An instancegroup is defining a group of your nodes with a specified template. Usually we create instancegroups for the hostgroups defined in the blueprints. - -``` -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName minviable-azure -``` - -Other available options: - -`--templateId` Id of the template - -### Create a Hadoop cluster -You are almost done - two more command and this will create your Hadoop cluster on your favorite cloud provider. Same as the API, or UI this will use your `credential`, `instancegroups`, `network`, `securitygroup`, and by using Azure ResourceManager will launch a cloud stack -``` -stack create --name my-first-stack --region WEST_US -``` -Once the `stack` is up and running (cloud provisioning is done) it will use your selected `blueprint` and install your custom Hadoop cluster with the selected components and services. -``` -cluster create --description "my first cluster" -``` -You are done - you can check the progress through the Ambari UI. If you log back to Cloudbreak UI you can check the progress over there as well, and learn the IP address of Ambari. - -### Stop/Restart cluster and stack -You have the ability to **stop your existing stack then its cluster** if you want to suspend the work on it. - -Select a stack for example with its name: -``` -stack select --name my-stack -``` -Other available option to define a stack is its `--id` (instead of the `--name`). - -Apply the following commands to stop the previously selected stack: -``` -cluster stop -stack stop -``` ->**Important!** The related cluster should be stopped before you can stop the stack. - - -Apply the following command to **restart the previously selected and stopped stack**: -``` -stack start -``` -After the selected stack has restarted, you can **restart the related cluster as well**: -``` -cluster start -``` - -### Upscale/Downscale cluster and stack -You can **upscale your selected stack** if you need more instances to your infrastructure: -``` -stack node --ADD --instanceGroup host_group_slave_1 --adjustment 6 -``` -Other available options: - -`--withClusterUpScale` indicates cluster upscale after stack upscale -or you can upscale the related cluster separately as well: -``` -cluster node --ADD --hostgroup host_group_slave_1 --adjustment 6 -``` - - -Apply the following command to **downscale the previously selected stack**: -``` -stack node --REMOVE --instanceGroup host_group_slave_1 --adjustment -2 -``` -and the related cluster separately: -``` -cluster node --REMOVE --hostgroup host_group_slave_1 --adjustment -2 -``` - -## Silent mode - -With Cloudbreak shell you can execute script files as well. A script file contains cloudbreak shell commands and can be executed with the `script` cloudbreak shell command - -``` -script -``` - -or with the `cbd util cloudbreak-shell-quiet` cbd command: - -``` -cbd util cloudbreak-shell-quiet < example.sh -``` - -## Example - -The following example creates a hadoop cluster with `hdp-small-default` blueprint on STANDARD_D3 instances with 2X100G attached disks on `default-azure-network` network using `all-services-port` security group. You should copy your ssh public key file into your cbd working directory with name `id_rsa.pub` and change the `<...>` parts with your azure credential details. - -``` -credential create --AZURE --description "credential description" --name myazurecredential --subscriptionId --appId --tenantId --password --sshKeyPath id_rsa.pub -credential select --name myazurecredential -template create --AZURE --name azuretemplate --description azure-template --instanceType STANDARD_D3 --volumeSize 100 --volumeCount 2 -blueprint select --name hdp-small-default -instancegroup configure --instanceGroup cbgateway --nodecount 1 --templateName azuretemplate -instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName azuretemplate -instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName azuretemplate -instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName azuretemplate -instancegroup configure --instanceGroup host_group_client_1 --nodecount 1 --templateName azuretemplate -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName azuretemplate -network select --name default-azure-network -securitygroup select --name all-services-port -stack create --name my-first-stack --region WEST_US -cluster create --description "My first cluster" -``` - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/azure_cb_ui.md b/docs/azure_cb_ui.md deleted file mode 100644 index 2c9537137..000000000 --- a/docs/azure_cb_ui.md +++ /dev/null @@ -1,204 +0,0 @@ -#Provisioning via Browser - -You can log into the Cloudbreak application at http://PUBLIC_IP:3000. - -The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider account. -This description details the AZURE setup - if you'd like to use a different cloud provider check out its manual. - -This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI: - -- connect your AZURE account with Cloudbreak -- create some template resources on the UI that describe the infrastructure of your clusters -- create a blueprint that describes the HDP services in your clusters and add some recipes for customization -- launch the cluster itself based on these template resources - -## Setting up Azure credentials - -If you do not have an Azure Resource manager application you can simply create it with Cloudbreak deployer. Please read the [Provisioning prerequisites](azure_pre_prov.md) for more information. - -`Name:` name of your credential - -`Description:` short description of your linked credential - -`Subscription Id:` your Azure subscription id - see Accounts (`portal.azure.com`> `Browse all`> `Subscription`) - -`Password:` your password which was setted up when you create the AD app - -`App Id:` You app Id (`portal.azure.com`> `Browse all`> `Subscription`> `Subscription detail`> `Users`> `You application`> `Properties`) - -`App Owner Tenant Id:` You Tenant Id (`portal.azure.com`> `Browse all`> `Subscription`> `Subscription detail`> `Users`> `You application`> `Properties`) - -`SSH public key:` the SSH public key in OpenSSH format that's private keypair can be used to [log into the launched instances](http://sequenceiq.com/cloudbreak-deployer/1.1.0/insights/#ssh-to-the-host) later - -> Cloudbreak is supporting simple rsa public key instead of X509 certificate file after 1.0.4 version - -The ssh username is **cloudbreak** - -## Infrastructure templates - -After your AZURE account is linked to Cloudbreak you can start creating templates that describe your clusters' infrastructure: - -- resources -- networks -- security groups - -When you create a template, Cloudbreak *doesn't make any requests* to AZURE. -Resources are only created on AZURE after the `create cluster` button is pushed. -These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure. - -**Manage resources** - -Using manage resources you can create infrastructure templates. Templates describes the infrastructure where the HDP cluster will be provisioned. We support heterogenous clusters - this means that one cluster can be built by combining different templates. - -`Name:` name of your template - -`Description:` short description of your template - -`Instance type:` the Azure instance type to be used - -`Attached volumes per instance:` the number of disks to be attached - -`Volume size (GB):` the size of the attached disks (in GB) - -`Public in account:` share it with others in the account - -**Manage blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. - -`Name:` name of your blueprint - -`Description:` short description of your blueprint - -`Source URL:` you can add a blueprint by pointing to a URL. As an example you can use this [blueprint](https://raw.githubusercontent.com/sequenceiq/cloudbreak/master/core/src/main/resources/defaults/blueprints/multi-node-hdfs-yarn.bp). - -`Manual copy:` you can copy paste your blueprint in this text area - -`Public in account:` share it with others in the account - -**Manage networks** - -Manage networks allows you to create or reuse existing networks and configure them. - -`Name:` name of the network - -`Description:` short description of your network - -`Subnet (CIDR):` a subnet in the VPC with CIDR block - -`Address prefix (CIDR):` the address space that is used for subnets - -`Public in account:` share it with others in the account - -**Security groups** - -They describe the allowed inbound traffic to the instances in the cluster. -Currently only one security group template can be selected for a Cloudbreak cluster and all the instances have a public IP address so all the instances in the cluster will belong to the same security group. -This may change in a later release. - -You can define your own security group by adding all the ports, protocols and CIDR range you'd like to use. 443 needs to be there in every security group otherwise Cloudbreak won't be able to communicate with the provisioned cluster. The rules defined here doesn't need to contain the internal rules, those are automatically added by Cloudbreak to the security group on Azure. - -You can also use the two pre-defined security groups in Cloudbreak: - -`only-ssh-and-ssl:` all ports are locked down except for SSH and gateway HTTPS (you can't access Hadoop services outside of the VPC): - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services and SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -If `Public in account` is checked all the users belonging to your account will be able to use this security group template to create clusters, but cannot delete or modify it. - ->Note that the security groups are *not created* on AZURE after the `Create Security Group` button is pushed, only -after the cluster provisioning starts with the selected security group template. - -## Cluster installation - -This section describes - -**Blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are [used by Ambari](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints). - -You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own. -Blueprints can be added from an URL or the whole JSON can be copied to the `Manual copy` field. - -The hostgroups added in the JSON will be mapped to a set of instances when starting the cluster and the services and components defined in the hostgroup will be installed on the corresponding nodes. -It is not necessary to define all the configuration fields in the blueprints - if a configuration is missing, Ambari will fill that with a default value. -The configurations defined in the blueprint can also be modified later from the Ambari UI. - -If `Public in account` is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it. - -A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. -There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. -When the blueprint is exported some configurations will have for example hardcoded domain names, or memory configurations that won't be applicable to the Cloudbreak cluster. - -**Cluster customization** - -Sometimes it can be useful to define some custom scripts that run during cluster creation and add some additional functionality. -For example it can be a service you'd like to install but it's not supported by Ambari or some script that automatically downloads some data to the necessary nodes. -The most notable example is Ranger setup: it has a prerequisite of a running database when Ranger Admin is installing. -A PostgreSQL database can be easily started and configured with a recipe before the blueprint installation starts. - -To learn more about these so called *Recipes*, and to check out the Ranger database recipe, take a look at the [Cluster customization](recipes.md) part of the documentation. - - -## Cluster deployment - -After all the templates are configured you can deploy a new HDP cluster. Start by selecting a previously created credential in the header. -Click on `create cluster`, give it a `Name`, select a `Region` where the cluster infrastructure will be provisioned and select one of the `Networks` and `Security Groups` created earlier. -After you've selected a `Blueprint` as well you should be able to configure the `Template resources` and the number of nodes for all of the hostgroups in the blueprint. - -If `Public in account` is checked all the users belonging to your account will be able to see the newly created cluster on the UI, but cannot delete or modify it. - -If `Enable security` is checked as well, Cloudbreak will install KDC and the cluster will be Kerberized. See more about it in the [Kerberos](kerberos.md) section of this documentation. - -After the `create and start cluster` button is pushed Cloudbreak will start to create resources on your AZURE account. -Cloudbreak uses *ARM template* to create the resources - you can check out the resources created by Cloudbreak on the [ARM Portal](https://portal.azure.com) on the 'Resource groups' page. - ->**Important** Always use Cloudbreak to delete the cluster, or if that fails for some reason always try to delete -the ARM first. - -**Advanced options** - -There are some advanced features when deploying a new cluster, these are the following: - -`File system:` read more [Deploying a DASH service with Cloudbreak deployer](azure_pre_prov.md) - -`Minimum cluster size:` the provisioning strategy in case of the cloud provider can't allocate all the requested nodes - -`Validate blueprint:` feature to validate or not the Ambari blueprint. By default is switched on. - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/azure_pre_prov.md b/docs/azure_pre_prov.md deleted file mode 100644 index 28b4cf7c4..000000000 --- a/docs/azure_pre_prov.md +++ /dev/null @@ -1,120 +0,0 @@ -# Provisioning Prerequisites - -We use the new [Azure ARM](https://azure.microsoft.com/en-us/documentation/articles/resource-group-overview/) in -order to launch clusters. In order to work we need to create an Active Directory application with the configured name and password and adds the permissions that are needed to call the Azure Resource Manager API. Cloudbreak deployer automates all this for you. - -## Generate a new SSH key - -All the instances created by Cloudbreak are configured to allow key-based SSH, -so you'll need to provide an SSH public key that can be used later to SSH onto the instances in the clusters you'll create with Cloudbreak. -You can use one of your existing keys or you can generate a new one. - -To generate a new SSH keypair: - -``` -ssh-keygen -t rsa -b 4096 -C "your_email@example.com" -# Creates a new ssh key, using the provided email as a label -# Generating public/private rsa key pair. -``` - -``` -# Enter file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter] -You'll be asked to enter a passphrase, but you can leave it empty. - -# Enter passphrase (empty for no passphrase): [Type a passphrase] -# Enter same passphrase again: [Type passphrase again] -``` - -After you enter a passphrase the keypair is generated. The output should look something like below. -``` -# Your identification has been saved in /Users/you/.ssh/id_rsa. -# Your public key has been saved in /Users/you/.ssh/id_rsa.pub. -# The key fingerprint is: -# 01:0f:f4:3b:ca:85:sd:17:sd:7d:sd:68:9d:sd:a2:sd your_email@example.com -``` - -Later you'll need to pass the `.pub` file's contents to Cloudbreak and use the private part to SSH to the instances. - -## Azure access setup - -If you do not have an Active directory user then you have to configure it before deploying a cluster with Cloudbreak. - -1. Go to `manage.windowsazure.com` > `Active Directory` -![](/images/azure1.png) - -2. You can configure your AD users on `Your active directory` > `Users` menu -![](/images/azure2.png) - -3. Here you can add the new user to AD. Simply click on `Add User` on the bottom of the page -![](/images/azure3.png) - -4. Type the new user name into the box -![](/images/azure4.png) - -5. You will see the new user in the list. You have got a temporary password so you have to change it before you start using the new user. -![](/images/azure5.png) - -6. After you add the user to the AD you need to add your AD user to the `manage.windowsazure.com` > `Settings` > `Administrators` -![](/images/azure6.png) - -7. Here you can add the new user to Administrators. Simply click on `Add` on the bottom of the page -![](/images/azure7.png) - -##Azure application setup with Cloudbreak Deployer - -In order for Cloudbreak to be able to launch clusters on Azure on your behalf you need to set up your **Azure ARM application**. We have automated the Azure configurations in the Cloudbreak Deployer (CBD). After the CBD has installed, simply run the following command: - -``` -cbd azure configure-arm --app_name myapp --app_password password123 --subscription_id 1234-abcd-efgh-1234 -``` -*Options:* - -**--app_name**: Your application name. Default is *app*. - -**--app_password**: Your application password. Default is *password*. - -**--subscription_id**: Your Azure subscription ID. - -**--username**: Your Azure username. - -**--password**: Your Azure password. - -The command first creates an Active Directory application with the configured name and password and adds the permissions that are needed to call the Azure Resource Manager API. -Please use the output of the command when you creating your Azure credential in Cloudbreak. -The output of the command something like this: - -``` -Subscription ID: sdf324-26b3-sdf234-ad10-234dfsdfsd -App ID: 234sdf-c469-sdf234-9062-dsf324 -Password: password123 -App Owner Tenant ID: sdwerwe1-d98e-dsf12-dsf123-df123232 -``` - -## Filesystem configuration - -When starting a cluster with Cloudbreak on Azure, the default filesystem is “Windows Azure Blob Storage with DASH”. Hadoop has built-in support for the [WASB filesystem](https://hadoop.apache.org/docs/current/hadoop-azure/index.html) so it can be used easily as HDFS instead of disks. - -### Disks and blob storage - -In Azure every data disk attached to a virtual machine [is stored](https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-disks-vhds/) as a virtual hard disk (VHD) in a page blob inside an Azure storage account. Because these are not local disks and the operations must be done on the VHD files it causes degraded performance when used as HDFS. -When WASB is used as a Hadoop filesystem the files are full-value blobs in a storage account. It means better performance compared to the data disks and the WASB filesystem can be configured very easily but Azure storage accounts have their own [limitations](https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/#storage-limits) as well. There is a space limitation for TB per storage account (500 TB) as well but the real bottleneck is the total request rate that is only 20000 IOPS where Azure will start to throw errors when trying to do an I/O operation. -To bypass those limits Microsoft created a small service called [DASH](https://github.com/MicrosoftDX/Dash). DASH itself is a service that imitates the API of the Azure Blob Storage API and it can be deployed as a Microsoft Azure Cloud Service. Because its API is the same as the standard blob storage API it can be used *almost* in the same way as the default WASB filesystem from a Hadoop deployment. -DASH works by sharding the storage access across multiple storage accounts. It can be configured to distribute storage account load to at most 15 **scaleout** storage accounts. It needs one more **namespace** storage account where it keeps track of where the data is stored. -When configuring a WASB filesystem with Hadoop, the only required config entries are the ones where the access details are described. To access a storage account Azure generates an access key that is displayed on the Azure portal or can be queried through the API while the account name is the name of the storage account itself. A DASH service has a similar account name and key, those can be configured in the configuration file while deploying the cloud service. - -![](/diagrams/dash.png) - -### Deploying a DASH service with Cloudbreak deployer - -We have automated the deployment of a DASH service in cloudbreak-deployer. After cbd is installed, simply run the following command to deploy a DASH cloud service with 5 scale out storage accounts: -``` -cbd azure deploy-dash --accounts 5 --prefix dash --location "West Europe" --instances 3 -``` - -The command first creates the namespace account and the scaleout storage accounts, builds the *.cscfg* configuration file based on the created storage account names and keys, generates an Account Name and an Account Key for the DASH service and finally deploys the cloud service package file to a new cloud service. - -The WASB filesystem configured with DASH can be used as a data lake - when multiple clusters are deployed with the same DASH filesystem configuration the same data can be accessed from all the clusters, but every cluster can have a different service configured as well. In that case deploy as many DASH services with cbd as clusters with Cloudbreak and configure them accordingly. - -## Next steps - -After these prerequisites are done you can move on to create clusters on the [UI](azure_cb_ui.md) or with the [Shell](azure_cb_shell.md). diff --git a/docs/blueprints.md b/docs/blueprints.md deleted file mode 100644 index dc8323b07..000000000 --- a/docs/blueprints.md +++ /dev/null @@ -1,58 +0,0 @@ - - - -# Blueprints - -We provide a list of default Hadoop cluster Blueprints for your convenience, however you can always build and use your own Blueprint. - -* hdp-small-default - HDP 2.3 blueprint - -This is a complex [Blueprint](https://raw.githubusercontent.com/sequenceiq/cloudbreak/master/core/src/main/resources/defaults/blueprints/hdp-small-default.bp) which allows you to launch a multi node, fully distributed HDP 2.3 Cluster in the cloud. - -It allows you to use the following services: HDFS, YARN, MAPREDUCE2, KNOX, HBASE, HIVE, HCATALOG, WEBHCAT, SLIDER, OOZIE, PIG, SQOOP, METRICS, TEZ, FALCON, ZOOKEEPER. - -* hdp-streaming-cluster - HDP 2.3 blueprint - -This is a streaming [Blueprint](https://raw.githubusercontent.com/sequenceiq/cloudbreak/master/core/src/main/resources/defaults/blueprints/hdp-streaming-cluster.bp) which allows you to launch a multi node, fully distributed HDP 2.3 Cluster in the cloud, optimized for streaming jobs. - -It allows you to use the following services: HDFS, YARN, MAPREDUCE2, STORM, KNOX, HBASE, HIVE, HCATALOG, WEBHCAT, SLIDER, OOZIE, PIG, SQOOP, METRICS, TEZ, FALCON, ZOOKEEPER. - -* hdp-spark-cluster - HDP 2.3 blueprint - -This is an analytics [Blueprint](https://raw.githubusercontent.com/sequenceiq/cloudbreak/master/core/src/main/resources/defaults/blueprints/hdp-spark-cluster.bp) which allows you to launch a multi node, fully distributed HDP 2.3 Cluster in the cloud, optimized for analytic jobs. - -It allows you to use the following services: HDFS, YARN, MAPREDUCE2, SPARK, ZEPPELIN, KNOX, HBASE, HIVE, HCATALOG, WEBHCAT, SLIDER, OOZIE, PIG, SQOOP, METRICS, TEZ, FALCON, ZOOKEEPER. - -## Components - -Ambari supports the concept of stacks and associated services in a stack definition. By leveraging the stack definition, Ambari has a consistent and defined interface to install, manage and monitor a set of services and provides extensibility model for new stacks and services to be introduced. - -At high level the supported list of components can be grouped into main categories: Master and Slave - and bundling them together form a Hadoop Service. - -| Services | Components | -|:----------- |:------------------------------------------------------------------------| -| HDFS | DATANODE, HDFS_CLIENT, JOURNALNODE, NAMENODE, SECONDARY_NAMENODE, ZKFC | -| YARN | APP_TIMELINE_SERVER, NODEMANAGER, RESOURCEMANAGER, YARN_CLIENT | -| MAPREDUCE2 | HISTORYSERVER, MAPREDUCE2_CLIENT | -| GANGLIA | GANGLIA_MONITOR, GANGLIA_SERVER | -| HBASE | HBASE_CLIENT, HBASE_MASTER, HBASE_REGIONSERVER | -| HIVE | HIVE_CLIENT, HIVE_METASTORE, HIVE_SERVER, MYSQL_SERVER | -| HCATALOG | HCAT | -| WEBHCAT | WEBHCAT_SERVER | -| OOZIE | OOZIE_CLIENT, OOZIE_SERVER | -| PIG | PIG | -| SQOOP | SQOOP | -| STORM | DRPC_SERVER, NIMBUS, STORM_REST_API, STORM_UI_SERVER, SUPERVISOR | -| TEZ | TEZ_CLIENT | -| FALCON | FALCON_CLIENT, FALCON_SERVER | -| ZOOKEEPER | ZOOKEEPER_CLIENT, ZOOKEEPER_SERVER | -| SPARK | SPARK_JOBHISTORYSERVER, SPARK_CLIENT | -| RANGER | RANGER_USERSYNC, RANGER_ADMIN | -| AMBARI_METRICS | AMBARI_METRICS, METRICS_COLLECTOR, METRICS_MONITOR | -| KERBEROS | KERBEROS_CLIENT | -| FLUME | FLUME_HANDLER | -| KAFKA | KAFKA_BROKER | -| KNOX | KNOX_GATEWAY | -| NAGIOS | NAGIOS_SERVER | -| ATLAS | ATLAS | -| CLOUDBREAK | CLOUDBREAK | diff --git a/docs/diagrams/UI-screenshot.png b/docs/diagrams/UI-screenshot.png deleted file mode 100644 index 298fd929f..000000000 Binary files a/docs/diagrams/UI-screenshot.png and /dev/null differ diff --git a/docs/diagrams/ambari-create-cluster.png b/docs/diagrams/ambari-create-cluster.png deleted file mode 100644 index 67d693ef3..000000000 Binary files a/docs/diagrams/ambari-create-cluster.png and /dev/null differ diff --git a/docs/diagrams/ambari-overview.png b/docs/diagrams/ambari-overview.png deleted file mode 100644 index 6bdb783d5..000000000 Binary files a/docs/diagrams/ambari-overview.png and /dev/null differ diff --git a/docs/diagrams/consul.png b/docs/diagrams/consul.png deleted file mode 100644 index 299227a89..000000000 Binary files a/docs/diagrams/consul.png and /dev/null differ diff --git a/docs/diagrams/dash.png b/docs/diagrams/dash.png deleted file mode 100644 index b87c40ec1..000000000 Binary files a/docs/diagrams/dash.png and /dev/null differ diff --git a/docs/diagrams/dashui.png b/docs/diagrams/dashui.png deleted file mode 100644 index cd6d91808..000000000 Binary files a/docs/diagrams/dashui.png and /dev/null differ diff --git a/docs/diagrams/docker.png b/docs/diagrams/docker.png deleted file mode 100644 index aafe0e830..000000000 Binary files a/docs/diagrams/docker.png and /dev/null differ diff --git a/docs/diagrams/seq_diagram_cluster_flow.png b/docs/diagrams/seq_diagram_cluster_flow.png deleted file mode 100644 index aee3e718c..000000000 Binary files a/docs/diagrams/seq_diagram_cluster_flow.png and /dev/null differ diff --git a/docs/diagrams/seq_diagram_provision_flow_1.png b/docs/diagrams/seq_diagram_provision_flow_1.png deleted file mode 100644 index ada3bf50d..000000000 Binary files a/docs/diagrams/seq_diagram_provision_flow_1.png and /dev/null differ diff --git a/docs/diagrams/seq_diagram_provision_flow_2.png b/docs/diagrams/seq_diagram_provision_flow_2.png deleted file mode 100644 index 0c4fd874b..000000000 Binary files a/docs/diagrams/seq_diagram_provision_flow_2.png and /dev/null differ diff --git a/docs/diagrams/seq_diagram_stack_post.png b/docs/diagrams/seq_diagram_stack_post.png deleted file mode 100644 index f84e5b7f1..000000000 Binary files a/docs/diagrams/seq_diagram_stack_post.png and /dev/null differ diff --git a/docs/diagrams/serf-event.png b/docs/diagrams/serf-event.png deleted file mode 100644 index 0abb665ae..000000000 Binary files a/docs/diagrams/serf-event.png and /dev/null differ diff --git a/docs/diagrams/serf-gossip.png b/docs/diagrams/serf-gossip.png deleted file mode 100644 index 5ec1f2953..000000000 Binary files a/docs/diagrams/serf-gossip.png and /dev/null differ diff --git a/docs/diagrams/stack_state_diag.png b/docs/diagrams/stack_state_diag.png deleted file mode 100644 index 39915f2ae..000000000 Binary files a/docs/diagrams/stack_state_diag.png and /dev/null differ diff --git a/docs/diagrams/swarm.png b/docs/diagrams/swarm.png deleted file mode 100644 index 0dad9eeca..000000000 Binary files a/docs/diagrams/swarm.png and /dev/null differ diff --git a/docs/diagrams/vm.png b/docs/diagrams/vm.png deleted file mode 100644 index a31973ef8..000000000 Binary files a/docs/diagrams/vm.png and /dev/null differ diff --git a/docs/gcp-image.md b/docs/gcp-image.md deleted file mode 100644 index 2c3fd91be..000000000 --- a/docs/gcp-image.md +++ /dev/null @@ -1,31 +0,0 @@ -# Google Cloud Images - -We have pre-built cloud images for GCP with the Cloudbreak Deployer pre-installed. Following the steps will guide you through the provider specific configuration then launch. - -> Alternatively, instead of using the pre-built cloud images, you can install Cloudbreak Deployer on your own VM. See [install the Cloudbreak Deployer](onprem.md) for more information. - -## Configured Image - -You can create the latest Cloudbreak deployer image on the [Google Developers Console](https://console.developers.google.com/) with the help - of the [Google Cloud Shell](https://cloud.google.com/cloud-shell/docs/). - -![](/images/google-cloud-shell-launch.png) - -Images are global resources, so they can be used across zones and projects. - -### GCP Image Details - - -![](/images/google-cloud-shell.png) - -Please make sure you opened the following ports on your virtual machine: - - * SSH (22) - * Ambari (8080) - * Identity server (8089) - * Cloudbreak GUI (3000) - * User authentication (3001) - -## Setup Cloudbreak Deployer - -Once you have the Cloudbreak Deployer installed, proceed to [Setup Cloudbreak Deployer](gcp.md). diff --git a/docs/gcp.md b/docs/gcp.md deleted file mode 100644 index 36ddf69d8..000000000 --- a/docs/gcp.md +++ /dev/null @@ -1,63 +0,0 @@ -# Google Setup - - -## Setup Cloudbreak Deployer - -If you already have Cloudbreak Deployer either by [using the GCP Cloud Images](gcp-image.md) or by [installing the Cloudbreak Deployer](onprem.md) manually on your own VM, -you can start to setup the Cloudbreak Application with the deployer. - -Create and open the `cloudbreak-deployment` directory: - -``` -cd cloudbreak-deployment -``` - -This is the directory of the config files and the supporting binaries that will be downloaded by Cloudbreak deployer. - -### Initialize your Profile - -First initialize cbd by creating a `Profile` file: - -``` -cbd init -``` - -It will create a `Profile` file in the current directory. Please edit the file - the only required -configuration is the `PUBLIC_IP`. This IP will be used to access the Cloudbreak UI -(called Uluwatu). In some cases the `cbd` tool tries to guess it, if can't than will give a hint. - -### Generate your Profile - -You are done with the configuration of Cloudbreak deployer. The last thing you have to do is to regenerate the configurations in order to take effect. - -``` -rm *.yml -cbd generate -``` - -This command applies the following steps: - -- creates the **docker-compose.yml** file that describes the configuration of all the Docker containers needed for the Cloudbreak deployment. -- creates the **uaa.yml** file that holds the configuration of the identity server used to authenticate users to Cloudbreak. - -### Start Cloudbreak - -To start the Cloudbreak application use the following command. -This will start all the Docker containers and initialize the application. It will take a few minutes until all the services start. - -``` -cbd start -``` - ->Launching it first will take more time as it downloads all the docker images needed by Cloudbreak. - -After the `cbd start` command finishes you can check the logs of the Cloudbreak server with this command: - -``` -cbd logs cloudbreak -``` ->Cloudbreak server should start within a minute - you should see a line like this: `Started CloudbreakApplication in 36.823 seconds` - -### Next steps - -Once Cloudbreak is up and running you should check out the [Provisioning Prerequisites](gcp_pre_prov.md) needed to create Google Cloud clusters with Cloudbreak. diff --git a/docs/gcp_cb_shell.md b/docs/gcp_cb_shell.md deleted file mode 100644 index eec0f6d54..000000000 --- a/docs/gcp_cb_shell.md +++ /dev/null @@ -1,257 +0,0 @@ -## Interactive mode - -Start the shell with `cbd util cloudbreak-shell`. This will launch the Cloudbreak shell inside a Docker container and you are ready to start using it. - -You have to copy files into the cbd working directory, which you would like to use from shell. For example if your `cbd` working directory is `~/prj/cbd` then copy your blueprint and public ssh key file into this directory. You can refer to these files with their names from the shell. - -### Create a cloud credential - -In order to start using Cloudbreak to provision a cluster in Google Cloud you will need to have a GCP credential. If you do not want to Cloubreak to reach your Google Cloud resources then you have to delete the service account. -``` -credential create --GCP --description "short description of your linked credential" --name my-gcp-credential --projectId --serviceAccountId --serviceAccountPrivateKeyPath --sshKeyPath -``` - -Alternatively you can upload your public key from an url as well, by using the `—sshKeyUrl` switch. You can check whether the credential was creates successfully by using the `credential list` command. -You can switch between your cloud credential - when you’d like to use one and act with that you will have to use: -``` -credential select --name my-gcp-credential -``` - -You can delete your cloud credential - when you’d like to delete one you will have to use: -``` -credential delete --name my-gcp-credential -``` - -You can show your cloud credential - when you’d like to show one you will have to use: -``` -credential show --name my-gcp-credential -``` - -### Create a template - -A template gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related resources, maintaining and updating them in an orderly and predictable fashion. A template can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -template create --GCP --name gcptemplate --description gcp-template --instanceType N1_STANDARD_4 --volumeSize 100 --volumeCount 2 -``` -Other available options: - -`--volumeType` defaults to "HDD", other allowed value: "SSD" - -`--publicInAccount` flags if the template is public in the account - -You can check whether the template was created successfully by using the `template list` or `template show` command. - -You can delete your cloud template - when you’d like to delete one you will have to use: -``` -template delete --name gcptemplate -``` - -### Create or select a blueprint - -You can define Ambari blueprints with cloudbreak-shell: - -``` -blueprint add --name myblueprint --description myblueprint-description --file -``` - -Other available options: - -`--url` the url of the blueprint - -`--publicInAccount` flags if the network is public in the account - -We ship default Ambari blueprints with Cloudbreak. You can use these blueprints or add yours. To see the available blueprints and use one of them please use: - -``` -blueprint list - -blueprint select --name hdp-small-default -``` - -### Create a network - -A network gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related networking, maintaining and updating them in an orderly and predictable fashion. A network can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -network create --GCP --name gcpnetwork --description "gcp network"--subnet 10.0.0.0/16 -``` - -Other available options: - -`--publicInAccount` flags if the network is public in the account - -There is a default network with name `default-gcp-network`. If we use this for cluster creation, Cloudbreak will create the cluster within the 10.0.0.0/16 subnet. - -You can check whether the network was created successfully by using the `network list` command. Check the network and select it if you are happy with it: - -``` -network show --name gcpnetwork - -network select --name gcpnetwork -``` - -### Create a security group - -A security group gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related security rules. - -``` -securitygroup create --name secgroup_example --description securitygroup-example --rules 0.0.0.0/0:tcp:8080,9090;10.0.33.0/24:tcp:1234,1235 -``` - -You can check whether the security group was created successfully by using the `securitygroup list` command. Check the security group and select it if you are happy with it: - -``` -securitygroup show --name secgroup_example - -securitygroup select --name secgroup_example -``` - -There are two default security groups defined: `all-services-port` and `only-ssh-and-ssl` - -`only-ssh-and-ssl:` all ports are locked down (you can't access Hadoop services outside of the VPC) - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services + SSH/HTTP are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -### Configure instance groups - -You have to configure the instancegroups before the provisioning. An instancegroup is defining a group of your nodes with a specified template. Usually we create instancegroups for the hostgroups defined in the blueprints. -``` -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName minviable-gcp -``` - -Other available options: - -`--templateId` Id of the template - -### Create a Hadoop cluster -You are almost done - two more command and this will create your Hadoop cluster on your favorite cloud provider. Same as the API, or UI this will use your `credential`, `instancegroups`, `network`, `securitygroup`, and by using Google Cloud Platform will launch a cloud stack -``` -stack create --name my-first-stack --region US_CENTRAL1_A -``` -Once the `stack` is up and running (cloud provisioning is done) it will use your selected `blueprint` and install your custom Hadoop cluster with the selected components and services. -``` -cluster create --description "my first cluster" -``` -You are done - you can check the progress through the Ambari UI. If you log back to Cloudbreak UI you can check the progress over there as well, and learn the IP address of Ambari. - -### Stop/Restart cluster and stack -You have the ability to **stop your existing stack then its cluster** if you want to suspend the work on it. - -Select a stack for example with its name: -``` -stack select --name my-stack -``` -Other available option to define a stack is its `--id` (instead of the `--name`). - -Apply the following commands to stop the previously selected stack: -``` -cluster stop -stack stop -``` ->**Important!** The related cluster should be stopped before you can stop the stack. - - -Apply the following command to **restart the previously selected and stopped stack**: -``` -stack start -``` -After the selected stack has restarted, you can **restart the related cluster as well**: -``` -cluster start -``` - -### Upscale/Downscale cluster and stack -You can **upscale your selected stack** if you need more instances to your infrastructure: -``` -stack node --ADD --instanceGroup host_group_slave_1 --adjustment 6 -``` -Other available options: - -`--withClusterUpScale` indicates cluster upscale after stack upscale -or you can upscale the related cluster separately as well: -``` -cluster node --ADD --hostgroup host_group_slave_1 --adjustment 6 -``` - - -Apply the following command to **downscale the previously selected stack**: -``` -stack node --REMOVE --instanceGroup host_group_slave_1 --adjustment -2 -``` -and the related cluster separately: -``` -cluster node --REMOVE --hostgroup host_group_slave_1 --adjustment -2 -``` - -## Silent mode - -With Cloudbreak shell you can execute script files as well. A script file contains cloudbreak shell commands and can be executed with the `script` cloudbreak shell command - -``` -script -``` - -or with the `cbd util cloudbreak-shell-quiet` cbd command: - -``` -cbd util cloudbreak-shell-quiet < example.sh -``` - -## Example - -The following example creates a hadoop cluster with `hdp-small-default` blueprint on M3Xlarge instances with 2X100G attached disks on `default-gcp-network` network using `all-services-port` security group. You should copy your ssh public key file and your GCP service account generated private key into your cbd working directory with name `id_rsa.pub` and `gcp.p12` and change the `<...>` parts with your gcp credential details. - -``` -credential create --GCP --description "my credential" --name my-gcp-credential --projectId --serviceAccountId --serviceAccountPrivateKeyPath gcp.p12 --sshKeyFile id_rsa.pub -credential select --name my-gcp-credential -template create --GCP --name gcptemplate --description gcp-template --instanceType N1_STANDARD_4 --volumeSize 100 --volumeCount 2 -blueprint select --name hdp-small-default -instancegroup configure --instanceGroup cbgateway --nodecount 1 --templateName gcptemplate -instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName gcptemplate -instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName gcptemplate -instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName gcptemplate -instancegroup configure --instanceGroup host_group_client_1 --nodecount 1 --templateName gcptemplate -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName gcptemplate -network select --name default-gcp-network -securitygroup select --name all-services-port -stack create --name my-first-stack --region US_CENTRAL1_A -cluster create --description "My first cluster" -``` - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/gcp_cb_ui.md b/docs/gcp_cb_ui.md deleted file mode 100644 index 56f2e6bad..000000000 --- a/docs/gcp_cb_ui.md +++ /dev/null @@ -1,200 +0,0 @@ -#Provisioning via Browser - -You can log into the Cloudbreak application at http://PUBLIC_IP:3000. - -The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider account. -This description details the GCP setup - if you'd like to use a different cloud provider check out its manual. - -This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI: - -- connect your GCP account with Cloudbreak -- create some template resources on the UI that describe the infrastructure of your clusters -- create a blueprint that describes the HDP services in your clusters and add some recipes for customization -- launch the cluster itself based on these template resources - - -## Manage cloud credentials - -You can now log into the Cloudbreak application at http://PUBLIC_IP:3000. Once logged in go to **Manage credentials**. Using manage credentials will link your cloud account with the Cloudbreak account. - -`Name:` name of your credential - -`Description:` short description of your linked credential - -`Project Id:` your GCP Project id - see Accounts - -`Service Account Email Address:` your GCP service account mail address - see Accounts - -`Service Account private (p12) key:` your GCP service account generated private key - see Accounts - -`SSH public key:` the SSH public key in OpenSSH format that's private keypair can be used to [log into the launched instances](http://sequenceiq.com/cloudbreak-deployer/1.1.0/insights/#ssh-to-the-host) later - -`Public in account:` share it with others in the account - -The ssh username is **cloudbreak**. - -## Infrastructure templates - -After your GCP account is linked to Cloudbreak you can start creating templates that describe your clusters' infrastructure: - -- resources -- networks -- security groups - -When you create a template, Cloudbreak *doesn't make any requests* to GCP. -Resources are only created on GCP after the `Create cluster` button is pushed. -These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure. - -**Manage resources** - -Using manage resources you can create infrastructure templates. Templates describes the infrastructure where the HDP cluster will be provisioned. We support heterogenous clusters - this means that one cluster can be built by combining different templates. - -`Name:` name of your template - -`Description:` short description of your template - -`Instance type:` the Google Cloud instance type to be used - we suggest to use at least n1-standard-4 instances - -`Volume type:` option to choose are SSD, regular Magnetic - -`Attached volumes per instance:` the number of disks to be attached - -`Volume size (GB):` the size of the attached disks (in GB) - -`Public in account:` share it with others in the account - -**Manage blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. - -`Name:` name of your blueprint - -`Description:` short description of your blueprint - -`Source URL:` you can add a blueprint by pointing to a URL. As an example you can use this [blueprint](https://raw.githubusercontent.com/sequenceiq/cloudbreak/master/core/src/main/resources/defaults/blueprints/multi-node-hdfs-yarn.bp). - -`Manual copy:` you can copy paste your blueprint in this text area - -`Public in account:` share it with others in the account - -**Manage networks** - -Manage networks allows you to create or reuse existing networks and configure them. - -`Name:` name of the network - -`Description:` short description of your network - -`Subnet (CIDR):` a subnet in the VPC with CIDR block - -`Public in account:` share it with others in the account - -**Security groups** - -They describe the allowed inbound traffic to the instances in the cluster. -Currently only one security group template can be selected for a Cloudbreak cluster and all the instances have a public IP address so all the instances in the cluster will belong to the same security group. -This may change in a later release. - -You can define your own security group by adding all the ports, protocols and CIDR range you'd like to use. 443 needs to be there in every security group otherwise Cloudbreak won't be able to communicate with the provisioned cluster. The rules defined here doesn't need to contain the internal rules, those are automatically added by Cloudbreak to the security group on GCP. - -You can also use the two pre-defined security groups in Cloudbreak: - -`only-ssh-and-ssl:` all ports are locked down except for SSH and gateway HTTPS (you can't access Hadoop services outside of the VPC): - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services and SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -If `Public in account` is checked all the users belonging to your account will be able to use this security group template to create clusters, but cannot delete or modify it. - ->**Note** that the security groups are *not created* on GCP after the `Create Security Group` button is pushed, only -after the cluster provisioning starts with the selected security group template. - -## Cluster installation - -This section describes - -**Blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are [used by Ambari](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints). - -You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own. -Blueprints can be added from an URL or the whole JSON can be copied to the `Manual copy` field. - -The hostgroups added in the JSON will be mapped to a set of instances when starting the cluster and the services and components defined in the hostgroup will be installed on the corresponding nodes. -It is not necessary to define all the configuration fields in the blueprints - if a configuration is missing, Ambari will fill that with a default value. -The configurations defined in the blueprint can also be modified later from the Ambari UI. - -If `Public in account` is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it. - -A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. -There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. -When the blueprint is exported some configurations will have for example hardcoded domain names, or memory configurations that won't be applicable to the Cloudbreak cluster. - -**Cluster customization** - -Sometimes it can be useful to define some custom scripts that run during cluster creation and add some additional functionality. -For example it can be a service you'd like to install but it's not supported by Ambari or some script that automatically downloads some data to the necessary nodes. -The most notable example is Ranger setup: it has a prerequisite of a running database when Ranger Admin is installing. -A PostgreSQL database can be easily started and configured with a recipe before the blueprint installation starts. - -To learn more about these so called *Recipes*, and to check out the Ranger database recipe, take a look at the [Cluster customization](recipes.md) part of the documentation. - - -## Cluster deployment - -After all the templates are configured you can deploy a new HDP cluster. Start by selecting a previously created credential in the header. -Click on `create cluster`, give it a `Name`, select a `Region` where the cluster infrastructure will be provisioned and select one of the `Networks` and `Security Groups` created earlier. -After you've selected a `Blueprint` as well you should be able to configure the `Template resources` and the number of nodes for all of the hostgroups in the blueprint. - -If `Public in account` is checked all the users belonging to your account will be able to see the newly created cluster on the UI, but cannot delete or modify it. - -If `Enable security` is checked as well, Cloudbreak will install KDC and the cluster will be Kerberized. See more about it in the [Kerberos](kerberos.md) section of this documentation. - -After the `create and start cluster` button is pushed Cloudbreak will start to create resources on your GCP account. - ->**Important** Always use Cloudbreak to delete the cluster, or if that fails for some reason always try to delete -the Google Cloud first. - -**Advanced options** - -There are some advanced features when deploying a new cluster, these are the following: - -`Minimum cluster size:` the provisioning strategy in case of the cloud provider can't allocate all the requested nodes - -`Validate blueprint:` feature to validate or not the Ambari blueprint. By default is switched on. - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/gcp_pre_prov.md b/docs/gcp_pre_prov.md deleted file mode 100644 index a55f0c9b9..000000000 --- a/docs/gcp_pre_prov.md +++ /dev/null @@ -1,59 +0,0 @@ -# Provisioning Prerequisites - -## Creating a Google Cloud Service Account - -Follow the [instructions](https://cloud.google.com/storage/docs/authentication#service_accounts) in Google Cloud's documentation to create a `Service account` and `Generate a new P12 key`. - -Make sure that at API level (**APIs and auth** menu) you have enabled: - -* Google Compute Engine -* Google Compute Engine Instance Group Manager API -* Google Compute Engine Instance Groups API -* BigQuery API -* Google Cloud Deployment Manager API -* Google Cloud DNS API -* Google Cloud SQL -* Google Cloud Storage -* Google Cloud Storage JSON API - ->If you enabled every API then you have to wait about **10 minutes** for the provider. - -When creating GCP credentials in Cloudbreak you will have to provide the email address of the Service Account and the project ID (from Google Developers Console - Projects) where the service account is created. You'll also have to upload the generated P12 file and provide an OpenSSH formatted public key that will be used as an SSH key. - -Once your prerequisites created you can use the [Cloudbreak UI](gcp_cb_ui.md) or use the [Cloudbreak shell](gcp_cb_shell.md). - -## Generate a new SSH key - -All the instances created by Cloudbreak are configured to allow key-based SSH, -so you'll need to provide an SSH public key that can be used later to SSH onto the instances in the clusters you'll create with Cloudbreak. -You can use one of your existing keys or you can generate a new one. - -To generate a new SSH keypair: - -``` -ssh-keygen -t rsa -b 4096 -C "your_email@example.com" -# Creates a new ssh key, using the provided email as a label -# Generating public/private rsa key pair. -``` - -``` -# Enter file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter] -You'll be asked to enter a passphrase, but you can leave it empty. - -# Enter passphrase (empty for no passphrase): [Type a passphrase] -# Enter same passphrase again: [Type passphrase again] -``` - -After you enter a passphrase the keypair is generated. The output should look something like below. -``` -# Your identification has been saved in /Users/you/.ssh/id_rsa. -# Your public key has been saved in /Users/you/.ssh/id_rsa.pub. -# The key fingerprint is: -# 01:0f:f4:3b:ca:85:sd:17:sd:7d:sd:68:9d:sd:a2:sd your_email@example.com -``` - -Later you'll need to pass the `.pub` file's contents to Cloudbreak and use the private part to SSH to the instances. - -## Next steps - -After these prerequisites are done you can move on to create clusters on the [UI](gcp_cb_ui.md) or with the [Shell](gcp_cb_shell.md). \ No newline at end of file diff --git a/docs/images/ambari_threshold.png b/docs/images/ambari_threshold.png deleted file mode 100644 index 3a105b449..000000000 Binary files a/docs/images/ambari_threshold.png and /dev/null differ diff --git a/docs/images/aws-create-cluster.png b/docs/images/aws-create-cluster.png deleted file mode 100644 index 85f7186cf..000000000 Binary files a/docs/images/aws-create-cluster.png and /dev/null differ diff --git a/docs/images/aws-credential.png b/docs/images/aws-credential.png deleted file mode 100644 index 699fd507f..000000000 Binary files a/docs/images/aws-credential.png and /dev/null differ diff --git a/docs/images/aws-iam-role.png b/docs/images/aws-iam-role.png deleted file mode 100644 index 7c4ebf04a..000000000 Binary files a/docs/images/aws-iam-role.png and /dev/null differ diff --git a/docs/images/aws-network.png b/docs/images/aws-network.png deleted file mode 100644 index 7de38aa5c..000000000 Binary files a/docs/images/aws-network.png and /dev/null differ diff --git a/docs/images/aws-resources.png b/docs/images/aws-resources.png deleted file mode 100644 index 4e8470d93..000000000 Binary files a/docs/images/aws-resources.png and /dev/null differ diff --git a/docs/images/azure1.png b/docs/images/azure1.png deleted file mode 100644 index c6ab15376..000000000 Binary files a/docs/images/azure1.png and /dev/null differ diff --git a/docs/images/azure2.png b/docs/images/azure2.png deleted file mode 100644 index e1fc3d415..000000000 Binary files a/docs/images/azure2.png and /dev/null differ diff --git a/docs/images/azure3.png b/docs/images/azure3.png deleted file mode 100644 index af78d8449..000000000 Binary files a/docs/images/azure3.png and /dev/null differ diff --git a/docs/images/azure4.png b/docs/images/azure4.png deleted file mode 100644 index e5ac8a214..000000000 Binary files a/docs/images/azure4.png and /dev/null differ diff --git a/docs/images/azure5.png b/docs/images/azure5.png deleted file mode 100644 index 3484e1960..000000000 Binary files a/docs/images/azure5.png and /dev/null differ diff --git a/docs/images/azure6.png b/docs/images/azure6.png deleted file mode 100644 index da0d6c884..000000000 Binary files a/docs/images/azure6.png and /dev/null differ diff --git a/docs/images/azure7.png b/docs/images/azure7.png deleted file mode 100644 index e651f340f..000000000 Binary files a/docs/images/azure7.png and /dev/null differ diff --git a/docs/images/enable_periscope.png b/docs/images/enable_periscope.png deleted file mode 100644 index 57a150e70..000000000 Binary files a/docs/images/enable_periscope.png and /dev/null differ diff --git a/docs/images/google-cloud-shell-launch.png b/docs/images/google-cloud-shell-launch.png deleted file mode 100644 index b49bba374..000000000 Binary files a/docs/images/google-cloud-shell-launch.png and /dev/null differ diff --git a/docs/images/google-cloud-shell.png b/docs/images/google-cloud-shell.png deleted file mode 100644 index cfee61802..000000000 Binary files a/docs/images/google-cloud-shell.png and /dev/null differ diff --git a/docs/images/metric_alert.png b/docs/images/metric_alert.png deleted file mode 100644 index 0ee5ac68e..000000000 Binary files a/docs/images/metric_alert.png and /dev/null differ diff --git a/docs/images/policy.png b/docs/images/policy.png deleted file mode 100644 index 46072f322..000000000 Binary files a/docs/images/policy.png and /dev/null differ diff --git a/docs/images/ranger-hostgroup.png b/docs/images/ranger-hostgroup.png deleted file mode 100644 index 0525bfc52..000000000 Binary files a/docs/images/ranger-hostgroup.png and /dev/null differ diff --git a/docs/images/ranger-recipe.png b/docs/images/ranger-recipe.png deleted file mode 100644 index 58837fb4d..000000000 Binary files a/docs/images/ranger-recipe.png and /dev/null differ diff --git a/docs/images/scaling_config.png b/docs/images/scaling_config.png deleted file mode 100644 index a02cc7b57..000000000 Binary files a/docs/images/scaling_config.png and /dev/null differ diff --git a/docs/images/time_alert.png b/docs/images/time_alert.png deleted file mode 100644 index b84a6c14c..000000000 Binary files a/docs/images/time_alert.png and /dev/null differ diff --git a/docs/images/ui-blueprints.png b/docs/images/ui-blueprints.png deleted file mode 100644 index 9c7e98fd7..000000000 Binary files a/docs/images/ui-blueprints.png and /dev/null differ diff --git a/docs/images/ui-secgroup.png b/docs/images/ui-secgroup.png deleted file mode 100644 index b58d800ff..000000000 Binary files a/docs/images/ui-secgroup.png and /dev/null differ diff --git a/docs/kerberos.md b/docs/kerberos.md deleted file mode 100644 index c5488a066..000000000 --- a/docs/kerberos.md +++ /dev/null @@ -1,71 +0,0 @@ -#Kerberos Security - -Cloudbreak can enable Kerberos security on the cluster. When enabled, Cloudbreak will install an MIT KDC into the cluster -and enable Kerberos on the cluster. - -> This feature is currently `TECHNICAL PREVIEW`. - -## Enable Kerberos - -To enable Kerberos in a cluster, when creating your cluster via the UI, do the following: - -1. When in the **Create cluster** wizard, on the **Setup Network and Security** tab, check the **Enable security** option. -2. Fill in the following fields: - -| Field | Description | -|---|---| -| Kerberos master key | The master key to use for the KDC. | -| Kerberos admin | The KDC admin username to use for the KDC. | -| Kerberos password | The KDC admin password to use for the KDC. | - -> The Cloudbreak Kerberos setup does not contain Active Directory support or any other third party user authentication method. If you -want to use custom Hadoop user, you have to create users manually with the same name on all Ambari containers on each node. - -### Testing Kerberos - -To run a job on the cluster, you can use one of the default Hadoop users, like `ambari-qa`, as usual. - -Once kerberos is enabled you need a `ticket` to execute any job on the cluster. Here's an example to get a ticket: -``` -kinit -V -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-sparktest-rec@NODE.DC1.CONSUL -``` -Example job: -```java -export HADOOP_LIBS=/usr/hdp/current/hadoop-mapreduce-client -export JAR_EXAMPLES=$HADOOP_LIBS/hadoop-mapreduce-examples.jar -export JAR_JOBCLIENT=$HADOOP_LIBS/hadoop-mapreduce-client-jobclient.jar - -hadoop jar $JAR_EXAMPLES teragen 10000000 /user/ambari-qa/terasort-input - -hadoop jar $JAR_JOBCLIENT mrbench -baseDir /user/ambari-qa/smallJobsBenchmark -numRuns 5 -maps 10 -reduces 5 -inputLines 10 -inputType ascending -``` - -### Create Hadoop Users - -To create Hadoop users please follow the steps below. - - * Log in via SSH to the Cloudbreak gateway node (IP address is the same as the Ambari UI) - -``` -sudo docker exec -i kerberos bash -kadmin -p [admin_user]/[admin_user]@NODE.DC1.CONSUL (type admin password) -addprinc custom-user (type user password twice) -``` - - * Log in via SSH to all other nodes - -``` -sudo docker exec -i $(docker ps | grep ambari-warmup | cut -d" " -f 1) bash -useradd custom-user -``` - - * Log in via SSH to one of the nodes - -``` -sudo docker exec -i $(docker ps | grep ambari-warmup | cut -d" " -f 1) bash -su custom-user -kinit -p custom-user (type user password) -hdfs dfs -mkdir input -hdfs dfs -put /tmp/wait-for-host-number.sh input -yarn jar $(find /usr/hdp -name hadoop-mapreduce-examples.jar) wordcount input output -hdfs dfs -cat output/* diff --git a/docs/openstack-image.md b/docs/openstack-image.md deleted file mode 100644 index 46e5840bd..000000000 --- a/docs/openstack-image.md +++ /dev/null @@ -1,39 +0,0 @@ -# OpenStack Cloud Images - -We have pre-built cloud images for OpenStack with the Cloudbreak Deployer pre-installed. Following the steps will guide you through the provider specific configuration then launch. - -> Alternatively, instead of using the pre-built cloud images, you can install Cloudbreak Deployer on your own VM. See [install the Cloudbreak Deployer](onprem.md) for more information. - -##System Requirements - -Cloudbreak currently only supports the `OpenStack Juno` release. - -##Download the Cloudbreak image - -You can download the latest pre-configured Cloudbreak deployer image for OpenStack with the following script in the -following section. - -Please make sure you opened the following ports on your virtual machine: - - * SSH (22) - * Ambari (8080) - * Identity server (8089) - * Cloudbreak GUI (3000) - * User authentication (3001) - -###OpenStack image details - - -##Import the image into OpenStack - -``` -export OS_IMAGE_NAME="name_in_openstack" -export OS_USERNAME=... -export OS_AUTH_URL="http://.../v2.0" -export OS_TENANT_NAME=... -glance image-create --name "$OS_IMAGE_NAME" --file "$LATEST_IMAGE" --disk-format qcow2 --container-format bare --progress -``` - -## Setup Cloudbreak Deployer - -Once you have the Cloudbreak Deployer installed, proceed to [Setup Cloudbreak Deployer](openstack.md). diff --git a/docs/openstack.md b/docs/openstack.md deleted file mode 100644 index eb8bb2364..000000000 --- a/docs/openstack.md +++ /dev/null @@ -1,69 +0,0 @@ -#OpenStack Setup - -## Setup Cloudbreak Deployer - -If you already have Cloudbreak Deployer either by [using the OpenStack Cloud Images](openstack-image.md) or by [installing the Cloudbreak Deployer](onprem.md) manually on your own VM, -you can start to setup the Cloudbreak Application with the deployer. - -> Cloudbreak currently only supports the `OpenStack Juno` release. - -Create and open the `cloudbreak-deployment` directory: - -``` -cd cloudbreak-deployment -``` - -This is the directory of the config files and the supporting binaries that will be downloaded by Cloudbreak deployer. - -###Initialize your Profile - -First initialize your directory by creating a `Profile` file: - -``` -cbd init -``` - -It will create a `Profile` file in the current directory. Please edit the file - one of the required configurations is the `PUBLIC_IP`. -This IP will be used to access the Cloudbreak UI (called Uluwatu). In some cases the `cbd` tool tries to guess it, if can't than will give a hint. - -The other required configuration in the `Profile` is the name of the Cloudbreak image you uploaded to your OpenStack cloud. - -``` -export CB_OPENSTACK_IMAGE="$OS_IMAGE_NAME" -``` - -###Generate your Profile - -You are done with the configuration of Cloudbreak deployer. The last thing you have to do is to regenerate the configurations in order to take effect. - -``` -rm *.yml -cbd generate -``` - -This command applies the following steps: - -- creates the **docker-compose.yml** file that describes the configuration of all the Docker containers needed for the Cloudbreak deployment. -- creates the **uaa.yml** file that holds the configuration of the identity server used to authenticate users to Cloudbreak. - -###Start Cloudbreak - -To start the Cloudbreak application use the following command. -This will start all the Docker containers and initialize the application. It will take a few minutes until all the services start. - -``` -cbd start -``` - ->Launching it first will take more time as it downloads all the docker images needed by Cloudbreak. - -After the `cbd start` command finishes you can check the logs of the Cloudbreak server with this command: - -``` -cbd logs cloudbreak -``` ->Cloudbreak server should start within a minute - you should see a line like this: `Started CloudbreakApplication in 36.823 seconds` - -###Next steps - -Once Cloudbreak is up and running you should check out the [Provisioning Prerequisites](openstack_pre_prov.md) needed to create OpenStack clusters with Cloudbreak. \ No newline at end of file diff --git a/docs/openstack_cb_shell.md b/docs/openstack_cb_shell.md deleted file mode 100644 index 6c87bcbb6..000000000 --- a/docs/openstack_cb_shell.md +++ /dev/null @@ -1,240 +0,0 @@ -## Interactive mode - -Start the shell with `cbd util cloudbreak-shell`. This will launch the Cloudbreak shell inside a Docker container and you are ready to start using it. - -You have to copy files into the cbd working directory, which you would like to use from shell. For example if your `cbd` working directory is `~/prj/cbd` then copy your blueprint and public ssh key file into this directory. You can refer to these files with their names from the shell. - -### Create a cloud credential - -``` -credential create --OPENSTACK --name my-os-credential --description "credentail description" --userName --password --tenantName --endPoint --sshKeyPath -``` - -Alternatively you can upload your public key from an url as well, by using the `—sshKeyUrl` switch. You can check whether the credential was created successfully by using the `credential list` command. You can switch between your cloud credentials - when you’d like to use one and act with that you will have to use: - -``` -credential select --name my-openstack-credential -``` - -### Create a template - -A template gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related resources, maintaining and updating them in an orderly and predictable fashion. A template can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -template create --OPENSTACK --name ostemplate --description openstack-template --instanceType m1.large --volumeSize 100 --volumeCount 2 -``` -You can check whether the template was created successfully by using the `template list` or `template show` command. - -You can delete your cloud template - when you’d like to delete one you will have to use: -``` -template delete --name ostemplate -``` - -### Create or select a blueprint - -You can define Ambari blueprints with cloudbreak-shell: - -``` -blueprint add --name myblueprint --description myblueprint-description --file -``` - -Other available options: - -`--url` the url of the blueprint - -`--publicInAccount` flags if the network is public in the account - -We ship default Ambari blueprints with Cloudbreak. You can use these blueprints or add yours. To see the available blueprints and use one of them please use: - -``` -blueprint list - -blueprint select --name hdp-small-default -``` - -### Create a network - -A network gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related networking, maintaining and updating them in an orderly and predictable fashion. A network can be used repeatedly to create identical copies of the same stack (or to use as a foundation to start a new stack). - -``` -network create --OPENSTACK --name osnetwork --description openstack-network --publicNetID --subnet 10.0.0.0/16 -``` - -Other available options: - -`--publicInAccount` flags if the network is public in the account - -You can check whether the network was created successfully by using the `network list` command. Check the network and select it if you are happy with it: - -``` -network show --name osnetwork - -network select --name osnetwork -``` - -### Create a security group - -A security group gives developers and systems administrators an easy way to create and manage a collection of cloud infrastructure related security rules. - -``` -securitygroup create --name secgroup_example --description securitygroup-example --rules 0.0.0.0/0:tcp:8080,9090;10.0.33.0/24:tcp:1234,1235 -``` - -You can check whether the security group was created successfully by using the `securitygroup list` command. Check the security group and select it if you are happy with it: - -``` -securitygroup show --name secgroup_example - -securitygroup select --name secgroup_example -``` - -There are two default security groups defined: `all-services-port` and `only-ssh-and-ssl` - -`only-ssh-and-ssl:` all ports are locked down (you can't access Hadoop services outside of the Virtual Private Cloud) - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services + SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -### Configure instance groups - -You have to configure the instancegroups before the provisioning. An instancegroup is defining a group of your nodes with a specified template. Usually we create instancegroups for the hostgroups defined in the blueprints. - -``` -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName ostemplate -``` - -Other available options: - -`--templateId` Id of the template - -### Create a Hadoop cluster -You are almost done - two more command and this will create your Hadoop cluster on your favorite cloud provider. Same as the API, or UI this will use your `credential`, `instancegroups`, `network`, `securitygroup`, and by using OpenStack Heat will launch a cloud stack -``` -stack create --name my-first-stack --region local -``` -Once the `stack` is up and running (cloud provisioning is done) it will use your selected `blueprint` and install your custom Hadoop cluster with the selected components and services. -``` -cluster create --description "my first cluster" -``` -You are done - you can check the progress through the Ambari UI. If you log back to Cloudbreak UI you can check the progress over there as well, and learn the IP address of Ambari. - -### Stop/Restart cluster and stack -You have the ability to **stop your existing stack then its cluster** if you want to suspend the work on it. - -Select a stack for example with its name: -``` -stack select --name my-stack -``` -Other available option to define a stack is its `--id` (instead of the `--name`). - -Apply the following commands to stop the previously selected stack: -``` -cluster stop -stack stop -``` ->**Important!** The related cluster should be stopped before you can stop the stack. - - -Apply the following command to **restart the previously selected and stopped stack**: -``` -stack start -``` -After the selected stack has restarted, you can **restart the related cluster as well**: -``` -cluster start -``` - -### Upscale/Downscale cluster and stack -You can **upscale your selected stack** if you need more instances to your infrastructure: -``` -stack node --ADD --instanceGroup host_group_slave_1 --adjustment 6 -``` -Other available options: - -`--withClusterUpScale` indicates cluster upscale after stack upscale -or you can upscale the related cluster separately as well: -``` -cluster node --ADD --hostgroup host_group_slave_1 --adjustment 6 -``` - - -Apply the following command to **downscale the previously selected stack**: -``` -stack node --REMOVE --instanceGroup host_group_slave_1 --adjustment -2 -``` -and the related cluster separately: -``` -cluster node --REMOVE --hostgroup host_group_slave_1 --adjustment -2 -``` - -## Silent mode - -With Cloudbreak shell you can execute script files as well. A script file contains cloudbreak shell commands and can be executed with the `script` cloudbreak shell command - -``` -script -``` - -or with the `cbd util cloudbreak-shell-quiet` cbd command: - -``` -cbd util cloudbreak-shell-quiet < example.sh -``` - -## Example - -The following example creates a hadoop cluster with `hdp-small-default` blueprint on `m1.large` instances with 2X100G attached disks on `osnetwork` network using `all-services-port` security group. You should copy your ssh public key file into your cbd working directory with name `id_rsa.pub` and change the `<...>` parts with your openstack credential and network details. - -``` -credential create --OPENSTACK --name my-os-credential --description "credentail description" --userName --password --tenantName --endPoint --sshKeyPath -credential select --name my-os-credential -template create --OPENSTACK --name ostemplate --description openstack-template --instanceType m1.large --volumeSize 100 --volumeCount 2 -blueprint select --name hdp-small-default -instancegroup configure --instanceGroup cbgateway --nodecount 1 --templateName ostemplate -instancegroup configure --instanceGroup host_group_master_1 --nodecount 1 --templateName ostemplate -instancegroup configure --instanceGroup host_group_master_2 --nodecount 1 --templateName ostemplate -instancegroup configure --instanceGroup host_group_master_3 --nodecount 1 --templateName ostemplate -instancegroup configure --instanceGroup host_group_client_1 --nodecount 1 --templateName ostemplate -instancegroup configure --instanceGroup host_group_slave_1 --nodecount 3 --templateName ostemplate -network create --OPENSTACK --name osnetwork --description openstack-network --publicNetID --subnet 10.0.0.0/16 -network select --name osnetwork -securitygroup select --name all-services-port -stack create --name my-first-stack --region local -cluster create --description "My first cluster" -``` - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/openstack_cb_ui.md b/docs/openstack_cb_ui.md deleted file mode 100644 index 62071f0d3..000000000 --- a/docs/openstack_cb_ui.md +++ /dev/null @@ -1,206 +0,0 @@ -#Provisioning via Browser - -You can log into the Cloudbreak application at http://PUBLIC_IP:3000. - -The main goal of the Cloudbreak UI is to easily create clusters on your own cloud provider account. -This description details the OpenStack setup - if you'd like to use a different cloud provider check out its manual. - -This document explains the four steps that need to be followed to create Cloudbreak clusters from the UI: - -- connect your OpenStack with Cloudbreak -- create some template resources on the UI that describe the infrastructure of your clusters -- create a blueprint that describes the HDP services in your clusters and add some recipes for customization -- launch the cluster itself based on these template resources - -## Manage cloud credentials - -You can now log into the Cloudbreak application at http://PUBLIC_IP:3000. Once logged in go to **Manage credentials**. Using manage credentials will link your cloud account with the Cloudbreak account. - -`Name:` name of your credential - -`Description:` short description of your linked credential - -`User:` your OpenStack user - -`Password:` your password - -`Tenant Name:` OpenStack tenant name - -`Endpoint:` Openstack Identity Service (Keystone) endpont (e.g. http://PUBLIC_IP:5000/v2.0) - -`SSH public key:` the SSH public certificate in OpenSSH format that's private keypair can be used to [log into the launched instances](http://sequenceiq.com/cloudbreak-deployer/1.1.0/insights/#ssh-to-the-host) later with the **ssh username: centos** - -`Public in account:` share it with others in the account - - -## Infrastructure templates - -After your OpenStack is linked to Cloudbreak you can start creating templates that describe your clusters' infrastructure: - -- resources -- networks -- security groups - -When you create a template, Cloudbreak *doesn't make any requests* to OpenStack. -Resources are only created on OpenStack after the `create cluster` button is pushed. -These templates are saved to Cloudbreak's database and can be reused with multiple clusters to describe the infrastructure. - -**Manage resources** - -Using manage resources you can create infrastructure templates. Templates describes the infrastructure where the HDP cluster will be provisioned. We support heterogenous clusters - this means that one cluster can be built by combining different templates. - -`Name:` name of your template - -`Description:` short description of your template - -`Instance type:` the OpenStack instance type to be used - -`Attached volumes per instance:` the number of disks to be attached - -`Volume size (GB):` the size of the attached disks (in GB) - -`Public in account:` share it with others in the account - -**Manage blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. - -`Name:` name of your blueprint - -`Description:` short description of your blueprint - -`Source URL:` you can add a blueprint by pointing to a URL. As an example you can use this [blueprint](https://github.com/sequenceiq/ambari-rest-client/raw/1.6.0/src/main/resources/blueprints/multi-node-hdfs-yarn). - -`Manual copy:` you can copy paste your blueprint in this text area - -`Public in account:` share it with others in the account - -**Manage networks** - -Manage networks allows you to create or reuse existing networks and configure them. - -`Name:` name of the network - -`Description:` short description of your network - -`Subnet (CIDR):` a subnet with CIDR block under the given `public network` - -`Public network ID:` id of an OpenStack public network - -`Public in account:` share it with others in the account - -**Security groups** - -They describe the allowed inbound traffic to the instances in the cluster. -Currently only one security group template can be selected for a Cloudbreak cluster and all the instances have a public IP address so all the instances in the cluster will belong to the same security group. -This may change in a later release. - -You can define your own security group by adding all the ports, protocols and CIDR range you'd like to use. 443 needs to be there in every security group otherwise Cloudbreak won't be able to communicate with the provisioned cluster. The rules defined here doesn't need to contain the internal rules, those are automatically added by Cloudbreak to the security group on OpenStack. - -You can also use the two pre-defined security groups in Cloudbreak: - -`only-ssh-and-ssl:` all ports are locked down (you can't access Hadoop services outside of the Virtual Private Cloud) but - -* SSH (22) -* HTTPS (443) - -`all-services-port:` all Hadoop services + SSH/gateway HTTPS are accessible by default: - -* SSH (22) -* HTTPS (443) -* Ambari (8080) -* Consul (8500) -* NN (50070) -* RM Web (8088) -* Scheduler (8030RM) -* IPC (8050RM) -* Job history server (19888) -* HBase master (60000) -* HBase master web (60010) -* HBase RS (16020) -* HBase RS info (60030) -* Falcon (15000) -* Storm (8744) -* Hive metastore (9083) -* Hive server (10000) -* Hive server HTTP (10001) -* Accumulo master (9999) -* Accumulo Tserver (9997) -* Atlas (21000) -* KNOX (8443) -* Oozie (11000) -* Spark HS (18080) -* NM Web (8042) -* Zeppelin WebSocket (9996) -* Zeppelin UI (9995) -* Kibana (3080) -* Elasticsearch (9200) - -If `Public in account` is checked all the users belonging to your account will be able to use this security group template to create clusters, but cannot delete or modify it. - -**Note** that the security groups are *not created* on OpenStack after the `Create Security Group` button is pushed, only after the cluster provisioning starts with the selected security group template. - -## Cluster installation - -This section describes - -**Blueprints** - -Blueprints are your declarative definition of a Hadoop cluster. These are the same blueprints that are [used by Ambari](https://cwiki.apache.org/confluence/display/AMBARI/Blueprints). - -You can use the 3 default blueprints pre-defined in Cloudbreak or you can create your own. -Blueprints can be added from an URL or the whole JSON can be copied to the `Manual copy` field. - -The hostgroups added in the JSON will be mapped to a set of instances when starting the cluster and the services and components defined in the hostgroup will be installed on the corresponding nodes. -It is not necessary to define all the configuration fields in the blueprints - if a configuration is missing, Ambari will fill that with a default value. -The configurations defined in the blueprint can also be modified later from the Ambari UI. - -If `Public in account` is checked all the users belonging to your account will be able to use this blueprint to create clusters, but cannot delete or modify it. - -A blueprint can be exported from a running Ambari cluster that can be reused in Cloudbreak with slight modifications. -There is no automatic way to modify an exported blueprint and make it instantly usable in Cloudbreak, the modifications have to be done manually. -When the blueprint is exported some configurations will have for example hardcoded domain names, or memory configurations that won't be applicable to the Cloudbreak cluster. - -**Cluster customization** - -Sometimes it can be useful to define some custom scripts that run during cluster creation and add some additional functionality. -For example it can be a service you'd like to install but it's not supported by Ambari or some script that automatically downloads some data to the necessary nodes. -The most notable example is Ranger setup: it has a prerequisite of a running database when Ranger Admin is installing. -A PostgreSQL database can be easily started and configured with a recipe before the blueprint installation starts. - -To learn more about these so called *Recipes*, and to check out the Ranger database recipe, take a look at the [Cluster customization](recipes.md) part of the documentation. - - -## Cluster deployment - -After all the templates are configured you can deploy a new HDP cluster. Start by selecting a previously created credential in the header. -Click on `create cluster`, give it a `Name`, select a `Region` where the cluster infrastructure will be provisioned and select one of the `Networks` and `Security Groups` created earlier. -After you've selected a `Blueprint` as well you should be able to configure the `Template resources` and the number of nodes for all of the hostgroups in the blueprint. - -If `Public in account` is checked all the users belonging to your account will be able to see the newly created cluster on the UI, but cannot delete or modify it. - -If `Enable security` is checked as well, Cloudbreak will install Key Distribution Center (KDC) and the cluster will be Kerberized. See more about it in the [Kerberos](kerberos.md) section of this documentation. - -After the `create and start cluster` button is pushed Cloudbreak will start to create resources on your OpenStack. - ->**Important** Always use Cloudbreak to delete the cluster. If that fails for some reason, always try to delete via -OpenStack Dashboard. - -**Advanced options**: - -`Consul server count:` the number of Consul servers (add number), by default is 3. It varies with the cluster size. - -`Platform variant:` Cloudbreak provides two implementation for creating OpenStack cluster - -* `HEAT:` using heat template to create the resources -* `NATIVE:` using API calls to create the resources - -`Minimum cluster size:` the provisioning strategy in case of the cloud provider can't allocate all the requested nodes - -`Validate blueprint:` feature to validate or not the Ambari blueprint. By default is switched on. - -Once you have launched the cluster creation you can track the progress either on Cloudbreak UI or your cloud provider management UI. - -## Next steps - -Congrats! Your cluster should now be up and running. To learn more about it we have some [interesting insights](insights.md) about Cloudbreak clusters. diff --git a/docs/openstack_pre_prov.md b/docs/openstack_pre_prov.md deleted file mode 100644 index f345efbde..000000000 --- a/docs/openstack_pre_prov.md +++ /dev/null @@ -1,37 +0,0 @@ -#Provisioning Prerequisites - -## Generate a new SSH key - -All the instances created by Cloudbreak are configured to allow key-based SSH, -so you'll need to provide an SSH public key that can be used later to SSH onto the instances in the clusters you'll create with Cloudbreak. -You can use one of your existing keys or you can generate a new one. - -To generate a new SSH keypair: - -``` -ssh-keygen -t rsa -b 4096 -C "your_email@example.com" -# Creates a new ssh key, using the provided email as a label -# Generating public/private rsa key pair. -``` - -``` -# Enter file in which to save the key (/Users/you/.ssh/id_rsa): [Press enter] -You'll be asked to enter a passphrase, but you can leave it empty. - -# Enter passphrase (empty for no passphrase): [Type a passphrase] -# Enter same passphrase again: [Type passphrase again] -``` - -After you enter a passphrase the keypair is generated. The output should look something like below. -``` -# Your identification has been saved in /Users/you/.ssh/id_rsa. -# Your public key has been saved in /Users/you/.ssh/id_rsa.pub. -# The key fingerprint is: -# 01:0f:f4:3b:ca:85:sd:17:sd:7d:sd:68:9d:sd:a2:sd your_email@example.com -``` - -Later you'll need to pass the `.pub` file's contents to Cloudbreak and use the private part to SSH to the instances. - -## Next steps - -After these prerequisites are done you can move on to create clusters on the [UI](openstack_cb_ui.md) or with the [Shell](openstack_cb_shell.md). diff --git a/docs/periscope.md b/docs/periscope.md deleted file mode 100644 index 1f6a7f291..000000000 --- a/docs/periscope.md +++ /dev/null @@ -1,80 +0,0 @@ -# Auto-Scaling - -The purpose of `auto-scaling` is to apply SLA scaling policies to a Cloudbreak-managed Hadoop cluster. - -> This feature is currently `TECHNICAL PREVIEW`. - -##How It Works - -The auto-scaling capabilities is based on [Ambari Metrics](https://cwiki.apache.org/confluence/display/AMBARI/Metrics) - and [Ambari Alerts](https://cwiki.apache.org/confluence/display/AMBARI/Alerts). Based on the Blueprint -used and the running services, Cloudbreak can access all the available metrics from the subsystem and define `alerts` based on this information. - -Beside the default Ambari Metrics, Cloudbreak includes two custom metrics: `Pending YARN containers` and `Pending applications`. These two custom metrics works with the YARN subsystem in order to bring `application` level QoS to the cluster. - -> In order to use the `autoscaling` feature with Cloudbreak you will have to enable from the UI or shell. - -![](/images/enable_periscope.png) - -####Alerts - -Auto-scaling supports two **Alert** types: `metric` and `time` based. - -**Metric-based Alerts** - -Metric based alerts are using the default (or custom) Ambari metrics. These metrics have a default `Threshold` value configured in Ambari - nevertheless these thresholds can be configured, changed or altered in Ambari. In order to change the default threshold for a metric please go to Ambari UI and select the `Alerts` tab and the metric. The values can be changed in the `Threshold` section. - -![](/images/ambari_threshold.png) - -Metric alerts have a few configurable fields. - -* `alert name` - name of the alert -* `description` - description of the alert -* `metric - desired state` - the Ambari metrics based on the installed services and their *state* (OK, WARN, CRITICAL), based on the *threshold* value -* `period` - for how many *minutes* the metric state has to be sustained in order for an alert to be triggered - -![](/images/metric_alert.png) - -**Time-based Alerts** - -Time based alerts are based on `cron` expressions and allow alerts to be triggered based on time. - -Time alerts have a few configurable fields. - -* `alert name` - name of the alert -* `description` - description of the alert -* `time zone` - the time zone -* `crom expression` - the *cron* expression to be used for the alert - -![](/images/time_alert.png) - - -####Scaling Policies -Scaling is the ability to increase or decrease the capacity of the Hadoop cluster or application based on an alert. -When scaling policies are used, the capacity is automatically increased or decreased according to the conditions defined. -Cloudbreak will do the heavy lifting and based on the alerts and the scaling policy linked to them it executes the associated policy. We scaling granularity is at the `hostgroup` level - thus you have the option to scale services or components only, not the whole cluster. - -Scaling policies have a few configurable fields. - -* `policy name` - name of the scaling policy -* `scaling adjustment` - the number of added or removed noded based on `node count` (the number of nodes), `percentage` (computed percentage adjustment based on the cluster size) and `exact` (a given exact size of the cluster) -* `host group` - the Ambari hostgroup to be scaled -* `alert` - the triggered alert based on that the scaling policy applies - -![](/images/policy.png) - -####Cluster Scaling Configuration - -An SLA scaling policy can contain multiple alerts. When an alert is triggered a `scaling adjustment` is applied, however to keep the cluster size within boundaries a `cluster size min.` and `cluster size max.` is attached to the cluster - thus a scaling policy can never over or undersize a cluster. Also in order to avoid stressing the cluster we have introduced a `cooldown time` period (minutes) - though an alert is raised and there is an associated scaling policy, the system will not apply the policy within the configured timeframe. In an SLA scaling policy the triggered rules are applied in order. - -* `cooldown time` - period (minutes) between two scaling events while the cluster is locked from adjustments -* `cluster size min.` - size will never go under the minimum value, despite scaling adjustments -* `cluster size max.` - size will never go above the maximum value, despite scaling adjustments - -![](/images/scaling_config.png) - -**Downscale Scaling Considerations** - -Cloudbreak auto-scaling will try to keep a healthy cluster, thus does several background checks during `downscale`. - -* We never remove `Application master nodes` from a cluster. In order to make sure that a node running AM is not removed, Cloudbreak has to be able to access the YARN Resource Manager - when creating a cluster using the `default` secure network template please make sure that the RM's port is open on the node -* In order to keep a healthy HDFS during downscale we always keep the configured `replication` factor and make sure there is enough `space` on HDFS to rebalance data. Also during downscale in order to minimize the rebalancing, replication and HDFS storms we check block locations and compute the least costly operations. diff --git a/docs/recipes.md b/docs/recipes.md deleted file mode 100644 index 9532eda39..000000000 --- a/docs/recipes.md +++ /dev/null @@ -1,482 +0,0 @@ -#Recipes - -With the help of Cloudbreak it is very easy to provision Hadoop clusters in the cloud from an Apache Ambari blueprint. Cloudbreak built in provisioning doesn't contain every use case, so we are introducing the concept of recipes. - -Recipes are basically script extensions to a cluster that run on a set of nodes before or after the Ambari cluster installation. With recipes it's quite easy for example to put a JAR file on the Hadoop classpath or run some custom scripts. - -In Cloudbreak we supports two ways to configure recipe, we have downloadable and stored recipes. - -##Stored recipes - -As the name mentions stored recipes are uploaded and stored in Cloudbreak via web interface or shell. - -The easiest way to create a custom recipe: - - * create your own pre and/or post scripts - * upload them on shell or web interface - -###Add recipe - -On the web interface under "manage recipes" section you should create new recipe. Please select SCRIPT or FILE type plugin, and fill other required fields. - -To add recipe via shell use the following command: - -``` -recipe store --name [recipe-name] --executionType [ONE_NODE|ALL_NODES] --preInstallScriptFile /path/of/the/pre-install-script --postInstallScriptFile /path/of/the/post-install-script -``` - -This command has optional parameters: - -`--description` "string" description of the recipe - -`--timeout` "integer" timeout of the script execution - -`--publicInAccount` "flag" flags if the template is public in the account - -In the background Cloudbreak pushes recipe to Consul key/value store during cluster creation. - -**Note** Stored recipes has limitation on size, because they are stored in Consul key/value store, the base64 encoded content of the scripts must be less than 512kB. - -##Downloadable recipes - -A downloadable recipe should be available on HTTP, HTTPS protocols optionally with basic authentication, or any kind of public Git repository. - -This kind of recipe must contain a plugin.toml file, with some basic information about the recipe. Besides this at least a recipe-pre-install or a recipe-post-install script. - -Content of plugin.toml: - -``` -[plugin] -name = "[recipe-name]" -description = "[description-of-the-recipe]" -version = "1.0" -maintainer_name = "[maintainer-name]" -maintainer_email = "[maintainer-email]" -website_url = "[website-url]" -``` - -Pre- and post scripts are regular shell scripts, and must be executable. - -To configure recipe or recipe groups in Cloudbreak you have to create a descriptive JSON file and send it to Cloudbreak via our shell. On web interface you don't need to take care of this file. -``` -{ - "name": "[recipe-name]", - "description": "[description-of-the-recipe]", - "properties": { - "[key]": "[value]" - }, - "plugins": { - "git://github.com/account/recipe.git": "ONE_NODE" - "http://user:password@mydomain.com/my-recipe.tar": "ALL_NODES" - "https://mydomain.com/my-recipe.zip": "ALL_NODES" - } -} -``` - -At this point we need to understand some element of the JSON above. - -First of all `properties`. Properties are saved to Consul key/value store, and they are available from the pre or post script by fetching http://localhost:8500/v1/kv/[key]?raw. The limitation of the value's base64 representation is 512kB. This option is a good choice if you want to write reusable recipes. - -The next one is `plugins`. As you read before we support a few kind of protocols, and each of them has their own limitations: - - * Git - * git repository must be public (or available from the cluster) - * the recipe files must be on the root - * only repository default branch supported, there is no opportunity to check out different branch - - * HTTP(S) - * on this kind of protocols you have to bundle your recipe into a tar or zip file - * basic authentication is the only way to protect recipe from public - -Last one is the execution type of the recipe. We supports two options: - - * ONE_NODE means the recipe will execute only one node in the hostgroup - * All_NODES runs every single instance in the hostgroup. - -###Add recipe - -On the web interface please select URL type plugin, and fill other required fields. - -To add recipe via shell use the command(s) below: - -``` -recipe add --file /path/of/the/recipe/json -``` -or -``` -recipe add --url http(s)://mydomain.com/my-recipe.json -``` - -Add command has an optional parameter - -`--publicInAccount` is checked all the users belonging to your account will be able to use this recipe for create clusters, but cannot delete it. - -## Sample recipe for Ranger - -To be able to install Ranger from a blueprint, a database must be running when Ambari starts to install Ranger Admin. With Cloudbreak a database can be configured and started from a recipe. We've created a sample recipe that can be used to initialize and start a PostgreSQL database that will be able to accept connections from Ranger and store its data. Add the `ONE_NODE` recipe from [this URL](https://github.com/sequenceiq/consul-plugins-ranger-db.git) on the Cloudbreak UI: - -![](/images/ranger-recipe.png) - -And add this recipe to the same hostgroup where Ranger Admin is installed on the 'Choose Blueprint' when creating a new cluster: - -![](/images/ranger-hostgroup.png) - -Ranger installation also has some required properties that must be added to the blueprint. We've created a sample one-node blueprint with the necessary configurations to install Ranger Admin and Ranger Usersync. The configuration values in this blueprint match the sample recipe above - they are set to use a PostgreSQL database on the same host where Ranger Admin is installed. Usersync is configured to use UNIX as the authentication method and it should also be installed on the same host where Ranger Admin is installed. - -``` -{ - "configurations": [ - { - "ranger-site": { - "properties_attributes": {}, - "properties": {} - } - }, - { - "ranger-hdfs-policymgr-ssl": { - "properties_attributes": {}, - "properties": { - "xasecure.policymgr.clientssl.keystore": "/etc/hadoop/conf/ranger-plugin-keystore.jks", - "xasecure.policymgr.clientssl.keystore.credential.file": "jceks://file{{credential_file}}", - "xasecure.policymgr.clientssl.truststore": "/etc/hadoop/conf/ranger-plugin-truststore.jks", - "xasecure.policymgr.clientssl.truststore.credential.file": "jceks://file{{credential_file}}" - } - } - }, - { - "ranger-ugsync-site": { - "properties_attributes": {}, - "properties": { - "ranger.usersync.enabled": "true", - "ranger.usersync.filesource.file": "/tmp/usergroup.txt", - "ranger.usersync.filesource.text.delimiter": ",", - "ranger.usersync.group.memberattributename": "member", - "ranger.usersync.group.nameattribute": "cn", - "ranger.usersync.group.objectclass": "groupofnames", - "ranger.usersync.group.searchbase": "ou=groups,dc=hadoop,dc=apache,dc=org", - "ranger.usersync.group.searchenabled": "false", - "ranger.usersync.group.searchfilter": "empty", - "ranger.usersync.group.searchscope": "sub", - "ranger.usersync.group.usermapsyncenabled": "false", - "ranger.usersync.ldap.bindalias": "testldapalias", - "ranger.usersync.ldap.binddn": "cn=admin,dc=xasecure,dc=net", - "ranger.usersync.ldap.bindkeystore": "-", - "ranger.usersync.ldap.groupname.caseconversion": "lower", - "ranger.usersync.ldap.searchBase": "dc=hadoop,dc=apache,dc=org", - "ranger.usersync.ldap.url": "ldap://localhost:389", - "ranger.usersync.ldap.user.groupnameattribute": "memberof, ismemberof", - "ranger.usersync.ldap.user.nameattribute": "cn", - "ranger.usersync.ldap.user.objectclass": "person", - "ranger.usersync.ldap.user.searchbase": "ou=users,dc=xasecure,dc=net", - "ranger.usersync.ldap.user.searchfilter": "empty", - "ranger.usersync.ldap.user.searchscope": "sub", - "ranger.usersync.ldap.username.caseconversion": "lower", - "ranger.usersync.logdir": "/var/log/ranger/usersync", - "ranger.usersync.pagedresultsenabled": "true", - "ranger.usersync.pagedresultssize": "500", - "ranger.usersync.policymanager.baseURL": "{{ranger_external_url}}", - "ranger.usersync.policymanager.maxrecordsperapicall": "1000", - "ranger.usersync.policymanager.mockrun": "false", - "ranger.usersync.port": "5151", - "ranger.usersync.sink.impl.class": "org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder", - "ranger.usersync.sleeptimeinmillisbetweensynccycle": "5", - "ranger.usersync.source.impl.class": "org.apache.ranger.unixusersync.process.UnixUserGroupBuilder", - "ranger.usersync.ssl": "true", - "ranger.usersync.unix.minUserId": "500" - } - } - }, - { - "admin-properties": { - "properties_attributes": {}, - "properties": { - "DB_FLAVOR": "POSTGRES", - "SQL_COMMAND_INVOKER": "psql", - "SQL_CONNECTOR_JAR": "/var/lib/ambari-agent/tmp/postgres-jdbc-driver.jar", - "audit_db_name": "ranger_audit", - "audit_db_user": "rangerlogger", - "db_host": "localhost:5432", - "db_name": "ranger", - "db_root_user": "postgres", - "db_root_password": "admin", - "db_user": "rangeradmin", - "policymgr_external_url": "http://localhost:6080", - "ranger_jdbc_connection_url": "jdbc:postgresql://{db_host}/ranger", - "ranger_jdbc_driver": "org.postgresql.Driver" - } - } - }, - { - "ranger-admin-site": { - "properties_attributes": {}, - "properties": { - "ranger.audit.source.type": "db", - "ranger.authentication.method": "UNIX", - "ranger.credential.provider.path": "/etc/ranger/admin/rangeradmin.jceks", - "ranger.externalurl": "{{ranger_external_url}}", - "ranger.https.attrib.keystore.file": "/etc/ranger/admin/keys/server.jks", - "ranger.jpa.audit.jdbc.credential.alias": "rangeraudit", - "ranger.jpa.audit.jdbc.dialect": "{{jdbc_dialect}}", - "ranger.jpa.audit.jdbc.driver": "{{jdbc_driver}}", - "ranger.jpa.audit.jdbc.url": "{{audit_jdbc_url}}", - "ranger.jpa.audit.jdbc.user": "{{ranger_audit_db_user}}", - "ranger.jpa.jdbc.credential.alias": "rangeradmin", - "ranger.jpa.jdbc.dialect": "{{jdbc_dialect}}", - "ranger.jpa.jdbc.driver": "org.postgresql.Driver", - "ranger.jpa.jdbc.url": "jdbc:postgresql://localhost:5432/ranger", - "ranger.jpa.jdbc.user": "{{ranger_db_user}}", - "ranger.jpa.jdbc.password": "{{ranger_db_password}}", - "ranger.ldap.ad.domain": "localhost", - "ranger.ldap.ad.url": "ldap://ad.xasecure.net:389", - "ranger.ldap.group.roleattribute": "cn", - "ranger.ldap.group.searchbase": "ou=groups,dc=xasecure,dc=net", - "ranger.ldap.group.searchfilter": "(member=uid={0},ou=users,dc=xasecure,dc=net)", - "ranger.ldap.url": "ldap://localhost:389", - "ranger.ldap.user.dnpattern": "uid={0},ou=users,dc=xasecure,dc=net", - "ranger.service.host": "{{ranger_host}}", - "ranger.service.http.enabled": "true", - "ranger.service.http.port": "6080", - "ranger.service.https.attrib.clientAuth": "false", - "ranger.service.https.attrib.keystore.keyalias": "mkey", - "ranger.service.https.attrib.keystore.pass": "ranger", - "ranger.service.https.attrib.ssl.enabled": "false", - "ranger.service.https.port": "6182", - "ranger.unixauth.remote.login.enabled": "true", - "ranger.unixauth.service.hostname": "localhost", - "ranger.unixauth.service.port": "5151" - } - } - }, - { - "ranger-env": { - "properties_attributes": {}, - "properties": { - "admin_username": "admin", - "create_db_dbuser": "true", - "ranger_admin_log_dir": "/var/log/ranger/admin", - "ranger_admin_username": "amb_ranger_admin", - "ranger_admin_password": "amb_ranger_pw", - "ranger_group": "ranger", - "ranger_jdbc_connection_url": "{{ranger_jdbc_connection_url}}", - "ranger_jdbc_driver": "org.postgresql.Driver", - "ranger_pid_dir": "/var/run/ranger", - "ranger_user": "ranger", - "ranger_usersync_log_dir": "/var/log/ranger/usersync", - "xml_configurations_supported": "true" - } - } - }, - { - "ranger-yarn-security": { - "properties_attributes": {}, - "properties": { - "ranger.plugin.yarn.policy.cache.dir": "/etc/ranger/{{repo_name}}/policycache", - "ranger.plugin.yarn.policy.pollIntervalMs": "30000", - "ranger.plugin.yarn.policy.rest.ssl.config.file": "/etc/yarn/conf/ranger-policymgr-ssl.xml", - "ranger.plugin.yarn.policy.rest.url": "{{policymgr_mgr_url}}", - "ranger.plugin.yarn.policy.source.impl": "org.apache.ranger.admin.client.RangerAdminRESTClient", - "ranger.plugin.yarn.service.name": "{{repo_name}}" - } - } - }, - { - "ranger-yarn-audit": { - "properties_attributes": {}, - "properties": { - "xasecure.audit.credential.provider.file": "jceks://file{{credential_file}}", - "xasecure.audit.db.async.max.flush.interval.ms": "30000", - "xasecure.audit.db.async.max.queue.size": "10240", - "xasecure.audit.db.batch.size": "100", - "xasecure.audit.db.is.async": "true", - "xasecure.audit.destination.db": "true", - "xasecure.audit.hdfs.async.max.flush.interval.ms": "30000", - "xasecure.audit.hdfs.async.max.queue.size": "1048576", - "xasecure.audit.destination.hdfs.dir": "/ranger/audit/%app-type%/%time:yyyyMMdd%", - "xasecure.audit.hdfs.config.destination.file": "%hostname%-audit.log", - "xasecure.audit.hdfs.config.destination.flush.interval.seconds": "900", - "xasecure.audit.hdfs.config.destination.open.retry.interval.seconds": "60", - "xasecure.audit.hdfs.config.destination.rollover.interval.seconds": "86400", - "xasecure.audit.hdfs.config.encoding": "", - "xasecure.audit.hdfs.config.local.archive.directory": "/var/log/yarn/audit/archive", - "xasecure.audit.hdfs.config.local.archive.max.file.count": "10", - "xasecure.audit.hdfs.config.local.buffer.directory": "/var/log/yarn/audit", - "xasecure.audit.hdfs.config.local.buffer.file": "%time:yyyyMMdd-HHmm.ss%.log", - "xasecure.audit.hdfs.config.local.buffer.file.buffer.size.bytes": "8192", - "xasecure.audit.hdfs.config.local.buffer.flush.interval.seconds": "60", - "xasecure.audit.hdfs.config.local.buffer.rollover.interval.seconds": "600", - "xasecure.audit.hdfs.is.async": "true", - "xasecure.audit.is.enabled": "true", - "xasecure.audit.jpa.javax.persistence.jdbc.driver": "{{jdbc_driver}}", - "xasecure.audit.jpa.javax.persistence.jdbc.url": "{{audit_jdbc_url}}", - "xasecure.audit.jpa.javax.persistence.jdbc.user": "{{xa_audit_db_user}}", - "xasecure.audit.kafka.async.max.flush.interval.ms": "1000", - "xasecure.audit.kafka.async.max.queue.size": "1", - "xasecure.audit.kafka.broker_list": "localhost:9092", - "xasecure.audit.kafka.is.enabled": "false", - "xasecure.audit.kafka.topic_name": "ranger_audits", - "xasecure.audit.log4j.async.max.flush.interval.ms": "30000", - "xasecure.audit.log4j.async.max.queue.size": "10240", - "xasecure.audit.log4j.is.async": "false", - "xasecure.audit.log4j.is.enabled": "false" - } - } - }, - { - "ranger-hdfs-security": { - "properties_attributes": {}, - "properties": { - "ranger.plugin.hdfs.policy.cache.dir": "/etc/ranger/{{repo_name}}/policycache", - "ranger.plugin.hdfs.policy.pollIntervalMs": "30000", - "ranger.plugin.hdfs.policy.rest.ssl.config.file": "/etc/hadoop/conf/ranger-policymgr-ssl.xml", - "ranger.plugin.hdfs.policy.rest.url": "{{policymgr_mgr_url}}", - "ranger.plugin.hdfs.policy.source.impl": "org.apache.ranger.admin.client.RangerAdminRESTClient", - "ranger.plugin.hdfs.service.name": "{{repo_name}}", - "xasecure.add-hadoop-authorization": "true" - } - } - }, - { - "ranger-yarn-plugin-properties": { - "properties_attributes": {}, - "properties": { - "REPOSITORY_CONFIG_USERNAME": "yarn", - "common.name.for.certificate": "-", - "hadoop.rpc.protection": "-", - "policy_user": "ambari-qa", - "ranger-yarn-plugin-enabled": "No" - } - } - }, - { - "ranger-hdfs-audit": { - "properties_attributes": {}, - "properties": { - "xasecure.audit.credential.provider.file": "jceks://file{{credential_file}}", - "xasecure.audit.db.async.max.flush.interval.ms": "30000", - "xasecure.audit.db.async.max.queue.size": "10240", - "xasecure.audit.db.batch.size": "100", - "xasecure.audit.db.is.async": "true", - "xasecure.audit.destination.db": "true", - "xasecure.audit.destination.hdfs.dir": "/ranger/audit/%app-type%/%time:yyyyMMdd%", - "xasecure.audit.hdfs.async.max.flush.interval.ms": "30000", - "xasecure.audit.hdfs.async.max.queue.size": "1048576", - "xasecure.audit.hdfs.config.destination.file": "%hostname%-audit.log", - "xasecure.audit.hdfs.config.destination.flush.interval.seconds": "900", - "xasecure.audit.hdfs.config.destination.open.retry.interval.seconds": "60", - "xasecure.audit.hdfs.config.destination.rollover.interval.seconds": "86400", - "xasecure.audit.hdfs.config.encoding": "", - "xasecure.audit.hdfs.config.local.archive.directory": "/var/log/hadoop/audit/archive/%app-type%", - "xasecure.audit.hdfs.config.local.archive.max.file.count": "10", - "xasecure.audit.hdfs.config.local.buffer.directory": "/var/log/hadoop/audit/%app-type%", - "xasecure.audit.hdfs.config.local.buffer.file": "%time:yyyyMMdd-HHmm.ss%.log", - "xasecure.audit.hdfs.config.local.buffer.file.buffer.size.bytes": "8192", - "xasecure.audit.hdfs.config.local.buffer.flush.interval.seconds": "60", - "xasecure.audit.hdfs.config.local.buffer.rollover.interval.seconds": "600", - "xasecure.audit.hdfs.is.async": "true", - "xasecure.audit.is.enabled": "true", - "xasecure.audit.jpa.javax.persistence.jdbc.driver": "{{jdbc_driver}}", - "xasecure.audit.jpa.javax.persistence.jdbc.url": "{{audit_jdbc_url}}", - "xasecure.audit.jpa.javax.persistence.jdbc.user": "{{xa_audit_db_user}}", - "xasecure.audit.kafka.async.max.flush.interval.ms": "1000", - "xasecure.audit.kafka.async.max.queue.size": "1", - "xasecure.audit.kafka.broker_list": "localhost:9092", - "xasecure.audit.kafka.is.enabled": "false", - "xasecure.audit.kafka.topic_name": "ranger_audits", - "xasecure.audit.log4j.async.max.flush.interval.ms": "30000", - "xasecure.audit.log4j.async.max.queue.size": "10240", - "xasecure.audit.log4j.is.async": "false", - "xasecure.audit.log4j.is.enabled": "false" - } - } - }, - { - "ranger-hdfs-plugin-properties": { - "properties_attributes": {}, - "properties": { - "REPOSITORY_CONFIG_USERNAME": "hadoop", - "common.name.for.certificate": "-", - "hadoop.rpc.protection": "-", - "policy_user": "ambari-qa", - "ranger-hdfs-plugin-enabled": "No" - } - } - }, - { - "usersync-properties": { - "properties_attributes": {}, - "properties": {} - } - } - ], - "host_groups": [ - { - "components": [ - { - "name": "NODEMANAGER" - }, - { - "name": "YARN_CLIENT" - }, - { - "name": "HDFS_CLIENT" - }, - { - "name": "HISTORYSERVER" - }, - { - "name": "METRICS_MONITOR" - }, - { - "name": "NAMENODE" - }, - { - "name": "ZOOKEEPER_CLIENT" - }, - { - "name": "RANGER_ADMIN" - }, - { - "name": "SECONDARY_NAMENODE" - }, - { - "name": "MAPREDUCE2_CLIENT" - }, - { - "name": "ZOOKEEPER_SERVER" - }, - { - "name": "AMBARI_SERVER" - }, - { - "name": "DATANODE" - }, - { - "name": "RANGER_USERSYNC" - }, - { - "name": "APP_TIMELINE_SERVER" - }, - { - "name": "METRICS_COLLECTOR" - }, - { - "name": "RESOURCEMANAGER" - } - ], - "configurations": [], - "name": "host_group_1", - "cardinality": "1" - } - ], - "Blueprints": { - "stack_name": "HDP", - "stack_version": "2.3", - "blueprint_name": "ranger-psql-onenode-sample" - } -} -``` - -**Notes** - -- Ranger plugins cannot be enabled by default in a blueprint due to some Ambari restrictions, so properties like `ranger-hdfs-plugin-enabled` must be set to *No* and the plugins must be enabled from the Ambari UI with the checkboxes and by restarting the necessary services. -- If using the UNIX user sync, it may be necessary in some cases to restart the Ranger Usersync Services after the blueprint installation finished if the UNIX users cannot be seen on the Ranger Admin UI. diff --git a/docs/shell.md b/docs/shell.md deleted file mode 100644 index fce3dee5d..000000000 --- a/docs/shell.md +++ /dev/null @@ -1,73 +0,0 @@ -# Cloudbreak Shell - -The goal with the CLI was to provide an interactive command line tool which supports: - -* all functionality available through the REST API or Cloudbreak web UI -* makes possible complete automation of management task via **scripts** -* context aware command availability -* tab completion -* required/optional parameter support -* **hint** command to guide you on the usual path - -## Install and start Cloudbreak shell - -You have a few options to give it a try: - -- [use Cloudreak deployer - **recommended**](#deployer) -- [use our prepared docker image](#dockerimage) -- [build it from source](#fromsource) - - -### Starting cloudbreak shell using cloudbreak deployer - -Start the shell with `cbd util cloudbreak-shell`. This will launch the Cloudbreak shell inside a Docker container and you are ready to start using it. - - -### Starting Cloudbreak shell with our prepared docker image - -You can find the docker image and its documentation [here](https://github.com/sequenceiq/docker-cb-shell). - - -### Build from source - -If want to use the code or extend it with new commands follow the steps below. You will need: -- jdk 1.7 - -``` -git clone https://github.com/sequenceiq/cloudbreak-shell.git -cd cloudbreak-shell -./gradlew clean build -``` - -> **Note** -> In case you use the hosted version of Cloudbreak you should use the `latest-release.sh` to get the right version of the CLI. - -**Start Cloudbreak-shell from the built source** - -``` -Usage: - java -jar cloudbreak-shell-0.5-SNAPSHOT.jar : Starts Cloudbreak Shell in interactive mode. - java -jar cloudbreak-shell-0.5-SNAPSHOT.jar --cmdfile= : Cloudbreak executes commands read from the file. - -Options: - --cloudbreak.address= Address of the Cloudbreak Server [default: https://cloudbreak-api.sequenceiq.com]. - --identity.address= Address of the SequenceIQ identity server [default: https://identity.sequenceiq.com]. - --sequenceiq.user= Username of the SequenceIQ user [default: user@sequenceiq.com]. - --sequenceiq.password= Password of the SequenceIQ user [default: password]. - -Note: - You should specify at least your username and password. -``` -Once you are connected you can start to create a cluster. If you are lost and need guidance through the process you can use `hint`. You can always use TAB for completion. - -> **Note** -> All commands are `context aware` - they are available only when it makes sense - this way you are never confused and guided by the system on the right path. - -**Provider specific documentations** - -- [AWS](aws_cb_shell.md) -- [Azure](azure_cb_shell.md) -- [GCP](gcp_cb_shell.md) -- [OpenStack](openstack_cb_shell.md) - -or you can find a more detailed documentation about Cloudbreak-shell in its [Github repositiry](https://github.com/sequenceiq/cloudbreak-shell). diff --git a/docs/spi.md b/docs/spi.md deleted file mode 100644 index eb557f264..000000000 --- a/docs/spi.md +++ /dev/null @@ -1,89 +0,0 @@ -#Service Provider Interface (SPI) - -Cloudbreak already supports multiple cloud platforms and provides an easy way to integrate a new provider trough [Cloudbreak's Service Provider Interface (SPI)](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-api) which is a plugin mechanism to enable a seamless integration of any cloud provider. - -The SPI plugin mechanism has been used to integrate all existing providers to Cloudbreak, therefore if a new provider is integrated it immediately becomes a first class citizen in Cloudbreak. - - * [cloud-aws](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-aws) module integrates Amazon Web Services - * [cloud-gcp](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-gcp) module integrates Google Cloud Platform - * [cloud-arm](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-arm) module integrates Microsoft Azure - * [cloud-openstack](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-openstack) module integrates OpenStack - -The SPI interface is event based, scales well and decoupled from Cloudbreak. The core of Cloudbreak is communicating trough [EventBus](http://projectreactor.io/) with providers, but the complexity of Event handling is hidden from the provider implementation. - -> Use of the SPI is currently `TECHNICAL PREVIEW`. - -##Resource management - -There are two kind of deployment/resource management method is supported by cloud providers: - -* template based deployments -* individual resource based deployments - -Cloudbreak's SPI supports both way of resource management. It provides a well defined interfaces, abstract classes and helper classes like scheduling and polling of resources to aid the integration and to avoid any boilerplate code in the module of cloud provider. - -##Template based deployments - -Providers with template based deployments like [AWS CloudFormation](https://aws.amazon.com/cloudformation/), [Azure ARM](https://azure.microsoft.com/en-us/documentation/articles/resource-group-overview/#) or [OpenStack Heat](https://wiki.openstack.org/wiki/Heat) have the ability to create and manage a collection of related cloud resources, provisioning and updating them in an orderly and predictable fashion. This means that Cloudbreak needs a reference to the template itself and every change in the infrastructure (e.g creating new instance or deleting one) is managed through this templating mechanism. - -If the provider has templating support then the provider's [gradle](http://gradle.org/) module shall depend on the [cloud-api](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-api) module. - -``` -apply plugin: 'java' - -sourceCompatibility = 1.7 - -repositories { - mavenCentral() -} - -jar { - baseName = 'cloud-new-provider' -} - -dependencies { - - compile project(':cloud-api') - -} -``` - -The entry point of the provider is the [CloudConnector](https://github.com/sequenceiq/cloudbreak/blob/master/cloud-api/src/main/java/com/sequenceiq/cloudbreak/cloud/CloudConnector.java) interface and every interface that needs to be implemented is reachable trough this interface. - -##Individual resource based deployments - -Providers like GCP that does not support suitable templating mechanism or for customisable providers like OpenStack where the Heat Orchestration (templating) component optional the individual resources needs to be handlet separately. This means that resources like networks, discs and compute instances needs to be created and managed with an ordered sequence of API calls and Cloudbreak shall provide a solution to manage the collection of related cloud resources together. - -If the provider has no templating support then the provider's [gradle](http://gradle.org/) module shall depend on the [cloud-template](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-template) module, that includes Cloudbreak defined abstract template. This template is a set of abstract and utility classes to support provisioning and updating related resources in an orderly and predictable manner trough ordered sequences of cloud API calls. - -``` -apply plugin: 'java' - -sourceCompatibility = 1.7 - -repositories { - mavenCentral() -} - -jar { - baseName = 'cloud-new-provider' -} - -dependencies { - - compile project(':cloud-template') - -} -``` - -##Variants - -OpenStack is very modular and allows to install different components for e.g. volume storage or different components for networking (e.g. Nova networking or Neutron) or even you have a chance that some components like Heat are not installed at all. - -Cloudbreak's SPI interface reflects this flexibility using so called variants. This means that if some part of cloud provider (typically OpenStack) is using different component you don't need re-implement the complete stack but just use a different variant and re-implement the part what is different. - -The reference implementation for this feature can be found in [cloud-openstack](https://github.com/sequenceiq/cloudbreak/tree/master/cloud-openstack) module which support a HEAT and NATIVE variants. The HEAT variant utilizes the Heat templating to launch a stack, but the NATIVE variant starts the cluster by using a sequence of API calls without Heat to achieve the same result, although both of them are using the same authentication and credential management. - -##Development - -In order to set up a development environment please take a look at [Local Development Setup](https://github.com/sequenceiq/cloudbreak/blob/master/docs/dev/development.md) documentation.