CBF2 - Community Build Framework 2.0
It's not community only; You don't actually build anything; But still rocks!
The goal of this project is to quickly spin a working Pentaho server on docker containers. This will also provide script utilities to get the client tools.
- A system with docker
- A decent shell; either Minux or Mac should work out of the box, Cygwin should as well
For docker, please follow the instructions for your specific operating system. Docker has clients for the main operating systems.
How to use
There are a few utilities here:
- getBinariesFromBox.sh - Connects to box and builds the main images for the servers (requires access to box. Later I'll do something that doesn't require that)
- cbf2.sh - What you need to use to build the images
- getClients.sh - A utility to get the clients tools
- startClient.sh - A utility to start the client tools
The software directory
This is the main starting point. If you're a pentaho employee you will have access to using the getBinariesFromBox.sh script, but all the rest of the world can still use this by manually putting the files here.
You should put the official software files under the software/v.v.v directory. It's very important that you follow this 3 number representation
This works for both CE and EE. This actually works better for EE, since you can also put the patches there and they will be processed.
For EE, you should use the official -dist.zip artifacts. For CE, use the normal .zip file.
The licenses directory
For EE, just place the *.lic license files on the licenses subdirectory. They will be installed on the images for EE builds.
X.X.X, and inside drop the server, plugins and patches
Drop the build artifacts directly in that directory
software/ ├── 5.2.1 │ ├── SP201502-5.2.zip │ ├── biserver-ee-220.127.116.11-148-dist.zip │ ├── paz-plugin-ee-18.104.22.168-148-dist.zip │ ├── pdd-plugin-ee-22.214.171.124-148-dist.zip │ └── pir-plugin-ee-126.96.36.199-148-dist.zip ├── 5.4.0 │ └── biserver-ce-188.8.131.52-128.zip ├── 5.4.1 │ ├── SP201603-5.4.zip │ └── biserver-ee-184.108.40.206-169-dist.zip ├── 6.0.1 │ ├── SP201601-6.0.zip │ ├── SP201602-6.0.zip │ ├── SP201603-6.0.zip │ ├── biserver-ce-220.127.116.11-386.zip │ ├── biserver-ee-18.104.22.168-386-dist.zip │ ├── paz-plugin-ee-22.214.171.124-386-dist.zip │ ├── pdd-plugin-ee-126.96.36.199-386-dist.zip │ └── pir-plugin-ee-188.8.131.52-386-dist.zip ├── 6.1-QAT-153 │ ├── biserver-ee-6.1-qat-153-dist.zip │ ├── biserver-merged-ce-6.1-qat-153.zip │ ├── paz-plugin-ee-6.1-qat-153-dist.zip │ ├── pdd-plugin-ee-6.1-qat-153-dist.zip │ └── pir-plugin-ee-6.1-qat-153-dist.zip ├── 7.0-QAT-76 │ ├── biserver-merged-ee-7.0-QAT-76-dist.zip │ ├── pdd-plugin-ee-7.0-QAT-76-dist.zip │ └── pir-plugin-ee-7.0-QAT-76-dist.zip └── README.txt
CBF2: The main thing
CBF1 was an ant script but CBF2 is a bash script. So yeah, you want cbf2.sh. If you are on windows... well, not sure I actually care, but you should be able to just use cygwin.
Here's what you'll see when you run ./cbf2.sh:
-------------------------------------------------------------- -------------------------------------------------------------- ------ CBF2 - Community Build Framework 2 ------- ------ Version: 0.9 ------- ------ Author: Pedro Alves (email@example.com) ------- -------------------------------------------------------------- -------------------------------------------------------------- Core Images available: ----------------------  baserver-ee-184.108.40.206-169  baserver-ee-220.127.116.11-386  baserver-merged-ce-6.1-qat-153  baserver-merged-ee-18.104.22.168-192 Core containers available: --------------------------  (Stopped): baserver-ee-22.214.171.124-169-debug Project images available: -------------------------  pdu-project-nasa-samples-baserver-ee-126.96.36.199-169  pdu-project-nasa-samples-baserver-merged-ee-188.8.131.52-192 Project containers available: -----------------------------  (Running): pdu-project-nasa-samples-baserver-ee-184.108.40.206-169-debug  (Stopped): pdu-project-nasa-samples-baserver-merged-ee-220.127.116.11-192-debug > Select an entry number, [A] to add new image or [C] to create new project:
There are 4 main concepts here:
- Core images
- Core containers
- Project images
- Project containers
These should be straightforward to understand if you're familiar with docker, but in a nutshell there are two fundamental concepts: images and containers. An image is an inert, immutable file; The container is an instance of an image, and it's a container that will run and allow us to access the Pentaho platform
Accessing the platform
When we run the container, it exposes a few ports, most importantly 8080. So in order to see Pentaho running all we need to do is to access the machine where docker is running. This part may vary depending on the operating system; On a Mac, and using docker-machine, there's a separate VM running the things, so I'm able to access the platform by using the following URL:
These are the core images - a clean install out of one of the available artifacts that are provided on the software directory. So the first thing we should do is add a core image. The option [A] allows us to select which image to add from an official distribution archive.
When we select this option, we are prompted to choose the version we want to build:
> Select an entry number, [A] to add new image or [C] to create new project: A Servers found on the software dir: : biserver-ee-18.104.22.168-148-dist.zip : biserver-ce-22.214.171.124-128.zip : biserver-ee-126.96.36.199-169-dist.zip : biserver-ce-188.8.131.52-386.zip : biserver-ee-184.108.40.206-386-dist.zip : biserver-ee-6.1-qat-153-dist.zip : biserver-merged-ce-6.1-qat-153.zip : biserver-merged-ee-7.0-QAT-76-dist.zip
CBF2 will correctly know how to handle EE dist files, you'll be presented with the EULA, patches will be automatically processed and licenses will be installed.
Once an image is built, if we select that core image number you'll have the option to launch a new container or delete the image:
> Select an entry number, [A] to add new image or [C] to create new project: 0 You selected the image baserver-ee-220.127.116.11-386 > What do you want to do? (L)aunch a new container or (D)elete the image? [L]:
You can launch a container from a core image. This will allow us to explore a completely clean version of the image you selected. This is useful for some tests, but I'd say the big value would come out of the project images. Here are the options available over containers:
> Select an entry number, [A] to add new image or [C] to create new project: 3 You selected the container baserver-merged-ce-6.1-qat-153-debug The container is running; Possible operations: S: Stop it R: Restart it A: Attach to it L: See the Logs What do you want to do? [A]:
Briefly, here are the options mean - even though they should be relatively straightforward:
- Stop it: Stops the container. When the container is stopped you'll be able to delete the container or start it again
- Restart it: Guess what? It restarts it. Surprising, hein? :)
- Attach to it: Attaches to the docker container. You'll then have a bash shell and you'll be able to play with the server
- See the Logs: Gets the logs from the server
CBF2 allows you to run multiple containers at the same time. If some exposed port is already in use in the host by some service, CBF2 will look for a new free port and use it.
The list of the default exposed ports are defined in the setPorts.sh file.
To include, change and/or delete globally the default exposed ports do the following:
- Edit the setPorts.sh file
- Do the proper changes in the PORTS list, at the top. Each line represents a port to be exposed, composed by a unique name and the default port used by the service inside the container.
To expose additional ports per project, do the following:
- Create/edit the file setPorts.sh in the cbf2/projects/<projectName>/config folder
- Define in the setPorts.sh file the list of the new ports to be exposed
Use the following sample for the cbf2/projects/<projectName>/config/setPorts.sh:
#!/bin/bash # Mapping ports for the project PROJ_PORTS=( "mysqlPort:3306" "nodeWsPort:13536" )
Mounting docker volumes
CBF2 allows you to mount Docker volumes as well.
To configure new Docker volumes, do the following:
- Create in the host the folder(s) to be mounted inside the container
- Create/edit the file dockerVolumes.sh in the cbf2/projects/<projectName>/config folder
- Define in the dockerVolumes.sh file the volumes to be mounted
Use the following sample to mount 2 folder:
#!/bin/bash # Docker volumes mapping # "host_folder:container_folder" VOLUMES=( "/tmp/volumes/folder1:/folder1" "/tmp/volumes/folder2:/folder2" )
NOTE: To deal with permission folder issues, read the Docker manual.
Definition and structure
A project is built on top of a core image. Instead of being a clean install it's meant to replicate a real project's environment. As a best practice, it should also have a well defined structure that can be stored on a VCS repository.
Projects should be cloned / checked out in to the projects directory. I recommend every project to be versioned in a different git or svn repository. Here's the structure that I have:
pedro@orion:~/tex/pentaho/cbf2 (master *) $ tree -l ./projects/ ./projects/ └── project-nasa-samples -> ../../project-nasa-samples/ ├── _dockerfiles └── solution └── public ├── Mars_Photo_Project │ ├── Mars_Photo_Project.cda │ ├── Mars_Photo_Project.cdfde │ ├── Mars_Photo_Project.wcdf │ ├── css │ │ └── styles.css │ ├── img │ │ └── nasaicon.png │ └── js │ └── functions.js ├── exportManifest.xml └── ktr ├── NASA\ API\ KEY.txt ├── curiosity.ktr ├── getPages.ktr └── mars.ktr
All the solution files are going to be automatically imported, including metadata for datasources creation.
The directory _dockerfiles is a special one; You can override the default Dockerfile that's used to build a project image (the file in dockerfiles/buildProject/Dockerfile) and just drop a project specific Dockerfile in that directory using the former one as an example. Note that you should not change the FROM line, as it will be dynamically replaced. This is what you want for project level configurations, like installing / restoring a specific database, an apache server on front or any fine tuned configurations.
The first thing that we need to do is to create a project. To do that is very simple: we select one of the projects on our projects directory and a core image to install it against. This separations aims at really simplifying upgrades / tests / etc
> Select an entry number, [A] to add new image or [C] to create new project: C Choose a project to build an image for:  project-nasa-samples > Choose project: 0 Select the image to use for the project  baserver-ee-18.104.22.168-386  baserver-merged-ce-6.1-qat-153  baserver-merged-ee-22.214.171.124-192 > Choose image: 2
Once we have the project image created, we have access to the same options we had for the core images, which is basically launching a container or deleting the image.
Like the images, project containers work very similarly to core containers. But we'll also have two extra options available:
- Export the solution: Exports the solution to our project folder
- Import the solution: Imports the solution from our project folder into the running containers. This would be equivalent to rebuilding the image
Note that by design CBF2 only exports the folders in public that are already part of the project. You'll need to manually create the directory if you add a top level one.
The client tools
This also provides two utilities to handle the client tools; One of them, the getClients.sh, is probably something you can't use since it's for internal pentaho people only.
The other one, startClients.sh, may be more useful; It requires the client tools to be downloaded into a dir called clients/ with a certain structure:
pedro@orion:~/tex/pentaho/cbf2 (master *) $ tree -L 4 clients/ clients/ ├── pad-ce │ └── 126.96.36.199 ├── pdi-ce │ ├── 6.1-QAT │ │ └── 156 │ │ └── data-integration │ ├── 188.8.131.52 │ │ └── 192 │ │ └── data-integration │ └── 7.0-QAT │ └── 57 │ └── data-integration ├── pdi-ee-client │ └── 184.108.40.206 │ └── 192 │ ├── data-integration │ ├── jdbc-distribution │ └── license-installer ├── pme-ce │ └── 220.127.116.11 │ └── 182 │ └── metadata-editor ├── prd-ce │ └── 18.104.22.168 │ └── 182 │ └── report-designer └── psw-ce └── 22.214.171.124
If you use this, then the startClients.sh simplifies launching them; Note that, unlike the platform, this will run on the local machine, not on a docker VM:
pedro@orion:~/tex/pentaho/cbf2 (master *) $ ./startClients.sh
Clients found: --------------  pdi-ce: 6.1-QAT-156  pdi-ce: 126.96.36.199-192  pdi-ce: 7.0-QAT-57  pdi-ee-client: 188.8.131.52-192  pme-ce: 184.108.40.206-182  prd-ce: 220.127.116.11-182 Select a client:
Variables and configurations
Some system variables allow to fine tune the behavior of CBF2:
- CBF2_DOCKER_NETWORK: Specifies a different docker network to connect to
- CBF2_BINDING_INTERFACE: Specifies to which interface the binding will occur (by default it's 0.0.0.0)
Taking it further
This is, first and foremost, a developer's tool and methodology. I'll make no considerations or recommendations in regards to using these containers in a production environment or not because I have simply no idea how that works as we're mostly agnostic on those methods.
Pentaho's stance is clearly explained here:
As deployments increase in complexity and our clients rapidly add new software components and expand software footprints, we have seen a definitive shift away from traditional installation methods to more automated/scriptable deployment approaches. At Pentaho, our goal is to ensure our clients continue to enjoy flexibility to adapt our technology to their environments and individual standards. Throughout 2015, Pentaho worked with customers who use various deployment technologies in development, test, and production environments. We have seen that the range of technologies used for scripted software deployment can vary as widely as the internal IT standards of our clients. In short, we have not found critical mass in any single deployment pattern. To support our clients in their adoption of these technologies, Pentaho takes the perspective that our clients should continue to be autonomous in their selection and implementation of automated deployment and configuration management. Pentaho will provide documented best practices, based on our experience and knowledge of our product, to assist our clients in understanding the scriptable and configurable options within our product, along with our deployment best practices. Due to the diversity of technology options, Pentaho customer support will remain focused on the behavior of the Pentaho software and will provide expertise on the Pentaho products to help customers troubleshoot individual scripts or containers.
Have fun. Tips and suggestions to pedro.alves at webdetails.pt