Skip to content

lucasjellema/platys-trial-workspace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

platys-trial-workspace

Gitpod for experimenting with Platys - a tool for generating and provisioning Modern Data Platforms based on Docker and Docker Compose from Trivadis. Its main use is for small-scale Data Lab projects, Proof-of-Concepts (PoC) or Proof-of-value (PoV) projects as well as trainings.

The user of platys can choose which services to use from a list of supported services and generate a fully working docker-compose.yml file including all necessary configuration files.

Open in Gitpod

Installation of platys

This installation is performed when the Gitpod workspace is first started (based on Installing Platys). See the actual commands in file .gitpod.yml

The steps are:

  • download tar ball to /tmp
  • extract and move to /usr/local/bin
  • make root the owner
  • remove the archive
  • verify installation with platys version

Note: only when you see the output from this last command - somethink like:

Platys - Trivadis Platform in a Box - v 2.4.3
https://github.com/trivadispf/platys
Copyright (c) 2018-2020, Trivadis AG 

is the environment ready for the next steps

First steps with platys

These first steps are taken from Getting started with Platys.

To create and run a simple first platform, let's first create a target directory.

mkdir platys-demo-platform
cd platys-demo-platform

Now let's use platys to initialise the new and now current directory to use the Modern Data Analytics Platform Stack.

We specify the platform stack name trivadis/platys-modern-data-platform to use as well as the stack version 1.15.0 (the current version of this platform stack at the time of writing). With the -n option we give the platform a meaningful name. It will be used as the name of the docker network, so it is important that it is unique, if you use multiple platforms on the same machine.

platys init -n demo-platform --stack trivadis/platys-modern-data-platform --stack-version 1.16.0 --structure flat

Running this command takes several dozens of seconds. It generates a config.yml file, if it does not exist already, with all the services which can be configured for the platform. This config file is now to be configured - to specify the services to be included in the demo-platform. For example: enable Apache Kafka, MySQL, SuperSet, RabbitMQ. And set additional configuration settings - see the documentation of the configuration settings

A very simple first platform composition could include services PostgreSQL (open source database) and nocodb ( a no-code database platform that allows teams to collaborate and build applications with ease of a familiar and intuitive spreadsheet interface) . This requires us to enable two services. With this next command (or simply from the file navigator in VS Code) open the config.yml file.

gp open config.yml

Locate the line with POSTGRESQL_enable: false. Change false to true in these two lines:

      # ===== PostgreSQL ========
      #
      POSTGRESQL_enable: false
      POSTGRESQL_volume_map_data: false

Also locate the lines

      NOCODB_enable: false
      NOCODB_volume_map_data: false

and change false to true. By default, nocodb is configured to work with the default PostgreSQL database (demodb) and user (demo) that will be created by platsys in the docker-compose,.yml file.

To generate the docker-compose.yml based on your updated version of the config.yml file:

platys gen -c ${PWD}/config.yml

The file docker-compose.yml is generated in the current directory. By running this file with Docker Compose, all configured services - if they fit in your machine - are pulled and started and configured.

Before running Docker Compose, first export

  • DOCKER_HOST_IP - the IP address of the network interface of the Docker Host
  • PUBLIC_IP - the IP address of the public network interface of the Docker Host (different to DOCKER_HOST_IP if in a public cloud environment
export DOCKER_HOST_IP=127.0.0.1

(I am not entirely sure about the PUBLIC_IP)

I have added in docker-compose.yml:

Healthcheck for postgresql

    environment:
      - POSTGRES_PASSWORD=abc123!
      - POSTGRES_USER=postgres
      - POSTGRES_DB=postgres
      - POSTGRES_MULTIPLE_DATABASES=demodb
      - POSTGRES_MULTIPLE_USERS=demo
      - POSTGRES_MULTIPLE_PASSWORDS=abc123!
      - PGDATA=/var/lib/postgresql/data/pgdata
      - DB_SCHEMA=demo
    healthcheck: 
      interval: 10s
      retries: 10
      test: "pg_isready -U \"$$POSTGRES_USER\" -d \"$$POSTGRES_DB\""
      timeout: 2s

and dependency for nocodb:

  nocodb:
    depends_on: 
      postgresql:
        condition: service_healthy
    image: nocodb/nocodb:latest

To now run the Docker Compose:

docker-compose up -d

The images for the selected services will be pulled (downloaded and subsequently the services are started. At port 80 you will find an overview of the installed services and their relevant port details.

Check the docker-compose.yml file to see which port has been mapped to nocodb.

In my case:

    ports:
      - 28276:8080

On port 28276 I can open the nocodb web ui. After creating an account, I can enter the tool. Under Team & Settings I can create a new data source that connects to the demo user and the demodb database and the demo schema. Subsequently I can add tables in the project under this data source - and "pick up" tables that already exist in the database.

I can define new tables, enter data and create a view on that data. I can even share that view (a web page) with anyone on the public internet.

One way to look inside the PostgreSQL database is with the CLI pgcli. You can install it with:

pip install -U pgcli

and the run and connect to the demo user in the demodb with:

pgcli 'postgres://demo:abc123!@localhost:5432/demodb'

List tables:

\dt

Create a table through DDL in this CLI and the table becomes available for NocoDB to use in the application. Tables can be created from either end (NocoDB UI and directly in the database) and the same applies to data.

Cookbook with Trino and Kafka

see original cookbook by Guido Schmutz

Resources

Introduction to Platys Getting started with Platys Platys Modern Data Platform - Overview of supported Services Installing Platys Cookbooks for trying out aspects of the modern data platform

About

Gitpod for experimenting with Platys - a tool for generating and provisioning Modern Data Platforms based on Docker and Docker Compose from Trivadis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published