Skip to content

Commit

Permalink
Add cache-service module; replace minio create bucket bootstrap scrip…
Browse files Browse the repository at this point in the history
…ts with container entrypoint
  • Loading branch information
jefflester committed Dec 4, 2023
1 parent d2b0b54 commit 056d590
Show file tree
Hide file tree
Showing 21 changed files with 296 additions and 59 deletions.
13 changes: 11 additions & 2 deletions release-notes/2.0.8.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,24 @@

## CLI Changes and Additions

- N/A
- Add shell environment support for the following environment variables:
- `LIB_PATH`
- `STARBURST_VER`
- `TEXT_EDITOR`
- `LIC_PATH`

## Library Changes and Additions

- Set minimum Starburst version to 402-e and the default version to 423-e.6
- Add specific image tags to all images (#68)
- Update Starburst image to account for lost binaries in 427-e release (#67)
- Add module testing framework (#25)
- Add `cache-service` module

## Other

- Add repository wiki and make the readme way smaller
- Add repository wiki for simpler user onboarding
- Add a new HMS image based off Apache's 3.1.3 release for better compatibility
with Starburst features
([commit](https://github.com/jefflester/hive-metastore/commit/2fe933196b20ab85997a6e0d1e3276e48dbea36e),
[image](https://hub.docker.com/repository/docker/jefflester/hive-metastore/general))
2 changes: 1 addition & 1 deletion src/lib/minitrino.env
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ COMPOSE_PROJECT_NAME=minitrino
CURL_VER=8.4.0
DB2_VER=11.5.8.0
ELASTICSEARCH_VER=8.11.0
HARBOR_HMS_VER=3.1.3-e.3
HMS_VER=3.1.3
ICEBERG_REST_VER=0.5.0
MINIO_MC_VER=RELEASE.2023-10-14T01-57-03Z
MINIO_VER=RELEASE.2023-11-01T18-37-25Z
Expand Down
22 changes: 22 additions & 0 deletions src/lib/modules/admin/cache-service/cache-service.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
version: '3.8'
services:

trino:
environment:
MINITRINO_BOOTSTRAP: bootstrap-trino.sh
volumes:
- ./modules/admin/cache-service/resources/trino/rules.json:/etc/starburst/rules.json
- ./modules/admin/cache-service/resources/trino/cache.properties:/etc/starburst/cache.properties
- ./modules/admin/cache-service/resources/trino/cache_svc.properties:/etc/starburst/catalog/cache_svc.properties
- ./modules/admin/cache-service/resources/trino/hive_mv_tsr.properties:/etc/starburst/catalog/hive_mv_tsr.properties
ports:
- 8180:8180

cache-svc-backend:
image: postgres:${POSTGRES_SEP_CACHE_SVC_VER}
container_name: cache-svc-backend
env_file:
- ./modules/admin/cache-service/resources/postgres/cache-svc.env
labels:
- com.starburst.tests=minitrino
- com.starburst.tests.module.cache-service=admin-cache-service
6 changes: 6 additions & 0 deletions src/lib/modules/admin/cache-service/metadata.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"description": "Cache service module",
"incompatibleModules": [],
"dependentModules": ["hive", "postgres", "insights"],
"enterprise": true
}
31 changes: 31 additions & 0 deletions src/lib/modules/admin/cache-service/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Cache Service Module

This module configures [Starburst's cache
service](https://docs.starburst.io/latest/admin/cache-service.html) feature
along with basic config for [table scan
redirections](https://docs.starburst.io/latest/admin/cache-service.html#enable-table-scan-redirections)
and [materialized
views](https://docs.starburst.io/latest/connector/starburst-hive.html#materialized-views).

The module launches with the `postgres`, `hive`, and `insights` modules.
Additional catalogs, `cache_svc` and `hive_mv_tsr`, are also configured.
`cache_svc` exposes the backend database for the cache service for querying in
Starburst, and `hive_mv_tsr` is a clone of the `hive` catalog but with
materialized views and `hive.security=allow-all` enabled.

For troubleshooting, the bootstrap script enables debug logging for
`com.starburstdata.cache` as well as JMX dump tables for the MBeans associated
with the cache service. The JMX dump tables can be queried in the `jmx.history`
schema.

## Table Scan Redirections (TSRs)

The `rules.json` file configures two tables for TSRs: `postgres.public.customer`
and `postgres.public.orders`. Additional tables can be specified for TSRs by
updating the `rules.json` file. The container logs will display the various
cache service operations as they occur.

## Materialized Views (MVs)

An example MV is created in `hive_mv_tsr.mvs.example`. Any number of MVs can be
added to this catalog, and MVs can pull data from any data source.
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/usr/bin/env bash

set -euxo pipefail

COUNTER=0
while [ "${COUNTER}" -lt 30 ]
do
set +e
RESPONSE=$(curl -s -X GET -H 'Accept: application/json' -H 'X-Trino-User: admin' 'trino:8080/v1/info/')
echo "${RESPONSE}" | grep -q '"starting":false'
if [ $? -eq 0 ]; then
echo "Trino health check passed."
sleep 5
break
fi
COUNTER=$((COUNTER+1))
sleep 1
done

if [ "${COUNTER}" -eq 30 ]
then
echo "Trino health check failed."
exit 1
fi

set -e
echo "com.starburstdata.cache=DEBUG" >> /etc/starburst/log.properties

echo -e "jmx.dump-tables=com.starburstdata.cache.resource:name=cacheresource,\\
com.starburstdata.cache.resource:name=materializedviewsresource,\\
com.starburstdata.cache.resource:name=redirectionsresource,\\
com.starburstdata.cache:name=cleanupservice,\\
com.starburstdata.cache:name=tableimportservice
jmx.dump-period=10s
jmx.max-entries=86400" >> /etc/starburst/catalog/jmx.properties

echo "Creating Postgres tables..."
trino-cli --user admin --output-format TSV_HEADER \
--execute "CREATE TABLE IF NOT EXISTS postgres.public.customer AS SELECT * FROM tpch.tiny.customer"

trino-cli --user admin --output-format TSV_HEADER \
--execute "CREATE TABLE IF NOT EXISTS postgres.public.orders AS SELECT * FROM tpch.tiny.orders"

echo "Creating Hive cache schema (for table scan redirections)..."
trino-cli --user admin --output-format TSV_HEADER \
--execute "CREATE SCHEMA IF NOT EXISTS hive_mv_tsr.cache WITH (LOCATION = 's3a://sample-bucket/cache/')"

echo "Creating materialized view schemas..."
trino-cli --user admin --output-format TSV_HEADER \
--execute "CREATE SCHEMA IF NOT EXISTS hive_mv_tsr.mv_storage WITH (LOCATION = 's3a://sample-bucket/mv/mv_storage/')"
trino-cli --user admin --output-format TSV_HEADER \
--execute "CREATE SCHEMA IF NOT EXISTS hive_mv_tsr.mvs WITH (LOCATION = 's3a://sample-bucket/mv/mvs/')"

echo "Creating materialized views..."
QUERY="CREATE OR REPLACE MATERIALIZED VIEW hive_mv_tsr.mvs.example
WITH (
partitioned_by = ARRAY['orderdate'],
max_import_duration = '1m',
refresh_interval = '5m',
grace_period = '10m'
)
AS
SELECT orderkey, orderdate FROM tpch.tiny.orders
UNION ALL
SELECT orderkey, orderdate FROM tpch.tiny.orders"

trino-cli --user admin --output-format TSV_HEADER --execute "${QUERY}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
POSTGRES_USER=admin
POSTGRES_PASSWORD=trinoRocks15
POSTGRES_DB=cachesvc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
service-database.user=admin
service-database.password=trinoRocks15
service-database.jdbc-url=jdbc:postgresql://cache-svc-backend:5432/cachesvc
starburst.user=cachesvc
starburst.jdbc-url=jdbc:trino://trino:8080
rules.file=/etc/starburst/rules.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
connector.name=postgresql
connection-url=jdbc:postgresql://cache-svc-backend:5432/cachesvc
connection-user=admin
connection-password=trinoRocks15
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
connector.name=hive
hive.metastore.uri=thrift://metastore-hive:9083
hive.s3.endpoint=http://minio:9000
hive.s3.aws-access-key=access-key
hive.s3.aws-secret-key=secret-key
hive.non-managed-table-writes-enabled=true
hive.s3.path-style-access=true

# Cache service specific
hive.security=allow-all
hive.max-partitions-per-writers=1000
cache-service.uri=http://trino:8180

# MVs
materialized-views.enabled=true
materialized-views.namespace=mv_namespace
materialized-views.storage-schema=mv_storage
25 changes: 25 additions & 0 deletions src/lib/modules/admin/cache-service/resources/trino/rules.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"defaultCacheCatalog": "hive_mv_tsr",
"defaultCacheSchema": "cache",
"defaultMaxImportDuration": "1m",
"rules": [
{
"catalogName": "postgres",
"schemaName": "public",
"tableName": "customer",
"refreshInterval": "90s",
"gracePeriod": "5m"
},
{
"catalogName": "postgres",
"schemaName": "public",
"tableName": "orders",
"refreshInterval": "90s",
"gracePeriod": "5m",
"columns": [
"orderkey",
"totalprice"
]
}
]
}
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
sepadmins:admin,admin-1,admin-2
metadata-users:metadata,metadata-1,metadata-2
platform-users:platform,platform-1,platform-2
platform-users:platform,platform-1,platform-2,cachesvc
12 changes: 8 additions & 4 deletions src/lib/modules/catalog/delta-lake/delta-lake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ services:
- com.starburst.tests.module.delta-lake=catalog-delta-lake

metastore-delta-lake:
image: naushadh/hive-metastore:latest
image: jefflester/hive-metastore:${HMS_VER}
container_name: metastore-delta-lake
depends_on:
postgres-delta-lake:
Expand Down Expand Up @@ -54,11 +54,15 @@ services:
create-minio-delta-lake-buckets:
image: minio/mc:${MINIO_MC_VER}
container_name: create-minio-delta-lake-buckets
environment:
MINITRINO_BOOTSTRAP: bootstrap-create-buckets.sh
entrypoint: >
/bin/sh -c "
tail -f /dev/null;
curl -fsSL https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh > /tmp/wait-for-it.sh && \
chmod +x /tmp/wait-for-it.sh && \
echo 'Waiting for MinIO to come up...' && \
/tmp/wait-for-it.sh minio-delta-lake:9000 --strict --timeout=60 -- echo 'MinIO service is up.' && \
/usr/bin/mc alias set minio http://minio-delta-lake:9000 access-key secret-key && \
/usr/bin/mc mb minio/sample-bucket/wh/ && \
tail -f /dev/null
"
labels:
- com.starburst.tests=minitrino
Expand Down

This file was deleted.

14 changes: 9 additions & 5 deletions src/lib/modules/catalog/hive/hive.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ services:
- com.starburst.tests.module.hive=catalog-hive

metastore-hive:
image: naushadh/hive-metastore:latest
image: jefflester/hive-metastore:${HMS_VER}
container_name: metastore-hive
depends_on:
postgres-hive:
Expand Down Expand Up @@ -54,11 +54,15 @@ services:
create-minio-buckets:
image: minio/mc:${MINIO_MC_VER}
container_name: create-minio-buckets
environment:
MINITRINO_BOOTSTRAP: bootstrap-create-buckets.sh
entrypoint: >
entrypoint: |
/bin/sh -c "
tail -f /dev/null;
curl -fsSL https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh > /tmp/wait-for-it.sh && \
chmod +x /tmp/wait-for-it.sh && \
echo 'Waiting for MinIO to come up...' && \
/tmp/wait-for-it.sh minio:9000 --strict --timeout=60 -- echo 'MinIO service is up.' && \
/usr/bin/mc alias set minio http://minio:9000 access-key secret-key && \
/usr/bin/mc mb minio/sample-bucket/wh/ && \
tail -f /dev/null
"
labels:
- com.starburst.tests=minitrino
Expand Down

This file was deleted.

10 changes: 7 additions & 3 deletions src/lib/modules/catalog/iceberg/iceberg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,15 @@ services:
create-minio-iceberg-buckets:
image: minio/mc:${MINIO_MC_VER}
container_name: create-minio-iceberg-buckets
environment:
MINITRINO_BOOTSTRAP: bootstrap-create-buckets.sh
entrypoint: >
/bin/sh -c "
tail -f /dev/null;
curl -fsSL https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh > /tmp/wait-for-it.sh && \
chmod +x /tmp/wait-for-it.sh && \
echo 'Waiting for MinIO to come up...' && \
/tmp/wait-for-it.sh minio-iceberg:9000 --strict --timeout=60 -- echo 'MinIO service is up.' && \
/usr/bin/mc alias set minio http://minio-iceberg:9000 access-key secret-key && \
/usr/bin/mc mb minio/sample-bucket/wh/ && \
tail -f /dev/null
"
labels:
- com.starburst.tests=minitrino
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# cachesvc, example.com
dn: uid=cachesvc,dc=example,dc=com
changetype: add
uid: cachesvc
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
cn: cachesvc
sn: cachesvc
mail: cachesvc@example.com
userPassword: trinoRocks15
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ set -euxo pipefail
echo "Setting up password file..."
htpasswd -cbB -C 10 /etc/starburst/password.db alice trinoRocks15
htpasswd -bB -C 10 /etc/starburst/password.db bob trinoRocks15
htpasswd -bB -C 10 /etc/starburst/password.db admin trinoRocks15
htpasswd -bB -C 10 /etc/starburst/password.db admin trinoRocks15
htpasswd -bB -C 10 /etc/starburst/password.db cachesvc trinoRocks15
Loading

0 comments on commit 056d590

Please sign in to comment.