-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
ServiceManager is a nice utility to make pinot effectively a single process. It currently cannot address a common dev setup problem of installing schema.
Instead, you do it like this or with curl after the process is up.
ex
bin/pinot-admin.sh AddTable \
-schemaFile /path/to/transcript-schema.json \
-tableConfigFile /path/to/transcript-table-realtime.json \
-exec
To do this after start has many problems:
- Anything that depends on Pinot now needs to depend on the outcome of this job.
- This is tricky in docker-compose for example, as you can't depend on a job status, only a service health.
- even running the commands can be tricky and can lead to an unhealthy pinot if not done properly.
- You cannot rely on /heath similar to normal, as the bootstrap isn't totally done.
- This can leak traffic and show errors you can avoid
- This prevents creating a layer that already configures the schema
I think 1 and 2 can be solved by allowing ServiceManager to bootstrap a directory full of schema pairs. such as:
backendEntityView-schemaFile.json
backendEntityView-tableConfigFile.json
rawServiceView-schemaFile.json
rawServiceView-tableConfigFile.json
rawTraceView-schemaFile.json
rawTraceView-tableConfigFile.json
serviceCallView-schemaFile.json
serviceCallView-tableConfigFile.json
spanEventView-schemaFile.json
spanEventView-tableConfigFile.json
Or even it could be config with the same effect.
For example, in this case /health doesn't pass until tables are applied, also listeners aren't up until that happens.
Stretch Goal -- no ZK dep
The last mile, ex persisting as a docker layer, seems not possible at the moment unless ServiceManager is also used for embedded ZK. This is because table configs/table ideal states/broker ideal states are written into zookeeper. In other words, you cannot simply save off a layer with pinot data directories and then bring it up later with a fresh ZK (IIUC)
However, if we could do this, it would be ultimately better because it allows us to have no chance of error starting the process later. This allows natural service dependencies with no external orchestration needed.