Skip to content

Ingestion spec templating is rather undocumented #5883

@lgo

Description

@lgo

While trying to set up ingestion, I ran across the -propertyFile and -values arguments in LaunchDataIngestionJobCommand. I didn't see anything in the docs about templating the job spec, and only later found one reference describing it in https://github.com/pinot-contrib/pinot-docs/blob/eb9a8a07687bfe78b022ba0825123fd43e316795/operators/cli.md.

This would be helpful to document and also answer questions such as:

  • What the format of the propertyFile is.
  • What the format for templated variables in the job spec file should be.

My particular use-case where this is great is in a setup where ingestion (via Spark or Hadoop) are only distributed as single JAR (compiled with deps, rather than distributed external JARs), and hooking up external file dependencies is a pain. For this, I'd ideally like to (1) bundle a basic configuration file as a resource in the JAR or as a separate distribution (2) provide any overrides at run-time via parameters (eg: by a scheduler application)

(Separately, it looks like using JAR resources for args like that aren't supported 🤔 I have to look a bit more into whether or not that's necessary. Is that normally a sane thing for Java applications to support?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions