Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common-fs2: fix env var substitution for JSON files #753

Closed
marcin-j opened this issue Mar 2, 2023 · 18 comments
Closed

common-fs2: fix env var substitution for JSON files #753

marcin-j opened this issue Mar 2, 2023 · 18 comments

Comments

@marcin-j
Copy link

marcin-j commented Mar 2, 2023

Project: Stream Enrich

Version: 3.7.0

Expected behavior:

Since version 3.7.0 using environment variables substitutions should not trigger an error seen below. The config should be parsed.

Actual behavior:

[pool-1-thread-1] ERROR com.snowplowanalytics.snowplow.enrich.common.fs2.Run - CLI arguments valid but some of the configuration is not correct. Error: Cannot parse file /home/snowplow/resolver/resolver.conf: ParsingFailure: need to Config#resolve(), see the API docs for Config#resolve(); substitution not resolved: ConfigConcatenation("https://"${domain}"/iglu-foo/")

It happens with resolver.hocon as well as all the other enrichments configs.

Steps to reproduce:

  1. Prepare directory with configs /test
    /test/resolver/resolver.json
{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1",
  "data": {
    "cacheSize": 500,
    "repositories": [
      {
        "name": "foo",
        "priority": 0,
        "vendorPrefixes": ["com.foo"],
        "connection": {
          "http": {
            "uri": "https://"${domain}"/iglu-foo/"
          }
        }
      }
    ]
  }
}

/test/config.hocon

{
  "input": {
    "streamName": "collector-payloads"
  }

  "output": {
    "good": {
      "streamName": "enriched"
    }

    "bad": {
      "streamName": "bad"
    }
  }
}

/test/enrichments/ just empty dir
2. Run enricher with

docker run \
  -it --rm \
  -v /test:/snowplow \
  -e domain="foo" \
  snowplow/snowplow-enrich-kinesis:3.7.0 \
  --enrichments /snowplow/enrichments \
  --iglu-config /snowplow/resolver/resolver.hocon \
  --config /snowplow/config.hocon


@benjben
Copy link
Contributor

benjben commented Mar 13, 2023

Hi @marcin-j ,

If I'm not mistaken, substitution works with full values.

Could you try with

"http": {
  "uri": ${uri}
}

and

-e uri="https://foo/iglu-foo/"

please?

@marcin-j
Copy link
Author

Hi @benjben ,

It won't work this way either. I'm getting exact same error. According to HOCON docs it shouldn't matter wheter you use substitution with or without a string value.

[pool-1-thread-1] ERROR com.snowplowanalytics.snowplow.enrich.common.fs2.Run - CLI arguments valid but some of the configuration is not correct. Error: Cannot parse file /snowplow/resolver/resolver.hocon: ParsingFailure: need to Config#resolve(), see the API docs for Config#resolve(); substitution not resolved: ConfigReference(${uri})

I think it could be realted to ConfigFactory.parseString(jsonStr) calls. It should explicitly call resolve method ConfigFactory.parseString(jsonStr).resolve(). I'm not really good with Scala so I might be mistaken.

@benjben benjben changed the title Stream Enrich fails while trying to parse .hocon configs with substitutions enrich-kinesis fails while trying to parse .hocon configs with substitutions Mar 13, 2023
@benjben
Copy link
Contributor

benjben commented Mar 13, 2023

Hey @marcin-j ,

It won't work this way either.

Just to double-check, you tried exactly like this "uri": ${uri} without the quotes around ${uri} right ? From the docs:

Substitutions are not parsed inside quoted strings.

@marcin-j
Copy link
Author

marcin-j commented Mar 14, 2023

@benjben yes

the resolver.hocon looks like:

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-1",
  "data": {
    "cacheSize": 500,
    "repositories": [
      {
        "name": "foo",
        "priority": 0,
        "vendorPrefixes": ["com.foo"],
        "connection": {
          "http": {
            "uri": ${uri}
          }
        }
      }
    ]
  }
}

command:

docker run \
  -it --rm \
  -v $PWD:/snowplow \
  -e uri="https://foo/iglu-foo/" \
  snowplow/snowplow-enrich-kinesis:3.7.0 \
  --enrichments /snowplow/enrichments \
  --iglu-config /snowplow/resolver/resolver.hocon \
  --config /snowplow/config.hocon

@benjben
Copy link
Contributor

benjben commented Mar 16, 2023

Hi @marcin-j ,

Indeed there is a bug, thanks for spotting it.

You were already using substitution right in the first place, sorry about that.

We'll release a fix shortly, I'll keep you posted.

@benjben benjben changed the title enrich-kinesis fails while trying to parse .hocon configs with substitutions common-fs2: fix env var substitution for JSON files Mar 16, 2023
@benjben
Copy link
Contributor

benjben commented Mar 16, 2023

Hey @marcin-j ,

We've released 3.7.1-rc1 with the fix, in case you want to start using it already.

@marcin-j
Copy link
Author

Hi @benjben,

Thanks for the update! Ive tested the new rc image, works great. Thanks!

@piankris
Copy link

Hey, we have just updated from snowplow/snowplow-enrich-kinesis:3.6.0.2 to snowplow/snowplow-enrich-kinesis:3.7.1 and we're suddenly getting the error message:

ERROR com.snowplowanalytics.snowplow.enrich.common.fs2.Run - CLI arguments valid but some of the configuration is not correct. Error: Cannot parse file /snowplow/resolver.hocon: ParsingFailure: need to Config#resolve(), see the API docs for Config#resolve(); substitution not resolved: ConfigReference(${SP_SCHEMA_URI})

The resolver.hocon file:

{
  "schema": "iglu:com.snowplowanalytics.iglu/resolver-config/jsonschema/1-0-0",
  "data": {
    "cacheSize": 500,
    "repositories": [
      {
        "name": "S3-schemas-registry",
        "priority": 0,
        "vendorPrefixes": ["com.oneapp"],
        "connection": {
          "http": {
            "uri": ${SP_SCHEMA_URI}
          }
        }
      }
    ]
  }
}

We call the enrichment app in Docker like this:

FROM snowplow/snowplow-enrich-kinesis:3.7.1
COPY ./src/config.hocon /snowplow/config.hocon
COPY ./src/resolver.hocon /snowplow/resolver.hocon
COPY ./src/enrichments /snowplow/enrichments

ARG SP_SCHEMA_URI
...

ENV SP_SCHEMA_URI $SP_SCHEMA_URI
...

USER root

ENTRYPOINT ["/usr/bin/env"]

CMD /home/snowplow/bin/snowplow-enrich-kinesis --config /snowplow/config.hocon \
    --iglu-config /snowplow/resolver.hocon \
    --enrichments /snowplow/enrichments \

@marcin-j
Copy link
Author

Same here. Seems like there is regression, cause 3.7.1-rc1 doesn't resolve the Config file anymore

@benjben
Copy link
Contributor

benjben commented Apr 28, 2023

It was fixed on develop branch but was not released yet. It will be shortly, in 3.8.0.

@marcin-j hadn't you said that it was working with 3.7.1-rc1 ?

@marcin-j
Copy link
Author

Hi @benjben, yes it was working. But Ive just built the image yesterday and it doesn't anymore. Im not sure what's wrong but I'm seeing exact same error about Config#resolve()

@benjben
Copy link
Contributor

benjben commented Apr 28, 2023

Hi @marcin-j , you build the image from develop branch ?

@marcin-j
Copy link
Author

@benjben
Copy link
Contributor

benjben commented Apr 28, 2023

I'm not sure to follow, have you actually built the image or have you run 3.7.1-rc1 from Docker Hub ?

@marcin-j
Copy link
Author

Oh sorry, we're actually using Docker Hub image to build our own image with some additions on top (like baking in config.hocon into the image etc.). That's why I said "built", but we base that image on your Docker Hub version:

FROM snowplow/snowplow-enrich-kinesis:3.7.1-rc1

@benjben
Copy link
Contributor

benjben commented Apr 28, 2023

I see. May I ask why you build your own image instead of using the original one directly ?

I've published 3.8.0-rc8 that should have the fix. Could you try and see if it works with it please ?

@marcin-j
Copy link
Author

marcin-j commented Apr 28, 2023

Of course, there are couple of reasons. We deploy Snowplow services via AWS ECS on FARGATE and we want to have additional layer of "caching" for situations like this. If anything changes on your Docker Hub, we still have our copy of image in our ECR repo and our prod version is referencing those ECR copies. Secondly, we add some additional layers to those images: enrichemnts directory, resolver config, enricher config. It's just easier to keep those in one place and call it via CMD (we actually have similar setup to this one).

Version 3.8.0-rc8 works well, thanks for the super fast response @benjben :)

@piankris
Copy link

Thank you for following on this and checking so promptly!
The problem is that as far as I understood, the images with the suffix -rc* are developing images and some bugs that they fixed are then not incorporated into next main releases.

I also got the impression that in our case, resolver.hocon was parsed as JSON and of course, it could not interpret the syntax for passing environment variables. I saw that there're two ways of parsing config files (for json and hocon) but somehow our resolver config wasn't recognized as a hocon.

@spenes spenes closed this as completed in 4dcb543 May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants