-
Notifications
You must be signed in to change notification settings - Fork 279
Fix issue with schema validation for config with env variables #2200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with schema validation for config with env variables #2200
Conversation
/azp run |
/azp run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow why we need new environment variable parsing logic as opposed to using pre-existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is for testing DW object serialization. Any reason why config env_var tests are placed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, With the name of the file, it was not very evident that it will be used only for DW
|
||
/// <summary> | ||
/// This test checks that the ReplaceEnvVarsInJson method throws a JsonException when given an invalid JSON string. | ||
/// The test asserts that the exception's message matches the expected error message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test asserts that the exception's message matches the expected error message.
nit: While it's good to note that you're checking the emitted error message, your description would be even more beneficial to the reader if it described that you're validating that the error message blames an environment variable replacement failure versus some other JSON parsing issue.
[TestMethod] | ||
public void TestEnvVariableReplacementWithInvalidJson() | ||
{ | ||
// Create an invalid JSON string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: note which part is invalid. (I'm assuming lack of closing quotation marks after the ENV_STRING.)
src/Service.Tests/Unittests/SerializationDeserializationTests.cs
Outdated
Show resolved
Hide resolved
catch (JsonException ex) | ||
{ | ||
Assert.AreEqual("Failed to replace environment variables in given JSON. " | ||
+ "Environment variable 'MISSING_ENV_VAR' is not set.", ex.Message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the following scenarios result in this error message? I suppose "unset environment variable" would apply to both of these:
- MISSING_ENV_VAR exists but doesn't have a value -> null
- MISSING_ENV_VAR doesn't exist -> null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, actually not. environment files are read as test. environment values are read as string,i.e. "null", and then added to the json as a null value. So, if a user wants to set a null value. it will be assigned as null and no error should be thorwn. there is a test for the same.
/// </summary> | ||
/// <param name="inputJson">The JSON string to process.</param> | ||
/// <exception cref="JsonException">Thrown when the input is null or when an error occurs while processing the JSON.</exception> | ||
public static string? ReplaceEnvVarsInJson(string? inputJson) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this function when StringJsonConverterFactory()
already exists which has helpers to parse out envvironment variable placeholders? that class also uses regex to detect instances of @env()
why not something like:
TryParseConfig(
json: json,
config: out config,
connectionString: _connectionString,
replaceEnvVar: true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something about dab validate
that requires a completely different code path + new code?
if so , please add to pr description and code comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please elaborate in PR description. this doesn't really tell us much about "why" it can't be used.
There were certain methods which was doing replacement, but it was written in a way to align with the RuntimeConfig Object. So, it couldn't be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we need the original conversion mechanism to stay since it will be utilized for the workflow outside of dab validate.
Not clear why original conversion would need to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions on the approach
// The config file may contain some environment variables that need to be replaced before validation. | ||
try | ||
{ | ||
jsonData = Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(jsonData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have the runtimeConfig
deserialized from the configFilePath
.
Wouldnt that mean we already had the jsonData with the environment variables replaced?
Or does the runtimeConfig object doesnt have the environment variable values?
Instead of reading from the file path, cant we get the string from which the runtimeConfig
object was formed - to do the schema validation on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no we cant. schemaValidator require a json string. i tried converting the deserialized runtimeConfig to string and then passing it to the schema validator but it was not forming a valid json string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you provide an example? invalid json string because of extra escape characters? is 'invalid json string' indicative of a different bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no by invalid i mean.
if i deserialize the below json
{
"database-type": "mssql"
}
and then serialize it again
{
"database-type" : {
"DatabaseType": 2
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems like another bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just how enums are serialized i believe, don't think it's a bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just how enums are serialized i believe, don't think it's a bug
I think we would need to have custom (de)serialization that handles the case of these enums so that they go back to the intended format. Deserialization/Serialization ought to be reversable so this does seem like a bug that just hasnt arisen yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using string jsonData = runtimeConfig.ToJson()
instead of all the custom parsing code and then provided jsonData
to:
await jsonConfigSchemaValidator.ValidateJsonConfigWithSchemaAsync(jsonSchema, jsonData!);
works great and doesn't present any of the issues mentioned earlier in this thread. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the tests added in this PR pass with your proposed changes? If yes, we may not need the custom parsing code mentioned here.
Co-authored-by: Sean Leonard <sean.leonard@microsoft.com>
break; | ||
case JsonValueKind.String: | ||
string? value = element.GetString(); | ||
if (value is not null && value.StartsWith("@env('") && value.EndsWith("')")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this assuming that the start of the string is the beginning of the env('<match>')
?
What happens if the env('<match>')
happens in the middle of the string?
For example, if the person had
{
"database-type": "env('db_type')env('db_lang')"
}
where their environment variable are db_type=ms, and db_lang=sql
will this replace env('db_lang')
with sql
?
Seems like this would skip all of the replacements except the first and only replace the first when the string starts with the env variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at the StringJsonConverterFactory
there is logic that handles all of our edge cases with regex match and replace. If you can't use that class explicitly, perhaps you can factor out the logic that is being used for parsing the env variables to be shared?
|
||
break; | ||
default: | ||
// Write the JSON element to the writer as is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the env('') doesnt start the string we end up here, but then we don't do any replacements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions about how and when we are matching the environment variables, and if we can share some code.
## Why make this change? - Closes #2200, an issue with `dab validate -c "path_to_config.json"` where environment variable `@env('ENV_VAR_NAME')` references are unresolved during config schema validation and schema validation always fails. The issue specifically calls out an error when an environment variable is used for the property `database-type` can't resolve an environment variable and properly set an enum value (mssql, pgsql, mysql, etc) for the property. ## What is this change? - This change corrects how `RuntimeConfigValidator::ValidateConfigSchema(...)` resolves environment variables in the runtime config. The usage of `string jsonData = runtimeConfig.ToJson();` converts the runtime config object (which already has environment variables resolved) into valid json which can then be validated against `dab.draft.schema.json`. ## How was this tested? - [x] Integration Tests: added a test which exercises the code path that validates the dab runtime config against the dab json schema. By ensuring that no schema validation errors occur due to unresolved environment variables. ## Sample Request(s) Add the file `launchSettings.json` to `src/Cli/Properties` with the following contents: ```json { "profiles": { "start": { "commandName": "Project", "commandLineArgs": "start -c src\\Cli\\dab-config.json", "environmentVariables": { "DATABASE_TYPE": "mssql", "DATABASE_NAME": "<your DB name", "CONN_STRING": "<CONN_STRING>", "AUTH_METHOD": "Simulator", "PUB_ENT_SRC": "publishers" }, "httpPort": 5002 }, "validate": { "commandName": "Project", "commandLineArgs": "validate -c src\\Cli\\dab-config.json", "environmentVariables": { "DATABASE_TYPE": "mssql", "DATABASE_NAME": "<your DB name>", "CONN_STRING": "<CONN_STRING>", "AUTH_METHOD": "Simulator", "PUB_ENT_SRC": "publishers" }, "httpPort": 5002 } } } ``` and modify your dab-config.json to have the following contents (assumes that you've deployed the DAB test project's .sql script: ```json { "$schema": "https://github.com/Azure/data-api-builder/releases/download/v0.13.0-rc/dab.draft.schema.json", "data-source": { "database-type": "@env('DATABASE_TYPE')", "connection-string": "@env('CONN_STRING')" }, "runtime": { "rest": { "enabled": true, "path": "/api" }, "graphql": { "allow-introspection": true, "enabled": true, "path": "/graphql" }, "host": { "mode": "development", "cors": { "origins": [ "http://localhost:5000" ], "allow-credentials": false }, "authentication": { "provider": "@env('AUTH_METHOD')" } } }, "entities": { "Query": { "source": "books", "permissions": [ { "role": "anonymous", "actions": [ "read" ] } ], "graphql": { "type": { "singular": "ProductCatalogue", "plural": "ProductCatalogue" } } }, "Product": { "source": "@env('PUB_ENT_SRC')", "permissions": [ { "role": "authenticated", "actions": [ "read", "delete", "update" ] } ] } } } ```
## Why make this change? - Closes #2200, an issue with `dab validate -c "path_to_config.json"` where environment variable `@env('ENV_VAR_NAME')` references are unresolved during config schema validation and schema validation always fails. The issue specifically calls out an error when an environment variable is used for the property `database-type` can't resolve an environment variable and properly set an enum value (mssql, pgsql, mysql, etc) for the property. ## What is this change? - This change corrects how `RuntimeConfigValidator::ValidateConfigSchema(...)` resolves environment variables in the runtime config. The usage of `string jsonData = runtimeConfig.ToJson();` converts the runtime config object (which already has environment variables resolved) into valid json which can then be validated against `dab.draft.schema.json`. ## How was this tested? - [x] Integration Tests: added a test which exercises the code path that validates the dab runtime config against the dab json schema. By ensuring that no schema validation errors occur due to unresolved environment variables. ## Sample Request(s) Add the file `launchSettings.json` to `src/Cli/Properties` with the following contents: ```json { "profiles": { "start": { "commandName": "Project", "commandLineArgs": "start -c src\\Cli\\dab-config.json", "environmentVariables": { "DATABASE_TYPE": "mssql", "DATABASE_NAME": "<your DB name", "CONN_STRING": "<CONN_STRING>", "AUTH_METHOD": "Simulator", "PUB_ENT_SRC": "publishers" }, "httpPort": 5002 }, "validate": { "commandName": "Project", "commandLineArgs": "validate -c src\\Cli\\dab-config.json", "environmentVariables": { "DATABASE_TYPE": "mssql", "DATABASE_NAME": "<your DB name>", "CONN_STRING": "<CONN_STRING>", "AUTH_METHOD": "Simulator", "PUB_ENT_SRC": "publishers" }, "httpPort": 5002 } } } ``` and modify your dab-config.json to have the following contents (assumes that you've deployed the DAB test project's .sql script: ```json { "$schema": "https://github.com/Azure/data-api-builder/releases/download/v0.13.0-rc/dab.draft.schema.json", "data-source": { "database-type": "@env('DATABASE_TYPE')", "connection-string": "@env('CONN_STRING')" }, "runtime": { "rest": { "enabled": true, "path": "/api" }, "graphql": { "allow-introspection": true, "enabled": true, "path": "/graphql" }, "host": { "mode": "development", "cors": { "origins": [ "http://localhost:5000" ], "allow-credentials": false }, "authentication": { "provider": "@env('AUTH_METHOD')" } } }, "entities": { "Query": { "source": "books", "permissions": [ { "role": "anonymous", "actions": [ "read" ] } ], "graphql": { "type": { "singular": "ProductCatalogue", "plural": "ProductCatalogue" } } }, "Product": { "source": "@env('PUB_ENT_SRC')", "permissions": [ { "role": "authenticated", "actions": [ "read", "delete", "update" ] } ] } } } ```
Why make this change?
What is this change?
NOTE:
no by invalid i mean.
if i deserialize the below json
and then deserialize it
dab validate
.How was this tested?
Sample Request(s)
Before update
After Update