Skip to content

Conversation

abhishekkumams
Copy link
Contributor

@abhishekkumams abhishekkumams commented May 2, 2024

Why make this change?

What is this change?

  • Added a method that takes in Json as string, and returns the new Json with replaced environment variables as string.

NOTE:

  • There were certain methods which was doing replacement, but it was written in a way to align with the RuntimeConfig Object. So, it couldn't be used. for example, different converterFactory classes were calling this StringJsonConvertorFactory and then converting it to required type based on the definition in the schema.
  • Also, trying to deserialize it and then serialize it again changes the formatting. for example:
    no by invalid i mean.
    if i deserialize the below json
{
"database-type": "mssql"
}

and then deserialize it

{
"database-type" : {
"DatabaseType": 2
}
}
  • Also, we need the original conversion mechanism to stay since it will be utilized for the workflow outside of dab validate.
  • Since, the new method takes string as input, it can be used in the future for the hosting scenario as the config file is provided as string.

How was this tested?

  • Unit Tests

Sample Request(s)

"data-source": {
      "database-type": "@env('DATABASE_TYPE')",
      "options": {
        "database": "@env('DATABASE_NAME')",
        "schema": "@env('GRAPHQL_SCHEMA_PATH')"
      },
      "connection-string": "@env('CONN_STRING')"
    }

Before update

image

After Update

image

@abhishekkumams
Copy link
Contributor Author

/azp run

@abhishekkumams
Copy link
Contributor Author

/azp run

Copy link
Contributor

@seantleonard seantleonard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow why we need new environment variable parsing logic as opposed to using pre-existing code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is for testing DW object serialization. Any reason why config env_var tests are placed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, With the name of the file, it was not very evident that it will be used only for DW


/// <summary>
/// This test checks that the ReplaceEnvVarsInJson method throws a JsonException when given an invalid JSON string.
/// The test asserts that the exception's message matches the expected error message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test asserts that the exception's message matches the expected error message.

nit: While it's good to note that you're checking the emitted error message, your description would be even more beneficial to the reader if it described that you're validating that the error message blames an environment variable replacement failure versus some other JSON parsing issue.

[TestMethod]
public void TestEnvVariableReplacementWithInvalidJson()
{
// Create an invalid JSON string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: note which part is invalid. (I'm assuming lack of closing quotation marks after the ENV_STRING.)

catch (JsonException ex)
{
Assert.AreEqual("Failed to replace environment variables in given JSON. "
+ "Environment variable 'MISSING_ENV_VAR' is not set.", ex.Message);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the following scenarios result in this error message? I suppose "unset environment variable" would apply to both of these:

  • MISSING_ENV_VAR exists but doesn't have a value -> null
  • MISSING_ENV_VAR doesn't exist -> null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, actually not. environment files are read as test. environment values are read as string,i.e. "null", and then added to the json as a null value. So, if a user wants to set a null value. it will be assigned as null and no error should be thorwn. there is a test for the same.

/// </summary>
/// <param name="inputJson">The JSON string to process.</param>
/// <exception cref="JsonException">Thrown when the input is null or when an error occurs while processing the JSON.</exception>
public static string? ReplaceEnvVarsInJson(string? inputJson)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this function when StringJsonConverterFactory() already exists which has helpers to parse out envvironment variable placeholders? that class also uses regex to detect instances of @env()

why not something like:

TryParseConfig(
                json: json,
                config: out config,
                connectionString: _connectionString,
                replaceEnvVar: true)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something about dab validate that requires a completely different code path + new code?
if so , please add to pr description and code comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please elaborate in PR description. this doesn't really tell us much about "why" it can't be used.

There were certain methods which was doing replacement, but it was written in a way to align with the RuntimeConfig Object. So, it couldn't be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we need the original conversion mechanism to stay since it will be utilized for the workflow outside of dab validate.

Not clear why original conversion would need to change.

Copy link
Collaborator

@Aniruddh25 Aniruddh25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions on the approach

// The config file may contain some environment variables that need to be replaced before validation.
try
{
jsonData = Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(jsonData);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have the runtimeConfig deserialized from the configFilePath.

Wouldnt that mean we already had the jsonData with the environment variables replaced?
Or does the runtimeConfig object doesnt have the environment variable values?
Instead of reading from the file path, cant we get the string from which the runtimeConfig object was formed - to do the schema validation on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we cant. schemaValidator require a json string. i tried converting the deserialized runtimeConfig to string and then passing it to the schema validator but it was not forming a valid json string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide an example? invalid json string because of extra escape characters? is 'invalid json string' indicative of a different bug?

Copy link
Contributor Author

@abhishekkumams abhishekkumams May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no by invalid i mean.
if i deserialize the below json

{
"database-type": "mssql"
}

and then serialize it again

{
"database-type" : {
"DatabaseType": 2
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like another bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just how enums are serialized i believe, don't think it's a bug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just how enums are serialized i believe, don't think it's a bug

I think we would need to have custom (de)serialization that handles the case of these enums so that they go back to the intended format. Deserialization/Serialization ought to be reversable so this does seem like a bug that just hasnt arisen yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using string jsonData = runtimeConfig.ToJson() instead of all the custom parsing code and then provided jsonData to:
await jsonConfigSchemaValidator.ValidateJsonConfigWithSchemaAsync(jsonSchema, jsonData!);

works great and doesn't present any of the issues mentioned earlier in this thread. Am I missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the tests added in this PR pass with your proposed changes? If yes, we may not need the custom parsing code mentioned here.

@Aniruddh25 Aniruddh25 added this to the 1.1rc milestone May 3, 2024
break;
case JsonValueKind.String:
string? value = element.GetString();
if (value is not null && value.StartsWith("@env('") && value.EndsWith("')"))
Copy link
Contributor

@aaronburtle aaronburtle May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assuming that the start of the string is the beginning of the env('<match>')?

What happens if the env('<match>') happens in the middle of the string?

For example, if the person had

{
"database-type": "env('db_type')env('db_lang')"
}

where their environment variable are db_type=ms, and db_lang=sql

will this replace env('db_lang') with sql?

Seems like this would skip all of the replacements except the first and only replace the first when the string starts with the env variable.

Copy link
Contributor

@aaronburtle aaronburtle May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the StringJsonConverterFactory there is logic that handles all of our edge cases with regex match and replace. If you can't use that class explicitly, perhaps you can factor out the logic that is being used for parsing the env variables to be shared?


break;
default:
// Write the JSON element to the writer as is
Copy link
Contributor

@aaronburtle aaronburtle May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the env('') doesnt start the string we end up here, but then we don't do any replacements?

Copy link
Contributor

@aaronburtle aaronburtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about how and when we are matching the environment variables, and if we can share some code.

@seantleonard seantleonard modified the milestones: 1.1rc, 1.1 Patch 01 May 11, 2024
seantleonard added a commit that referenced this pull request May 14, 2024
## Why make this change?

- Closes #2200, an issue with `dab validate -c "path_to_config.json"`
where environment variable `@env('ENV_VAR_NAME')` references are
unresolved during config schema validation and schema validation always
fails. The issue specifically calls out an error when an environment
variable is used for the property `database-type` can't resolve an
environment variable and properly set an enum value (mssql, pgsql,
mysql, etc) for the property.

## What is this change?

- This change corrects how
`RuntimeConfigValidator::ValidateConfigSchema(...)` resolves environment
variables in the runtime config. The usage of `string jsonData =
runtimeConfig.ToJson();` converts the runtime config object (which
already has environment variables resolved) into valid json which can
then be validated against `dab.draft.schema.json`.

## How was this tested?

- [x] Integration Tests: added a test which exercises the code path that
validates the dab runtime config against the dab json schema. By
ensuring that no schema validation errors occur due to unresolved
environment variables.

## Sample Request(s)

Add the file `launchSettings.json` to `src/Cli/Properties` with the
following contents:
```json
{
  "profiles": {
    "start": {
      "commandName": "Project",
      "commandLineArgs": "start -c src\\Cli\\dab-config.json",
      "environmentVariables": {
        "DATABASE_TYPE": "mssql",
        "DATABASE_NAME": "<your DB name",
        "CONN_STRING": "<CONN_STRING>",
        "AUTH_METHOD": "Simulator",
        "PUB_ENT_SRC": "publishers"
      },
      "httpPort": 5002
    },
    "validate": {
      "commandName": "Project",
      "commandLineArgs": "validate -c src\\Cli\\dab-config.json",
      "environmentVariables": {
        "DATABASE_TYPE": "mssql",
        "DATABASE_NAME": "<your DB name>",
        "CONN_STRING": "<CONN_STRING>",
        "AUTH_METHOD": "Simulator",
        "PUB_ENT_SRC": "publishers"
      },
      "httpPort": 5002
    }
  }
}
```

and modify your dab-config.json to have the following contents (assumes
that you've deployed the DAB test project's .sql script:

```json
{
  "$schema": "https://github.com/Azure/data-api-builder/releases/download/v0.13.0-rc/dab.draft.schema.json",
  "data-source": {
    "database-type": "@env('DATABASE_TYPE')",
    "connection-string": "@env('CONN_STRING')"
  },
  "runtime": {
    "rest": {
      "enabled": true,
      "path": "/api"
    },
    "graphql": {
      "allow-introspection": true,
      "enabled": true,
      "path": "/graphql"
    },
    "host": {
      "mode": "development",
      "cors": {
        "origins": [
          "http://localhost:5000"
        ],
        "allow-credentials": false
      },
      "authentication": {
        "provider": "@env('AUTH_METHOD')"
      }
    }
  },
  "entities": {
    "Query": {
      "source": "books",
      "permissions": [
        {
          "role": "anonymous",
          "actions": [
            "read"
          ]
        }
      ],
      "graphql": {
        "type": {
          "singular": "ProductCatalogue",
          "plural": "ProductCatalogue"
        }
      }
    },
    "Product": {
      "source": "@env('PUB_ENT_SRC')",
      "permissions": [
        {
          "role": "authenticated",
          "actions": [
            "read",
            "delete",
            "update"
          ]
        }
      ]
    }
  }
}
```
ayush3797 pushed a commit that referenced this pull request May 14, 2024
## Why make this change?

- Closes #2200, an issue with `dab validate -c "path_to_config.json"`
where environment variable `@env('ENV_VAR_NAME')` references are
unresolved during config schema validation and schema validation always
fails. The issue specifically calls out an error when an environment
variable is used for the property `database-type` can't resolve an
environment variable and properly set an enum value (mssql, pgsql,
mysql, etc) for the property.

## What is this change?

- This change corrects how
`RuntimeConfigValidator::ValidateConfigSchema(...)` resolves environment
variables in the runtime config. The usage of `string jsonData =
runtimeConfig.ToJson();` converts the runtime config object (which
already has environment variables resolved) into valid json which can
then be validated against `dab.draft.schema.json`.

## How was this tested?

- [x] Integration Tests: added a test which exercises the code path that
validates the dab runtime config against the dab json schema. By
ensuring that no schema validation errors occur due to unresolved
environment variables.

## Sample Request(s)

Add the file `launchSettings.json` to `src/Cli/Properties` with the
following contents:
```json
{
  "profiles": {
    "start": {
      "commandName": "Project",
      "commandLineArgs": "start -c src\\Cli\\dab-config.json",
      "environmentVariables": {
        "DATABASE_TYPE": "mssql",
        "DATABASE_NAME": "<your DB name",
        "CONN_STRING": "<CONN_STRING>",
        "AUTH_METHOD": "Simulator",
        "PUB_ENT_SRC": "publishers"
      },
      "httpPort": 5002
    },
    "validate": {
      "commandName": "Project",
      "commandLineArgs": "validate -c src\\Cli\\dab-config.json",
      "environmentVariables": {
        "DATABASE_TYPE": "mssql",
        "DATABASE_NAME": "<your DB name>",
        "CONN_STRING": "<CONN_STRING>",
        "AUTH_METHOD": "Simulator",
        "PUB_ENT_SRC": "publishers"
      },
      "httpPort": 5002
    }
  }
}
```

and modify your dab-config.json to have the following contents (assumes
that you've deployed the DAB test project's .sql script:

```json
{
  "$schema": "https://github.com/Azure/data-api-builder/releases/download/v0.13.0-rc/dab.draft.schema.json",
  "data-source": {
    "database-type": "@env('DATABASE_TYPE')",
    "connection-string": "@env('CONN_STRING')"
  },
  "runtime": {
    "rest": {
      "enabled": true,
      "path": "/api"
    },
    "graphql": {
      "allow-introspection": true,
      "enabled": true,
      "path": "/graphql"
    },
    "host": {
      "mode": "development",
      "cors": {
        "origins": [
          "http://localhost:5000"
        ],
        "allow-credentials": false
      },
      "authentication": {
        "provider": "@env('AUTH_METHOD')"
      }
    }
  },
  "entities": {
    "Query": {
      "source": "books",
      "permissions": [
        {
          "role": "anonymous",
          "actions": [
            "read"
          ]
        }
      ],
      "graphql": {
        "type": {
          "singular": "ProductCatalogue",
          "plural": "ProductCatalogue"
        }
      }
    },
    "Product": {
      "source": "@env('PUB_ENT_SRC')",
      "permissions": [
        {
          "role": "authenticated",
          "actions": [
            "read",
            "delete",
            "update"
          ]
        }
      ]
    }
  }
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Schema Validation is not considering environment variables
5 participants