Skip to content
Closed
114 changes: 113 additions & 1 deletion src/Config/Converters/Utf8JsonReaderExtensions.cs
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT License.

using System.Text;
using System.Text.Json;

namespace Azure.DataApiBuilder.Config.Converters;

static internal class Utf8JsonReaderExtensions
static public class Utf8JsonReaderExtensions
{
/// <summary>
/// Reads a string from the <c>Utf8JsonReader</c> by using the deserialize method rather than GetString.
Expand Down Expand Up @@ -41,4 +42,115 @@ static internal class Utf8JsonReaderExtensions

return JsonSerializer.Deserialize<string>(ref reader, options);
}

/// <summary>
/// Replaces placeholders for environment variables in the given JSON string and returns the new JSON string.
/// </summary>
/// <param name="inputJson">The JSON string to process.</param>
/// <exception cref="JsonException">Thrown when the input is null or when an error occurs while processing the JSON.</exception>
public static string? ReplaceEnvVarsInJson(string? inputJson)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this function when StringJsonConverterFactory() already exists which has helpers to parse out envvironment variable placeholders? that class also uses regex to detect instances of @env()

why not something like:

TryParseConfig(
                json: json,
                config: out config,
                connectionString: _connectionString,
                replaceEnvVar: true)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something about dab validate that requires a completely different code path + new code?
if so , please add to pr description and code comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please elaborate in PR description. this doesn't really tell us much about "why" it can't be used.

There were certain methods which was doing replacement, but it was written in a way to align with the RuntimeConfig Object. So, it couldn't be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we need the original conversion mechanism to stay since it will be utilized for the workflow outside of dab validate.

Not clear why original conversion would need to change.

{
try
{
if (inputJson is null)
{
throw new JsonException("Input JSON string is null.");
}

// Load the JSON string
using JsonDocument doc = JsonDocument.Parse(inputJson);
using MemoryStream stream = new();
using Utf8JsonWriter writer = new(stream, new JsonWriterOptions { Indented = true });

// Replace environment variable placeholders
ReplaceEnvVarsAndWrite(doc.RootElement, writer);
writer.Flush();

// return the final JSON as a string
return Encoding.UTF8.GetString(stream.ToArray());
}
catch (Exception e)
{
throw new JsonException("Failed to replace environment variables in given JSON. " + e.Message);
}
}

/// <summary>
/// Replaces placeholders for environment variables in a JSON element and writes the result to a JSON writer.
/// </summary>
/// <param name="element">The JSON element to process.</param>
/// <param name="writer">The JSON writer to write the result to.</param>
private static void ReplaceEnvVarsAndWrite(JsonElement element, Utf8JsonWriter writer)
{
switch (element.ValueKind)
{
case JsonValueKind.Object:
writer.WriteStartObject();
foreach (JsonProperty property in element.EnumerateObject())
{
writer.WritePropertyName(property.Name);
// Recursively process each property of the object
ReplaceEnvVarsAndWrite(property.Value, writer);
}

writer.WriteEndObject();
break;
case JsonValueKind.Array:
writer.WriteStartArray();
foreach (JsonElement item in element.EnumerateArray())
{
// Recursively process each item of the array
ReplaceEnvVarsAndWrite(item, writer);
}

writer.WriteEndArray();
break;
case JsonValueKind.String:
string? value = element.GetString();
if (value is not null && value.StartsWith("@env('") && value.EndsWith("')"))
Copy link
Contributor

@aaronburtle aaronburtle May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assuming that the start of the string is the beginning of the env('<match>')?

What happens if the env('<match>') happens in the middle of the string?

For example, if the person had

{
"database-type": "env('db_type')env('db_lang')"
}

where their environment variable are db_type=ms, and db_lang=sql

will this replace env('db_lang') with sql?

Seems like this would skip all of the replacements except the first and only replace the first when the string starts with the env variable.

Copy link
Contributor

@aaronburtle aaronburtle May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the StringJsonConverterFactory there is logic that handles all of our edge cases with regex match and replace. If you can't use that class explicitly, perhaps you can factor out the logic that is being used for parsing the env variables to be shared?

{
string envVar = value[6..^2];

// Check if the environment variable is set
if (!Environment.GetEnvironmentVariables().Contains(envVar))
{
throw new ArgumentException($"Environment variable '{envVar}' is not set.");
}

string? envValue = Environment.GetEnvironmentVariable(envVar);

// Write the value of the environment variable to the JSON writer
if (envValue == "null")
{
writer.WriteNullValue();
}
else if (bool.TryParse(envValue, out bool boolValue))
{
writer.WriteBooleanValue(boolValue);
}
else if (int.TryParse(envValue, out int intValue))
{
writer.WriteNumberValue(intValue);
}
else if (double.TryParse(envValue, out double doubleValue))
{
writer.WriteNumberValue(doubleValue);
}
else
{
writer.WriteStringValue(envValue);
}
}
else
{
writer.WriteStringValue(value);
}

break;
default:
// Write the JSON element to the writer as is
Copy link
Contributor

@aaronburtle aaronburtle May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the env('') doesnt start the string we end up here, but then we don't do any replacements?

element.WriteTo(writer);
break;
}
}
}
18 changes: 16 additions & 2 deletions src/Core/Configurations/RuntimeConfigValidator.cs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@

using System.IO.Abstractions;
using System.Net;
using System.Text.Json;
using System.Text.RegularExpressions;
using Azure.DataApiBuilder.Config.Converters;
using Azure.DataApiBuilder.Config.DatabasePrimitives;
using Azure.DataApiBuilder.Config.ObjectModel;
using Azure.DataApiBuilder.Core.AuthenticationHelpers;
Expand Down Expand Up @@ -190,7 +192,19 @@ public async Task<bool> TryValidateConfig(
/// </summary>
public async Task<JsonSchemaValidationResult> ValidateConfigSchema(RuntimeConfig runtimeConfig, string configFilePath, ILoggerFactory loggerFactory)
{
string jsonData = _fileSystem.File.ReadAllText(configFilePath);
string? jsonData = _fileSystem.File.ReadAllText(configFilePath);

// The config file may contain some environment variables that need to be replaced before validation.
try
{
jsonData = Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(jsonData);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have the runtimeConfig deserialized from the configFilePath.

Wouldnt that mean we already had the jsonData with the environment variables replaced?
Or does the runtimeConfig object doesnt have the environment variable values?
Instead of reading from the file path, cant we get the string from which the runtimeConfig object was formed - to do the schema validation on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we cant. schemaValidator require a json string. i tried converting the deserialized runtimeConfig to string and then passing it to the schema validator but it was not forming a valid json string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide an example? invalid json string because of extra escape characters? is 'invalid json string' indicative of a different bug?

Copy link
Contributor Author

@abhishekkumams abhishekkumams May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no by invalid i mean.
if i deserialize the below json

{
"database-type": "mssql"
}

and then serialize it again

{
"database-type" : {
"DatabaseType": 2
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like another bug

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just how enums are serialized i believe, don't think it's a bug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just how enums are serialized i believe, don't think it's a bug

I think we would need to have custom (de)serialization that handles the case of these enums so that they go back to the intended format. Deserialization/Serialization ought to be reversable so this does seem like a bug that just hasnt arisen yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using string jsonData = runtimeConfig.ToJson() instead of all the custom parsing code and then provided jsonData to:
await jsonConfigSchemaValidator.ValidateJsonConfigWithSchemaAsync(jsonSchema, jsonData!);

works great and doesn't present any of the issues mentioned earlier in this thread. Am I missing something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the tests added in this PR pass with your proposed changes? If yes, we may not need the custom parsing code mentioned here.

}
catch (JsonException e)
{
_logger.LogError(e.Message);
return new JsonSchemaValidationResult(isValid: false, errors: null);
}

ILogger<JsonConfigSchemaValidator> jsonConfigValidatorLogger = loggerFactory.CreateLogger<JsonConfigSchemaValidator>();
JsonConfigSchemaValidator jsonConfigSchemaValidator = new(jsonConfigValidatorLogger, _fileSystem);

Expand All @@ -202,7 +216,7 @@ public async Task<JsonSchemaValidationResult> ValidateConfigSchema(RuntimeConfig
return new JsonSchemaValidationResult(isValid: false, errors: null);
}

return await jsonConfigSchemaValidator.ValidateJsonConfigWithSchemaAsync(jsonSchema, jsonData);
return await jsonConfigSchemaValidator.ValidateJsonConfigWithSchemaAsync(jsonSchema, jsonData!);
}

/// <summary>
Expand Down
116 changes: 116 additions & 0 deletions src/Service.Tests/Unittests/SerializationDeserializationTests.cs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is for testing DW object serialization. Any reason why config env_var tests are placed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, With the name of the file, it was not very evident that it will be used only for DW

Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,12 @@
using System.Reflection;
using System.Text.Json;
using System.Text.Json.Serialization;
using Azure.DataApiBuilder.Config.Converters;
using Azure.DataApiBuilder.Config.DatabasePrimitives;
using Azure.DataApiBuilder.Config.ObjectModel;
using Azure.DataApiBuilder.Core.Services.MetadataProviders.Converters;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json.Linq;

namespace Azure.DataApiBuilder.Service.Tests.Unittests
{
Expand Down Expand Up @@ -280,6 +282,120 @@ public void TestDictionaryDatabaseObjectSerializationDeserialization()
VerifySourceDefinitionSerializationDeserialization(deserializedDatabaseTable.TableDefinition, _databaseTable.TableDefinition, "FirstName");
}

/// <summary>
/// This test sets environment variables of different data types,
/// creates an input JSON string with placeholders for these environment variables,
/// and then calls TryGetEnvVarsReplacedJson to replace the placeholders with the actual environment variable values.
/// The output JSON is then compared with the expected JSON to verify that the placeholders were replaced correctly.
/// </summary>
[TestMethod]
public void TestEnvVariableReplacementWithDifferentDataTypes()
{
// Set environment variables
Environment.SetEnvironmentVariable("ENV_STRING", "Hello");
Environment.SetEnvironmentVariable("ENV_NUMBER", "123");
Environment.SetEnvironmentVariable("ENV_FLOAT", "123.45");
Environment.SetEnvironmentVariable("ENV_BOOLEAN", "true");
Environment.SetEnvironmentVariable("ENV_NULL", "null");

// Create input JSON with environment variable placeholders
string inputJson = @"
{
""stringKey"": ""@env('ENV_STRING')"",
""numberKey"": ""@env('ENV_NUMBER')"",
""floatKey"": ""@env('ENV_FLOAT')"",
""booleanKey"": ""@env('ENV_BOOLEAN')"",
""nullKey"": ""@env('ENV_NULL')"",
""arrayKey"": [""@env('ENV_STRING')"", ""@env('ENV_NUMBER')"", ""@env('ENV_FLOAT')"", ""@env('ENV_BOOLEAN')"", ""@env('ENV_NULL')""],
""objectKey"": {
""stringKey"": ""@env('ENV_STRING')"",
""numberKey"": ""@env('ENV_NUMBER')"",
""floatKey"": ""@env('ENV_FLOAT')"",
""booleanKey"": ""@env('ENV_BOOLEAN')"",
""nullKey"": ""@env('ENV_NULL')""
}
}";

// Call the method under test
string outputJson = null;
try
{
outputJson = Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(inputJson);
}
catch (JsonException e)
{
Assert.Fail("Unexpected Failure. " + e.Message);
}

// Create expected JSON
string expectedJson = @"
{
""stringKey"": ""Hello"",
""numberKey"": 123,
""floatKey"": 123.45,
""booleanKey"": true,
""nullKey"": null,
""arrayKey"": [""Hello"", 123, 123.45, true, null],
""objectKey"": {
""stringKey"": ""Hello"",
""numberKey"": 123,
""floatKey"": 123.45,
""booleanKey"": true,
""nullKey"": null
}
}";

// Compare the output JSON with the expected JSON
Assert.IsTrue(JToken.DeepEquals(JToken.Parse(expectedJson), JToken.Parse(outputJson)));
}

/// <summary>
/// This test checks that the ReplaceEnvVarsInJson method throws a JsonException when given an invalid JSON string.
/// The test asserts that the exception's message matches the expected error message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test asserts that the exception's message matches the expected error message.

nit: While it's good to note that you're checking the emitted error message, your description would be even more beneficial to the reader if it described that you're validating that the error message blames an environment variable replacement failure versus some other JSON parsing issue.

/// </summary>
[TestMethod]
public void TestEnvVariableReplacementWithInvalidJson()
{
// Create an invalid JSON string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: note which part is invalid. (I'm assuming lack of closing quotation marks after the ENV_STRING.)

string inputJson = @"{ ""stringKey"": ""@env('ENV_STRING')', }";

// Call the method under test and expect a JsonException
try
{
Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(inputJson);
Assert.Fail("Expected a JsonException to be thrown");
}
catch (JsonException ex)
{
Assert.AreEqual("Failed to replace environment variables in given JSON. "
+ "Expected end of string, but instead reached end of data. LineNumber: 0 | BytePositionInLine: 38.", ex.Message);
}
}

/// <summary>
/// This test validates that the `ReplaceEnvVarsInJson` method throws a JsonException when
/// it encounters a reference to an unset environment variable in the provided JSON string.
/// The test asserts that the exception's message matches the expected error message.
/// </summary>
[TestMethod]
public void TestEnvVariableReplacementWithMissingEnvVar()
{
// Create a JSON string with a placeholder for the missing environment variable
string inputJson = @"{ ""stringKey"": ""@env('MISSING_ENV_VAR')"" }";

// Call the method under test and expect a JsonException
try
{
Utf8JsonReaderExtensions.ReplaceEnvVarsInJson(inputJson);
Assert.Fail("Expected a JsonException to be thrown");
}
catch (JsonException ex)
{
Assert.AreEqual("Failed to replace environment variables in given JSON. "
+ "Environment variable 'MISSING_ENV_VAR' is not set.", ex.Message);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the following scenarios result in this error message? I suppose "unset environment variable" would apply to both of these:

  • MISSING_ENV_VAR exists but doesn't have a value -> null
  • MISSING_ENV_VAR doesn't exist -> null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, actually not. environment files are read as test. environment values are read as string,i.e. "null", and then added to the json as a null value. So, if a user wants to set a null value. it will be assigned as null and no error should be thorwn. there is a test for the same.

}
}

private void InitializeObjects()
{
_options = new()
Expand Down