-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: enum name consisting of symbols lead to conflicting enum values #1261
Conversation
…nums values Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
src/type-generator.ts
Outdated
const lowerCaseValue = value?.toString().toLowerCase(); | ||
if (lowerCaseValue && !processedValues.has(lowerCaseValue)) { | ||
processedValues.add(lowerCaseValue); | ||
if (value && isAllowlistedCharacter(value.toString())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really a character you're checking for, but a complete num value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your lookup table. But is this going to deal with characters anywhere they occur? Would we want to, or is that not necessary?
How about we do something in a loop to replace all occurrences of groups of symbols with words? (This might be overkill, I'm not sure about the domain)
const memberName = snakeCase(rewriteNamedSymbols(`${value}`).replace(/[^a-z0-9]/gi, '_')).split('_').filter(x => x).join('_').toUpperCase();
function rewriteNamedSymbols(value: string) {
const ret = new Array<string>();
while (value.length > 0) {
const [prefixName, prefixLen] = longestPrefixMatch(value, NAMED_SYMBOLS);
if (prefixName) {
ret.push(`_${prefixName}_`);
value = value.slice(prefixLen);
} else {
ret.push(value.charAt(0));
value = value.slice(1);
}
}
if (ret[0] === '_') { ret.unshift('VALUE'); }
if (ret[ret.length - 1] === '_') { ret.pop(); }
return ret.join('');
}
function longestPrefixMatch(x: string, lookupTable: Record<string, string>) {
let ret: string | undefined;
let longest: number = 0;
for (const [name, value] of Object.entries(lookupTable)) {
if (x.startsWith(value) && value.length > longest) {
ret = name;
longest = value.length;
}
}
return [ret, longest];
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea, taking a look.
For context, we import Custom Resource Definitions(these are json schemas) in cdk8s and there is an issue with importing an enum in Prometheus CRD: https://github.com/prometheus-operator/kube-prometheus/blob/373e2b415bd0ddbcb049dea54db80c2a78b76945/manifests/setup/0alertmanagerConfigCustomResourceDefinition.yaml#L67-L71
It does not have these symbols occurring between other words but it makes sense to add this. My thought process here was to just add an allowlist so that we can add to that list if we agree with providing an exception for the user.
src/character-allowlist.ts
Outdated
@@ -0,0 +1,58 @@ | |||
const allowListCharacters = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in SHOUTY_CASE
since it's a constant.
And "allow list" is very generic. How about NAMED_SYMBOLS
?
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
VALUE_0_PERIOD = 0.9, | ||
/** 0.95 */ | ||
VALUE_0_95 = 0.95, | ||
VALUE_0_PERIOD_95 = 0.95, | ||
/** 0.99 */ | ||
VALUE_0_99 = 0.99, | ||
VALUE_0_PERIOD_99 = 0.99, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callout
: Looks like this would be a breaking change and would change enum names in certain scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good catch, that's probably not great.
Can we make an exception for a period inside numbers?
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
/** Foo-Bar */ | ||
FOO_BAR = \\"Foo-Bar\\", | ||
/** Bar_Foo */ | ||
BAR_FO = \\"Bar_Foo\\", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not look correct. Taking a look.
/** Bar_Foo */
BAR_FO = \\"Bar_Foo\\",
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
src/type-generator.ts
Outdated
|
||
function isExemptedSymbolPattern(input: string): boolean { | ||
// Decimal numbers | ||
if (/^\d*\.\d+$/.test(input)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does something else than it looks like it's doing, and that's dangerous (I'm not sure if it's a bug, but at least it's confusing).
isExemptedSymbolPattern
will only ever be called on the start of a string, because we keep constantly slicing off the start of the string that we have already replaced. So the "cursor" is focused on the .
right now, and we've already gotten rid of everything in front of the cursor.
That means the \d*
will never match anything, because it has already been removed.
That's fine, it apparently works because the tests are passing, but we're not testing for what this regex looks to be indicating we are: we're not testing for a .
in between numbers: we're only testing for a .
followed by numbers.
I suppose the difference should show up if we were going to translate an enum value like 1.
, which in the old system would be replaced by VALUE_1_
and in the new one by VALUE_1_PERIOD
?
So -- is that a problem? I'm not sure, I'll leave that to you. But if you're going to leave the current code, please update the regex so it doesn't look like it's matching digits at the start anymore.
If you want to be able to test full context, we're going to have to change the code a little:
Instead of slicing off the start of the string, every time we match something, instead we'll keep a counter ofs
and we'll iterate over the string without mutating it:
let ofs = 0;
while (ofs < value.length) {
const [prefixName, prefixLen] = longestPrefixMatch(value, ofs, NAMED_SYMBOLS);
// ...
ofs += prefixLen;
}
value.startsWith(x)
now needs to be replaced with something like hasSubStringAt(value, x, ofs)
.
Finally, for the regex, we need to start searching at a specific index, instead of at the start. For that we use the flag y
(sticky), which makes a regex match at a particular index, and not anywhere else. Plus, we acn use lookbehind assertions (?<=...)
to say that something must be true for the string before the match, without actually being part of the match itself (or its index).
// RegExp API is a bit wonky in JavaScript so let's make a helper
function testRegexAt(regex: string, haystack: string, index: number) {
const re = new RegExp(regex, 'y');
re.lastIndex = index;
return re.test(haystack);
}
function isExemptPattern(value: string, ofs: number) {
const exemptPatterns = [
// Note: use lookbehind to test for things before the current position
'(?<=\d)\.\d',
// ...more patterns here...?
];
return exemptPatterns.some((p) => testRegexAt(p, value, ofs));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right, I am dumb 😮💨
Thanks Rico, this was really helpful. Taking a look. 🙏
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
/** .9 */ | ||
VALUE_PERIOD_9 = \\".9\\", | ||
/** 9 */ | ||
VALUE_9 = \\"9\\", | ||
/** 9. */ | ||
VALUE_9_PERIOD = \\"9.\\", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callout
:
This is a breaking change but I believe this is needed since it was not working and would give the same value.
Similar to these are,
Foo.
.Foo
For instance, if I run this with mainline, I see:
| index.ts:39:3 - error TS2300: Duplicate identifier 'FOO'.
| 39 FOO = ".Foo",
| ~~~
| index.ts:43:3 - error TS2300: Duplicate identifier 'FOO'.
| 43 FOO = "Foo.",
| ~~~
| index.ts:47:3 - error TS2300: Duplicate identifier 'VALUE_9'.
| 47 VALUE_9 = ".9",
| ~~~~~~~
| index.ts:49:3 - error TS2300: Duplicate identifier 'VALUE_9'.
| 49 VALUE_9 = "9",
| ~~~~~~~
| index.ts:51:3 - error TS2300: Duplicate identifier 'VALUE_9'.
| 51 VALUE_9 = "9.",
Signed-off-by: Vinayak Kukreja <vinakuk@amazon.com>
FOO_HYPHEN_BAR = \\"Foo-Bar\\", | ||
/** Foo_Bar */ | ||
FOO_UNDERSCORE_BAR = \\"Foo_Bar\\", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to prior, these are breaking and would not work with our current code. Both would give FOO_BAR
.
Characters like,
!=, ==, =~, !~
etc would result inVALUE_
enum values since we are not able to code generate these characters to something valid. This PR is adding an allowlist with some known characters to provide valid enum values if such a case occurs.Fixes cdk8s-team/cdk8s-cli#578