Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(entity-generator): allow arbitrary class and prop names as identifiers #5359

Merged
merged 1 commit into from
Apr 9, 2024

Conversation

boenrobot
Copy link
Collaborator

@boenrobot boenrobot commented Mar 19, 2024

AbstractNamingStrategy is updated to handle class conflicts, and inherently,
any naming strategy that inherits from it.

The "word boundary" character in entity and property names is now correctly replaced,
even in cases where multiple sequential separators are present.

Any table that starts with a character that is invalid as a JS start character (not just digits)
gets prefixed with "E".
Characters that are not valid in any position as a class identifier (or the character "$")
get turned into code points, prefixed with $, making them valid (though inconvenient to type)
identifiers and file names.

In AbstractNamingStrategy.columnNameToProperty(),
prop names that would be in conflict with PopulatePath values get prefixed with "$".

Prop names that are not valid identifiers get string quoted.
This is a SourceFile/EntitySchemaSourceFile feature, not a naming strategy feature,
so it will get applied regardless of naming strategy.

The MySqlSchemaHelper doesn't special case index expression based on column name.
Lack of column name or presence of an "expression" column with "where" is still considered.

The PostgreSqlSchemaHelper properly escapes all identifiers when looking them up.
Due to the relative complexity of PostgreSQL's expressions vs column names,
index properties are still not picked up for odd identifiers,
though an index expression is generated.

Also made the imports of types tied to the "entity" prop option, rather than the prop declaration.
This ensures proper imports when using mapToPk in extension hooks.

Also moved the schema de-duplication code to happen before bi-directional and many-to-many
detection, to ensure less work and error prone-ness for the other stages in updating references.

Also fixed quoting of strings in generated columns.

@boenrobot
Copy link
Collaborator Author

boenrobot commented Mar 19, 2024

This started as yet another subset of #5273 (and I know it will have conflicts with it, which I'll resolve later...), but it evolved as I realized there's no test for the thing I noticed there (the inability to correctly reference "odd" identifiers like that, due to the amendment being outside of the naming strategy), and during the making of the test, I realized there's also plenty of other cases where the naming strategy is insufficient, and eventually reached this.

There is one known edge case not covered here... Identifiers containing a dot (.). This would require a bit more significant refactoring to make MirkoORM handle this better, so... a story for a different PR to specifically handle "dot in identifier vs dot to separate schema and table name" in some way... I think it's best to introduce dedicated schema fields in the FK, and output those too in separate prop options where not the current schema... but again, that is a more significant refactor.

NOTE: The change in the schema generator is a consequence of the quote normalization happening in loadInformationSchema.
It's probably worth to also normalize the charset of strings there too, or in higher layers, though I would still need to test what MySQL defaults to... I suspect charset of the connection that defined the column, which means that to "properly" detect schema changes, the schema generator needs to lookup its connection's charset, and only strip that one out from the generated column definitions it is comparing.

Copy link

codecov bot commented Mar 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.77%. Comparing base (acd6590) to head (e7c91b4).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5359   +/-   ##
=======================================
  Coverage   99.77%   99.77%           
=======================================
  Files         260      260           
  Lines       17868    17886   +18     
  Branches     3780     3793   +13     
=======================================
+ Hits        17827    17845   +18     
  Misses         41       41           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@boenrobot boenrobot marked this pull request as ready for review March 19, 2024 11:12
@boenrobot boenrobot marked this pull request as draft March 19, 2024 13:06
@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch 4 times, most recently from d30e7e7 to 5ab1991 Compare March 19, 2024 17:00
packages/knex/src/schema/SchemaComparator.ts Dismissed Show dismissed Hide dismissed
@boenrobot boenrobot marked this pull request as ready for review March 19, 2024 17:10
@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch from 5ab1991 to 86c4556 Compare March 20, 2024 15:54
@B4nan
Copy link
Member

B4nan commented Mar 22, 2024

Prop names that conflict with the prototype of Reference (which all loaded entities inherit from)

This is not true. Why did you think so? Maybe we need to improve the docs somewhere.

Reference is an optional wrapper class, nowadays represented by the Ref type.

https://mikro-orm.io/docs/guide/type-safety#reference-wrapper

@boenrobot
Copy link
Collaborator Author

Prop names that conflict with the prototype of Reference (which all loaded entities inherit from)

This is not true. Why did you think so? Maybe we need to improve the docs somewhere.

Reference is an optional wrapper class, nowadays represented by the Ref type.

https://mikro-orm.io/docs/guide/type-safety#reference-wrapper

Ok, "inherit" was the wrong word... "wrap"... point is that you have a point at which you have accessible both the properties and extra identifiers like "$", and "load".

So... when you have a wrapped object, be it because of Ref, or because it is the direct result of find()... Is there a way to read or write a prop with an arbitrary name? If there is, it is not documented.

@B4nan
Copy link
Member

B4nan commented Mar 23, 2024

point is that you have a point at which you have accessible both the properties and extra identifiers like "$", and "load".

Nope, that's not how it works, the wrapped entity doesn't have anything in common with the Reference class.

So... when you have a wrapped object, be it because of Ref, or because it is the direct result of find()... Is there a way to read or write a prop with an arbitrary name? If there is, it is not documented.

I don't understand your question. It's a simple wrapper class, if you want to access anything else than the primary key, you need to unwrap with one of the helper methods.

@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch 5 times, most recently from b7fcb7b to 0b02ee2 Compare March 25, 2024 17:03
Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the escaping is apparently not working correctly

'my@odd.column'!: string;

@Property({ fieldName: 'column with apostrophe in it\\'s name', length: 45 })
'columnWithApostropheInIt\\'sName'!: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong escaping

Copy link
Collaborator Author

@boenrobot boenrobot Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean the double slash... this is an artifact of the fact that jest snapshots aren't the literal file, but a template string inside a js file.

If you check the actual file produced, you'll see it has only one slash inside it.

That said, I should probably make it create the files, as in the mysql test, so... I'll change that part. But the escaping here is correct.

EDIT: OK, changed to output the file, and check for a file presence and remove... same as MySQL test.

'columnWithApostropheInIt\\'sName'!: string;

@Property({ fieldName: 'column with backtick in it\`\`s name', length: 45 })
'columnWithBacktickInIt\`\`sName'!: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary escaping of `

Copy link
Collaborator Author

@boenrobot boenrobot Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think This is the same "problem" here too...

The snapshot is inside a template string... that's just how jest's .snap files work...

So it escapes the `, in order to stay into the template string, while in the actual contents of the file, you won't see the ` being escaped.

@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch from 0b02ee2 to bee6e12 Compare March 27, 2024 09:51
@boenrobot boenrobot requested a review from B4nan April 1, 2024 10:05
@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch from bee6e12 to 8c333e0 Compare April 2, 2024 12:57
@boenrobot
Copy link
Collaborator Author

boenrobot commented Apr 2, 2024

I've added a fix for the first problem described in #5400, meaning a custom naming strategy would not be needed for that part. Also amended the tests to feature a new table and a column that follow those patterns.

And like I already said about the backslash and backtick escapes... That this is just because the snapshots are template strings. The raw files don't have the extra slashes. So those are correct.

@boenrobot
Copy link
Collaborator Author

As mentioned in #5418, I've now added disambiguation for column names that could end up conflicting with "$infer" or "*", which yes, are technically valid identifiers, even if no one in their right mind would use those actual names for an actual column.

…fiers

AbstractNamingStrategy is updated to handle class conflicts, and inherently,
any naming strategy that inherits from it.

The "word boundary" character in entity and property names is now correctly replaced,
even in cases where multiple sequential separators are present.

Any table that starts with a character that is invalid as a JS start character (not just digits)
gets prefixed with "E".
Characters that are not valid in any position as a class identifier (or the character "$")
get turned into code points, prefixed with $, making them valid (though inconvenient to type)
identifiers and file names.

In AbstractNamingStrategy.columnNameToProperty(),
prop names that would be in conflict with PopulatePath values get prefixed with "$".

Prop names that are not valid identifiers get string quoted.
This is a SourceFile/EntitySchemaSourceFile feature, not a naming strategy feature,
so it will get applied regardless of naming strategy.

The MySqlSchemaHelper doesn't special case index expression based on column name.
Lack of column name or presence of an "expression" column with "where" is still considered.

The PostgreSqlSchemaHelper properly escapes all identifiers when looking them up.
Due to the relative complexity of PostgreSQL's expressions vs column names,
index properties are still not picked up for odd identifiers,
though an index expression is generated.

Also made the imports of types tied to the "entity" prop option, rather than the prop declaration.
This ensures proper imports when using mapToPk in extension hooks.

Also moved the schema de-duplication code to happen before bi-directional and many-to-many
detection, to ensure less work and error prone-ness for the other stages in updating references.

Also fixed quoting of strings in generated columns.
@boenrobot boenrobot force-pushed the UnidentifiableDeclaredObjects branch from 401ff9d to e7c91b4 Compare April 9, 2024 07:45
Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this wont backfire, the changes are quite a magic to me :]

@B4nan B4nan merged commit b0c0236 into master Apr 9, 2024
11 checks passed
@B4nan B4nan deleted the UnidentifiableDeclaredObjects branch April 9, 2024 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants