Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(entity-generator): allow generating scalar properties for FKs #4892

Merged
merged 7 commits into from
Nov 17, 2023

Conversation

boenrobot
Copy link
Collaborator

When a pivot table's primary key is referenced by a foreign key constraint, render an entity for the pivot table.

NOTE: I am using Windows, and the test suite seems to not pass on it. I'm also not sure how to add a new test for this. But I did test it with my code base, where it does end up generating the correct entities and references to them.

Copy link

codecov bot commented Nov 3, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (aa2e889) 99.43% compared to head (e23cea7) 99.44%.
Report is 7 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #4892    +/-   ##
========================================
  Coverage   99.43%   99.44%            
========================================
  Files         222      222            
  Lines       15797    16012   +215     
  Branches     3763     3836    +73     
========================================
+ Hits        15708    15923   +215     
  Misses         86       86            
  Partials        3        3            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: I am using Windows, and the test suite seems to not pass on it.

You should use WSL, I believe that should do it.

I'm also not sure how to add a new test for this.

I don't want to accept any code changes I don't understand without seeing the why, so a test case is needed. I looked at the change you did and no idea why you would need to make things conditional based on that. Sounds like you have a wrong entity definition rather than its something to fix in the ORM, but I might be wrong - hard to guess without the failing test case first.

What do you not understand about creating one? Check this folder, it's full of examples for bug reproductions:

https://github.com/mikro-orm/mikro-orm/tree/master/tests/issues

If you struggle with the tests, it can be as well a separate repository, all I need is to understand what you do, I can help with making a test case out of it.


Also, you should drop all the unrelated code formatting changes (e.g. I can see you changed the order of imports for no reason).

@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 3, 2023

I have an existing DB with a schema, and I am trying to generate entities out of it.

Here's a minimal MySQL schema that showcases the issue (an extremely simplified version of my full schema):

-- MySQL Workbench Forward Engineering

SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION';

-- -----------------------------------------------------
-- Schema pivot_ref_examples
-- -----------------------------------------------------

-- -----------------------------------------------------
-- Schema pivot_ref_examples
-- -----------------------------------------------------
CREATE SCHEMA IF NOT EXISTS `pivot_ref_examples` DEFAULT CHARACTER SET utf8 ;
USE `pivot_ref_examples` ;

-- -----------------------------------------------------
-- Table `sender`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `sender` (
  `sender_id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(255) NOT NULL,
  PRIMARY KEY (`sender_id`),
  UNIQUE INDEX `name_UNIQUE` (`name` ASC) VISIBLE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `emails`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `emails` (
  `email_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `email` VARCHAR(255) NOT NULL,
  PRIMARY KEY (`email_id`),
  UNIQUE INDEX `email_UNIQUE` (`email` ASC) VISIBLE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `sender_emails`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `sender_emails` (
  `sender_id` SMALLINT UNSIGNED NOT NULL,
  `email_id` INT UNSIGNED NOT NULL,
  PRIMARY KEY (`sender_id`, `email_id`),
  INDEX `fk_sender_emails_emails1_idx` (`email_id` ASC) VISIBLE,
  CONSTRAINT `fk_sender_emails_sender`
    FOREIGN KEY (`sender_id`)
    REFERENCES `sender` (`sender_id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT `fk_sender_emails_emails1`
    FOREIGN KEY (`email_id`)
    REFERENCES `emails` (`email_id`)
    ON DELETE RESTRICT
    ON UPDATE CASCADE)
ENGINE = InnoDB;


-- -----------------------------------------------------
-- Table `email_messages_log`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `email_messages_log` (
  `log_id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  `sender_id` SMALLINT UNSIGNED NOT NULL,
  `sender_email_id` INT UNSIGNED NOT NULL,
  `recepient_email_id` INT UNSIGNED NOT NULL,
  PRIMARY KEY (`log_id`),
  INDEX `fk_email_messages_log_sender_emails1_idx` (`sender_id` ASC, `sender_email_id` ASC) VISIBLE,
  INDEX `fk_email_messages_log_emails1_idx` (`recepient_email_id` ASC) VISIBLE,
  CONSTRAINT `fk_email_messages_log_sender_emails1`
    FOREIGN KEY (`sender_id` , `sender_email_id`)
    REFERENCES `sender_emails` (`sender_id` , `email_id`)
    ON DELETE RESTRICT
    ON UPDATE CASCADE,
  CONSTRAINT `fk_email_messages_log_emails1`
    FOREIGN KEY (`recepient_email_id`)
    REFERENCES `emails` (`email_id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;


SET SQL_MODE=@OLD_SQL_MODE;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;

In email_messages_log, the compound FK fk_email_messages_log_sender_emails1 is a reference to sender_emails. The table sender_emails is a pivot table.

Why am I even doing this? Because I want to enforce on a DB schema level that emails can only be sent by senders if an email is theirs. Only once they are verified by the application would there be an entry in sender_emails. Sure, I could have separate FKs in email_messages_log for the sender_id and sender_email_id columns, but that means I can add an entry about an email that was used for sending, and is not owned by the sender. Or I could use a trigger, sure, but I'd rather not if I can avoid it.

In the full schema, I have some more guards to ensure time flexibility and accounting for logging of failed sends, but that's not additions needed to showcase the issue.

Anyhow, the generated entity for email_messages_log has a reference to sender_emails. But because sender_emails is a pivot table, no real file is generated, leaving the entity for email_messages_log (and any other entity that might reference it) unusable.

Sounds like you have a wrong entity definition

I am using a schema first approach, not an entity first approach. And there are no pre-existing entity definitions at the time of the entity generation. The issue is with the generated entity definition being wrong.

What do you not understand about creating one?

Where to put SQL schema definition and where to put expected entity generator output for one thing... all tests that I managed to find assume a pre-existing set of entity definitions, and the issue here is specifically with the entity generation.

You should use WSL, I believe that should do it.

Even on WSL(2), there are a lot of failures, though most failures are timeout related, and it takes ages to get to that point, whereas without WSL, I immediately get the failures, and most errors are related to being unable to load the config (and a few path normalization assertions that I patched, but have not included in this PR, since there's bigger issues). Docker is running on Windows with its WSL2 engine.

I'm guessing WSL's issue is being unable to reach the container for some reason that I don't yet understand. I don't know why Windows is trying to find a config in the place where it's trying to find it - there isn't a config there indeed.

Also, you should drop all the unrelated code formatting changes (e.g. I can see you changed the order of imports for no reason).

Fixed. Sorry about that. During my attempts to fix the issue, I must've added an import, which I later removed, but my IDE is configured to auto format when a new import is modified.

@B4nan
Copy link
Member

B4nan commented Nov 3, 2023

Thanks for the details, now I see what you are trying to fix, that's a good first step. The generated entity is indeed wrong (even if the pivot entity would be generated, there should not be the senderId column at all, only the relation one is supposed to be there), but the fix you made still does not feel correct (or better say it's still very confusing). I will have a look later today and try to come up with something cleaner and more general, as you are trying to fix only the edge case you face, while the issue feels rather a bit wider.

I am using a schema first approach

MikroORM is a code-first framework, entity generator is a way to bootstrap your project from existing schema, not a schema-first approach per se. Over time it's getting better and better, so in a way you can use it this way too, but it never was the supposed usage. I've created #3484 some time ago with some ideas of what should be done to make this a proper schema-first approach, but it has low priority, again because MikroORM is a code-first framework.

@B4nan
Copy link
Member

B4nan commented Nov 3, 2023

Isn't the right fix here to generate the pivot entity as a real file? It feels like that's a more sane approach, in the end, your schema references it from other places.

(fwiw, you can have a pivot entity defined as a regular entity nowadays, see https://mikro-orm.io/docs/collections#custom-pivot-table-entity)

When a pivot table's primary key is referenced by a foreign key constraint, render an entity for the pivot table.

Oh so that is what you were trying to do in the PR, I see. So we are aligned on that, good :]


Here is a test case for this:

import { MikroORM } from '@mikro-orm/mysql';
import { EntityGenerator } from '@mikro-orm/entity-generator';

test('4892', async () => {
  const orm = await MikroORM.init({
    dbName: 'pivot_ref_examples',
    port: 3308,
    discovery: { warnWhenNoEntities: false },
    extensions: [EntityGenerator],
    multipleStatements: true,
  });
  await orm.schema.ensureDatabase();
  await orm.schema.execute(`
CREATE TABLE IF NOT EXISTS \`sender\` (
  \`sender_id\` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
  \`name\` VARCHAR(255) NOT NULL,
  PRIMARY KEY (\`sender_id\`),
  UNIQUE INDEX \`name_UNIQUE\` (\`name\` ASC) VISIBLE)
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS \`emails\` (
  \`email_id\` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  \`email\` VARCHAR(255) NOT NULL,
  PRIMARY KEY (\`email_id\`),
  UNIQUE INDEX \`email_UNIQUE\` (\`email\` ASC) VISIBLE)
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS \`sender_emails\` (
  \`sender_id\` SMALLINT UNSIGNED NOT NULL,
  \`email_id\` INT UNSIGNED NOT NULL,
  PRIMARY KEY (\`sender_id\`, \`email_id\`),
  INDEX \`fk_sender_emails_emails1_idx\` (\`email_id\` ASC) VISIBLE,
  CONSTRAINT \`fk_sender_emails_sender\`
    FOREIGN KEY (\`sender_id\`)
    REFERENCES \`sender\` (\`sender_id\`)
    ON DELETE CASCADE
    ON UPDATE CASCADE,
  CONSTRAINT \`fk_sender_emails_emails1\`
    FOREIGN KEY (\`email_id\`)
    REFERENCES \`emails\` (\`email_id\`)
    ON DELETE RESTRICT
    ON UPDATE CASCADE)
ENGINE = InnoDB;

CREATE TABLE IF NOT EXISTS \`email_messages_log\` (
  \`log_id\` INT UNSIGNED NOT NULL AUTO_INCREMENT,
  \`sender_id\` SMALLINT UNSIGNED NOT NULL,
  \`sender_email_id\` INT UNSIGNED NOT NULL,
  \`recepient_email_id\` INT UNSIGNED NOT NULL,
  PRIMARY KEY (\`log_id\`),
  INDEX \`fk_email_messages_log_sender_emails1_idx\` (\`sender_id\` ASC, \`sender_email_id\` ASC) VISIBLE,
  INDEX \`fk_email_messages_log_emails1_idx\` (\`recepient_email_id\` ASC) VISIBLE,
  CONSTRAINT \`fk_email_messages_log_sender_emails1\`
    FOREIGN KEY (\`sender_id\` , \`sender_email_id\`)
    REFERENCES \`sender_emails\` (\`sender_id\` , \`email_id\`)
    ON DELETE RESTRICT
    ON UPDATE CASCADE,
  CONSTRAINT \`fk_email_messages_log_emails1\`
    FOREIGN KEY (\`recepient_email_id\`)
    REFERENCES \`emails\` (\`email_id\`)
    ON DELETE CASCADE
    ON UPDATE CASCADE)
ENGINE = InnoDB;
  `);
  const dump = await orm.entityGenerator.generate();
  expect(dump).toMatchSnapshot('mysql-entity-dump');

  await orm.schema.dropDatabase();
  await orm.close(true);
});

@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 3, 2023

Thank you for the pointer to pivotEntity. I've modified the PR to also include that in the ManyToMany relations when the pivot table would be outputted anyway.

Here is a test case for this:

Thanks. I have now added it, along with the generated file in the snapshots folder, which... if I'm understanding correctly is the expected output when it doesn't exist, but otherwise, if it mismatches, the test would be considered a failure?

I've created #3484 some time ago with some ideas of what should be done to make this a proper schema-first approach,

IMHO, the presence of a robust entity generator already enables a schema first approach, as one can always regenerate their entities, even on migrations.

Sure, you may still need to refactor your app after regeneration, to handle all the new scenarios that your latest schema changes entail, but at least TS will, ideally, warn you, and you can fix all problems before deploying.

(even if the pivot entity would be generated, there should not be the senderId column at all, only the relation one is supposed to be there)

Hm... I could try to solve that too, but I feel like that may be me biting more than I can chew :D

@boenrobot boenrobot force-pushed the referencedPivotGeneration branch 6 times, most recently from b008fdf to 18b0e17 Compare November 7, 2023 09:11
@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 8, 2023

OK, my first commit made the test run without regressions and produce a file that at least compiles...

I've now added a second commit to address the broader issue #4898. Now that it is fixed, the output of my original case is also more sensible, and I have updated that expected output.

EDIT: Looking at the test results of the full test suite now, there are some changes in some other tests. The most frequent one is related to the prop name for composite FKs - the name of the property is now based on the referenced table, unless the same table is referenced multiple times, or the referenced table happens to be the same name as a column in the table - in those cases, a name based on the FK is picked. Previously, the name of the first column in the FK would be used, but it is this very approach that seems to be causing the other problems.

Also, fieldNames is included on all composite FKs, which is the other big change affecting old tests.

IMO, those changes alone warrant a fix in the tests' expected result.

@boenrobot boenrobot force-pushed the referencedPivotGeneration branch 5 times, most recently from ec35f0d to 688cccf Compare November 8, 2023 15:23
@B4nan
Copy link
Member

B4nan commented Nov 8, 2023

I would prefer to keep things as they were for existing tests (so not printing the field names if they patch the naming strategy, etc).

@boenrobot boenrobot force-pushed the referencedPivotGeneration branch 4 times, most recently from b47e811 to 1a31fc6 Compare November 9, 2023 08:32
@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 9, 2023

Done. I managed to make the name selection a bit smarter... Basically, it will check to see if it is safe to use a column name as the name for a FK prop (detailed comments included for what counts as "safe"), fallback to trying to use the referenced entity, and if that too is unsafe, just use the constraint name.

Also fixed the field names thing, and even the order (FKs named after columns are outputted first, in order of column declaration, as before; other FKs are outputted after that).

Since all test cases fall along the lines of "safe", there are now no changes in the old test cases.

EDIT: Also added more test cases, to trigger less safe entity generation. I must admit it was a bit difficult to think of a semi-practical, even if very contrived example, that goes all the way to having to use the constraint name 😄 . The last is absolutely not something that anyone would ever do intentionally... but it is technically a working schema that can exist by mistake.

@boenrobot boenrobot force-pushed the referencedPivotGeneration branch 2 times, most recently from 5b97622 to 49e8e66 Compare November 9, 2023 16:33
Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall good job, left some comments, mostly about code style and complexity. looking at the snapshot, there are few things that i'd like to see polished, mainly the senderId vs sender property name, relation properties shouldnt have the Id suffix (defined in naming strategy, this is technically dynamic and needs to work just based on comparing the value to what NS would produce) - this is handled propertly for the regular relations so i am sure you can fine in the code how its done and replicate it to your changes

import { SenderEmails } from './SenderEmails';

@Entity()
@Index({ name: 'fk_email_messages_log_sender_emails1_idx', properties: ['senderId', 'sender'] })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will need to deal with that sender vs senderId too, this index definition is wrong because of it

Copy link
Collaborator Author

@boenrobot boenrobot Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this whole property idea somewhat flawed?

Like, if I have two composite FKs that are both covering one column, and I have an index on that one column in addition to the index I'd normally use for the sake of the FK.

e.g.

  • columns a,b,c (all nullable, or all non-nullable; the problem exists either way)
  • fk_x over a and b (with an according index)... renders as prop a
  • fk_y over b and c (with an according index)... renders as prop c
  • index b_idx over b

For the indexes used by the FKs, sure, the one prop for the FK is the one to be listed, but what about the index for the column itself?

Prop b would not be rendered, because it is swallowed by the two FKs. So this leaves b_idx to either under specify props, i.e. nothing... or over specify props - specify props a and c, because b is part of them both.

Would the correct behavior of the generator here be to include all FK props with this column if there is no field for the column? Or maybe always include all props in which the column participates in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why are you talking about nullability (nor why the code does things conditionally based on it), or about indexes. This is only about entity properties and database columns, I don't see how indexes and nullability have any meaning in this problem.

Like, if I have two composite FKs that are both covering one column, and I have an index on that one column in addition to the index I'd normally use for the sake of the FK.

One thing to say out loud, this use case was never really supported, IMO this is a very specific setup most people simply don't need. Also worth saying this is a code-first framework, the entity generator is not a schema-first approach, it's a way to bootstrap projects and it is supposed to be used only once (and the output is subject to manual changes).

Would the correct behavior of the generator here be to include all FK props with this column if there is no field for the column?

No idea, provide a code example and we can talk, this is too abstract for me to say anything :]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand why are you talking about nullability (nor why the code does things conditionally based on it),

At least in MySQL and MariaDB (idk about Postgresql, but I imagine there too), if you have a composite FK, where some or all of the columns are nullable, the FK is not enforced at all if one or more of the columns are set to NULL. As soon as all columns are not null, the entire FK is enforced.

Therefore, it is ok for the ORM to treat a FK prop's value as undefined if one of the FKs' columns is NULL. Any non-nullable columns however must be set somehow for the row to still be valid.

Hm... now that I'm spelling this out, I'm thinking

      if (specificColumnNames.length === currentFk.columnNames.length) {
        // All columns involved with this FK are only covered by this one FK.
        // It is safe to attach this FK to the first nullable column,
        // or if the FK has all non-nullable columns, attach it to the first column.
        const columnName = nullableColumnsInFk.at(0) ?? currentFk.columnNames[0];
        fksOnColumnProps.set(columnName, currentFk);
        continue;
      }

may not actually be safe in this form... If you have multiple nullable columns in a FK, even if this is the only FK over those columns, it still means you can set only some of those columns to something other than NULL, and keep the others NULL. That's a valid DB row, without the FK being enforced, and due to the above fragment, that mixed state is not representable. I'll have to adjust this too I guess. In the case of multiple nullable columns, I guess the FK prop would have to be named after the referenced table if possible, and all columns that are part of the FK would need to be rendered as props without the FK attached.

So the above index problem is really only a problem with all non-nullable columns...

  • columns a,b,c (all nullable, or all non-nullable; the problem exists either way)
  • fk_x over a and b (with an according index)... renders as prop a
  • fk_y over b and c (with an according index)... renders as prop c
  • index b_idx over b

One thing to say out loud, this use case was never really supported

I understand... that is what I'm trying to help with. Help to support this case :)

IMO this is a very specific setup most people simply don't need.

In my experience, this is a chicken-and-the-egg problem. People do need this... BUT they never do it in the DB layer because their ORM of choice doesn't support it, and their ORM of choice doesn't support it because their users don't use it, because their DB designs are already made with that or competing ORM's limited capabilities in mind. The same kind of constraints that I'm trying to enforce on a DB level are more often enforced on the application level due to these limitations, but they are frequently needed, especially as the requirements of the application crystalize.

Also worth saying this is a code-first framework, the entity generator is not a schema-first approach, it's a way to bootstrap projects and it is supposed to be used only once (and the output is subject to manual changes).

Still... if it's a valid SQL schema, the generator should be able to produce usable code out of it... even if that code is later manually refined or requires special care to be used correctly.

I will admit I do intend to have a schema-first approach, where I'll regenerate the entities with every migration (or every major MikroORM upgrade), but I don't expect the generator to cater to this. I expect that I'll always have to post-process the output from the entity generator... one ts-morph, eslint and prettier pass at minimum, in this order. And it's OK for that to be out of scope for the generator. I expect my app to potentially fail to compile with every entity regeneration, even if I didn't change the schema, but especially if I changed the schema... and that too is OK.

No idea, provide a code example and we can talk, this is too abstract for me to say anything :]

The "overlap_fk_example" in the new tests in one such example. "nullable_fk_example" is another. I can come up with a few more similar examples.

I guess the correct answer for the index depends on what MikroORM uses the index meta data for... I can't seem to find any uses beyond just "keeping a record of it" and validating if said records are correct upon meta data discovery... If the intent is to f.e. in the future add "force index" when all involved props are requested, then I guess it would be ok to feature all properties in which "b" is part of. If the intent is for MikroORM to be able to generate the SQL that targets the correct columns during a migration generation, changing "properties" to be "columns" may be more appropriate... Or maybe add both in the "index" declaration, and use either "properties" or "columns" based on context? As far as the generator is concerned, it would generate both the list of columns, and the list of properties in which any one of those columns is involved.

Copy link
Member

@B4nan B4nan Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in MySQL and MariaDB (idk about Postgresql, but I imagine there too), if you have a composite FK, where some or all of the columns are nullable, the FK is not enforced at all if one or more of the columns are set to NULL. As soon as all columns are not null, the entire FK is enforced.

What I don't understand is how is all of that relevant to entity/schema generators. The FK is in the schema and as such should be generated into a relation property in the entity generator, its not relevant that some of the targeting columns are nullable or not. What you say (I think) is that it has a runtime effect, what I say is that I don't see how it's connected to what the entity generator generates.

I understand... that is what I'm trying to help with. Help to support this case :)

Yeah sure, and I am more than happy to accept PRs for that, thanks!

Still... if it's a valid SQL schema, the generator should be able to produce usable code out of it... even if that code is later manually refined or requires special care to be used correctly.

Sure, again, I am fine with any improvements if it was previously failing completely with a type error or similar.

I guess the correct answer for the index depends on what MikroORM uses the index meta data for...

Only for the schema generator, there is no runtime effect on the ORM level. Unique constraints do have some effect (namely for em.upsert(), I don't recall any other place), but not a regular index.


One more thing to mention, when it comes to complex indexes, the indexes should be generated as index expressions, no need to deal with the column<->property mapping at all for those - recent connected issue about that is #4911.

And maybe one last note - I am fine merging this as long as it does not break the existing tests (which it does not right now), it's overall an improvement even if it produces some things wrongly (better than type errors).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I don't understand is how is all of that relevant to entity/schema generators. The FK is in the schema and as such should be generated into a relation property in the entity generator, its not relevant that some of the targeting columns are nullable or not. What you say (I think) is that it has a runtime effect, what I say is that I don't see how it's connected to what the entity generator generates.

For each table, the generator needs to generate a class that can represent any possible state of any row in that table.

It would be easy to just generate props for every column named after the column name, then also generate separate props for every FK, named after the FK constraint... But this is not what the generator is currently doing, so it won't be backwards compatible... It's also not exactly ergonomic, so trying to be smarter about the names of props and combining FK props into column props where it's safe to do so is indeed a welcome addition to DX.

One more thing to mention, when it comes to complex indexes, the indexes should be generated as index expressions, no need to deal with the column<->property mapping at all for those - recent connected issue about that is #4911.

OK... that is a valid solution... I can detect if all columns in the index are covered by a single prop each (rather than a prop with an attached composite FK), and if it's not safe to just list all props, fallback to an expression. Sounds perfectly reasonable to me. If things change in the future, the generator could potentially be made smarter to further support those index features.

And maybe one last note - I am fine merging this as long as it does not break the existing tests (which it does not right now), it's overall an improvement even if it produces some things wrongly (better than type errors).

Let me fix some of your other nit picks first (I kind of already did fix most, just haven't pushed yet...). I'll see if I can squeeze in this index fix too.

packages/entity-generator/src/EntityGenerator.ts Outdated Show resolved Hide resolved
packages/entity-generator/src/EntityGenerator.ts Outdated Show resolved Hide resolved
packages/core/src/typings.ts Outdated Show resolved Hide resolved
packages/entity-generator/src/EntityGenerator.ts Outdated Show resolved Hide resolved
tests/issues/GH4898.test.ts Outdated Show resolved Hide resolved
tests/issues/GH4898.test.ts Outdated Show resolved Hide resolved
tests/issues/GH4898.test.ts Outdated Show resolved Hide resolved
logId!: number;

@ManyToOne({ entity: () => SenderEmails, fieldNames: ['sender_id', 'sender_email_id'], updateRule: 'cascade' })
senderId!: SenderEmails;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this called senderId and not sender?

Comment on lines 72 to 75
@ManyToOne({ entity: () => Sender, fieldName: 'sender_id', updateRule: 'cascade', deleteRule: 'cascade', primary: true })
senderId!: Sender;

@ManyToOne({ entity: () => Emails, fieldName: 'email_id', updateRule: 'cascade', primary: true, index: 'fk_sender_emails_emails1_idx' })
emailId!: Emails;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, this should adhere to naming strategy instead of using explciit fieldName

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still not addressed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And with regards to the senderId vs sender thing... I think my test cases are hitting cases where the code doesn't treat that part as removable... Specifically, I have email_messages_log.sender_id reference email_senders.sender_id... My naming convention - to just use the same column name everywhere, unless the same table is referenced multiple times from the same table, in which case, use a meaningful prefix - is not covered by the regex. The regex would have worked if I had email_messages_log.email_sender_id reference email_senders.sender_id, as it would then replace the _sender_id in email_sender_id with an empty string, resulting in the prop being called email.

If this is really bothering you too much, I can edit the example schema to use "id" in the origin table, and "sender_id" in the table that references the origin table, thus triggering the removal.

But the naming algorithm works as designed for its earlier use cases. There's no way for the generator to figure out that "_id" is the only removable part in the above combination.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the naming algorithm works as designed for its earlier use cases. There's no way for the generator to figure out that "_id" is the only removable part in the above combination.

I don't follow, it's a relation, the naming strategy is where we append _id suffix to relations. It worked fine till now (and it still works for all the other tests), so please don't tell me things are not designed for it, this is the very same case as for other entities, it is not about properties being removable or not - as mentioned in the other thread, I'd like to have this configurable and consistent, the flag I'd like to see would force generate the scalars here is well, and you are defining the relations with names of what should be scalars.

If this is really bothering you too much, I can edit the example schema to use "id" in the origin table, and "sender_id" in the table that references the origin table, thus triggering the removal.

I don't want you to adjust the tests so they pass, I want to address this problem, as it wasn't there before.

@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 13, 2023

OK, I've addressed all issues pointed out in the review before.

Being able to generate a unique index with an expression instead of properties required a bit of extra changes in the types.

With regards to composite FKs with multiple nullable columns that I mentioned, that one requires a small change in the existing tests - new optional properties are added for each nullable column in the composite FK (which in the existing tests is all the cases of "favoriteCar"). The FK itself is still named the same where possible (which in the case of the existing tests is "everywhere"), so if anyone relied on the generator for a schema first approach, they will likely not have a BC break, but they'll get the extra ability to handle rows where the composite FK is not enforced due to it being partially filled.

And with regards to the senderId vs sender thing... I think my test cases are hitting cases where the code doesn't treat that part as removable... Specifically, I have email_messages_log.sender_id reference email_senders.sender_id... My naming convention - to just use the same column name everywhere, unless the same table is referenced multiple times from the same table, in which case, use a meaningful prefix - is not covered by the regex. The regex would have worked if I had email_messages_log.email_sender_id reference email_senders.sender_id, as it would then replace the _sender_id in email_sender_id with an empty string, resulting in the prop being called email.

Honestly, I think the whole naming strategy thing needs somewhat of an overhaul... specifically, it should let the user guide all property and class names during generation, after being fed the whole meta and current suggested name. Then the generator can act accordingly based on the returned suggested name. Instead, right now, a lot of decisions are outside of the naming strategy's control, like this stripping of the suffix of the referenced column, or the "E" in front of tables that start with a number. And even when the naming strategy class is consulted, it doesn't get enough context to make any non-trivial name changes... But all of that can wait for another time... I for one am not bothered by it enough... yet.

@boenrobot boenrobot force-pushed the referencedPivotGeneration branch 6 times, most recently from b32ab9d to 61ce20e Compare November 15, 2023 12:35
@boenrobot
Copy link
Collaborator Author

ok, ready... I reorganized the tests in GH4898.test.ts a bit to be less repetitive and added an extra set of ones to cover the index picking that codecov/patch was complaining is missing coverage... everything is now green, finally 😃

Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will review this later this evening, left just two quick comments now.

also, what issues is this addressing, is it just #4898 or is #4911 also covered?

packages/core/src/metadata/EntitySchema.ts Outdated Show resolved Hide resolved
docs/docs/entity-generator.md Outdated Show resolved Hide resolved
@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 15, 2023

will review this later this evening, left just two quick comments now.

Addressed both of those now.

also, what issues is this addressing, is it just #4898 or is #4911 also covered?

I don't have a test for #4911... as in, idk if there's something Postgresql specific that I'm missing there... But I think it is also effectively covered. The bottom line is if one or more props don't exist, an expression is generated for the index.


It seems to me #4911 is about functional indexes... I haven't dealt with them specifically. The schema helper is given the index, and it's up to it to produce the expression. By the looks of it, no driver's schema helper supports functional indexes yet, so those would still be broken. And on a related note, spatial and full-text indexes would also be broken for similar reasons.

I do hope to eventually use functional indexes myself, so I might end up contributing that too (in a separate PR)... although it's not as pressing for me as it is for @maxsommer , so if he PRs changes to SchemaHelper to support it, all the better😄

Added option to the entity generator called "scalarPropertiesForRelations".
Used to control how scalar properties for composite relations are generated.
Default value of "never" preserved BC and forces all FKs to be fulfilled,
value "always" renders all scalar properties of FKs always,
suitable mostly for applications that intend to sometimes disable FK checks
and value "smart" renders scalar properties of composite FKs only if
the FK can potentially be violated due to only some columns being set.

Adjusted the naming algorithm to fallback to removing the "reference column name"
when removing the full reference column name does not result in a change.
This would remove the "_id" suffix in naming conventions where
both sides of the relation are using the same column name,
and that name happens to end in "_id"
(or whatever a custom naming strategy would dictate that the reference column name is).

Fixed generation of indexes annotations for plain columns.
@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 15, 2023

I feel like I will regret asking this now, as you may agree, but...

Have you considered renaming the @Index property "expression" to "createStatement" or something of that nature? Keep "expression" for later use by functional indexes... I realize that may be a BC break, but v6 is around the corner, so might as well...

…referencedPivotGeneration

# Conflicts:
#	packages/knex/src/schema/DatabaseTable.ts
@@ -53,6 +53,10 @@ By default, the `EntityGenerator` generates only owning sides of relations (e.g.
- `esmImport` to use esm style import for imported entities e.x. when `esmImport=true`, generated entities include `import Author from './Author.js'`
- `skipTables` to ignore some database tables (accepts array of table names)
- `skipColumns` to ignore some database tables columns (accepts an object, keys are table names with schema prefix if available, values are arrays of column names)
- `scalarPropertiesForRelations` to control how scalar column properties are generated for foreign key relations. Possible values:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i took a look at how this works (having the scalar properties side by side with relation ones), and we need to add persist: false to the scalar props, otherwise they will trigger validation if you dont provide value for both when creating an entity (there might be more problems, this is the first one i faced). with persist: false, they wont be considered for change tracking, validation, or schema diffing. they will be still hydrated as usual.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed... I think... I haven't added tests to see if the generated entities work ok, but I changed it to add persist: false if a scalar column is rendered without a FK attached and yet a FK featuring it exists.

…re needed

Also ensured that FK props get to have a safe name
in cases where the naming strategy would otherwise name both the same.
Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left few final comments

Adjusted prop order, such that scalar props follow the FK prop named after them.

Scalar primary columns are not marked as such if they are not to be persisted.

Added handling (and a test for) an edge case where the FK constraint name could
in theory match the name of a column in the same table,
while also not being the only reference to that table.
In such cases, "fk_" is added as prefix, potentially multiple times,
until a non-existent column is hit.

Added tests for more setting combinations in GH4898.test.ts.
Specifically, for bidirectionalRelations and identifiedReferences.
esmImports is intentionally left out due to its smaller impact (not worth the time).
@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 17, 2023

All done. Still no tests for whether the generated entities are usable, but they are generated OK. However, I also added a test case for an edge case I realize exists at least in theory - FK constraint name shared with a column name. I add "fk_" prefix to the suggested name, until a non-existent column is hit (which is probably going to be the first iteration, but just in case, it's in a loop...).

Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all right, good job, lets move this forward!

@B4nan B4nan changed the title fix(entity-generator): pivot table referencing feat(entity-generator): allow generating scalar properties for FKs Nov 17, 2023
@B4nan B4nan merged commit abad6ca into mikro-orm:master Nov 17, 2023
11 checks passed
@B4nan
Copy link
Member

B4nan commented Nov 17, 2023

The tests are pretty crazy (the one for #4898), would be good to simplify the schema or think of a way to make them faster. Ideally 2-3 tables, with as few properties as possible, that way it could be reasonably fast on its own. Or split them into test per schema, so they can run in parallel (we would ideally do both).

(I've spend a lot of time making the tests "fast" again - it used to be 5x slower - and this adds ~15% to the run time :])

@boenrobot
Copy link
Collaborator Author

boenrobot commented Nov 17, 2023

ok... I'll make a new PR in which I will split that file into one file per schema, and add it to the entity generator folder... I kind of wanted to do that initially, but wanted to smooth out the behaviors first 😅

I'm not so sure about simplifying the example schemas though, as the edge cases do require a few tables to be present... Although I guess I can make the examples less realistic.

@B4nan
Copy link
Member

B4nan commented Nov 17, 2023

A few tables are ok, let's say up to 5, but that test is using 10 tables (in 6 different schemas). And agreed, also wanted to move the test to the right folder :]

Btw be sure to pull latest changes, I found few small bugs in your changes and fixed them as part of 64a39f8.

boenrobot added a commit to boenrobot/mikro-orm that referenced this pull request Nov 17, 2023
boenrobot added a commit to boenrobot/mikro-orm that referenced this pull request Nov 17, 2023
boenrobot added a commit to boenrobot/mikro-orm that referenced this pull request Nov 20, 2023
Also moved mikro-orm#4892 into the entity generator folder.

Moved schema creation in each file into a beforeAll(),
rather than at the start of every test.
@boenrobot boenrobot deleted the referencedPivotGeneration branch February 10, 2024 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants