Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Represent stored routines and provide type safe way to execute them #5253

Open
boenrobot opened this issue Feb 17, 2024 · 4 comments
Open

Represent stored routines and provide type safe way to execute them #5253

boenrobot opened this issue Feb 17, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@boenrobot
Copy link
Collaborator

boenrobot commented Feb 17, 2024

Is your feature request related to a problem? Please describe.

In some cases, it may be appropriate to define functionality in stored procedures or functions, particularly in otherwise chatty series of queries that don't need additional input by the app, beyond the initial one. In PostgreSQL, there's also the fact that stored functions are what's needed to be attached to triggers.

Describe the solution you'd like

I'm thinking something along the lines of

@Routine<AddRecord>({
  type: 'procedure',// required. type: 'procedure' | 'function'
  name: 'add_record',//optional. Respects naming strategy as for entities when omitted
  comment: 'Adds a record, and returns the newly added record',//optional
  security: 'invoker',//optional
  definer: 'root',//optional

  deterministic: true,//optional. Explicitly sets the (NOT) DETERMINISTIC flag if provided

//optional. Sets the data access (for MySQL, that's CONTAINS SQL, NO SQL, READS SQL DATA, or MODIFIES SQL DATA).
  dataAccess: 'MODIFIES SQL DATA',

// Required for functions, and defines the returned type.
// For procedures, if present, defines the type of result set
// (i.e. output is always an array of this type, even if guaranteed to be a single record by the query).
// Multiple result sets can be defined as an tuple where each member is the type of that result set.
// Whenever an entity type is specified, lazy scalars always get resolved if returned by the procedure,
// and references are not loaded.
// Returned entities themselves load/refresh the identity map and result caches,
// just like a regular find() call with an empty populate.
// Users wishing to have procedures return certain shapes without an associated entity
// can define virtual entities for use here.
//
// Maybe instead of a function or an array tuple, also allow the definition of an object,
// that will contain a function that will be called with all result sets, input object and entity manager as input,
// and returns either an array where each member the return type for the respective result set,
// (which should be automatically collapsed if it's a single result set for single member arrays)
// or anything else to represent the entire output of the call.
// This would enable the procedure to return a variable and mixed set of result sets,
// which this object's function can then deal with.
  returns: () => RecordEntity,

//optional. Default is driver specific. Mostly useful in PostgreSQL, where one can define custom references to C libs
  language: 'SQL',

// Optional for routines in language other than SQL.
// can be a string that is the procedure body,
// or a function that takes the name of the inputs of the class itself, similarly to generated column decorator,
// and the entity manager that is creating the procedure (not the caller).
// This is mostly to enable the use of entity aware query builder.
// Returns the body of the procedure to be created.
  body:  (params, em) => `BEGIN
SET ${params.hash} = SHA1(CONCAT(${params.name}, ${params.age}, 'secret salt'));
INSERT INTO record_entity SET hash = ${params.hash}, name = ${params.name}, age = ${params.age};
SELECT * FROM record_entity WHERE hash = ${params.hash};
END`
});
export class AddRecord{
  @Property({columnType: 'tinyint'});
  age: number;

  @Property({columnType: 'varchar(255)'});
  name: string;

  // Refs denote an INOUT param.
  // Adding "persist: false" turns the param into an OUT param.
  @Property({columnType: 'char(40)', ref: true, persist: false});
  hash: Ref<string>;
}

and then when calling

// prepare holder for out param.
const hash = new ScalarReference<string>();

// Type of newRecords inferred from returns, potentially void if the procedure does not have a returns defined.
// In this example, it's "RecordEntity[]", that happens to contain just one member.
// If returns option was [() => RecordEntity, 'string'],
// the inferred return type would have been [RecordEntity[], string[]].
// If the returns option was an object to hydrate rows, the inferred return type would be an array of its return type,
// whatever that may be.
const newRecords = await orm.em.callProcedure(
  AddRecord,
  /*type hints from class.*/
  {name: 'Jon Snow', years: 30, hash}
);

/* resolves to a set of SQL statements like

SET @`mikro_orm_ref0` = NULL; -- Set to initial value of reference, or NULL if not provided.
CALL add_record('John Snow', 30, @`mikro_orm_ref0`); -- Hydrates returned result sets according to decorator
SELECT @`add_record__hash`; -- Only if ref is provided in callProcedure. Hydrates into the references provided.
SET @`mikro_orm_ref0` = NULL; -- Always set to NULL in SQL, to prevent data leaks in other statements.

*/

// do something else the results in newRecords, including f.e. populating the relations in the entities.

// Now that we called the procedure, this will not be undefined.
// In this particular example, we could've gotten the same value out of newRecords instead,
// but for the sake of example...
const loadedHash = hash.unwrap();

And of course, it goes without saying that eventually, the schema generator and entity generator should be able to handle those definitions, so that these can be versioned along with the rest of the DB.

Describe alternatives you've considered
The way to use those right now is to define them manually, perhaps in separately maintained and versioned SQL files, and use raw queries to manually map the results back. Extremely inconvenient, which means most people would rather never use them, even if appropriate.

Additional context
Related to #5053

EDIT: Param decorator removed, per feedback. Replaced with persist: false for OUT params, while refs by default would be INOUT params.

@boenrobot boenrobot added the enhancement New feature or request label Feb 17, 2024
@B4nan
Copy link
Member

B4nan commented Feb 17, 2024

I have no idea what's this supposed to be doing, but feel free to PR it 😄 Just keep in mind the schema comparator needs to be able to detect if something is already in the schema (or is this runtime only?). The new decorator feels also pretty weird to me, I would rather have some new options instead, or we need a more specific name.

One note - Ref needs to stay as type, it cant be a class.

@boenrobot
Copy link
Collaborator Author

boenrobot commented Feb 17, 2024

One note - Ref needs to stay as type, it cant be a class.

ok. Edited the above example, before I forget.

I have no idea what's this supposed to be doing,

The decorators are declaring a stored function/procedure (in the example, a procedure). Each class member represents an input param. And the other options are part of the class decorator.

https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html

And the second code block shows how you'd use the declared class in the application code.

Just keep in mind the schema comparator needs to be able to detect if something is already in the schema (or is this runtime only?).

No. Not only runtime. Also migration generation time. The schema comparator can look up information_schema in the case of MySQL and whatever is the equivalent in PostgreSQL... The same schema looked up during entity generation, which I'm guessing (I haven't looked deep into it yet...) is also used in schema comparator. I haven't examined PostgreSQL in detail, but I know for sure MySQL has all the needed info in its information_schema, including even the SQL source of routines, so the comparator should be able to detect if the declaration matches what is already in DB, and generate a drop + create if there are any differences detected, including (most notably) differences in the body of the stored routine.

The new decorator feels also pretty weird to me, I would rather have some new options instead, or we need a more specific name.

Stored routines behave very differently from entities, so I think overloading the @Entity decorator would only introduce hard to diagnose bugs. But functions and procedures are similar enough that separate decorators for each is overkill... so I thought a more generic name like @Routine is a good middle ground.

Or do you mean the @Param decorator? Hm... well... for that one, I suppose some of the existing options can be given new semantics when used in this context. Maybe persist: false, eager: false to be equivalent to an out param, while persist: false, eager: true would be an "INOUT" param?

but feel free to PR it 😄

Great. Yes, I do plan to PR it, but I wanted to get your opinion on the proposed design first.


EDIT: Wait... Ref is already required anyway for out and inout params, so I guess ref: true, persist: true = out param, and ref: true, persist: false = inout param? Or maybe some other option to indicate inout param, but I'm not sure which one. Maybe trackChanges: false for out params, while trackChanges: true would be inout? I mean, stored routines will not be part of the change set anyway (as the ORM can't reliably detect mutations performed by the routine, beyond those returned), so may as well repurpose trackChanges for that.

@B4nan
Copy link
Member

B4nan commented Feb 17, 2024

Thanks, will take a look during next week and try to get acquainted with this to have some opinion :]

Stored routines behave very differently from entities, so I think overloading the @entity decorator would only introduce hard to diagnose bugs. But functions and procedures are similar enough that separate decorators for each is overkill... so I thought a more generic name like @routine is a good middle ground.

This is more about what the actual decorators should be doing. If you want to modify the metadata object, we dont need a new one, @Entity decorator is here for this exact purpose. If it would be adding different kind of metadata, using it would be indeed wrong. Also, what I didnt like was the @Param decorator, that is the most generic name you can pick, used for a very specific thing most people won't need. I don't mind @Routine at all.

I don't understand what you mean by in/out params, so let's first agree on the meaning of that (and the actual implementation), then we can discuss the decorators. Feel free to use those proposed names in the initial PR and we can find better ones (or maybe not) later during reviews.

(and if this is all explained in the links you provided, please allow me a few days, I don't have time to read through them this weekend)

@boenrobot
Copy link
Collaborator Author

boenrobot commented Feb 17, 2024

and if this is all explained in the links you provided, please allow me a few days, I don't have time to read through them this weekend

Sure thing 👍

It is explained in the docs indeed. Here's a more full collection of links for your reading pleasure when you have more time:

https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html
https://dev.mysql.com/doc/refman/8.0/en/call.html
https://dev.mysql.com/doc/refman/8.0/en/information-schema-routines-table.html

https://www.postgresql.org/docs/current/sql-createfunction.html
https://www.postgresql.org/docs/current/sql-call.html
https://www.postgresql.org/docs/current/infoschema-routines.html

https://github.com/WiseLibs/better-sqlite3/blob/master/docs/api.md#functionname-options-function---this
https://www.sqlite.org/pragma.html#pragma_function_list

(SQLite doesn't support stored procedures, and node-sqlite3 doesn't allow user defined SQL functions either, only binary extensions)

I don't understand what you mean by in/out params

TL;DR

IN param - akin to an argument passed by value in most languages (think string/number/bigint/boolean in JS). You get a copy of what is outside.
INOUT param - akin to an argument passed by reference in most languages (think objects in JS). Routine can read what is given, and modify it, and the caller will be able to read the modified value afterwards.
OUT param - kind of like argument passed by reference, except the procedure is not allowed to read what was supplied. It can only replace the value with whatever value it wants (as long as it's of the declared type). In fact, the variable used to hold the value doesn't need to exist prior to the routine call.

@boenrobot boenrobot changed the title Represent stored routes and provide type safe way to execute them Represent stored routines and provide type safe way to execute them Feb 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants