Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: The implementation of the findFirst method in Phalcon\Mvc\Model class seems to have a bug." #16502

Open
whooperlove opened this issue Jan 9, 2024 · 24 comments
Labels
bug A bug report status: unverified Unverified

Comments

@whooperlove
Copy link

Questions? Forum: https://phalcon.io/forum or Discord: https://phalcon.io/discord

Describe the bug
I encountered an error when passing a search query with the string 'notif_code' as an argument to the findFirst method of the Phalcon\Mvc\Model object. The error message is as follows:

"Column 'if_code' doesn't belong to any of the selected models (1), when preparing: SELECT [Notif].* FROM [Notif] WHERE notif_code = 5 AND account_id = :account_id: AND other_account_id = :other_account_id: LIMIT :APL0:"

Since the actual column 'if_code' does not exist in the table, it seems that within the implementation of the findFirst method, there might be a bug where the string 'notif_code' is modified, possibly removing the 'not' keyword for processing in SQL. As a result, 'notif_code' may have been transformed into 'if_code,' leading to the passing of a nonexistent column name 'if_code' in the generated low-level SQL statement.

To Reproduce
If you add a column containing the word 'not' to any table and specify it as a search condition in the findFirst method, you will notice an attempt to access the column name with the 'not' removed. For example, the column name 'notification' would be recognized as 'ification' within the findFirst method.

Provide minimal script to reproduce the issue

class Notif extends \Phalcon\Mvc\Model
{
    /**
     *
     * @var integer
     */
    public $id;

    /**
     *
     * @var integer
     */
    public $notif_code;
}

$notif =  Notif::findFirst(['notif_code = 0']); // this produces an error.
if  ($notif) doSomething();
else doSomethingElse();

//$notif2 = Notif::findFirstByNotifCode(0); // Magic method version doesn't produces an error.

Expected behavior
The error, 'Column 'if_code' doesn't belong to any of the selected models,' should not occur.

$notif = Notif::findFirst(['notif_code = 0']);
if  ($notif) doSomething();
else doSomethingElse();

Details

  • Phalcon version: 5.5.0
  • PHP Version: 8.0.30
  • Operating System: Rocky Linux release 9.3 (Blue Onyx)
  • Installation type: Compiling from source
  • Zephir version (if any): 0.18.0
  • Server: Nginx 1.20.1
  • Other related info (Database, table schema): postgresql
    CREATE TABLE notif (
    id bigint DEFAULT nextval('notif_id_seq'::regclass) NOT NULL ,
    notif_code integer NOT NULL ,
    CONSTRAINT pk_notif PRIMARY KEY ( id )
    );
@whooperlove whooperlove added bug A bug report status: unverified Unverified labels Jan 9, 2024
@niden
Copy link
Sponsor Member

niden commented Jan 9, 2024

not is a reserved word for Zephir and PHQL. As such, the parser does not know if this is part of your filed name or an actual word used in the framework.

Change your findFirst like this:

$notif =  Notif::findFirst(['[notif_code] = 0']);

Enclosing the field in square brackets will tell the parser that this is the name of a field and not part of the statement.

@whooperlove
Copy link
Author

Thanks for your feedback. Normally, keywords are separated by spaces. For instance, in cases like 'notice' where 'not' and 'ice' are together without spaces, there shouldn't be an issue. Also, in Phalcon 4, there wasn't such a problem as it is now.

@s-ohnishi
Copy link

Even though it is a reserved word, it is very troublesome if it is determined by partial match.
I also think it's an incorrect implementation.

@niden
Copy link
Sponsor Member

niden commented Jan 25, 2024

I don't disagree. PHQL was written 10 years ago and at the time the implementation had difficulties distinguishing between keywords and names of variables/fields. The solution was the square brackets.

These limitations are the reason we (slowly) started introducing the Data Mapper pattern for the database. Still a lot of work left though.

@s-ohnishi
Copy link

I feel very sorry for everyone working on this, but I would like you to agree that it is incomplete, no matter how many years ago it is.
And, "now" we need to make it known that all column names need to be escaped with [ and ], and I think the reason why this issue was raised is because there wasn't enough awareness.

@whooperlove
Copy link
Author

You mentioned that the problem has been present since the implementation of PHQL 10 years ago, but I find two peculiar points. First, when accessing notif_code in the format of a magic method, such as 'Notif::findFirstByNotifCode(0),' no issues seem to arise. Perhaps it internally handles the square brackets [] differently? Second, as mentioned in the previous comment, there were no problems with the same code in Phalcon version 4. My project was created before this issue emerged, and it only occurred after applying Phalcon version 5, making me aware of the need for reserved word escaping.

@niden
Copy link
Sponsor Member

niden commented Jan 25, 2024

The reason these bugs arise or if you like difference in behavior, is the changes in PHP and their internal engine. We all had to adjust to new functionality and adjusted our methods. This one appeared with PHP 8 it seems.

Note that you don't need to enclose all your field names with square brackets. The ones prefixed with not are the most common ones (note, notes, notifications etc.)

I don't know C so I cannot dive in and diagnose and potentially fix this. If anyone does and can help by all means, we welcome such help. In the meantime the brackets are the workaround or "feature" :)

@s-ohnishi
Copy link

Note that you don't need to enclose all your field names with square brackets.

No, as a workaround today, isn't it sure to take measures without omission?
I haven't tried it, but wouldn't it target all words that follow normal SQL reserved words such as or, and, select, delete?

Also, I don't think there will be a problem if the column name that seems to be a problem this time is passed directly by PDO, so maybe this is also a problem with Zephir?

@niden
Copy link
Sponsor Member

niden commented Jan 25, 2024

The problem is in the generated code that builds the extension. Whether that is Zephir or the pure C code that handles the PHQL queries I do not know.

I have not seen any submissions for fields starting with delete say deletedTime, only the not ones.

@s-ohnishi
Copy link

I took a look at scanner.c, but to be honest, I don't really understand it.
Why is the processing different for "AS" and "NOT", and why is there a difference between "X" and "x" even though I thought it was case insensitive...

@whooperlove
Copy link
Author

@s-ohnishi I don't know about phalcon internals. but I think you don't have to read scanner.c file. It seems scanner.c file was auto-generated from scanner.re file in the same directory. scanner.re file is readable code.

@whooperlove
Copy link
Author

@s-ohnishi I just found that scanner.c was auto-generated from scanner.re file by using re2c program. The following text is about re2c.

=====
RE2C is a lexer generator designed to process regular expressions and generate C code for recognizing and handling tokens. However, it does not provide an explicit feature for optimizing regular expressions. Instead, the main goal of RE2C is to generate simple and efficient C code.

Key features of RE2C include:

Performance: The code generated by RE2C is highly efficient and fast, providing high-performance recognition of regular expressions.

Flexibility: RE2C supports various regular expression syntaxes and can be used in multiple programming languages.

Ease of use: Defining lexer code using a simple regular expression syntax and a few special commands is straightforward.

While RE2C does not offer explicit options for optimizing regular expressions, it is designed to automatically generate optimized code. If specific optimization for a particular regular expression is needed, developers typically modify or fine-tune the RE2C code directly.

@s-ohnishi
Copy link

@whooperlove Thank you very much.
I understand that re2c converts regular expression rules into concrete code.

I think that case sensitivity is specified by the option --case-insensitive, but why is it that some of the generated code is not so?
I'm not sure if scanner.re is wrong or if re2c is having the problem.

And my doubts increased.
Why is there a difference between BETWEEN and NOT BETWEEN?
Even though PHQL_T_NOTIN is defined for NOT IN, why is it not written in scanner.re?
I think NOT EXISTS is used in a similar way, but why is it completely ignored?
If NOT IN and NOT EXISTS do not need to be distinguished as tokens, shouldn't NOT BETWEEN be the same?

@whooperlove
Copy link
Author

@s-ohnishi for your information, it seems that scanner.re and scanner.c are unrelated to the problem. (I modified scanner.re by removing the 'NOT' block and regenerated the scanner.c file by running 're2c -o scanner.c scanner.re'. After building and installing the new phalcon.so, when I run the test code that previously resulted in the tif_code error, the same error occurs with the new phalcon.so.)

@s-ohnishi
Copy link

@whooperlove That's an interesting result. It's a shame that I can't experiment with it at hand.
For example, even if I delete everything such as NOT BETWEEN and NOT, there will be no problem as long as there is no NOT in the SQL, so I would like to try various things.

the same error occurs with the new phalcon.so.

Maybe I'm thinking the wrong way.
Even though the token separation is working fine, something may be happening elsewhere.
If that's the case, the mysterious parts such as scanner.re and scanner.h are probably irrelevant.

regenerated the scanner.c file by running 're2c -o scanner.c scanner.re'

I'm also interested in how much the generated scanner.c has changed.

@s-ohnishi
Copy link

@whooperlove I tried it too.

I created a debian:bookworm container with Docker and compiled it from the v5.6.0 source.

I tested simple-mvc with a little modification.

When I add a notice column to the Products table and execute $prod = Products::find('notice=1');,

Column 'ice' doesn't belong to any of the selected models (1), when preparing: SELECT [Products].* FROM [Products] WHERE notice=1

was displayed.

After that, I modified ext/phalcon/mvc/model/query/scanner.re as follows.

Removed NOT BETWEEN clause.
Although this is not the case, the regular expression for hexadecimal integers is strange, so I changed it to the following.

HINTEGER = [0][x][0-9A-Fa-f]+;
HINTEGER {
     token->value = estrndup(q, YYCURSOR - q);
     token->len = YYCURSOR - q;
     token->opcode = PHQL_T_HINTEGER;
     q = YYCURSOR;
     return 0;
}

(In [x0-9A-Fa-f]+, 0xxxx is also considered a hexadecimal integer, right? However, I am not sure if this description is appropriate.)
After that, I generated scanner.c with re2c, compiled it again, and tried it, and it ran without any problems.

I don't know why there was no change in your test, but I think it's because the NOT BETWEEN clause is getting in the way, or because re2c is up to date (the one I used was re2c3.2).

@s-ohnishi
Copy link

@niden Could you please try removing the NOT BETWEEN clause from scanner.re (by changing the HINTEGER regular expression if possible)?

@whooperlove
Copy link
Author

I think that, due to our lack of knowledge in solving this issue, it might be better to include a warning message in the official documentation, specifying certain reserved keywords including 'not'. This warning would advise using square brackets for escaping when accessing columns with names containing the reserved keywords. for example, at https://docs.phalcon.io/latest/db-models/#findfirst, "Please escape column name with square brackets if it contains reserved keywords like 'not', ..."

@s-ohnishi
Copy link

@niden @whooperlove
I agree that users use [ ] as a precaution, but I disagree that the system cannot function without [ ].
This is clearly different from the original behavior of SQL.
And there is no reason why NOT BETWEEN should be treated specially as a token, so scanner.re should be modified.
(HINTEGER should also be fixed)

@aircodepl
Copy link

aircodepl commented Mar 6, 2024

Just run into this issue and wish to bump its importance. +1 to fixing it asap.

Migrating from Phalcon 3 to Phalcon 5 with queries like this:

'conditions' => 'publisherId = :publisherId: AND noteId = :noteId:',

And hilarious errors:

Column 'eId' doesn't belong to any of the selected models (1)

What? At least I laughed a bit when found out what's going on. But newcomers will be lost.

@s-ohnishi
Copy link

At least I laughed a bit when found out what's going on.

I really think so.
In my test, I was able to solve the problem by modifying scanner.re, so I would like someone to verify this.

@jturbide
Copy link
Sponsor Contributor

jturbide commented Mar 8, 2024

+1 Anyone who has used Phalcon for an extended period has likely encountered this issue at least once. It might be beneficial to include an example of this specific problem in the documentation or resolve the issue and incorporate this scenario into the unit tests.

@s-ohnishi
Copy link

It looks like the developers are focusing more on v6 than v5.
Since I can't prepare a v5 development environment, I can't issue a specific PR for files under ext/ or .zep files.
So I know I have to wait until they have enough time.

@s-ohnishi
Copy link

I've presented a proposal that I think will solve the problem, and I hope the developers will start working on it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug report status: unverified Unverified
Projects
None yet
Development

No branches or pull requests

5 participants