Skip to content

RE_INSERT_VALUES matches values starts with % and ends with s #571

@griiid

Description

@griiid

Describe the bug

The regex pattern RE_INSERT_VALUES defined and used in cursors.py is wrong.

RE_INSERT_VALUES = re.compile(
    "".join(
        [
            r"\s*((?:INSERT|REPLACE)\b.+\bVALUES?\s*)",
            r"(\(\s*(?:%s|%\(.+\)s)\s*(?:,\s*(?:%s|%\(.+\)s)\s*)*\))",
            r"(\s*(?:ON DUPLICATE.*)?);?\s*\Z",
        ]
    ),
    re.IGNORECASE | re.DOTALL,
)

%\(.+\)s in the value part: r"(\(\s*(?:%s|%\(.+\)s)\s*(?:,\s*(?:%s|%\(.+\)s)\s*)*\))" allows values starts with % and ends with s.

Theses wrong query text (which you don't want?) will be matched:

INSERT INTO xxx(a, b, c) VALUES (%(xxx)s, NOW(), %(ooo)s)
INSERT INTO xxx(a, b, c) VALUES (%(xxx)s, abc124*(!@*$!@)$&)&$)s)

I think what you want is all %s in values, or all %(xxx)s-liked in values.
However, I don't know why you only want to allow these patterns, since this string-formatting will also works: "NOW(), %(a)s, %(b)s, NOW + INTERVAL 1 MINUTE" % {"a": 100, "b": 3}

Why cant the values part in RE_INSERT_VALUES become: r"(\(.+\))" ?
Developers using function executemany should check whether the args and the query text are match or not.
With r"(\(.+\))" values part, users can use SQL functions, numbers, strings, ... etc in query text, which helps developers a lot.


One more thing is: users don't know this regex rule until they meet some issue, you should explain how to use executemany in the document, something like:

`executemany` only support `INSERT INTO` or `REPLACE` statments
The format of VALUES in query text should be all %s or all named string-formatting, for example: `(%s, %s, %s)` or `(%(a)s, %(b)s, %(c)s)`, .
If your query text is not matches the rules above, it will use `execute` function with looping args instead.

To Reproduce

Code

con = MySQLdb.connect(...)
cur = con.cursor()
query_text_1 = """INSERT INTO xxx(a, b, c) VALUES (%(xxx)s, NOW(), %(b)s)"""
query_text_2 = """INSERT INTO xxx(a, b, c) VALUES (NOW(), %(xxx)s, %(b)s)"""
args = {"a": 1, "b": 2}
cur.executemany(con, query_text_1, args) # works
cur.executemany(con, query_text_2, args) # NOT works

Environment

MySQL Server

MySQL 5.7

MySQL Client

  • OS (e.g. Windows 10, Ubuntu 20.04): macOS 12.5.1

  • Python (e.g. Homebrew Python 3.7.5): python install with pyenv: 3.7, 3.8, 3.10

  • Connector/C (e.g. Homebrew mysql-client 8.0.18): mysql-client

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions