Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump of a single table in sqlite3 library #91602

Open
WolfgangFahl opened this issue Apr 16, 2022 · 19 comments
Open

dump of a single table in sqlite3 library #91602

WolfgangFahl opened this issue Apr 16, 2022 · 19 comments
Labels
stdlib Python modules in the Lib dir topic-sqlite3 type-feature A feature request or enhancement

Comments

@WolfgangFahl
Copy link

WolfgangFahl commented Apr 16, 2022

Feature or enhancement

I would like to dump only one table but by the looks of it, there is no parameter for this.
see https://stackoverflow.com/questions/6677540/how-do-i-dump-a-single-sqlite3-table-in-python

Pitch
Copying tables is much easier when a dump per table is available.

Previous discussion
https://stackoverflow.com/a/6677833/1497139 already suggested a code change in stackoverflow in 2011. I think it just never made it as a feature request here.

Linked PRs

@WolfgangFahl WolfgangFahl added the type-feature A feature request or enhancement label Apr 16, 2022
@erlend-aasland erlend-aasland added the stdlib Python modules in the Lib dir label Apr 16, 2022
@AlexWaygood AlexWaygood added the 3.11 only security fixes label Apr 16, 2022
@akulakov
Copy link
Contributor

Note there are alternative ways to dump table data provided here: https://stackoverflow.com/questions/75675/how-do-i-dump-the-data-of-some-sqlite3-tables

@erlend-aasland erlend-aasland added topic-sqlite3 and removed 3.11 only security fixes labels May 16, 2022
@kalyanr23
Copy link

If no one is already working on this, I would like to pick this up

@erlend-aasland
Copy link
Contributor

Let's find out if we should add it or not first 😉 I'm not totally convinced it is really needed.

@rawwar

This comment was marked as off-topic.

@AA-Turner
Copy link
Member

@rawwar have you seen https://github.com/orgs/python/projects/23? There is one "TODO Bug" without a PR, and a fair few uncategorised issues. @erlend-aasland is the point man for sqlite though -- I just review or triage!

A

@rawwar

This comment was marked as off-topic.

@erlend-aasland
Copy link
Contributor

FTR, the SQLite CLI accepts an "OBJECTS" param:

sqlite> .help .dump
.dump ?OBJECTS?          Render database content as SQL
   Options:
     --data-only            Output only INSERT statements
     --newlines             Allow unescaped newline characters in output
     --nosys                Omit system tables (ex: "sqlite_stat1")
     --preserve-rowids      Include ROWID values in the output
   OBJECTS is a LIKE pattern for tables, indexes, triggers or views to dump
   Additional LIKE patterns can be given in subsequent arguments
sqlite> CREATE TABLE t(t); 
sqlite> CREATE TABLE r(r);
sqlite> .dump r
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE r(r);
COMMIT;

@felixxm
Copy link
Contributor

felixxm commented Jan 22, 2024

iterdump() is documented as a similar to the .dump command in the sqlite3 shell, so it sounds reasonable to add support for object pattern names. We could add a new kwarg object_name_filter for this purpose.

I'm happy to implement this if we all agree it's valuable.

@erlend-aasland
Copy link
Contributor

iterdump() is documented as a similar to the .dump command in the sqlite3 shell, so it sounds reasonable to add support for object pattern names. We could add a new kwarg object_name_filter for this purpose.

Yeah; I revisited this issue a couple of weeks ago, and I agree that the current behaviour breaks with the promised behaviour in the docs. Regarding the feature itself, I think the keyword have a nicer name; object_name_filter sounds too convoluted to me. How about just filter?

I'm happy to implement this if we all agree it's valuable.

Great!

@felixxm
Copy link
Contributor

felixxm commented Jan 22, 2024

Yeah; I revisited this issue a couple of weeks ago, and I agree that the current behaviour breaks with the promised behaviour in the docs. Regarding the feature itself, I think the keyword have a nicer name; object_name_filter sounds too convoluted to me. How about just filter?

I'd try to avoid using builtin names as filter, maybe filter_name 🤔

@erlend-aasland
Copy link
Contributor

I'd try to avoid using builtin names as filter, maybe filter_name 🤔

I don't think we need to avoid builtin names as parameter names; modules such as logging, shutils, tarfile, and zipapp have functions/methods with a param named filter 😉

@erlend-aasland
Copy link
Contributor

Anyway, sorry 'bout the premature bikeshedding, @felixxm; there's no need to carve the name in stone now :) Go ahead with the implementation.

@felixxm
Copy link
Contributor

felixxm commented Jan 23, 2024

Anyway, sorry 'bout the premature bikeshedding, @felixxm; there's no need to carve the name in stone now :) Go ahead with the implementation.

WIP, I need to dust off some long unused C-skills 😄 but I'm really glad that I have an excuse to do it 💪

@felixxm
Copy link
Contributor

felixxm commented Jan 23, 2024

@erlend-aasland Can you help me a bit 🙏 ? I made the following changes to the _sqlite/clinic/connection.c.h which are probably OK

diff --git a/Modules/_sqlite/clinic/connection.c.h b/Modules/_sqlite/clinic/connection.c.h
--- a/Modules/_sqlite/clinic/connection.c.h
+++ b/Modules/_sqlite/clinic/connection.c.h
@@ -1204,21 +1204,79 @@ pysqlite_connection_interrupt(pysqlite_Connection *self, PyObject *Py_UNUSED(ign
 }
 
 PyDoc_STRVAR(pysqlite_connection_iterdump__doc__,
-"iterdump($self, /)\n"
+"iterdump($self, /, *, filter=None)\n"
 "--\n"
 "\n"
-"Returns iterator to the dump of the database in an SQL text format.");
+"Returns iterator to the dump of the database in an SQL text format.\n"
+"\n"
+"  filter\n"
+"    An optional LIKE pattern for database objects to dump");
 
 #define PYSQLITE_CONNECTION_ITERDUMP_METHODDEF    \
-    {"iterdump", (PyCFunction)pysqlite_connection_iterdump, METH_NOARGS, pysqlite_connection_iterdump__doc__},
+    {"iterdump", (PyCFunction)pysqlite_connection_iterdump, METH_FASTCALL|METH_KEYWORDS, pysqlite_connection_iterdump__doc__},
 
 static PyObject *
-pysqlite_connection_iterdump_impl(pysqlite_Connection *self);
+pysqlite_connection_iterdump_impl(pysqlite_Connection *self, const char *filter);
 
 static PyObject *
-pysqlite_connection_iterdump(pysqlite_Connection *self, PyObject *Py_UNUSED(ignored))
+pysqlite_connection_iterdump(pysqlite_Connection *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
 {
-    return pysqlite_connection_iterdump_impl(self);
+    PyObject *return_value = NULL;
+    #if defined(Py_BUILD_CORE) && !defined(Py_BUILD_CORE_MODULE)
+
+    #define NUM_KEYWORDS 1
+    static struct {
+        PyGC_Head _this_is_not_used;
+        PyObject_VAR_HEAD
+        PyObject *ob_item[NUM_KEYWORDS];
+    } _kwtuple = {
+        .ob_base = PyVarObject_HEAD_INIT(&PyTuple_Type, NUM_KEYWORDS)
+        .ob_item = { &_Py_ID(name), },
+    };
+    #undef NUM_KEYWORDS
+    #define KWTUPLE (&_kwtuple.ob_base.ob_base)
+
+    #else  // !Py_BUILD_CORE
+    #  define KWTUPLE NULL
+    #endif  // !Py_BUILD_CORE
+
+    static const char * const _keywords[] = {"filter", NULL};
+    static _PyArg_Parser _parser = {
+        .keywords = _keywords,
+        .fname = "iterdump",
+        .kwtuple = KWTUPLE,
+    };
+    #undef KWTUPLE
+    PyObject *argsbuf[1];
+    Py_ssize_t noptargs = nargs + (kwnames ? PyTuple_GET_SIZE(kwnames) : 0) - 0;
+    const char *filter = "filter";
+
+    args = _PyArg_UnpackKeywords(args, nargs, NULL, kwnames, &_parser, 0, 0, 0, argsbuf);
+    if (!args) {
+        goto exit;
+    }
+    if (!noptargs) {
+        goto skip_optional_kwonly;
+    }
+    if (!PyUnicode_Check(args[0])) {
+        _PyArg_BadArgument("iterdump", "argument 'filter'", "str", args[0]);
+        goto exit;
+    }
+    Py_ssize_t filter_length;
+    filter = PyUnicode_AsUTF8AndSize(args[0], &filter_length);
+    if (filter == NULL) {
+        goto exit;
+    }
+    if (strlen(filter) != (size_t)filter_length) {
+        PyErr_SetString(PyExc_ValueError, "embedded null character");
+        goto exit;
+    }
+
+skip_optional_kwonly:
+    return_value = pysqlite_connection_iterdump_impl(self, filter);
+
+exit:
+    return return_value;
 }
 
 PyDoc_STRVAR(pysqlite_connection_backup__doc__,

but, I'm puzzled how to reflect them in Modules/_sqlite/connection.c. I've tried to use PyObject_Call() instead of PyObject_CallOneArg:

diff --git a/Modules/_sqlite/connection.c b/Modules/_sqlite/connection.c
index 422d8c2946..2d5a7a10ee 100644
--- a/Modules/_sqlite/connection.c
+++ b/Modules/_sqlite/connection.c
@@ -1998,8 +1998,9 @@ pysqlite_connection_iterdump_impl(pysqlite_Connection *self, const char *filter)
         }
         return NULL;
     }
-
-    PyObject *retval = PyObject_CallFunction(iterdump, (PyObject *)self, filter);
+    PyObject *py_filter = NULL;
+    py_filter = PyUnicode_FromString(filter);
+    PyObject *retval = PyObject_Call(iterdump, (PyObject *)self, py_filter);
     Py_DECREF(iterdump);
     return retval;
 }

but no luck 😞 :

In file included from ./Modules/_sqlite/connection.c:136:
./Modules/_sqlite/clinic/connection.c.h:1216:18: warning: cast between incompatible function types from ‘PyObject * (*)(pysqlite_Connection *, PyObject * const*, Py_ssize_t,  PyObject *)’ {aka ‘struct _object * (*)(pysqlite_Connection *, struct _object * const*, long int,  struct _object *)’} to ‘PyObject * (*)(PyObject *, PyObject *)’ {aka ‘struct _object * (*)(struct _object *, struct _object *)’} [-Wcast-function-type]
 1216 |     {"iterdump", (PyCFunction)pysqlite_connection_iterdump, METH_FASTCALL|METH_KEYWORDS, pysqlite_connection_iterdump__doc__},
      |                  ^
./Modules/_sqlite/connection.c:2580:5: note: in expansion of macro ‘PYSQLITE_CONNECTION_ITERDUMP_METHODDEF’
 2580 |     PYSQLITE_CONNECTION_ITERDUMP_METHODDEF
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@erlend-aasland
Copy link
Contributor

@felixxm, can you create a draft PR? It's easier to make suggestions and comment on a PR :)

@felixxm
Copy link
Contributor

felixxm commented Jan 23, 2024

Done, #114501.

@erlend-aasland
Copy link
Contributor

PR #114501 implements this feature similar to how the SQLite CLI .dump command works. We do have time to adjust this feature before the feature freeze in May, but still I'd like to give a little thought to the API we're now introducing, before landing the PR.

The SQLite CLI .dump command is implemented in shell.c.in1 (a template C file in the SQLite repo). Here are some excerpts from the code:

As you can see from the links, there are some differences in that multiple filters are allowed in the SQLite CLI, and the SQL query itself is slightly different. Perhaps that does not matter :)

Should we expand the feature to allow multiple LIKE patterns? Should we use a similar SQL query as the SQLite CLI? I suggest we try out the feature now in the early alphas, and then we can amend or enhance this functionality in due time before May.

Footnotes

  1. If you're using the amalgamated sources, you'll see it simply as shell.c.

erlend-aasland added a commit that referenced this issue Feb 6, 2024
)

Add optional 'filter' parameter to iterdump() that allows a "LIKE"
pattern for filtering database objects to dump.

Co-authored-by: Erlend E. Aasland <erlend@python.org>
@erlend-aasland
Copy link
Contributor

I merged #114501. There is time to tweak the API before the feature freeze, if needed.

fsc-eriker pushed a commit to fsc-eriker/cpython that referenced this issue Feb 14, 2024
…python#114501)

Add optional 'filter' parameter to iterdump() that allows a "LIKE"
pattern for filtering database objects to dump.

Co-authored-by: Erlend E. Aasland <erlend@python.org>
@felixxm
Copy link
Contributor

felixxm commented Feb 21, 2024

I think we can move this in the sqlite3 project board to the Done or at least In Progress card.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir topic-sqlite3 type-feature A feature request or enhancement
Projects
Status: In Progress
Development

No branches or pull requests

8 participants