fixes for StringDType #22

ngoldbaum · 2023-01-11T17:40:07Z

This brings over some functionality from asciidtype. I also include some asciidtype changes here to keep stringdtype and asciidtype more uniform.

Makes StringScalar a str subclass. This makes it possible to use the functions in np.char to manipulate StringDType data
- Adds partition and rpartition wrappers to make sure the result of those functions has uniform types.
Adds a missing incref to common_instance. Also simplifies it a bit since StringDType isn't parametric.
Can create a StringDType with an optional size argument, which gets ignored. This makes the API more uniform with numpy's other string dtypes. In particular, this way of creating a string dtype is now used in np.char, so this also allows StringDType to work with functions in np.char. Currently we just ignore the size but in principle we could use it as a size hint? I'm not sure if that's actually useful.

seberg · 2023-01-11T19:44:40Z

Honestly, now that I see arbitrary sized strings passing length seems weird. For most functions, the long-term thing would be a ufunc replacement. For others, I suppose we have no good story, we could hide something away like arr.dtype.array_funcs.char.mod.

Anyway, don't want to derail trials here, just seems like something better is needed if we really want full support for something like np.char.

seberg · 2023-01-11T19:49:30Z

stringdtype/stringdtype/src/dtype.c

+{
+    PyObject *ret_bytes = NULL;
+    PyTypeObject *scalar_type = Py_TYPE(scalar);
+    // FIXME: handle bytes too


Maybe numpy bytes only? Normal bytes shouldn't be necessary, Python 3 doesn't do it. NumPy bytes are just weird because they serve the dual purpose of an ascii/latin1 string.

ngoldbaum · 2023-01-11T19:54:04Z

Agreed that passing a size is weird. I'd like to come back to that, I agree that it might be better to just expose all the functionality in np.char as ufuncs. That just seemed like a bigger project than I wanted to take on right now.

peytondmurray · 2023-01-12T17:42:49Z

stringdtype/stringdtype/src/dtype.c

-                "common_instance called on unequal StringDType instances");
-        return NULL;
-    }
+    Py_INCREF(dtype1);


peytondmurray · 2023-01-12T17:52:52Z

stringdtype/stringdtype/src/dtype.c

+
+    long size = 0;
+
+    if (!PyArg_ParseTupleAndKeywords(args, kwds, "|l:ASCIIDType", kwargs_strs,


Suggested change

if (!PyArg_ParseTupleAndKeywords(args, kwds, "|l:ASCIIDType", kwargs_strs,

if (!PyArg_ParseTupleAndKeywords(args, kwds, "|l:StringDType", kwargs_strs,

good catch!

peytondmurray

Looks good with a minor change.

seberg reviewed Jan 11, 2023

View reviewed changes

peytondmurray reviewed Jan 12, 2023

View reviewed changes

peytondmurray approved these changes Jan 12, 2023

View reviewed changes

fixes for StringDType

865c973

ngoldbaum force-pushed the string-dtype-fixes branch from ab6ef04 to 865c973 Compare January 12, 2023 17:55

ngoldbaum merged commit 4648ca5 into numpy:main Jan 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fixes for StringDType #22

fixes for StringDType #22

Uh oh!

ngoldbaum commented Jan 11, 2023

Uh oh!

seberg commented Jan 11, 2023

Uh oh!

seberg Jan 11, 2023

Uh oh!

ngoldbaum commented Jan 11, 2023

Uh oh!

peytondmurray Jan 12, 2023

Uh oh!

peytondmurray Jan 12, 2023

Uh oh!

ngoldbaum Jan 12, 2023

Uh oh!

peytondmurray left a comment

Uh oh!

Uh oh!


		long size = 0;

		if (!PyArg_ParseTupleAndKeywords(args, kwds, "\|l:ASCIIDType", kwargs_strs,

Uh oh!

fixes for StringDType #22

fixes for StringDType #22

Uh oh!

Conversation

ngoldbaum commented Jan 11, 2023

Uh oh!

seberg commented Jan 11, 2023

Uh oh!

seberg Jan 11, 2023

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented Jan 11, 2023

Uh oh!

peytondmurray Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

peytondmurray Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

peytondmurray left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!