-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for pyre2 #11
Conversation
Doesn't this break compatibility with programs that access |
With the current state of the PR, these are not exposed to the python module mostly because I wasn't sure what's the desired API/ABI that the re2 module should expose. However, that info is already saved on the diff --git a/_re2.cc b/_re2.cc
index 8e3d17d..673a3ba 100644
--- a/_re2.cc
+++ b/_re2.cc
@@ -39,6 +39,8 @@ using std::nothrow;
using re2::RE2;
using re2::StringPiece;
+#include <structmember.h>
+
typedef struct _RegexpObject2 {
PyObject_HEAD
@@ -59,6 +61,41 @@ typedef struct _MatchObject2 {
StringPiece* groups;
} MatchObject2;
+// getters for MatchObject2
+static PyObject* match_pos_get(MatchObject2* self)
+{
+ return PyLong_FromSsize_t(self->pos);
+}
+
+static PyObject* match_endpos_get(MatchObject2* self)
+{
+ return PyLong_FromSsize_t(self->endpos);
+}
+
+static PyObject* match_re_get(MatchObject2* self)
+{
+ if (self->re != NULL)
+ Py_INCREF(self->string);
+ return self->re;
+ Py_RETURN_NONE;
+}
+
+static PyObject* match_string_get(MatchObject2* self)
+{
+ if (self->string != NULL)
+ Py_INCREF(self->string);
+ return self->string;
+ Py_RETURN_NONE;
+}
+
+static PyGetSetDef match_getset[] = {
+ {"pos", (getter)match_pos_get, (setter)NULL},
+ {"endpos", (getter)match_endpos_get, (setter)NULL},
+ {"re", (getter)match_re_get, (setter)NULL},
+ {"string", (getter)match_string_get, (setter)NULL},
+ {NULL}
+};
+
typedef struct _RegexpSetObject2 {
PyObject_HEAD
// True iff re2_set_obj has been compiled.
@@ -245,7 +282,7 @@ static PyTypeObject Match_Type2 = {
0, /*tp_iternext*/
match_methods, /*tp_methods*/
0, /*tp_members*/
- 0, /*tp_getset*/
+ match_getset, /*tp_getset*/
};
static PyTypeObject RegexpSet_Type2 = {
@@ -475,13 +512,20 @@ create_match(PyObject* re, PyObject* string,
StringPiece* groups)
{
MatchObject2* match = PyObject_New(MatchObject2, &Match_Type2);
+
if (match == NULL) {
delete[] groups;
return NULL;
}
+ Py_INCREF(groups);
match->groups = groups;
+ Py_INCREF(re);
match->re = re;
+ Py_INCREF(string);
match->string = string;
+ match->pos = pos;
+ match->endpos = endpos;
+
return (PyObject*)match;
}
That gives us: In [1]: import re2
In [2]: r = re2.compile('\w+')
In [3]: m = r.match('foo bar')
In [4]: m.re
Out[4]: <_re2.RE2_Regexp at 0x7f096eb58618>
In [5]: m.string
Out[5]: 'foo bar'
In [6]: m.pos, m.endpos
Out[6]: (0L, 7L) |
Cool. Can you update the PR with the full change? Unless @sunshowers has objections, I'm not going to worry about the performance of lazy access since these seem like they'd be rarely accessed. |
Sounds good. In that case, I'll merge the proposed patch to the PR, add some unit tests and do some more thorough testing to make sure I haven't messed up with any object referencing and then push the update. |
Any thoughts on this @sunshowers, @dreiss? |
|
||
typedef struct _RegexpObject2 { | ||
PyObject_HEAD | ||
// __dict__. Simpler than implementing getattr and possibly faster. | ||
PyObject* attr_dict; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this part of the change necessary or beneficial? Regexp objects should be constructed very rarely in performance-sensitive programs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I got that. The dict stores groups
/groupindex
and pattern
fields. I didn't change the way re2 objects are constructed/destroyed.
However, based on your feedback I'll re-implement these fields for RegexpObject2
(probably using getter/setter to avoid the performance penalties)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see what you mean. In general execution, you create the regex once and then just match against it! That's true but I think we can get some performance improvements out of this so, why not :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default choice should be to leave working code as-is. Let's try to keep this PR scoped to changes that have measurable benefits. If you think changing RegexpObject2 is useful, let's handle that separately.
_re2.cc
Outdated
PyObject* re; | ||
PyObject* string; | ||
Py_ssize_t pos, endpos; /* current target slice */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put these on separate lines.
_re2.cc
Outdated
|
||
static PyObject* match_re_get(MatchObject2* self) | ||
{ | ||
if (self->re != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be null?
_re2.cc
Outdated
|
||
static PyObject* match_string_get(MatchObject2* self) | ||
{ | ||
if (self->string != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be null?
Py_RETURN_NONE; | ||
} | ||
|
||
static PyGetSetDef match_getset[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting looks weird.
_re2.cc
Outdated
@@ -65,6 +61,43 @@ typedef struct _MatchObject2 { | |||
StringPiece* groups; | |||
} MatchObject2; | |||
|
|||
// getters for MatchObject2 | |||
static PyObject* match_pos_get(MatchObject2* self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the proper getter signature here and just cast to MatchObject2 internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silly question but when you say "proper getter signature", what do you mean?
I had a look at other python extensions and the python documentation before implementing the getters and they both suggest using this kind of signatures. Eg: https://docs.python.org/2.7/extending/newtypes.html
_re2.cc
Outdated
@@ -216,14 +249,6 @@ static PyTypeObject Regexp_Type2 = { | |||
regexp_methods, /*tp_methods*/ | |||
0, /*tp_members*/ | |||
0, /*tp_getset*/ | |||
0, /*tp_base*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default all omitted entries are 0 and since we're no longer setting the attr_dict I thought i'd trim down the list to be consistent with the other typedef in the source. I can zero the fields out and leave it as is though if that's preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd leave it. Explicit is better than implicit.
_re2.cc
Outdated
@@ -259,15 +284,7 @@ static PyTypeObject Match_Type2 = { | |||
0, /*tp_iternext*/ | |||
match_methods, /*tp_methods*/ | |||
0, /*tp_members*/ | |||
0, /*tp_getset*/ | |||
0, /*tp_base*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, we no longer have the attr_dict so I trimmed the PyType object after the last member we manually set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave the zeroes? I think explicit is better than implicit in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
// to groupindex. | ||
regexp->attr_dict = Py_BuildValue("{sisNsO}", | ||
"groups", regexp->re2_obj->NumberOfCapturingGroups(), | ||
"groupindex", groupindex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that groupindex no longer works? In case it wasn't clear before, I don't think this change should be breaking any functionality that currently works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will address that.
I will do some checks to see if the attr_dict is performant enough otherwise I'll just extend the RegexObject2
definition similiar to MatrchObject2
Updated the PR and unit tests to cover the re attributes as well. |
@dreiss have you had any time to review my latest changes? |
I have nothing to do with this project, but I tested your PR: pyre2 1.0.6:
with this PR applied:
I tested with a pyenv-built Python 3.7.0 on an amd64 Debian 9.5 machine.
I see the same with a pyenv-built Python 3.7.0 on an amd64 Ubuntu 16.04.5 machine.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to repro that crash?
_re2.cc
Outdated
typedef struct _RegexpSetObject2 { | ||
PyObject_HEAD | ||
// True iff re2_set_obj has been compiled. | ||
bool compiled; | ||
RE2::Set* re2_set_obj; | ||
} RegexpSetObject2; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted the extra lines in order to be consistent with spacing but I'll revert all style changes (including the tab -> space change)
_re2.cc
Outdated
@@ -96,7 +119,6 @@ static PyObject* regexp_set_add(RegexpSetObject2* self, PyObject* pattern); | |||
static PyObject* regexp_set_compile(RegexpSetObject2* self); | |||
static PyObject* regexp_set_match(RegexpSetObject2* self, PyObject* text); | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
_re2.cc
Outdated
@@ -170,9 +192,9 @@ static int | |||
_no_setattr(PyObject* obj, PyObject* name, PyObject* v) { | |||
(void)name; | |||
(void)v; | |||
PyErr_Format(PyExc_AttributeError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as bove
_re2.cc
Outdated
@@ -259,15 +284,7 @@ static PyTypeObject Match_Type2 = { | |||
0, /*tp_iternext*/ | |||
match_methods, /*tp_methods*/ | |||
0, /*tp_members*/ | |||
0, /*tp_getset*/ | |||
0, /*tp_base*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave the zeroes? I think explicit is better than implicit in this case.
|
||
static PyObject* regexp_pattern_get(RegexpObject2* self) | ||
{ | ||
return self->pattern; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this incref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, you are correct this should increase the reference count. Not sure why/how I missed it but I'll fix it!
I tried various python versions (3.5, 3.6 and 3.7) and the segfault only happens in 3.7. I'll take a look into this and look at what might have changed in python3.7 to cause this behavior. |
I think We can easily work around that by nulling the field after On the bright side, while testing on python-3.7, I noticed the following: Python 3.7.0 (default, Oct 12 2018, 06:58:16)
[...]
In [1]: import re
In [2]: import re2
In [3]: r2 = re2.compile('\w+')
In [4]: r = re.compile('\w+')
In [5]: %timeit r.match('foo bar')
319 ns ± 1.59 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [6]: %timeit r2.match('foo bar')
199 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) |
I haven't had the chance to look into the new way memory allocation/reference handling works in Python 3.7 yet (life and work got in the way) so I've updated the PR with the fix and the suggested changes (removed style fixes, readded the empty struct members for PyTypeObjects etc). I'll try to look into the python object initialization at another time but I don't think that should hold up this set of changes. @ivan can you check to see if you get any issues with this latest version? |
3555007
to
1f42a38
Compare
I also checked with python core, their suggestion was: if we need the field to be NULL, mark it as NULL explicitly which is what I did. Otherwise, we could use @dreiss do you mind taking another look ? |
_re2.cc
Outdated
// consistent across Py3 and Py2. | ||
PyObject* index = PyLong_FromLong(it->second); | ||
if (index == NULL) { | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to immediately return NULL (after decref) here instead of storing an incomplete groupindex in self.
_re2.cc
Outdated
break; | ||
} | ||
} | ||
Py_INCREF(groupindex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Shouldn't we transfer the reference that we got from PyDict_New?
As part of a quick test I did to measure the perfomance of pyre2, I noticed thathat we spent a lot of time allocating objects and that we could improve thethe performance simply by altering some of the C code to limit unnecessary object allocations. The main culprit was Py_BuildValue but it seems like the changes had a positive effect overall in terms of performance. As part of that I did: * Deleted code that populated attributes of the __dict__ struct * Deleted code that initialized the struct itself
Addressed latest comments. |
Changelog: * Performance improvements facebook#11 * Bugfixes facebook#12
As part of a quick test I did to measure the performance of pyre2, I noticed
hat we spent a lot of time allocating objects and that we could improve
the performance simply by altering some of the C code to limit unnecessary
object allocations. The main culprit was
Py_BuildValue
but it seems like thechanges had a positive effect overall in terms of performance.
As part of that I did:
This is more of a request for comments about these changes, just to get some initial feedback based on the changes because I might be missing something obvious but as far as I can see nothing has broken and we have performance improvements ranging from 5% to 50% depending on the use case.
PS: Also here's pprof execution graphs for
timeit()
test before and after the changes: