Problem with Series.str.match #2074

jseabold · 2012-10-15T20:21:49Z

Not sure yet what's going on here. Doesn't appear to be a unicode issue

df = pandas.DataFrame([[u'A Confe\xdfion of the most auncient and true christen catholike olde belefe accordyng to the ordre of the .XII. Articles of our co[m]mon crede, set furthe in Englishe to the glory of almightye God, and to the confirmacion of Christes people in Christes catholike olde faith.']], columns=["title"])

df.title.str.match(".*[A|a]lmight")
# returns 
#0    ()
#Name: title

re.match(".*[A|a]lmight", df.title.ix[0])
# expected output
#<_sre.SRE_Match at 0x4fa8100>

The text was updated successfully, but these errors were encountered:

gerigk · 2012-10-15T20:28:15Z

match searches only at the beginning of the string. I guess you are
expecting the behavior of re.search ?

re.match(pattern, string,
flags=0)http://docs.python.org/library/re.html#re.match

If zero or more characters at the beginning of string match the regular
expression pattern, return a corresponding
MatchObjecthttp://docs.python.org/library/re.html#re.MatchObject
instance.
Return None if the string does not match the pattern; note that this is
different from a zero-length match.

On Mon, Oct 15, 2012 at 10:21 PM, Skipper Seabold
notifications@github.comwrote:

Not sure yet what's going on here. Doesn't appear to be a unicode issue

df = pandas.DataFrame([[u'A Confe\xdfion of the most auncient and true christen catholike olde belefe accordyng to the ordre of the .XII. Articles of our co[m]mon crede, set furthe in Englishe to the glory of almightye God, and to the confirmacion of Christes people in Christes catholike olde faith.']], columns=["title"])

df.title.str.match(".*[A|a]lmight")

returns

#0 ()
#Name: title

re.match(".*[A|a]lmight", df.title.ix[0])

expected output

#<_sre.SRE_Match at 0x4fa8100>

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2074.

wesm · 2012-10-15T20:32:27Z

The problem is that you have no groups in the regular expression:

In [16]: df.title.str.match("(.*[A|a]lmight)").ix[0]
Out[16]: (u'A Confe\xdfion of the most auncient and true christen catholike olde belefe accordyng to the ordre of the .XII. Articles of our co[m]mon crede, set furthe in Englishe to the glory of almight',)

A better behavior in the case where match.groups() is empty is to use match.group(0) if it exists

jseabold · 2012-10-15T20:33:28Z

I showed the expected behavior of re.match in the code example. That's why I stuck the .* at the beginning. What am I missing?

wesm · 2012-10-15T20:35:04Z

Well, if you look at the implementation of str.match, you see it unpacks the matched groups from the SRE_Match object. That is the issue

jseabold · 2012-10-15T20:36:41Z

Ah, I expected it to return something I can evaluate to True/False to make an index. Will adjust expectations.

wesm · 2012-10-15T20:38:35Z

Very open to API improvements here...put all those functions together over the course of about a day or so

jseabold · 2012-10-15T20:42:11Z

It's (sort of) clear in the documentation that it finds groups but I find myself right now just wanting to know if something matches rather than to pull out the groups. I can see use cases for both though.

cpcloud · 2013-07-25T22:31:43Z

you could have a matches method

jreback · 2013-09-22T15:40:42Z

@jseabold this new method might work for you

#4685

hayd closed this as completed May 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with Series.str.match #2074

Problem with Series.str.match #2074

jseabold commented Oct 15, 2012

gerigk commented Oct 15, 2012

returns

expected output

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

cpcloud commented Jul 25, 2013

jreback commented Sep 22, 2013

Problem with Series.str.match #2074

Problem with Series.str.match #2074

Comments

jseabold commented Oct 15, 2012

gerigk commented Oct 15, 2012

returns

expected output

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

wesm commented Oct 15, 2012

jseabold commented Oct 15, 2012

cpcloud commented Jul 25, 2013

jreback commented Sep 22, 2013