argmax? #2970

Closed
jseabold opened this Issue Mar 5, 2013 · 10 comments

Comments

Projects
None yet
4 participants
Contributor

jseabold commented Mar 5, 2013

Before I make a PR, is there an existing argmax anywhere? I often find myself wanting an easier way of asking for the argmax in a series, and I don't see a way to do it easily. From experience, this doesn't mean it doesn't exist.

Contributor

jreback commented Mar 5, 2013

Series.idxmax

Contributor

jseabold commented Mar 5, 2013

Thought so. Odd name. Thanks.

jseabold closed this Mar 5, 2013

mtpain commented Mar 5, 2013

An idea for a possibly more memorable name: findmax, a la

array_max_idx = matplotlib.mlab.find( max( array ) )

On Tue, Mar 5, 2013 at 7:38 AM, Skipper Seabold notifications@github.comwrote:

Thought so. Odd name.

?
Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/issues/2970#issuecomment-14446039
.

Matthew Turner
Data Engineer
Economic Modeling Specialists, Int'l
585-350-8649 (mobile)��

mturner@economicmodeling.com

Contributor

lodagro commented Mar 5, 2013

The names were originally argmin and argmax, but were changed to idxmin and idxmax to avoid overloading the ndarray functions (see also #286).

Contributor

jseabold commented Mar 5, 2013

That makes sense. I didn't even see argmax/argmin before, though how I missed these is a mystery because I swear I looked...

Contributor

jreback commented Mar 5, 2013

the original argument for not naming argmax is kind of bogus; there are many methods that override the numpy methods on purpose (e.g. to handle nans)..... e.g.where, abs, astype, reshape, repeat, unique, cumsum/prod, diff, clip, sort....etc

mtpain commented Mar 5, 2013

argmin and argmax are pretty standard imho, esp in optimization where
finding the argmin (argmax) of a function is the problem itself (rando
search turned up http://ttic.uchicago.edu/~karthik/convex.pdf). I think
argmin and argmax are much more appropriate names. I also think the name
is intuitive--they are the argument (i.e. to a function, or an index to an
array) that maximizes that thing.

On Tue, Mar 5, 2013 at 1:13 PM, jreback notifications@github.com wrote:

the original argument for not naming argmax is kind of bogus; there are
many methods that override the numpy methods on purpose (e.g. to handle
nans).....e.g.where, abs, astype, reshape, repeat, unique, cumsum/prod,
diff, clip, sort....etc

?
Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/issues/2970#issuecomment-14465531
.

Matthew Turner
Data Engineer
Economic Modeling Specialists, Int'l
585-350-8649 (mobile)��

mturner@economicmodeling.com

Contributor

lodagro commented Mar 6, 2013

@jreback idxmin/idxmax return label whereas argmin/argmax return location. I think this was the main reason why Wes did not want to override. Since override is not done, both exist - at least on Series, same names idxmin/idxmax are kept on DataFrame.

In [33]: s = pd.Series([10, 30, 20], list('abc'))

In [34]: s
Out[34]:
a    10
b    30
c    20

In [35]: s.idxmax()
Out[35]: 'b'

In [36]: s.argmax()
Out[36]: 1
Contributor

jreback commented Mar 6, 2013

I was using an integerr index, so you are correct :)

in any event, I believe np.argmin of a series of datetime64[ns] with NaT is wrong (in 0.11-dev), worked in 0.10.1 becuase shifted datetime64[ns] was object dtype....

so I think we SHOULD add support for argmax/min directly

Contributor

lodagro commented Mar 6, 2013

created #2982 for datetime64[ns]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment