Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak DataFrame formatting to always use same # of digits #395

Closed
wesm opened this issue Nov 21, 2011 · 3 comments

Comments

@wesm
Copy link
Member

commented Nov 21, 2011

No description provided.

@lodagro

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2012

Now that formatting always uses same number of digits, one can wonder what does precision mean?
One gets more or less precision depending on the columns content.

In [1]: import numpy as np

In [2]: import pandas

In [3]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [4]: df
Out[4]: 
   A        
0 -1.0000000
1  0.0100000
2  1.235e+08
3  3.1415927
4  1.4142136

By the way precision seems to be one digit off target when using the default float formatter, ok when using the eng one.

In [5]: df = pandas.DataFrame({'A': [np.pi, np.sqrt(2)]})

In [6]: df
Out[6]: 
   A    
0  3.142
1  1.414

In [7]: pandas.core.common._precision
Out[7]: 4

In [8]: pandas.set_printoptions(precision=3)

In [9]: pandas.core.common._precision
Out[9]: 3

In [10]: df
Out[10]: 
   A   
0  3.14
1  1.41

In [11]: pandas.set_eng_float_format(use_eng_prefix=False, precision=3)

In [12]: df
Out[12]: 
   A        
0  3.142E+00
1  1.414E+00

In [13]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [14]: df
Out[14]: 
   A          
0 -1.000E+00  
1  10.000E-03 
2  123.457E+06
3  3.142E+00  
4  1.414E+00  

In [15]: pandas.core.common._precision
Out[15]: 3
@adamklein

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2012

I took precision to be number of significant digits, in which case to
me it looks like the default float formatter is actually the right
number of digits here, vs the eng one. It will be good to resolve this
ambiguity.

On Jan 8, 2012, at 2:22 PM, Wouter Overmeire
reply@reply.github.com
wrote:

Now that formatting always uses same number of digits, one can wonder what does precision mean?
One gets more or less precision depending on the columns content.

In [1]: import numpy as np

In [2]: import pandas

In [3]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [4]: df
Out[4]:
  A
0 -1.0000000
1  0.0100000
2  1.235e+08
3  3.1415927
4  1.4142136

By the way precision seems to be one digit off target when using the default float formatter, ok when using the eng one.

In [5]: df = pandas.DataFrame({'A': [np.pi, np.sqrt(2)]})

In [6]: df
Out[6]:
  A
0  3.142
1  1.414

In [7]: pandas.core.common._precision
Out[7]: 4

In [8]: pandas.set_printoptions(precision=3)

In [9]: pandas.core.common._precision
Out[9]: 3

In [10]: df
Out[10]:
  A
0  3.14
1  1.41

In [11]: pandas.set_eng_float_format(use_eng_prefix=False, precision=3)

In [12]: df
Out[12]:
  A
0  3.142E+00
1  1.414E+00

In [13]: df = pandas.DataFrame({'A': [-1, 1e-2, 123456789, np.pi, np.sqrt(2)]})

In [14]: df
Out[14]:
  A
0 -1.000E+00
1  10.000E-03
2  123.457E+06
3  3.142E+00
4  1.414E+00

In [15]: pandas.core.common._precision
Out[15]: 3

Reply to this email directly or view it on GitHub:
https://github.com/wesm/pandas/issues/395#issuecomment-3403865

@lodagro

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2012

Ok, so it's a matter of definition.

I think definitions are:
precision = the effective number of decimal digits
accuracy = is the effective number of these digits which appear to the right of the decimal point
But i see precision often used when one means accuracy.

Then eng float formatter uses precision argument to indicate the effective number of digits which appear to the right of the decimal point. So for pandas maybe better to use accuracy for eng formatter iso precision (as matplotlib does).

What pandas means by precision is the minimum number of digits, since the actual precision (as defined above) depends on the column content, see my example in previous comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.