Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better quantities support #2494

Closed
twmr opened this issue Dec 11, 2012 · 7 comments
Closed

better quantities support #2494

twmr opened this issue Dec 11, 2012 · 7 comments

Comments

@twmr
Copy link

twmr commented Dec 11, 2012

I want to use the quantities package and pandas to process scientific data. However, pandas strips the unit(s) of the data stored in numpy arrays if I create a dataframe out of them:

a = np.random.rand(10)*pq.s
b = np.random.rand(10)*pq.A
df = pd.DataFrame({'current':b, 't':a}, columns=['t','current'])
In [1]: df
Out[1]: 
         t   current
0  0.663397  0.435423
1  0.038498  0.101763
2  0.960983  0.091785
3  0.262863  0.364734
4  0.154440  0.274169
5  0.953129  0.052678
6  0.389961  0.272535
7  0.961604  0.559451
8  0.747192  0.438268
9  0.789207  0.568685

What do you think, should pandas have support for writing the unit of each column in the corresponding columnname if the column is a quantities array, or should it be part of the quantities package.

In [1]: df
Out[1]: 
       t [s]  current [A]
0  0.663397  0.435423
1  0.038498  0.101763
2  0.960983  0.091785
....
@changhiskhan
Copy link
Contributor

related #1071

@changhiskhan
Copy link
Contributor

idea solution would be #2485

@ghost
Copy link

ghost commented Dec 13, 2012

note that pandas does allow arbitrary objects as data, so this works:

In [1]: import quantities as pq

In [2]: a = [x for x in np.random.rand(10)*pq.s]
   ...: b = [x for x in np.random.rand(10)*pq.A]
   ...: df = pd.DataFrame({'current':b, 't':a}, columns=['t','current'])

In [3]: df
Out[3]: 
                   t           current
0   0.947453713439 s  0.862354858891 A
1  0.0715388489745 s  0.289714978433 A
2   0.355878981889 s  0.104722774751 A
3  0.0348524553738 s  0.583777806824 A
4  0.0138462074388 s  0.696935444594 A
5  0.0618976659102 s  0.517191317629 A
6   0.280860457598 s  0.811183797669 A
7     0.7259184522 s  0.270850455923 A
8   0.148336783722 s  0.203341988353 A
9   0.619437945726 s  0.895585882586 A

Which means this works, and is the most wonderful thing I've seen
all week:

In [4]: df**2
Out[4]: 
                        t               current
0      0.89766853911 s**2   0.743655902652 A**2
1   0.00511780691259 s**2  0.0839347687284 A**2
2      0.12664984975 s**2  0.0109668595515 A**2
3   0.00121469364558 s**2    0.34079652774 A**2
4  0.000191717460437 s**2   0.485719013931 A**2
5   0.00383132104514 s**2   0.267486859031 A**2
6    0.0788825966422 s**2     0.6580191536 A**2
7     0.526957599245 s**2  0.0733599694737 A**2
8    0.0220038014049 s**2  0.0413479642275 A**2
9     0.383703368605 s**2   0.802074073088 A**2

cgs ye shall torment me no more.

not familiar with quantities internals, so this may have degraded performance
since the new type is probably calculated per datum rather then per vector operation.
still. very cool.

@ghost
Copy link

ghost commented Dec 13, 2012

I'll try and create a PR with a new keyword for the dataframe constructor which would
have the constructor do the conversion from X-array to array of X's.
Might enable other packages that subclass from numpy to work as well.

edit: on second thought, not generally useful.

@ghost ghost self-assigned this Dec 13, 2012
@mikofski
Copy link

mikofski commented Dec 5, 2015

+1, and/or Pint support as well. I will contribute as I can

@jreback
Copy link
Contributor

jreback commented Dec 5, 2015

closing in favor of #10349

@jreback jreback closed this as completed Dec 5, 2015
@5igno
Copy link

5igno commented Apr 13, 2021

Just as a comment, the approach suggested here leaves room to the possibility of having quantities that have inconsistent values from one row to the next. The measurement unit should be something that describes the content of each column, as e.g. some column metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants