Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with unyt comparison methods via numpy 1.25.0 #431

Open
CalCraven opened this issue Jun 26, 2023 · 9 comments
Open

Issue with unyt comparison methods via numpy 1.25.0 #431

CalCraven opened this issue Jun 26, 2023 · 9 comments

Comments

@CalCraven
Copy link

  • unyt version:
    unyt 2.9.2 pyhd8ed1ab_1 conda-forge 101kB
  • Python version:
    3.10.0
  • Operating System:
    MacOS

Description

Comparing unyt arrays to strings has been deprecated by 1.25.0 release of numpy. According to the patch notes, universal functions for numpy arrays can be overwritten by a subclass, causing differences in comparison. This comes into play by raising an error if you try to compare a unyt_array object to a null string, which works with numpy 1.24.2 and below as far as we've tested.

What I Did

import numpy as np
print(np.__version__)
import unyt as u
x = 1.0 * u.kJ
print(assert x != "")

Outputs

1.25.0
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  
  self = unyt_quantity(1.0, 'kJ'), ufunc = <ufunc 'not_equal'>
  method = '__call__', inputs = (unyt_quantity(1.0, 'kJ'), ''), kwargs = {}
  func = <method-wrapper '__call__' of numpy.ufunc object at 0x7f43cbce0b40>
  out = None, out_func = None, new_dtype = dtype('float32')
  
                      new_dtype = np.dtype("f" + str(inp1.dtype.itemsize))
                      conv = new_dtype.type(conv)
                      if offset is not None:
                          raise InvalidUnitOperation(
                              "Quantities with units of Fahrenheit or Celsius "
                              "cannot by multiplied, divided, subtracted or "
                              "added with data that has different units."
                          )
  >                   inp1 = np.asarray(inp1, dtype=new_dtype) * conv
  E                   ValueError: could not convert string to float: ''

Working code

import numpy as np
print(np.__version__)
import unyt as u
x = 1.0 * u.kJ
print(assert x != "")

Outputs

1.24.2
True
@neutrinoceros
Copy link
Member

I don't understand the purpose of comparing a unyt_array to a string. What context does this pop into ?

I'm also not sure this is something we can or should fix, because it was never documented as reliable or tested.

@CalCraven
Copy link
Author

Yes that's a fair point. We have a package that has used it previously to compare many attributes of an object, some of which are defaulted to a null string, and only write out ones that have been modified. It's easy enough for us to change this comparison to something a little more rigorous, just thought it should be brought to everyone attention in case anyone else has the same bug coming up.

@ngoldbaum
Copy link
Member

I think it would probably be ok to just universally return False for equality comparisons with types that aren’t instances of unyt_array. Seems sensible to me.

@neutrinoceros
Copy link
Member

I think it would probably be ok to just universally return False for equality comparisons with types that aren’t instances of unyt_array.

this would break any existing comparison between an unyt_array (a) and a np.ndarray (b), which currently works the same as comparing a.ndview with b.

@ngoldbaum
Copy link
Member

Of course we’d need to treat ndarrays differently, I meant any random python type like string or other unrelated object.

@neutrinoceros
Copy link
Member

Makes sense to me. How about comparing scalar arrays (unyt_quantity) with stuff like int or float ?

@ngoldbaum
Copy link
Member

Probably just needs a call to np.asanyarray in __eq__ to handle all array-likes, then anything that ends up with a string or object dtype gets rejected.

@neutrinoceros
Copy link
Member

strings may need extra care too

>>> import numpy as np
>>> np.asanyarray("hello")
array('hello', dtype='<U5')

@ngoldbaum
Copy link
Member

Right, that's why I said,

then anything that ends up with a string or object dtype gets rejected

I guess also datetimes too. U5 is a string dtype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants