Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: added vector operators: divergence, curl and laplacian #6727

Closed
wants to merge 8 commits into from
252 changes: 252 additions & 0 deletions numpy/lib/function_base.py
Expand Up @@ -1320,6 +1320,258 @@ def gradient(f, *varargs, **kwargs):
else:
return outvals

def divergence(v,*varargs,**kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more general comments -- did not make up my mind about everything, so some of it is more for discussion:

  • I know varargs is copied from gradient, but I am frankly not a fan, I think I would prefer varargs to be a single sequence argument.
  • There seems to be quite a lot of duplication with gradient, maybe we can just call gradient (possible add some arguments to gradient, such as axis=None to pick which derivatives to calculate or some such; that would also in some sense allow the stacked matrix kind of logic)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello seberg,
You are right, it makes more sense to call gradient as needed. I thought of this but for some reason I also thought that each function should be "stand alone" (which I didn't do in laplace!). It will certainly be easier to understand (for me too!) in this way. I will work on keeping what is new, and have the rest substituted for the appropriate calls to gradient. Yes, all the varargs are uncofortable to me too, In this way (calling gradient), this "freedom" will be removed when calling divergence, curl and laplacian, since they not make much sense for these functions.
Thanks for your suggestions

"""
Return the divergence of an N-dimensional vector field of N
components, each of dimension N.

The divergence is computed using second order accurate central
differences in the interior and either first differences or second
order accurate one-sides (forward or backwards) differences at the
boundaries. The returned gradient hence has the same shape as the
input array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have .. versionadded:: 1.11.0 here. Grep for examples.

Parameters
----------
v : list of numpy arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to make this an array-like? Here, one can choose which dimension to be that of the axes; I think the first one is fine. In that case, one could just do

v = np.asanyarray(v)

and it would even cover the case of having a list of numpy arrays. Another advantage would be that the tests for the sizes to be the same could be omitted, since that would be done implicitly already.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
Thanks for your suggestion. By reviewing the size test, I have realized that they are incomplete right now, they only check that every component has the same number of dimensions, but not the same size. A quick example:

a = np.array([[1,2],[3,4]])
b = np.array([[5,6,7],[8,9,10],[11,12,13]])
c=[a,b]
d = np.asanyarray(c)
np.shape(d[0])
(2,2)
np.shape(d[1])
(3,3)
This should fail the tests for size. But the tests right now would pass since:
len(d[0].shape)
2
len(d[1].shape)
2

I'm going to fix this and will implement your suggestion as well.
Thank you

Each of these array is the N-dimensional projection of the vector
field on the corresponding axis.

varargs : scalar or list of scalar, optional
N scalars specifying the sample distances for each dimension,
i.e. `dx`, `dy`, `dz`, ... Default distance: 1.
single scalar specifies sample distance for all dimensions.
if `axis` is given, the number of varargs must equal the number of axes.
edge_order : {1, 2}, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to not actually be implemented/make a difference currently. Which also points to an unfortunate huge chunk of work, but it can wait for a while, just needs patience. And that is, we need a lot of tests after hashing out what we want.

Gradient is calculated using N\ :sup:`th` order accurate differences
at the boundaries. Default: 1.


Returns
-------
divergence : numpy array
The output corresponds to the divergence of the input vector field.
This means that the output array has the form
dAx/dx + dAy/dy + dAz/dz + ... for an input vector field
A = (Ax, Ay, Az, ...) up to N dimensions.

Example
-------
This example shows how the calculated divergence of the 2-D field
(0.5*x**2, -y*x) (whose divergence should be 0 everywhere) returns a 0 array

>>> import numpy as np
>>> X,Y=np.mgrid[0:2000,0:2000]
>>> a1=0.5*X**2
>>> a2=-Y*X
>>> c=[a1,a2]
>>> d=np.divergence(c)
>>> print d
[[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
"""
N = [0]*len(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My sense is that these pre-calculation shape checks are not necessary: you can just rely on the addition in outvals failing if the shapes do not match. But if you really feel it is necessary, all that would be required is

if len(set(np.shape(item) for item in v)) > 1:
    raise ...

for i, v_c in enumerate(v):
v_c = np.asanyarray(v_c)
N[i] = len(v_c.shape) # Get number of dimensions from every component
if False in [N[0] == N[i] for i in range(len(v))]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use if not all(...) instead of if False in [...]

#Check if all components are the same size
raise ValueError("Not all components of input"
" have the same number of dimensions")
else:
if False in [np.shape(v[0]) == np.shape(v[i]) for i in range(len(v))]:
raise ValueError("Not all components of input are the same size")
else:
N = N[0]
# If all vector field components are the same shape,
#N becomes the number of dimensions

axes = tuple(range(N))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a mistake? Given this, why is there a need to normalize the axes below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, I borrowed some code from np.gradient and forgot to remove the normalization below. I will do it now, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems overly long as well. Doesn't the below cover it?

outvals = gradient(v[0], *varargs, axis=0, edge_order=0)
for axis in range(1, len(v)):
    outvals += gradient(v[axis], *varargs, axis=axis, edge_order=0)
return outvals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. Is there any reason not just to have edge_order as a keyword argument and pass it on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking along similar lines.

A trick you could use to avoid repeating the first line outside the loop is to make a dummy object to add in-place:

class ZeroArray(object):
    def __iadd__(self, other):
        return other

The first time you use ZeroArray, it's replaced by the object you add in-place:

>>> x = ZeroArray()
>>> x += 10
>>> x
10
>>> type(x)
int

Then you could write:

outvals = ZeroArray()
for axis in range(len(v)):
    outvals += ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. Though I fear it fails the Zen of Python...

Actually, I think python's sum is smart enough to do this intrinsically, i.e., one would just do

outvals = sum(gradient(v[axis], *varargs, axis=axis, edge_order=0)
              for axis in range(len(v)))

This definitely beats everything else in readability, but, doing a quick test, is quite a bit slower than either of our approaches (which both are faster than first making a zero-filled array for large sizes)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. The problem with sum is that it doesn't do in-place arithmetic on the result, e.g.,

class A(object):
    def __iadd__(self, other):
        print('iadd with %r' % other)
        return self

    def __add__(self, other):
        print('add with %r' % other)
        return self

    def __radd__(self, other):
        print('radd with %r' % other)
        return self

sum([A()] * 4)
# radd with 0
# add with <__main__.A object at 0x107590748>
# add with <__main__.A object at 0x107590748>
# add with <__main__.A object at 0x107590748>

Maybe defining inplace_sum would be cleaner?

def inplace_sum(items, start=0):
    total = start
    for item in items:
        total += item
    return total

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, in this case we want the result to be added in-place on the first object to avoid an unnecessary memory allocation. So really, we need something like this:

def inplace_sum(items):
    it = iter(items)
    total = next(it)
    for item in it:
        total += item
    return total

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense -- I can see how for a generic sum you cannot rely on inplace addition doing what you'd like. You solution also makes sense, and is similarly fast to our other options. Still, probably getting a bit off-topic here -- I think for the divergence I would just stick with doing it explicitly as in my original suggestion.

outvals=np.zeros(np.shape(v_c)) #Initialize output

for i,ax in enumerate(axes):
outvals += np.gradient(v[ax], *varargs, axis=ax,edge_order=0)

return outvals


def curl(v,*varargs,**kwargs):
"""
Return the curl of a 3-D vector field.

The curl is computed using second order accurate central differences
in the interior and either first differences or second order accurate
one-sides (forward or backwards) differences at the boundaries. The
returned gradient hence has the same shape as the input array.

Parameters
----------
v : list of numpy arrays
Each of these array is the N-dimensional projection
of the vector field on the corresponding axis.

varargs : scalar or list of scalar, optional
N scalars specifying the sample distances for each dimension,
i.e. `dx`, `dy`, `dz`, ... Default distance: 1.
single scalar specifies sample distance for all dimensions.
if `axis` is given, the number of varargs must equal the number of axes.
edge_order : {1, 2}, optional
Gradient is calculated using N\ :sup:`th` order accurate differences
at the boundaries. Default: 1.

axis : None or int or tuple of ints, optional
Gradient is calculated only along the given axis or axes
The default (axis = None) is to calculate the gradient for
all the axes of the input array.
axis may be negative, in which case it counts from the last to the first axis.

Returns
-------
curl : list of numpy arrays
The output corresponds to the curl of the input 3-D vector field.
This means that the output has the form
(dAz/dy-dAy/dz, dAx/dz-dAz/dx, dAy/dx-dAx/dy) of an input vector
field A = (Ax, Ay, Ax).

Example
-------
The following example shows in matplotlib how the vector field (0,0,x*y) has the
correct value, nonzero only in the third component of the resulting vector field
after applying curl

>>> import matplotlib.pylab as plt
>>> import numpy as np
>>> X,Y,Z=np.mgrid[0:200,0:200,0:200]
>>> a0=X*Y
>>> a1=np.zeros(np.shape(X))
>>> a2=np.zeros(np.shape(X))
>>> a=[a0,a1,a2]
>>> [cx,cy,cz]=np.curl(a)
>>> plt.imshow(cx[:,:,1])
>>> plt.colorbar()
>>> plt.show()
>>> plt.imshow(cy[:,:,1])
>>> plt.colorbar()
>>> plt.show()
>>> plt.imshow(cz[:,:,1])
>>> plt.colorbar()
>>> plt.show()
"""
N = [0]*len(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on divergence above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ps. here of course you do need to check that len(v) == 3.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean len(v) == 3 in divergence? In principle I was intending to create a "generalized" divergence, for any number of dimensions, following the example of np.gradient, that can be applied to N-dimensional arrays, and the "return vector" would have N components. Please, correct me if I am wrong. However, for the case of curl, three dimensions are needed, not more nor fewer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for divergence I felt you either should omit all sanity checks, since if the shapes did not match the calculation of the divergence would fail anyway, or just do the very simple one I gave, since that covers what you had anyway.
My note was just to say that for curl, in addition to the (possible) check that shapes are identical, you do need to check that v has length 3 (as you say too).

for i, v_c in enumerate(v):
v_c = np.asanyarray(v_c)
N[i] = len(v_c.shape)
# Extract the number of dimensions from every component v_c
if False in [N[i] == 3 for i in range(len(v))]:
#Check if all components are the same size
raise ValueError("Not all components of input vector field"
" have three dimensions")
else:
if False in [np.shape(v[0]) == np.shape(v[i]) for i in range(len(v))]:
raise ValueError("Not all components of input are the same size")
else:
N = N[0]
# If all vector field components are the same shape,
#N becomes the number of dimensions

outvals=[np.zeros(np.shape(v_c))] * 3 #Initialize output vector field
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be shortened substantially

# Calculate dAz/ dy - dAy / dz, dAx / dz - dAz / dx, dAy / dx - dAx / dy.
outvals = [gradient(v[(axis + 2) % 3], *varargs, axis=(axis + 1) % 3, edge_order=0) -
           gradient(v[(axis + 1) % 3], *varargs, axis=(axis + 2) % 3, edge_order=0)
           for axis in range(3)]
return outvals

(Note that I don't call np.gradient since the function gradient is defined locally anyway)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello everyone,
Thank you for all your suggestions, they will definitely improve the functions. I will start implementing them, I guess it will take me a couple of days to review everything and make some profiles to filter the most efficient ones. By the way, in my last commit I force edge_order to 0 when calling gradient because if it is 1 something wrong happens in the edges, and the examples don't return arrays of zeros (in the edges there are 0.5 and -0.5). With edge_order = 0 it does not happen.
Thanks again


lst_axes = [1,2,2,0,0,1] # ordered list of dimension of the derivatives
lst_args = [2,1,0,2,1,0] # ordered list of arguments for the derivatives

for i in range(6):
# select the appropriate derivative and argument for each of the 6
# required computations
ax = lst_axes[i]
arg = lst_args[i]

out = np.gradient(v[arg], *varargs, axis=ax, edge_order=0)

if i == 0:
out0 = out
elif i == 1:
out1 = out
elif i == 2:
out2 = out
elif i == 3:
out3 = out
elif i == 4:
out4 = out
else:
out5 = out

outvals[0] = out0 - out1
outvals[1] = out2 - out3
outvals[2] = out4 - out5
return outvals


def laplace(f):
"""
Return the laplacian of input.

Computes the laplacian of an N-dimensional numpy array (scalar
field), or of a list of N N-dimensional numpy arrays (vector
field) as the composition of the numpy functions gradient and
divergence. In the case of an input vector field, the result
is the vector laplacian, in which a list of the laplacian of
each component is returned.
For more information on the computations, type help(gradient) or
help(divergence)

Parameters
----------
f : array_like
Input array

Returns
-------
l : array_like
Laplacian of input. If the input is a scalar field f
(single numpy array) the output is (d^2(f)/dx_0^2 +
d^2(f)/dx_1^2 + d^2(f)/dx_2^2 + ...), where d^2(f)/dx_i^2
refers to the second derivative of f with respect to x_i.
If the input is a scalar field the output is a list of
numpy arrays, being each of them the laplacian operator
applied to each of the input vector field components f_i

See Also
--------
gradient, divergence

Example
-------
This example returns an array that grows linearly in the X axis
after applying the laplacian function to the array X**3+Y**2+Z**2:

>>> X,Y,Z=np.mgrid[0:200,0:200,0:200]
>>> f=X**3+Y**2+Z**2
>>> d=laplace(f)
>>> plt.imshow(d[:,:,1])
>>> plt.colorbar()
>>> plt.show()
"""
if type(f) == np.ndarray:
l = np.divergence(np.gradient(f, *varargs, edge_order=0))
elif type(f) == list:
if False in [np.shape(f[0]) == np.shape(f[i]) for i in range(len(f))]:
raise TypeError("All components of the input vector field "
"must be the same shape")
else:
l = []
for i in range(len(f)):
laplace_comp = np.divergence(np.gradient(f[i], *varargs, edge_order=0))
l.append(laplace_comp)
else:
raise TypeError("Please, enter a numpy array or a list of"
" numpy arrays of the same shape")
return l


def diff(a, n=1, axis=-1):
"""
Expand Down