Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: added vector operators: divergence, curl and laplacian #6727

Closed
wants to merge 8 commits into from
Closed

ENH: added vector operators: divergence, curl and laplacian #6727

wants to merge 8 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Nov 25, 2015

Hello, yesterday I tried to upload this change but I messed up (sorry, my first time on github). I think I got it right now. In this commit I have included three new functions to /numpy/lib/function_base.py, namely divergence, curl and laplacian. I have been missing these functions when using numpy myself.

Thank you,

Antonio

@ghost
Copy link
Author

ghost commented Nov 25, 2015

I am trying to notify to the email list about this, as suggested in the workflow, but I can't figure out how to send a message to a specific list. Is it done in Gmane? Any help on this would be greatly appreciated.
Thanks,
Antonio

@rkern
Copy link
Member

rkern commented Nov 25, 2015

@jaimefrio
Copy link
Member

Your formatting seems to be way off, please:

  • use 4 space indentation throughout,
  • break long lines at 79 characters, and
  • follow the example of any other docstring for the proper vertical and horizontal spacing there.

@ghost
Copy link
Author

ghost commented Nov 26, 2015

Ahh, I'm sorry for that, I didn't realize... I think everything is ok now.
Thanks for the comment!

Return the laplacian of an N-dimensional array.

Computes the laplacian of an N-dimensional numpy array, as a
concatenation of the numpy functions gradient and divergence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 'concatenation' do you mean 'composition' here? Or this could be jargon that is unfamiliar to me, for example the composition of functions might be written as the concatenation of the symbols representing the functions. And what is the meaning of the _s in laplace_s?

@ghost
Copy link
Author

ghost commented Nov 27, 2015

Hello argriffing,
Yes, probably composition is more appropriate here, I just mean applying gradient to the input argument, and then applying divergence to the result of that gradient.
As for the _s, it means the "scalar laplacian", since laplacian operator also exists for vector fields. I was planning to implement it also. In principle it is just applying first the divergence and then the gradient, but I want to make sure first, it is not so straightforward for vector fields (https://en.wikipedia.org/wiki/Vector_Laplacian), but I'm pretty sure that in this case, that everything is done in "cartesian coordinates" it's just that. So I was thinking about also creating another function for this case, called laplace_v, but of course the names can be changed to more appropriate ones. For example I know that in Matlab they call "del2" to the laplacian operator.
Thanks,


Parameters
----------
v : list of numpy arrays
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to make this an array-like? Here, one can choose which dimension to be that of the axes; I think the first one is fine. In that case, one could just do

v = np.asanyarray(v)

and it would even cover the case of having a list of numpy arrays. Another advantage would be that the tests for the sizes to be the same could be omitted, since that would be done implicitly already.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
Thanks for your suggestion. By reviewing the size test, I have realized that they are incomplete right now, they only check that every component has the same number of dimensions, but not the same size. A quick example:

a = np.array([[1,2],[3,4]])
b = np.array([[5,6,7],[8,9,10],[11,12,13]])
c=[a,b]
d = np.asanyarray(c)
np.shape(d[0])
(2,2)
np.shape(d[1])
(3,3)
This should fail the tests for size. But the tests right now would pass since:
len(d[0].shape)
2
len(d[1].shape)
2

I'm going to fix this and will implement your suggestion as well.
Thank you

@shoyer
Copy link
Member

shoyer commented Nov 30, 2015

One thing that I would find helpful is to include the definition of the relevant operators (written using partial derivatives) in docstrings for each function.

@ghost
Copy link
Author

ghost commented Nov 30, 2015

Following some suggestions, I have eliminated duplicated code, by calling gradient with the appropriate "axis" parameter every time. Now everything looks cleaner and easier to understand. The definitions (using derivative notation) are included in the docstrings, in the return section

for i, v_c in enumerate(v):
v_c = np.asanyarray(v_c)
N[i] = len(v_c.shape) # Get number of dimensions from every component
if False in [N[0] == N[i] for i in range(len(v))]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use if not all(...) instead of if False in [...]

@ghost
Copy link
Author

ghost commented Dec 2, 2015

I have incorporated your suggestions. Everything looks much better and easier now!

@ghost
Copy link
Author

ghost commented Dec 7, 2015

Hi again,
Sorry for asking something so simple, this is my first contribution to GitHub. If everyone agrees with the last version, what is the next step? Do I have to write to the newsletter again?


N=len(v)
axes = tuple(range(N))
out = inplace_sum([gradient(v[ax],*varargs,axis=ax,edge_order=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to pass the argument as an iterator here, otherwise the list with all the arrays still gets created. I think all you need to do is to remove the square brackets.

While you are at it, also follow PEP8 more closely: in particular, spaces around = (line 1388), spaces after ,, and the for ax in axes should align with the first character after inplace_sum(
(If you use emacs, these kinds of checks can be done on-the-fly: http://docs.astropy.org/en/latest/development/codeguide_emacs.html)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello mhvk,
Thank you for your comment. I'm not sure if I understand what you mean. Do you propose to not create a list (in the example, "c") to pass to the function, but instead do something like a generator? In order to save memory? In that case, I think something inside the function should be changed. For example, if a list is not passed, the N = len(v) would no longer make sense, and some more changes would be necessary.
Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just meant to pass on a generator rather than a list to inplace_sum, not to the whole function. Here, on line 1390, you are creating a list implicitly, by using the [ and ] at the ends; these should be removed.

@mhvk
Copy link
Contributor

mhvk commented Dec 7, 2015

@antlarac - I added a few small further comments. One more one is that all code should have tests that check that they work as expected (and more importantly, continue to work as expected when someone else starts fiddling with the code). Tests for this module seem to be in numpy/lib/tests/test_function_base.py -- so that would seem the logical place to add some more.

@ghost
Copy link
Author

ghost commented Dec 8, 2015

Thank you for your answers, I will work on the tests now

@@ -1320,6 +1320,201 @@ def gradient(f, *varargs, **kwargs):
else:
return outvals

def inplace_sum(items):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a private function, should start with an underscore

@charris
Copy link
Member

charris commented Dec 8, 2015

The new functions need to be added to the __all__ list and a note should be added in doc/release/1.11.0-notes.rst. At some point the commits also need squashing.

@ghost
Copy link
Author

ghost commented Dec 9, 2015

Thank you for all the comments on formatting, I am fixing everything :)

@ghost
Copy link
Author

ghost commented Dec 20, 2015

Hello again,
I finally added some tests and fixed some other things. Also added the description of new functions in doc/release/1.11.0-notes.rst.
Can anyone think of any other way to improve this contribution?
Thank you :)

@seberg
Copy link
Member

seberg commented Jan 4, 2016

I am sorry, it seems this is difficult to get moving :(.

Anyway, as I mentioned before.... I am going to ask you or someone else to convince me that this is actually a reasonable way to calculate the laplacian numerically, or to scrap the whole function. I don't doubt that it is in principle correct, but I am very dubious about it. For one thing, it must have a rather large stencil (In my world laplacians are numerically calculated with stencils), but is the boundary really quite correct for that stencil, etc.? Note for example:

In [119]: r = np.sin(np.linspace(0, np.pi, 500))[None].repeat(500, axis=0)

In [120]: np.laplace(r).sum()
Out[120]: -6.2959224292199103

In [121]: scipy.ndimage.laplace(r).sum()
Out[121]: 5.0480886831794081e-13

And because I am dubious about the laplace, I think I need some reassurance that the other stuff is numerically reasonable, I don't really doubt it, but if you have other softwares that calculate these values did you compare the results?

There is at least one code problem that might be a bit more serious (older than your stuff really) though probably a simple fix. edge_order=0 is used, which is not even supported and should raise an error. Since it is used though, what does it mean?

Last point, I am still a bit unsure about the signature. Ignoring gradient which I am sure is ancient, might not a signature such as (with or without axis) be convenient?:

def divergence(arr, axis=-1, diffs=None):
    """
    Parameters
    ---------------
    arr : ndarray
         Data for which to calculate divergence
    axis : int
         Axis signifying the vector dimension.
    """

Of course in some sense this is less efficient if you happen to have your data in list-form, but it seems to me to be the more common form in numpy when considering linalg. Just thinking, but I think before the end I would like to poll the mailing list about the API again unless some more chip in here.

@ghost
Copy link
Author

ghost commented Jan 5, 2016

ok, thanks for your comment, let me check it.

@ghost
Copy link
Author

ghost commented Jan 10, 2016

I'm sorry this is taking me so long, these days I can't dedicate much time...

One alternative for the laplacian would be to apply gradient twice in each direction, and sum the results. I am concerned with what would happen in the edges, but this point is always trickier anyway...

Also, I have found the source code for octave's "del2" function. Using the proposed np.laplace for several functions, the results are qualitatively equal for many tests with arbitrary functions, but there is a quantitative difference (for some reason, the del2 function in octave and matlab is 1/4 times the actual laplacian, but the octave results actually are slightly smaller than that, compared to the proposed np.laplace). Also, I have found some problems with 1D functions, I will look deeper into that.

I will check the first option (using gradient twice), as well as trying to implement a python version of octave's del2 function. If anyone thinks one of these approaches is not reasonable for numpy, please let me know. For the sake of curiosity, next I copy the octave del2 function:

Thanks for your help

function D = del2 (M, varargin)

  if (nargin < 1)
    print_usage ();
  endif

  nd = ndims (M);
  sz = size (M);
  dx = cell (1, nd);
  if (nargin == 2 || nargin == 1)
    if (nargin == 1)
      h = 1;
    else
      h = varargin{1};
    endif
    for i = 1 : nd
      if (isscalar (h))
        dx{i} = h * ones (sz (i), 1);
      else
        if (length (h) == sz (i))
          dx{i} = diff (h)(:);
        else
          error ("del2: dimensionality mismatch in %d-th spacing vector", i);
        endif
      endif
    endfor
  elseif (nargin - 1 == nd)
    ## Reverse dx{1} and dx{2} as the X-dim is the 2nd dim of the ND array
    tmp = varargin{1};
    varargin{1} = varargin{2};
    varargin{2} = tmp;

    for i = 1 : nd
      if (isscalar (varargin{i}))
        dx{i} = varargin{i} * ones (sz (i), 1);
      else
        if (length (varargin{i}) == sz (i))
          dx{i} = diff (varargin{i})(:);
        else
          error ("del2: dimensionality mismatch in %d-th spacing vector", i);
        endif
      endif
    endfor
  else
    print_usage ();
  endif

  idx = cell (1, nd);
  for i = 1: nd
    idx{i} = ":";
  endfor

  D = zeros (sz);
  for i = 1: nd
    if (sz(i) >= 3)
      DD = zeros (sz);
      idx1 = idx2 = idx3 = idx;

      ## interior points
      idx1{i} = 1 : sz(i) - 2;
      idx2{i} = 2 : sz(i) - 1;
      idx3{i} = 3 : sz(i);
      szi = sz;
      szi (i) = 1;

      h1 = repmat (shiftdim (dx{i}(1 : sz(i) - 2), 1 - i), szi);
      h2 = repmat (shiftdim (dx{i}(2 : sz(i) - 1), 1 - i), szi);
      DD(idx2{:}) = ((M(idx1{:}) - M(idx2{:})) ./ h1 + ...
                     (M(idx3{:}) - M(idx2{:})) ./ h2) ./ (h1 + h2);

      ## left and right boundary
      if (sz(i) == 3)
        DD(idx1{:}) = DD(idx3{:}) = DD(idx2{:});
      else
        idx1{i} = 1;
        idx2{i} = 2;
        idx3{i} = 3;
        DD(idx1{:}) = (dx{i}(1) + dx{i}(2)) / dx{i}(2) * DD (idx2{:}) - ...
            dx{i}(1) / dx{i}(2) * DD (idx3{:});

        idx1{i} = sz(i);
        idx2{i} = sz(i) - 1;
        idx3{i} = sz(i) - 2;
        DD(idx1{:}) =  (dx{i}(sz(i) - 1) + dx{i}(sz(i) - 2)) / ...
            dx{i}(sz(i) - 2) * DD (idx2{:}) - ...
            dx{i}(sz(i) - 1) / dx{i}(sz(i) - 2) * DD (idx3{:});
      endif

      D += DD;
    endif
  endfor

  D = D ./ nd;
endfunction

EDIT (seberg): Add ticks for code to make it a bit more readable

@charris
Copy link
Member

charris commented Jan 12, 2016

@antlarac When you get back to this, could you also squash the commits. You can use git rebase -i HEAD~n for that, where n is the number of commits you want to work on.

@ghost
Copy link
Author

ghost commented Jan 14, 2016

Ok, I didn't know about this, thank you for the tip :)

@ghost
Copy link
Author

ghost commented Jan 21, 2016

Ok, I know this is taking too long, but I think I'll finish soon. I've been figuring out how del2 from matlab / octave works, and it definitely uses 5 point stencil for interior points, and the edge points are linearly interpolated from the two nearest neighbors of each edge. So it certainly would give different results, (especially at the edges) than my initial proposal of taking the divergence of the gradient. Thanks seberg, you were completely right at this point !!

I have also been checking the matlab versions of divergence and curl, and those use gradient for every dimension as needed. So I would say that the divergence and curl proposed in this pull request are reliable.
I hope to finish the laplacian (maybe it is better to call it del2 as in matlab/octave??) in a few days.

I'm still not 100% sure why del2 is actually 1/4 of the value of the actual laplacian. It may have to do with how they add (h1 + h2) to calculate the second derivative (DD), but that would give a factor 1/2, I think (at least for regular grids). I will keep working on this, but of course any suggestion is welcome :)

total = next(it)
for item in it:
total += item
return total
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be written np.sum(arr, out=arr[0]). Seems like a premature optimization to me though

@seberg
Copy link
Member

seberg commented May 2, 2019

Since the original author deleted the github account, and I think the PR still needed quite a lot of work and probably some discussion. I will open an issue for a request to add these functions to np.linalg pointing to this PR since there is likely some good start and discussion here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants