Array mods #113

JimNrao · 2015-06-04T21:02:17Z

Modified Array object by adding a new method reformOrResize. This will allow an array with a large amount of memory allocated to change shape without forcing the memory to be reallocated.
Addted ElementType typedef to allow template users of Array to determin the element type; for example:

template
void f (T & x){
T tmp = x (0);
// etc.
}

that is allocated but not currently needed. An example would be if adding a Matrix of data onto a Cube. 2) Added "typedef T ElementType;" so that templates using an array can have access to the element type. (STL data structures usually do this).

tammojan · 2015-06-05T18:52:12Z

To get the github cross-references right: this would fix #111, so whenever this gets merged, also #111 can be closed.

gervandiepen · 2015-06-11T10:15:25Z

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to use value_type for such a typedef and that is already present in Array.h.
I do not understand why the implementation of the reform function has changed. baseReform already checks if the length matches. The error message suggests that reform is a nonStrict reform, which it is not.
Why not inline the capacity function?
In Array.h the comment for function capacity needs to be placed before the function declaration, otherwise doxygen does not pick it up.
In Array.h the comments of reformOrResize contain quite some typos. It should not use matrix, but array. In the middle resize instead of copy is used. I got the impression that resizePercentage is applied to the current shape, but it is to the new shape.
reformOrResize checks if the array is contiguous. Note that even if contiguous, the array can be a view on a larger array. It does not check if other Array objects are referencing the same storage. It might be better to check that no more references exist to the array's storage.
A serious issue: it does not check that the dimensionality does not change, hence it is possible that a Vector gets resized to a 2-dim array which must not be possible.
I would like to see the code not dependent on the template, being implemented in ArrayBase to avoid bloat. Certainly the first part checking the shape, could be put in ArrayBase.
The test 'shape() == newShape' will throw an exception if their lengths mismatch. Better to use the isEqual function.
Data is not copied if there is sufficient space and copyDataIfNeeded is true. Maybe easier to remove argument copyDataIfNeeded.

tammojan · 2015-06-11T11:06:28Z

Reopening the pull request, so that improvements can be made in the existing pull request.

…yMods

JimNrao · 2015-06-15T17:24:42Z

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

1. The ElementType typedef should be removed. Standard STL is to use value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs in the iterator definitions).

2. I do not understand why the implementation of the reform function has changed. baseReform already checks if the length matches. The error message suggests that reform is a nonStrict reform, which it is not.

--> Reverted the implementation back since this class no longer needs a new implementation.  
    (Manually reverted it but only got the signature changed).

3. Why not inline the capacity function?

--> Done.

4. In Array.h the comment for function capacity needs to be placed before the function declaration, otherwise doxygen does not pick it up.

--> Done

5. In Array.h the comments of reformOrResize contain quite some typos. It should not use matrix, but array. In the middle resize instead of copy is used. I got the impression that resizePercentage is applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

6. reformOrResize checks if the array is contiguous. Note that even if contiguous, the array can be a view on a larger array. It does not check if other Array objects are referencing the same storage. It might be better to check that no more references exist to the array's storage.
A serious issue: it does not check that the dimensionality does not change, hence it is possible that a Vector gets resized to a 2-dim array which must not be possible.
I would like to see the code not dependent on the template, being implemented in ArrayBase to avoid bloat. Certainly the first part checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared or if an attempt is made
    to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more readable to keep all of the
    validation checks together rather than split them between Array and ArrayBase.

7. The test 'shape() == newShape' will throw an exception if their lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals.  (Rather odd that operator== doesn't simply return false if the shapes 
    are different)

8. Data is not copied if there is sufficient space and copyDataIfNeeded is true. Maybe easier to remove argument copyDataIfNeeded.

--> If there is sufficient space, the data is left in place.  The most likely use cases will
    increase or decrease the last dimension, so leaving the data in place is appropriate.
    If a different change is made (e.g., a 2x3 going to a 3x2) it's really hard to say what
    a copy ought to do.  I added a line in the comments to warn the user about this case.

gervandiepen · 2015-06-16T06:56:57Z

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception if the
lengths mismatch. Brian Glendenning has written that code long time ago (in
1992) and I always hesitated to change it. Maybe we have to bite the bullet
one time.

I still think copyDataIfNeeded should be removed. The argument promises to
copy the array elements to their new places when reshaping from e.g. [2,4]
to [3,4]. However, that won't be done if there is enough storage. Note: in
such a case element [1,1] is not the same as before the reformOrResize. It
makes it unpredictable for the caller if a copy will be made or not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything could be
done in ArrayBase because resize is a virtual function. It only requires
data_p.nrefs() to be passed to that base function. In fact, only the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on isContiguous makes
little sense. As I already commented, even if contiguous the Array could be
a view on a part of the original, already deleted, Array object. I think it
does not matter for reformOrResize if the view is contiguous or not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to use value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs in the iterator definitions).

I do not understand why the implementation of the reform function has changed. baseReform already checks if the length matches. The error message suggests that reform is a nonStrict reform, which it is not.

--> Reverted the implementation back since this class no longer needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be placed before the function declaration, otherwise doxygen does not pick it up.

--> Done

In Array.h the comments of reformOrResize contain quite some typos. It should not use matrix, but array. In the middle resize instead of copy is used. I got the impression that resizePercentage is applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note that even if contiguous, the array can be a view on a larger array. It does not check if other Array objects are referencing the same storage. It might be better to check that no more references exist to the array's storage.
A serious issue: it does not check that the dimensionality does not change, hence it is possible that a Vector gets resized to a 2-dim array which must not be possible.
I would like to see the code not dependent on the template, being implemented in ArrayBase to avoid bloat. Certainly the first part checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared or if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more readable to keep all of the
validation checks together rather than split them between Array and ArrayBase.

The test 'shape() == newShape' will throw an exception if their lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't simply return false if the shapes
are different)

Data is not copied if there is sufficient space and copyDataIfNeeded is true. Maybe easier to remove argument copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The most likely use cases will
increase or decrease the last dimension, so leaving the data in place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's really hard to say what
a copy ought to do. I added a line in the comments to warn the user about this case.

—
Reply to this email directly or view it on GitHub
#113 (comment).

JimNrao · 2015-06-17T17:34:41Z

Maybe I should change the name to copyIfResizing. My use case is if
when a shape of [2,3,4] goes to [2,3,5](i.e., adding another row on a
vis cube); the user need not know if the operation will require resizing
or reforming but they do want the data in element (1,1,1) to be the same
in the resulting array. I pass the argument to the Array::resize method
if resizing is required.

I can see where a user who radically changes shape might be confused
into thinking that this parameter could be used to preserve the data
(e.g., going from [2,3,4] to [4, 2, 3]) even when resizing is not
required. The original Array::reform method just lets the data remain
at the same memory location.

Maybe there needs to be two different methods with slightly different
semantics? One method only allows altering the last dimension while
preserving the data already in place. The second method lets the user
radically reshape the array but provides no guarantees about existing
data. The underlying code might be the same (e.g. the protected
ArrayBase::reformOrResize method would be the same in both cases) with a
two thin methods providing the public API. Maybe one would be called
extendArray (preserves data) and the other could be called
reformOrReshape (no data guarantees).

On 06/16/2015 12:56 AM, Ger van Diepen wrote:

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception if the
lengths mismatch. Brian Glendenning has written that code long time
ago (in
1992) and I always hesitated to change it. Maybe we have to bite the
bullet
one time.

I still think copyDataIfNeeded should be removed. The argument promises to
copy the array elements to their new places when reshaping from e.g. [2,4]
to [3,4]. However, that won't be done if there is enough storage. Note: in
such a case element [1,1] is not the same as before the reformOrResize. It
makes it unpredictable for the caller if a copy will be made or not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything could be
done in ArrayBase because resize is a virtual function. It only requires
data_p.nrefs() to be passed to that base function. In fact, only the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on isContiguous makes
little sense. As I already commented, even if contiguous the Array
could be
a view on a part of the original, already deleted, Array object. I
think it
does not matter for reformOrResize if the view is contiguous or not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to use
value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs
in the iterator definitions).

I do not understand why the implementation of the reform function
has changed. baseReform already checks if the length matches. The
error message suggests that reform is a nonStrict reform, which it is not.

--> Reverted the implementation back since this class no longer
needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be placed
before the function declaration, otherwise doxygen does not pick it up.

--> Done

In Array.h the comments of reformOrResize contain quite some
typos. It should not use matrix, but array. In the middle resize
instead of copy is used. I got the impression that resizePercentage is
applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note that even
if contiguous, the array can be a view on a larger array. It does not
check if other Array objects are referencing the same storage. It
might be better to check that no more references exist to the array's
storage.
A serious issue: it does not check that the dimensionality does not
change, hence it is possible that a Vector gets resized to a 2-dim
array which must not be possible.
I would like to see the code not dependent on the template, being
implemented in ArrayBase to avoid bloat. Certainly the first part
checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared or
if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more
readable to keep all of the
validation checks together rather than split them between Array and
ArrayBase.

The test 'shape() == newShape' will throw an exception if their
lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't simply
return false if the shapes
are different)

Data is not copied if there is sufficient space and
copyDataIfNeeded is true. Maybe easier to remove argument
copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The
most likely use cases will
increase or decrease the last dimension, so leaving the data in
place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's
really hard to say what
a copy ought to do. I added a line in the comments to warn the user
about this case.

—
Reply to this email directly or view it on GitHub
#113 (comment).

—
Reply to this email directly or view it on GitHub
#113 (comment)
Bug from
https://github.com/notifications/beacon/AGPGtIjHmFFFrfkJC0bbLx7kplb-3vGkks5oT8A5gaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

…Base with needed template-specific info passed in as parameters.

and extend. The first method promised nothing about the data contained in the array while the second method always preserves it.

JimNrao · 2015-06-18T21:05:44Z

Created a new method, extend, which only allows changing the last dimension. The new method always preserves the relevant portion of the data. The old now method offers no guarantee about the underlying data. This allows removing the "copyIfNeeded" parameter and provides a clearer set of semantics for each call. The same ArrayBase method does the bulk of the work for both methods so the change is only to improve the quality of the Array API.

gervandiepen · 2015-06-22T06:12:52Z

I see your use case. But adding an element to another axis than the last
one, is just as valid and will have different semantics. We could say that
if copy=True and another axis than the last one changes, a resize with copy
is always done (because that is cheaper and easier than making a temp copy
in another array; when not using a temp array, you have to figure out how
to copy which can be quite difficult).

But having a function extendArray also sounds very good to me. Do you
really need the more general reformOrResize? If not, we can as well drop
it. Although you can argue it might be useful to be able to reuse an
existing array for totally different purposes, the limitiation that the
dimensionality cannot change undermines it.
Would extendArray also allow you to make the last axis shorter? If so, the
function name is incorrect.

On Wed, Jun 17, 2015 at 7:34 PM, Jim notifications@github.com wrote:

Maybe I should change the name to copyIfResizing. My use case is if
when a shape of [2,3,4] goes to [2,3,5](i.e., adding another row on a
vis cube); the user need not know if the operation will require resizing
or reforming but they do want the data in element (1,1,1) to be the same
in the resulting array. I pass the argument to the Array::resize method
if resizing is required.

I can see where a user who radically changes shape might be confused
into thinking that this parameter could be used to preserve the data
(e.g., going from [2,3,4] to [4, 2, 3]) even when resizing is not
required. The original Array::reform method just lets the data remain
at the same memory location.

Maybe there needs to be two different methods with slightly different
semantics? One method only allows altering the last dimension while
preserving the data already in place. The second method lets the user
radically reshape the array but provides no guarantees about existing
data. The underlying code might be the same (e.g. the protected
ArrayBase::reformOrResize method would be the same in both cases) with a
two thin methods providing the public API. Maybe one would be called
extendArray (preserves data) and the other could be called
reformOrReshape (no data guarantees).

On 06/16/2015 12:56 AM, Ger van Diepen wrote:

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception if the
lengths mismatch. Brian Glendenning has written that code long time
ago (in
1992) and I always hesitated to change it. Maybe we have to bite the
bullet
one time.

I still think copyDataIfNeeded should be removed. The argument promises
to
copy the array elements to their new places when reshaping from e.g.
[2,4]
to [3,4]. However, that won't be done if there is enough storage. Note:
in
such a case element [1,1] is not the same as before the reformOrResize.
It
makes it unpredictable for the caller if a copy will be made or not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything could be
done in ArrayBase because resize is a virtual function. It only requires
data_p.nrefs() to be passed to that base function. In fact, only the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on isContiguous makes
little sense. As I already commented, even if contiguous the Array
could be
a view on a part of the original, already deleted, Array object. I
think it
does not matter for reformOrResize if the view is contiguous or not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to use
value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs
in the iterator definitions).

I do not understand why the implementation of the reform function
has changed. baseReform already checks if the length matches. The
error message suggests that reform is a nonStrict reform, which it is
not.

--> Reverted the implementation back since this class no longer
needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be placed
before the function declaration, otherwise doxygen does not pick it up.

--> Done

In Array.h the comments of reformOrResize contain quite some
typos. It should not use matrix, but array. In the middle resize
instead of copy is used. I got the impression that resizePercentage is
applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note that even
if contiguous, the array can be a view on a larger array. It does not
check if other Array objects are referencing the same storage. It
might be better to check that no more references exist to the array's
storage.
A serious issue: it does not check that the dimensionality does not
change, hence it is possible that a Vector gets resized to a 2-dim
array which must not be possible.
I would like to see the code not dependent on the template, being
implemented in ArrayBase to avoid bloat. Certainly the first part
checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared or
if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more
readable to keep all of the
validation checks together rather than split them between Array and
ArrayBase.

The test 'shape() == newShape' will throw an exception if their
lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't simply
return false if the shapes
are different)

Data is not copied if there is sufficient space and
copyDataIfNeeded is true. Maybe easier to remove argument
copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The
most likely use cases will
increase or decrease the last dimension, so leaving the data in
place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's
really hard to say what
a copy ought to do. I added a line in the comments to warn the user
about this case.

—
Reply to this email directly or view it on GitHub
<#113 (comment)
.

—
Reply to this email directly or view it on GitHub
#113 (comment)

Bug from

https://github.com/notifications/beacon/AGPGtIjHmFFFrfkJC0bbLx7kplb-3vGkks5oT8A5gaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub
#113 (comment).

gervandiepen · 2015-06-23T09:51:14Z

casa/Arrays/ArrayBase.cc

+            // Perform an exact resize
+
+            resize (newShape, copyDataIfNeeded);
+	    resetEnd = False;


I think setEndIter needs to be done by the caller when the array is resized.

JimNrao · 2015-06-23T18:04:03Z

Actually I use both aspects of the functionality, although the "extend"
one is the most important. In both use-cases, the data always retains
the same number of dimensions. In the more frequent case, I have to
increase or decrease the last dimension of the cube (or matrix or
vector); this one can occur in the tightest of the read loops (we
normally process a set of visibilities from the same spectral window
over a modest time interval). At the next level up, we potentially
change spectral windows which can change the size of all three axes
(correlation, channel, row). This one occurs less frequently and is not
as important to performance, I think. My suspicion is that anyone using
this method would normally be interested in reusing a block of storage.

I think by splitting it into one call that reforms it to allow storage
reuse (i.e., any data in the array prior to the call should be
considered effectively lost) and one that extends the array in the last
dimension which guarantees data preservation would seem to be a
reasonable set.

Maybe the "extend" method could be called "adjustLastAxis". "Extend"
makes sense if you allow for "negative extension" but when looking at an
API I would probably scan past "extend" looking for something else
(e.g., "shrink"), etc.

On 06/22/2015 12:12 AM, Ger van Diepen wrote:

I see your use case. But adding an element to another axis than the last
one, is just as valid and will have different semantics. We could say that
if copy=True and another axis than the last one changes, a resize with
copy
is always done (because that is cheaper and easier than making a temp copy
in another array; when not using a temp array, you have to figure out how
to copy which can be quite difficult).

But having a function extendArray also sounds very good to me. Do you
really need the more general reformOrResize? If not, we can as well drop
it. Although you can argue it might be useful to be able to reuse an
existing array for totally different purposes, the limitiation that the
dimensionality cannot change undermines it.
Would extendArray also allow you to make the last axis shorter? If so, the
function name is incorrect.

On Wed, Jun 17, 2015 at 7:34 PM, Jim notifications@github.com wrote:

Maybe I should change the name to copyIfResizing. My use case is if
when a shape of [2,3,4] goes to [2,3,5](i.e., adding another row on a
vis cube); the user need not know if the operation will require resizing
or reforming but they do want the data in element (1,1,1) to be the same
in the resulting array. I pass the argument to the Array::resize method
if resizing is required.

I can see where a user who radically changes shape might be confused
into thinking that this parameter could be used to preserve the data
(e.g., going from [2,3,4] to [4, 2, 3]) even when resizing is not
required. The original Array::reform method just lets the data remain
at the same memory location.

Maybe there needs to be two different methods with slightly different
semantics? One method only allows altering the last dimension while
preserving the data already in place. The second method lets the user
radically reshape the array but provides no guarantees about existing
data. The underlying code might be the same (e.g. the protected
ArrayBase::reformOrResize method would be the same in both cases) with a
two thin methods providing the public API. Maybe one would be called
extendArray (preserves data) and the other could be called
reformOrReshape (no data guarantees).

On 06/16/2015 12:56 AM, Ger van Diepen wrote:

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception
if the
lengths mismatch. Brian Glendenning has written that code long time
ago (in
1992) and I always hesitated to change it. Maybe we have to bite the
bullet
one time.

I still think copyDataIfNeeded should be removed. The argument
promises
to
copy the array elements to their new places when reshaping from e.g.
[2,4]
to [3,4]. However, that won't be done if there is enough storage.
Note:
in
such a case element [1,1] is not the same as before the
reformOrResize.
It
makes it unpredictable for the caller if a copy will be made or
not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything
could be
done in ArrayBase because resize is a virtual function. It only
requires
data_p.nrefs() to be passed to that base function. In fact, only
the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on
isContiguous makes
little sense. As I already commented, even if contiguous the Array
could be
a view on a part of the original, already deleted, Array object. I
think it
does not matter for reformOrResize if the view is contiguous or
not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to use
value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs
in the iterator definitions).

I do not understand why the implementation of the reform function
has changed. baseReform already checks if the length matches. The
error message suggests that reform is a nonStrict reform, which it is
not.

--> Reverted the implementation back since this class no longer
needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be placed
before the function declaration, otherwise doxygen does not pick
it up.

--> Done

In Array.h the comments of reformOrResize contain quite some
typos. It should not use matrix, but array. In the middle resize
instead of copy is used. I got the impression that resizePercentage is
applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note that even
if contiguous, the array can be a view on a larger array. It does not
check if other Array objects are referencing the same storage. It
might be better to check that no more references exist to the array's
storage.
A serious issue: it does not check that the dimensionality does not
change, hence it is possible that a Vector gets resized to a 2-dim
array which must not be possible.
I would like to see the code not dependent on the template, being
implemented in ArrayBase to avoid bloat. Certainly the first part
checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared or
if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more
readable to keep all of the
validation checks together rather than split them between Array and
ArrayBase.

The test 'shape() == newShape' will throw an exception if their
lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't simply
return false if the shapes
are different)

Data is not copied if there is sufficient space and
copyDataIfNeeded is true. Maybe easier to remove argument
copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The
most likely use cases will
increase or decrease the last dimension, so leaving the data in
place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's
really hard to say what
a copy ought to do. I added a line in the comments to warn the user
about this case.

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.

—
Reply to this email directly or view it on GitHub

#113 (comment)

Bug from

https://github.com/notifications/beacon/AGPGtIjHmFFFrfkJC0bbLx7kplb-3vGkks5oT8A5gaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub
#113 (comment).

—
Reply to this email directly or view it on GitHub
#113 (comment)
Bug from
https://github.com/notifications/beacon/AGPGtLYCivXP5KRnE85Wogh_JnWaVjaPks5oV57kgaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

gervandiepen · 2015-06-24T05:59:01Z

I sympathize with changing extend to a name that better covers the
functionality. adjustLastAxis seems fine.
I now realize my remark about setEndIter was incorrect, because resize is
already doing setEndIter. Maybe better to add that as a comment to the code.
I'll merge the pull request tomorrow morning once you've change the name
extend.
Ger

On Tue, Jun 23, 2015 at 8:04 PM, Jim notifications@github.com wrote:

Actually I use both aspects of the functionality, although the "extend"
one is the most important. In both use-cases, the data always retains
the same number of dimensions. In the more frequent case, I have to
increase or decrease the last dimension of the cube (or matrix or
vector); this one can occur in the tightest of the read loops (we
normally process a set of visibilities from the same spectral window
over a modest time interval). At the next level up, we potentially
change spectral windows which can change the size of all three axes
(correlation, channel, row). This one occurs less frequently and is not
as important to performance, I think. My suspicion is that anyone using
this method would normally be interested in reusing a block of storage.

I think by splitting it into one call that reforms it to allow storage
reuse (i.e., any data in the array prior to the call should be
considered effectively lost) and one that extends the array in the last
dimension which guarantees data preservation would seem to be a
reasonable set.

Maybe the "extend" method could be called "adjustLastAxis". "Extend"
makes sense if you allow for "negative extension" but when looking at an
API I would probably scan past "extend" looking for something else
(e.g., "shrink"), etc.

On 06/22/2015 12:12 AM, Ger van Diepen wrote:

I see your use case. But adding an element to another axis than the last
one, is just as valid and will have different semantics. We could say
that
if copy=True and another axis than the last one changes, a resize with
copy
is always done (because that is cheaper and easier than making a temp
copy
in another array; when not using a temp array, you have to figure out how
to copy which can be quite difficult).

But having a function extendArray also sounds very good to me. Do you
really need the more general reformOrResize? If not, we can as well drop
it. Although you can argue it might be useful to be able to reuse an
existing array for totally different purposes, the limitiation that the
dimensionality cannot change undermines it.
Would extendArray also allow you to make the last axis shorter? If so,
the
function name is incorrect.

On Wed, Jun 17, 2015 at 7:34 PM, Jim notifications@github.com wrote:

Maybe I should change the name to copyIfResizing. My use case is if
when a shape of [2,3,4] goes to [2,3,5](i.e., adding another row on a
vis cube); the user need not know if the operation will require
resizing
or reforming but they do want the data in element (1,1,1) to be the
same
in the resulting array. I pass the argument to the Array::resize method
if resizing is required.

I can see where a user who radically changes shape might be confused
into thinking that this parameter could be used to preserve the data
(e.g., going from [2,3,4] to [4, 2, 3]) even when resizing is not
required. The original Array::reform method just lets the data remain
at the same memory location.

Maybe there needs to be two different methods with slightly different
semantics? One method only allows altering the last dimension while
preserving the data already in place. The second method lets the user
radically reshape the array but provides no guarantees about existing
data. The underlying code might be the same (e.g. the protected
ArrayBase::reformOrResize method would be the same in both cases) with
a
two thin methods providing the public API. Maybe one would be called
extendArray (preserves data) and the other could be called
reformOrReshape (no data guarantees).

On 06/16/2015 12:56 AM, Ger van Diepen wrote:

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception
if the
lengths mismatch. Brian Glendenning has written that code long time
ago (in
1992) and I always hesitated to change it. Maybe we have to bite the
bullet
one time.

I still think copyDataIfNeeded should be removed. The argument
promises
to
copy the array elements to their new places when reshaping from e.g.
[2,4]
to [3,4]. However, that won't be done if there is enough storage.
Note:
in
such a case element [1,1] is not the same as before the
reformOrResize.
It
makes it unpredictable for the caller if a copy will be made or
not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything
could be
done in ArrayBase because resize is a virtual function. It only
requires
data_p.nrefs() to be passed to that base function. In fact, only
the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on
isContiguous makes
little sense. As I already commented, even if contiguous the Array
could be
a view on a part of the original, already deleted, Array object. I
think it
does not matter for reformOrResize if the view is contiguous or
not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com
wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to
use
value_type for such a typedef and that is already present in Array.h.

--> Removed this; I missed the existing typedef (lot's of typedefs
in the iterator definitions).

I do not understand why the implementation of the reform
function
has changed. baseReform already checks if the length matches. The
error message suggests that reform is a nonStrict reform, which it is
not.

--> Reverted the implementation back since this class no longer
needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be placed
before the function declaration, otherwise doxygen does not pick
it up.

--> Done

In Array.h the comments of reformOrResize contain quite some
typos. It should not use matrix, but array. In the middle resize
instead of copy is used. I got the impression that resizePercentage
is
applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note that even
if contiguous, the array can be a view on a larger array. It does not
check if other Array objects are referencing the same storage. It
might be better to check that no more references exist to the array's
storage.
A serious issue: it does not check that the dimensionality does not
change, hence it is possible that a Vector gets resized to a 2-dim
array which must not be possible.
I would like to see the code not dependent on the template, being
implemented in ArrayBase to avoid bloat. Certainly the first part
checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is shared
or
if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it seems more
readable to keep all of the
validation checks together rather than split them between Array and
ArrayBase.

The test 'shape() == newShape' will throw an exception if their
lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't simply
return false if the shapes
are different)

Data is not copied if there is sufficient space and
copyDataIfNeeded is true. Maybe easier to remove argument
copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The
most likely use cases will
increase or decrease the last dimension, so leaving the data in
place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's
really hard to say what
a copy ought to do. I added a line in the comments to warn the user
about this case.

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.Web

Bug from

https://github.com/notifications/beacon/AGPGtIjHmFFFrfkJC0bbLx7kplb-3vGkks5oT8A5gaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub
<#113 (comment)
.

—
Reply to this email directly or view it on GitHub
#113 (comment)

Bug from

https://github.com/notifications/beacon/AGPGtLYCivXP5KRnE85Wogh_JnWaVjaPks5oV57kgaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub
#113 (comment).

JimNrao · 2015-06-24T16:04:42Z

The name change seems like a good idea.

I was having the ArrayBase method return a bool to indicate to the
caller (in Array) that it needed to call setEndIter since that wasn't
available to ArrayBase.

On 6/23/2015 11:59 PM, Ger van Diepen wrote:

I sympathize with changing extend to a name that better covers the
functionality. adjustLastAxis seems fine.
I now realize my remark about setEndIter was incorrect, because resize is
already doing setEndIter. Maybe better to add that as a comment to the
code.
I'll merge the pull request tomorrow morning once you've change the name
extend.
Ger

On Tue, Jun 23, 2015 at 8:04 PM, Jim notifications@github.com wrote:

Actually I use both aspects of the functionality, although the "extend"
one is the most important. In both use-cases, the data always retains
the same number of dimensions. In the more frequent case, I have to
increase or decrease the last dimension of the cube (or matrix or
vector); this one can occur in the tightest of the read loops (we
normally process a set of visibilities from the same spectral window
over a modest time interval). At the next level up, we potentially
change spectral windows which can change the size of all three axes
(correlation, channel, row). This one occurs less frequently and is not
as important to performance, I think. My suspicion is that anyone using
this method would normally be interested in reusing a block of storage.

I think by splitting it into one call that reforms it to allow storage
reuse (i.e., any data in the array prior to the call should be
considered effectively lost) and one that extends the array in the last
dimension which guarantees data preservation would seem to be a
reasonable set.

Maybe the "extend" method could be called "adjustLastAxis". "Extend"
makes sense if you allow for "negative extension" but when looking at an
API I would probably scan past "extend" looking for something else
(e.g., "shrink"), etc.

On 06/22/2015 12:12 AM, Ger van Diepen wrote:

I see your use case. But adding an element to another axis than
the last
one, is just as valid and will have different semantics. We could say
that
if copy=True and another axis than the last one changes, a resize with
copy
is always done (because that is cheaper and easier than making a temp
copy
in another array; when not using a temp array, you have to figure
out how
to copy which can be quite difficult).

But having a function extendArray also sounds very good to me. Do you
really need the more general reformOrResize? If not, we can as
well drop
it. Although you can argue it might be useful to be able to reuse an
existing array for totally different purposes, the limitiation
that the
dimensionality cannot change undermines it.
Would extendArray also allow you to make the last axis shorter? If so,
the
function name is incorrect.

On Wed, Jun 17, 2015 at 7:34 PM, Jim notifications@github.com wrote:

Maybe I should change the name to copyIfResizing. My use case is if
when a shape of [2,3,4] goes to [2,3,5](i.e., adding another
row on a
vis cube); the user need not know if the operation will require
resizing
or reforming but they do want the data in element (1,1,1) to be the
same
in the resulting array. I pass the argument to the Array::resize
method
if resizing is required.

I can see where a user who radically changes shape might be confused
into thinking that this parameter could be used to preserve the data
(e.g., going from [2,3,4] to [4, 2, 3]) even when resizing is not
required. The original Array::reform method just lets the data
remain
at the same memory location.

Maybe there needs to be two different methods with slightly
different
semantics? One method only allows altering the last dimension while
preserving the data already in place. The second method lets the
user
radically reshape the array but provides no guarantees about
existing
data. The underlying code might be the same (e.g. the protected
ArrayBase::reformOrResize method would be the same in both
cases) with
a
two thin methods providing the public API. Maybe one would be called
extendArray (preserves data) and the other could be called
reformOrReshape (no data guarantees).

On 06/16/2015 12:56 AM, Ger van Diepen wrote:

Hi Jim,

Thanks for the changes.
I agree it is odd that IPosition::operator== throws an exception
if the
lengths mismatch. Brian Glendenning has written that code long
time
ago (in
1992) and I always hesitated to change it. Maybe we have to
bite the
bullet
one time.

I still think copyDataIfNeeded should be removed. The argument
promises
to
copy the array elements to their new places when reshaping
from e.g.
[2,4]
to [3,4]. However, that won't be done if there is enough storage.
Note:
in
such a case element [1,1] is not the same as before the
reformOrResize.
It
makes it unpredictable for the caller if a copy will be made or
not, so
better to never do it. Do you need that argument?

I do not agree data_p is used a lot. In fact, about everything
could be
done in ArrayBase because resize is a virtual function. It only
requires
data_p.nrefs() to be passed to that base function. In fact, only
the call
to setEndIter() needs to be in Array.tcc.

Now there is a test on nrefs()==1, I think the test on
isContiguous makes
little sense. As I already commented, even if contiguous the Array
could be
a view on a part of the original, already deleted, Array object. I
think it
does not matter for reformOrResize if the view is contiguous or
not. What
do you think?

Ger

On Mon, Jun 15, 2015 at 7:24 PM, Jim notifications@github.com
wrote:

Response to Ger's comments above:

I have a few comments. They have to be addressed in the code.

The ElementType typedef should be removed. Standard STL is to
use
value_type for such a typedef and that is already present in
Array.h.

--> Removed this; I missed the existing typedef (lot's of
typedefs
in the iterator definitions).

I do not understand why the implementation of the reform
function
has changed. baseReform already checks if the length matches. The
error message suggests that reform is a nonStrict reform,
which it is
not.

--> Reverted the implementation back since this class no longer
needs a new implementation.
(Manually reverted it but only got the signature changed).

Why not inline the capacity function?

--> Done.

In Array.h the comment for function capacity needs to be
placed
before the function declaration, otherwise doxygen does not pick
it up.

--> Done

In Array.h the comments of reformOrResize contain quite some
typos. It should not use matrix, but array. In the middle resize
instead of copy is used. I got the impression that
resizePercentage
is
applied to the current shape, but it is to the new shape.

--> Fixed a couple of typos.

reformOrResize checks if the array is contiguous. Note
that even
if contiguous, the array can be a view on a larger array. It
does not
check if other Array objects are referencing the same storage. It
might be better to check that no more references exist to the
array's
storage.
A serious issue: it does not check that the dimensionality
does not
change, hence it is possible that a Vector gets resized to a 2-dim
array which must not be possible.
I would like to see the code not dependent on the template,
being
implemented in ArrayBase to avoid bloat. Certainly the first part
checking the shape, could be put in ArrayBase.

--> Added two checks to throw an exception if the array is
shared
or
if an attempt is made
to change the dimensionality.

--> About 1/2 of the checks require access to data_p; it
seems more
readable to keep all of the
validation checks together rather than split them between
Array and
ArrayBase.

The test 'shape() == newShape' will throw an exception if
their
lengths mismatch. Better to use the isEqual function.

--> Changed to isEquals. (Rather odd that operator== doesn't
simply
return false if the shapes
are different)

Data is not copied if there is sufficient space and
copyDataIfNeeded is true. Maybe easier to remove argument
copyDataIfNeeded.

--> If there is sufficient space, the data is left in place. The
most likely use cases will
increase or decrease the last dimension, so leaving the data in
place is appropriate.
If a different change is made (e.g., a 2x3 going to a 3x2) it's
really hard to say what
a copy ought to do. I added a line in the comments to warn
the user
about this case.

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.Web

Bug from

https://github.com/notifications/beacon/AGPGtIjHmFFFrfkJC0bbLx7kplb-3vGkks5oT8A5gaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub

<#113 (comment)
.

—
Reply to this email directly or view it on GitHub

#113 (comment)

Bug from

https://github.com/notifications/beacon/AGPGtLYCivXP5KRnE85Wogh_JnWaVjaPks5oV57kgaJpZM4E42Hu.gif

Jim Jacobs

phone: 835-7235
office: 301

—
Reply to this email directly or view it on GitHub
#113 (comment).

—
Reply to this email directly or view it on GitHub
#113 (comment)
Bug from
https://github.com/notifications/beacon/AGPGtPmkzDG7gBzYmQjqA1wSkRzsHllKks5oWj6mgaJpZM4E42Hu.gif

Jim Jacobs

…yMods

Issue #111: Added the capacity feature making it possible to resize without acquiring memory.

gervandiepen · 2015-06-25T05:52:41Z

Finalized issue #111

juliantaylor · 2015-06-25T14:49:20Z

you shouldn't merge PR's with failing travis tests, casacore now fails to build:

/home/jtaylor/eso/casa/casacore/casa/Arrays/test/tArray.cc:770:20: error: ‘class casa::Array<int>’ has no member named ‘extend’
  bool resized = a1.extend (newShape);

JimNrao · 2015-06-25T16:44:54Z

Fixed the error in tArray.cc. Probably needs to be repoened and remerged after travis passes on it?

tammojan · 2015-06-25T16:46:48Z

@JimNrao Could you just fix it on the master branch?

juliantaylor · 2015-06-25T16:47:50Z

I don't think you can add to an already merged PR, please make a new one with the fix.

tammojan · 2015-06-25T16:52:49Z

A new PR is indeed better, I'll merge it once it passes travis.

gervandiepen · 2015-06-25T16:56:32Z

I agree. I hadn't paid enough attention to it. A good lesson.

On Thu, Jun 25, 2015 at 4:49 PM, Julian Taylor notifications@github.com
wrote:

you shouldn't merge PR's with failing travis tests, casacore now fails to
build:

/home/jtaylor/eso/casa/casacore/casa/Arrays/test/tArray.cc:770:20: error: ‘class casa::Array’ has no member named ‘extend’
bool resized = a1.extend (newShape);

—
Reply to this email directly or view it on GitHub
#113 (comment).

JimNrao added 2 commits June 4, 2015 13:20

Merge branch 'master' into ArrayMods

265d1d9

gervandiepen closed this Jun 11, 2015

tammojan reopened this Jun 11, 2015

JimNrao added 2 commits June 15, 2015 10:56

Merge branch 'master' of ssh://github.com/casacore/casacore into Arra…

d009643

…yMods

Modified Array.{h,tcc} to address Ger's comments on the pull request.

098eb84

JimNrao added 2 commits June 17, 2015 13:25

Moved 95% of the implementation of Array<T>::reformOrReshape to Array…

c24188b

…Base with needed template-specific info passed in as parameters.

Reworked API so that there are now two public methods: reformOrResize

46b1be7

and extend. The first method promised nothing about the data contained in the array while the second method always preserves it.

tammojan mentioned this pull request Jun 22, 2015

Release casacore 2.0.2? #125

Closed

tammojan added this to the 2.0.2 milestone Jun 23, 2015

gervandiepen reviewed Jun 23, 2015
View reviewed changes

JimNrao added 2 commits June 24, 2015 11:17

Merge branch 'master' of ssh://github.com/casacore/casacore into Arra…

4a7f369

…yMods

Changed name of "Array<T>::extend" method to "Array<T>::adjustLastAxis".

7add7a2

gervandiepen added a commit that referenced this pull request Jun 25, 2015

Merge pull request #113 from JimNrao/ArrayMods

26cb696

Issue #111: Added the capacity feature making it possible to resize without acquiring memory.

gervandiepen merged commit 26cb696 into casacore:master Jun 25, 2015

JimNrao assigned gijzelaerr Jun 25, 2015

JimNrao mentioned this pull request Jun 25, 2015

Made change in tArray that corresponds to the method name change. #137

Merged

tammojan mentioned this pull request Jun 27, 2015

Allowing in-place array reshaping #111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array mods #113

Array mods #113

JimNrao commented Jun 4, 2015

tammojan commented Jun 5, 2015

gervandiepen commented Jun 11, 2015

tammojan commented Jun 11, 2015

JimNrao commented Jun 15, 2015

gervandiepen commented Jun 16, 2015

JimNrao commented Jun 17, 2015

JimNrao commented Jun 18, 2015

gervandiepen commented Jun 22, 2015

gervandiepen Jun 23, 2015

JimNrao commented Jun 23, 2015

gervandiepen commented Jun 24, 2015

JimNrao commented Jun 24, 2015

gervandiepen commented Jun 25, 2015

juliantaylor commented Jun 25, 2015

JimNrao commented Jun 25, 2015

tammojan commented Jun 25, 2015

juliantaylor commented Jun 25, 2015

tammojan commented Jun 25, 2015

gervandiepen commented Jun 25, 2015

Array mods #113

Array mods #113

Conversation

JimNrao commented Jun 4, 2015

tammojan commented Jun 5, 2015

gervandiepen commented Jun 11, 2015

tammojan commented Jun 11, 2015

JimNrao commented Jun 15, 2015

gervandiepen commented Jun 16, 2015

JimNrao commented Jun 17, 2015

JimNrao commented Jun 18, 2015

gervandiepen commented Jun 22, 2015

gervandiepen Jun 23, 2015

Choose a reason for hiding this comment

JimNrao commented Jun 23, 2015

gervandiepen commented Jun 24, 2015

JimNrao commented Jun 24, 2015

gervandiepen commented Jun 25, 2015

juliantaylor commented Jun 25, 2015

JimNrao commented Jun 25, 2015

tammojan commented Jun 25, 2015

juliantaylor commented Jun 25, 2015

tammojan commented Jun 25, 2015

gervandiepen commented Jun 25, 2015