Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't initialize CoordinateSequence in constructor #764

Open
eyal0 opened this issue Dec 8, 2022 · 2 comments · May be fixed by #765
Open

Don't initialize CoordinateSequence in constructor #764

eyal0 opened this issue Dec 8, 2022 · 2 comments · May be fixed by #765

Comments

@eyal0
Copy link
Contributor

eyal0 commented Dec 8, 2022

tl;dr

No need to initialize the CoordinateSequence when we construct a new one because we never used to do that.

The new code (d463bcb)

CoordinateSequence::CoordinateSequence(std::size_t sz, std::size_t dim) :
m_vect(sz * std::max(static_cast<std::uint8_t>(dim), static_cast<std::uint8_t>(3))),
m_stride(std::max(static_cast<std::uint8_t>(dim), static_cast<std::uint8_t>(3))),
m_hasdim(dim > 0),
m_hasz(dim >= 3),
m_hasm(dim == 4)
{
if (dim == 1 || dim > 4) {
throw util::IllegalArgumentException("Declared dimension must be 2, 3, or 4");
}
initialize();
}

In the code above, we see that if the CoordinateSequence is generated by calling with two arguments, sz and dim, then initialize is always called. It calls fillVector which calls std::fill and loops through the size of the new vector to initialize all elements to 0.

I think that this is not necessary.

The old code (60edf0e)

Looking at commit 60edf0e we see that creating the CoordinateSequence directly like this is instead of calling getGeometrySequenceFactory()->create(sz, dim).

60edf0e#diff-95f3875df536135fe955deecfaeab0af5f6af52008d0fd5a56a3ed1b6638aca1L2427-R2429

getCoordinateSequenceFactory is a method of the GeometryFactory class:

const CoordinateSequenceFactory* getCoordinateSequenceFactory() const
{
return coordinateListFactory;
};

It returns a CoordinateSequenceFactory, which has a create method. It's here in the old code:

/** \brief
* Creates a CoordinateSequence of the specified size and dimension.
*
* For this to be useful, the CoordinateSequence implementation must
* be mutable.
*
* @param size the number of coordinates in the sequence
* @param dimension the dimension of the coordinates in the sequence
* (0=unknown, 2, or 3 - ignored if not user specifiable)
*/
virtual std::unique_ptr<CoordinateSequence> create(std::size_t size,
std::size_t dimension = 0) const = 0;

We see that it is a pure virtual function so it must be implemented by inheritors of this class.

There are only two classes that inherit from this class, the DefaultCoordinateSequenceFactory and the CoordinateArraySequenceFactory:

class GEOS_DLL CoordinateArraySequenceFactory: public CoordinateSequenceFactory {

class GEOS_DLL DefaultCoordinateSequenceFactory : public CoordinateSequenceFactory {

Let's look at them one-by-one

CoordinateArraySequenceFactory

Here's the implementation of create(size_t, size_t):

/** @see CoordinateSequenceFactory::create(std::size_t, int) */
std::unique_ptr<CoordinateSequence> create(std::size_t size, std::size_t dimension) const override
{
return std::unique_ptr<CoordinateSequence>(
new CoordinateArraySequence(size, dimension));
};

It's creating a CoordinateArraySequence using the constructor and returning a unique_ptr. Here's that constuctor:

CoordinateArraySequence::CoordinateArraySequence(std::size_t n,
std::size_t dimension_in):
vect(n),
dimension(dimension_in)
{
}

It is resizing the vector to make it the right size but not initializing anything.

DefaultCoordinateSequenceFactory

Here's the implementation of create(size_t, size_t):

std::unique_ptr <CoordinateSequence> create(std::size_t size, std::size_t dims = 0) const final override {
switch(size) {
case 5: return detail::make_unique<FixedSizeCoordinateSequence<5>>(dims);
case 4: return detail::make_unique<FixedSizeCoordinateSequence<4>>(dims);
case 3: return detail::make_unique<FixedSizeCoordinateSequence<3>>(dims);
case 2: return detail::make_unique<FixedSizeCoordinateSequence<2>>(dims);
case 1: return detail::make_unique<FixedSizeCoordinateSequence<1>>(dims);
default:
return detail::make_unique<CoordinateArraySequence>(size, dims);
}
}

It's making a FixedSizeCoordinateSequence,templated on the number of dimensions. It's like a fixed size list, like a tuple! Let's look at that constuctor:

explicit FixedSizeCoordinateSequence(std::size_t dimension_in = 0) : dimension(dimension_in) {}

We see that it is just coping the dimension. It is not initializing anything.

Conclusion

For all cases of create(size_t, size_t), the code before 60edf0e did not initialize the coordinates. The new code could also skip the initialization to maintain the same behavior. This would also improve performance because we'd eliminate the slow call to std::fill.

@eyal0 eyal0 linked a pull request Dec 8, 2022 that will close this issue
@dbaston
Copy link
Member

dbaston commented Dec 8, 2022

the code before 60edf0e did not initialize the coordinates

We didn't initialize them explicitly, but they were still initialized. If I create a std::vector<Coordinate>(5) or std::array<Coordinate, 5>, the five elements are initialized using the default constructor for Coordinate.

That said, it should be possible to skip the initialization in many places. At one point I added an initialize flag to the the constructor with a dim parameter. But I think it's better to move in the direction of eliminating this constructor entirely, because the dim flag is ambiguous (does a value of 3 indicate XYZ or XYM?)

@eyal0
Copy link
Contributor Author

eyal0 commented Dec 8, 2022

Ah, you are right! That explains why I had to initialize the first one in the PR.

If you can eliminate the usage of dim==0 (unknown) then you can do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants