Don't initialize CoordinateSequence in constructor #764

eyal0 · 2022-12-08T04:50:01Z

tl;dr

No need to initialize the CoordinateSequence when we construct a new one because we never used to do that.

The new code (`d463bcb`)

Lines 68 to 79 in d463bcb

    
           CoordinateSequence::CoordinateSequence(std::size_t sz, std::size_t dim) : 
        
               m_vect(sz * std::max(static_cast<std::uint8_t>(dim), static_cast<std::uint8_t>(3))), 
        
               m_stride(std::max(static_cast<std::uint8_t>(dim), static_cast<std::uint8_t>(3))), 
        
               m_hasdim(dim > 0), 
        
               m_hasz(dim >= 3), 
        
               m_hasm(dim == 4) 
        
           { 
        
               if (dim == 1 || dim > 4) { 
        
                   throw util::IllegalArgumentException("Declared dimension must be 2, 3, or 4"); 
        
               } 
        
               initialize(); 
        
           }

In the code above, we see that if the CoordinateSequence is generated by calling with two arguments, sz and dim, then initialize is always called. It calls fillVector which calls std::fill and loops through the size of the new vector to initialize all elements to 0.

I think that this is not necessary.

The old code (`60edf0e`)

Looking at commit 60edf0e we see that creating the CoordinateSequence directly like this is instead of calling getGeometrySequenceFactory()->create(sz, dim).

60edf0e#diff-95f3875df536135fe955deecfaeab0af5f6af52008d0fd5a56a3ed1b6638aca1L2427-R2429

getCoordinateSequenceFactory is a method of the GeometryFactory class:

geos/include/geos/geom/GeometryFactory.h

Lines 446 to 449 in 0e8d436

    
           const CoordinateSequenceFactory* getCoordinateSequenceFactory() const 
        
           { 
        
               return coordinateListFactory; 
        
           };

It returns a CoordinateSequenceFactory, which has a create method. It's here in the old code:

geos/include/geos/geom/CoordinateSequenceFactory.h

Lines 87 to 98 in 0e8d436

    
               /** \brief 
        
                * Creates a CoordinateSequence of the specified size and dimension. 
        
                * 
        
                * For this to be useful, the CoordinateSequence implementation must 
        
                * be mutable. 
        
                * 
        
                * @param size the number of coordinates in the sequence 
        
                * @param dimension the dimension of the coordinates in the sequence 
        
                * 	(0=unknown, 2, or 3 - ignored if not user specifiable) 
        
                */ 
        
               virtual std::unique_ptr<CoordinateSequence> create(std::size_t size, 
        
                                                                  std::size_t dimension = 0) const = 0;

We see that it is a pure virtual function so it must be implemented by inheritors of this class.

There are only two classes that inherit from this class, the DefaultCoordinateSequenceFactory and the CoordinateArraySequenceFactory:

geos/include/geos/geom/CoordinateArraySequenceFactory.h

Line 42 in 0e8d436

    
           class GEOS_DLL CoordinateArraySequenceFactory: public CoordinateSequenceFactory {

geos/include/geos/geom/DefaultCoordinateSequenceFactory.h

Line 24 in 0e8d436

    
           class GEOS_DLL DefaultCoordinateSequenceFactory : public CoordinateSequenceFactory {

Let's look at them one-by-one

`CoordinateArraySequenceFactory`

Here's the implementation of create(size_t, size_t):

geos/include/geos/geom/CoordinateArraySequenceFactory.h

Lines 69 to 74 in 0e8d436

    
           /** @see CoordinateSequenceFactory::create(std::size_t, int) */ 
        
           std::unique_ptr<CoordinateSequence> create(std::size_t size, std::size_t dimension) const override 
        
           { 
        
               return std::unique_ptr<CoordinateSequence>( 
        
                   new CoordinateArraySequence(size, dimension)); 
        
           };

It's creating a CoordinateArraySequence using the constructor and returning a unique_ptr. Here's that constuctor:

geos/src/geom/CoordinateArraySequence.cpp

Lines 37 to 42 in 0e8d436

    
           CoordinateArraySequence::CoordinateArraySequence(std::size_t n, 
        
                   std::size_t dimension_in): 
        
               vect(n), 
        
               dimension(dimension_in) 
        
           { 
        
           }

It is resizing the vector to make it the right size but not initializing anything.

`DefaultCoordinateSequenceFactory`

Here's the implementation of create(size_t, size_t):

geos/include/geos/geom/DefaultCoordinateSequenceFactory.h

Lines 43 to 53 in 0e8d436

    
           std::unique_ptr <CoordinateSequence> create(std::size_t size, std::size_t dims = 0) const final override { 
        
               switch(size) { 
        
                   case 5: return detail::make_unique<FixedSizeCoordinateSequence<5>>(dims); 
        
                   case 4: return detail::make_unique<FixedSizeCoordinateSequence<4>>(dims); 
        
                   case 3: return detail::make_unique<FixedSizeCoordinateSequence<3>>(dims); 
        
                   case 2: return detail::make_unique<FixedSizeCoordinateSequence<2>>(dims); 
        
                   case 1: return detail::make_unique<FixedSizeCoordinateSequence<1>>(dims); 
        
                   default: 
        
                       return detail::make_unique<CoordinateArraySequence>(size, dims); 
        
               } 
        
           }

It's making a FixedSizeCoordinateSequence,templated on the number of dimensions. It's like a fixed size list, like a tuple! Let's look at that constuctor:

geos/include/geos/geom/FixedSizeCoordinateSequence.h

Line 35 in 0e8d436

    
           explicit FixedSizeCoordinateSequence(std::size_t dimension_in = 0) : dimension(dimension_in) {}

We see that it is just coping the dimension. It is not initializing anything.

Conclusion

For all cases of create(size_t, size_t), the code before 60edf0e did not initialize the coordinates. The new code could also skip the initialization to maintain the same behavior. This would also improve performance because we'd eliminate the slow call to std::fill.

The text was updated successfully, but these errors were encountered:

dbaston · 2022-12-08T13:11:24Z

the code before 60edf0e did not initialize the coordinates

We didn't initialize them explicitly, but they were still initialized. If I create a std::vector<Coordinate>(5) or std::array<Coordinate, 5>, the five elements are initialized using the default constructor for Coordinate.

That said, it should be possible to skip the initialization in many places. At one point I added an initialize flag to the the constructor with a dim parameter. But I think it's better to move in the direction of eliminating this constructor entirely, because the dim flag is ambiguous (does a value of 3 indicate XYZ or XYM?)

eyal0 · 2022-12-08T14:21:20Z

Ah, you are right! That explains why I had to initialize the first one in the PR.

If you can eliminate the usage of dim==0 (unknown) then you can do it.

eyal0 linked a pull request Dec 8, 2022 that will close this issue

Don't initialize CoordinateSequence in constructor #765

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't initialize CoordinateSequence in constructor #764

Don't initialize CoordinateSequence in constructor #764

eyal0 commented Dec 8, 2022

dbaston commented Dec 8, 2022

eyal0 commented Dec 8, 2022

Don't initialize CoordinateSequence in constructor #764

Don't initialize CoordinateSequence in constructor #764

Comments

eyal0 commented Dec 8, 2022

tl;dr

The new code (d463bcb)

The old code (60edf0e)

CoordinateArraySequenceFactory

DefaultCoordinateSequenceFactory

Conclusion

dbaston commented Dec 8, 2022

eyal0 commented Dec 8, 2022

The new code (`d463bcb`)

The old code (`60edf0e`)

`CoordinateArraySequenceFactory`

`DefaultCoordinateSequenceFactory`