New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use std::vector instead of DynArray(on going) #3832

Open
wants to merge 1 commit into
base: develop
from

Conversation

Projects
None yet
3 participants
@MikeLing
Contributor

MikeLing commented Jun 4, 2017

So this PR is a part of DynArray refactoring(#3824), the reason why we don't use std::vector here is because [1] and [2].

[1] https://stackoverflow.com/a/17794965
[2] https://stackoverflow.com/a/8399470

@MikeLing MikeLing changed the title from use std::deque instead of DynamicArray(on going) to use std::deque instead of DynArray(on going) Jun 4, 2017

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Jun 4, 2017

Contributor

@lisitsyn could you give me some feedback for this pr? why so many error comes out but not directly relate to DynArray? Thanks a lot!

Contributor

MikeLing commented Jun 4, 2017

@lisitsyn could you give me some feedback for this pr? why so many error comes out but not directly relate to DynArray? Thanks a lot!

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Jun 5, 2017

Contributor
Contributor

MikeLing commented Jun 5, 2017

@MikeLing

This comment has been minimized.

Show comment
Hide comment
@MikeLing

MikeLing Jun 7, 2017

Contributor

sorry @vigsterkr , I failed to solve the error in SplittingStrategy.standard test. I had update the pr and wise you could give me some guide on that issue and pr itself. Maybe it's just a little error and i somehow miss it. But I'm too tired to move on today. I will keep working on it until it's done. :)

BTW, learn a lot of things from this pr and I wise I could use them in the std::vector refactoring. Sorry, I went into an wrong direction and waste a bunch of time. And, I waste a lot of time in building and rebuilding :(

Contributor

MikeLing commented Jun 7, 2017

sorry @vigsterkr , I failed to solve the error in SplittingStrategy.standard test. I had update the pr and wise you could give me some guide on that issue and pr itself. Maybe it's just a little error and i somehow miss it. But I'm too tired to move on today. I will keep working on it until it's done. :)

BTW, learn a lot of things from this pr and I wise I could use them in the std::vector refactoring. Sorry, I went into an wrong direction and waste a bunch of time. And, I waste a lot of time in building and rebuilding :(

@vigsterkr

unfortunately i have a bad news.... probably deque will not work here (i.e. we need to go back to vector and handle CDynamicArray as a special case). see my comments...

Show outdated Hide outdated src/shogun/evaluation/StratifiedCrossValidationSplitting.cpp
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
"free_array",
"whether array must be freed",
MS_NOT_AVAILABLE);
SG_ADD(

This comment has been minimized.

@vigsterkr

vigsterkr Jun 8, 2017

Member

by removing

m_parameters->add_vector(&m_array.array, &m_array.current_num_elements, "array", "Memory for dynamic array.");

basically it means that the serialization of this object will not be correct... as the deque is not going to be serialized.
it would be easy to add a vector element to m_parameters if we would use std::vector as that is a continuous memory block... unfortunately it's not the case of deque. but we still need a way to be able to serialize deque's content :S

@lisitsyn btw how is this gonna be handled with tags? m_parameters->add_vector

@vigsterkr

vigsterkr Jun 8, 2017

Member

by removing

m_parameters->add_vector(&m_array.array, &m_array.current_num_elements, "array", "Memory for dynamic array.");

basically it means that the serialization of this object will not be correct... as the deque is not going to be serialized.
it would be easy to add a vector element to m_parameters if we would use std::vector as that is a continuous memory block... unfortunately it's not the case of deque. but we still need a way to be able to serialize deque's content :S

@lisitsyn btw how is this gonna be handled with tags? m_parameters->add_vector

This comment has been minimized.

@vigsterkr

vigsterkr Jun 8, 2017

Member

this is why some of the python modular tests fails

@vigsterkr

vigsterkr Jun 8, 2017

Member

this is why some of the python modular tests fails

Show outdated Hide outdated src/shogun/lib/DynamicArray.h
@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Jun 8, 2017

Member

or of course there's an option to actually not return the array itself but only an iterator... and refactor some parts of shogun to use iterator to copy CDynamicArray... this of course still does not solve our issue with serialization.

Member

vigsterkr commented Jun 8, 2017

or of course there's an option to actually not return the array itself but only an iterator... and refactor some parts of shogun to use iterator to copy CDynamicArray... this of course still does not solve our issue with serialization.

private:
/** register parameters */
virtual void init()
{

This comment has been minimized.

@MikeLing

MikeLing Jun 10, 2017

Contributor

@vigsterkr Hi, could you tell me something about set_generic<>()?
I got bunch of error like


/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:110: error: use
      of class template 'CDynamicArray' requires template arguments
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                          ^
/Users/mikeling/shogun/src/shogun/base/DynArray.h:34:33: note: template is
      declared here
        template<class U> friend class CDynamicArray;
        ~~~~~~~~~~~~~~~~~              ^
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:131: error: 
      expected ':'
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                                               ^
                                                               : 
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:104: note: to
      match this '?'
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                    ^
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:131: error: 
      expected expression
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
@MikeLing

MikeLing Jun 10, 2017

Contributor

@vigsterkr Hi, could you tell me something about set_generic<>()?
I got bunch of error like


/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:110: error: use
      of class template 'CDynamicArray' requires template arguments
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                          ^
/Users/mikeling/shogun/src/shogun/base/DynArray.h:34:33: note: template is
      declared here
        template<class U> friend class CDynamicArray;
        ~~~~~~~~~~~~~~~~~              ^
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:131: error: 
      expected ':'
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                                               ^
                                                               : 
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:104: note: to
      match this '?'
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
                                    ^
/Users/mikeling/shogun/build/src/shogun/base/class_list.cpp:815:131: error: 
      expected expression
  ...g) { return g == PT_NOT_GENERIC? new CDynamicArray(): NULL; }
@vigsterkr

please rename the PR as it's now std::vector again ;)

with those fixes there are still unit tests failing:

	  1 - unit-DynamicObjectArray (SEGFAULT)
	  8 - unit-SGObject (OTHER_FAULT)
	 12 - unit-GaussianProcessClassification (SEGFAULT)
	 73 - unit-LineReaderTest (Failed)
	 81 - unit-CommUlongStringKernel (SEGFAULT)
	222 - unit-LogPlusOne (SEGFAULT)
	223 - unit-MultipleProcessors (SEGFAULT)
	226 - unit-RescaleFeatures (SEGFAULT)
	265 - unit-SerializationAscii (OTHER_FAULT)
	266 - unit-SerializationHDF5 (OTHER_FAULT)
	267 - unit-SerializationJSON (OTHER_FAULT)
	268 - unit-SerializationXML (OTHER_FAULT)
	343 - libshogun-evaluation_cross_validation_multiclass_mkl (OTHER_FAULT)

which means there are still(!) bugs in this refactor... i'm getting a bit worried here... as this is simple coding problem nothing else

Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
set_array(bool* p_array, int32_t p_num_elements, int32_t p_array_size)
{
if (m_array != NULL && free_array)
SG_FREE(m_array);

This comment has been minimized.

@vigsterkr

vigsterkr Jun 11, 2017

Member

this would be totally broken if anybody ever would take use_sg_mallocs seriously, as what if use_sg_mallocs = false, SG_FREE would try to free a none owned memory region... but yeah please just remove the whole use_sg_mallocs logic

@vigsterkr

vigsterkr Jun 11, 2017

Member

this would be totally broken if anybody ever would take use_sg_mallocs seriously, as what if use_sg_mallocs = false, SG_FREE would try to free a none owned memory region... but yeah please just remove the whole use_sg_mallocs logic

This comment has been minimized.

@MikeLing

MikeLing Jun 14, 2017

Contributor

I had remove all the use_sg_mallocs logic in this push :)

@MikeLing

MikeLing Jun 14, 2017

Contributor

I had remove all the use_sg_mallocs logic in this push :)

}
/** shuffles the array with external random state */
inline void shuffle(CRandom* rand)

This comment has been minimized.

@vigsterkr

vigsterkr Jun 11, 2017

Member

we do not want to have different API for shuffle... this is a very very bad approach!
use the unified one below...

@vigsterkr

vigsterkr Jun 11, 2017

Member

we do not want to have different API for shuffle... this is a very very bad approach!
use the unified one below...

This comment has been minimized.

@MikeLing

MikeLing Jun 14, 2017

Contributor

mmm, but after look more into it, I found maybe use shuffle(sg_rand) in

is a better idea. Sorry for remove it before

@MikeLing

MikeLing Jun 14, 2017

Contributor

mmm, but after look more into it, I found maybe use shuffle(sg_rand) in

is a better idea. Sorry for remove it before

This comment has been minimized.

@vigsterkr

vigsterkr Jun 14, 2017

Member

ok... anyhow we can refactor this in your next task as that one will heavily touch the random of shogun itself. but we'll discuss that later once DynArray removal is fully done.

@vigsterkr

vigsterkr Jun 14, 2017

Member

ok... anyhow we can refactor this in your next task as that one will heavily touch the random of shogun itself. but we'll discuss that later once DynArray removal is fully done.

Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
Show outdated Hide outdated src/shogun/lib/DynamicArray.h
@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Jun 11, 2017

Member

based on gdb of GaussianProcessClassification.apply_preprocessor_and_binary the problem is of course is in shogun::CDynamicArray<bool>::set_element; it seems it goes into an infinite recursive callstack as there are more than 20000+ set_element calls on the stack.... please try to copy-paste code without ERRORS

Member

vigsterkr commented Jun 11, 2017

based on gdb of GaussianProcessClassification.apply_preprocessor_and_binary the problem is of course is in shogun::CDynamicArray<bool>::set_element; it seems it goes into an infinite recursive callstack as there are more than 20000+ set_element calls on the stack.... please try to copy-paste code without ERRORS

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Jun 11, 2017

Member

@MikeLing one more thing. please dont even push any commits until you dont have ctest running successfully locally...

Member

vigsterkr commented Jun 11, 2017

@MikeLing one more thing. please dont even push any commits until you dont have ctest running successfully locally...

@vigsterkr

This comment has been minimized.

Show comment
Hide comment
@vigsterkr

vigsterkr Jun 11, 2017

Member

@MikeLing would be great to add unit-tests in the other PR to cover these errors, so that next time if there's a refactor the problem is easily caught

Member

vigsterkr commented Jun 11, 2017

@MikeLing would be great to add unit-tests in the other PR to cover these errors, so that next time if there's a refactor the problem is easily caught

@vigsterkr vigsterkr changed the title from use std::deque instead of DynArray(on going) to use std::vector instead of DynArray(on going) Jun 14, 2017

@@ -621,21 +774,13 @@ template <class T> class CDynamicArray :public CSGObject
virtual void init()
{
set_generic<T>();
T* head = m_array.data();
m_parameters->add_vector(
&head, &num_elements, "array", "Memory for dynamic array.");

This comment has been minimized.

@MikeLing

MikeLing Jun 15, 2017

Contributor

Does this really broken the serialization? I really have no idea, will look into it after I address the another DynArray refactor pr

@MikeLing

MikeLing Jun 15, 2017

Contributor

Does this really broken the serialization? I really have no idea, will look into it after I address the another DynArray refactor pr

This comment has been minimized.

@vigsterkr

vigsterkr Jun 15, 2017

Member

i would use m_array.size() instead of num_elements for the size of vector, as you actually wanna have the same vector deserialized, right? meaning if you have m_array.size() = 15 but num_elements = 5, you still wanna have the same sized m_array deserialized...so
m_parameters->add_vector(&head, &(m_array.size()), ....

@vigsterkr

vigsterkr Jun 15, 2017

Member

i would use m_array.size() instead of num_elements for the size of vector, as you actually wanna have the same vector deserialized, right? meaning if you have m_array.size() = 15 but num_elements = 5, you still wanna have the same sized m_array deserialized...so
m_parameters->add_vector(&head, &(m_array.size()), ....

This comment has been minimized.

@MikeLing

MikeLing Jun 15, 2017

Contributor

OK! Thank you! I will try it before next push. I'm working on the DynamicObjectArray refactor, so I can test if all these implementation on DynamicArray can be implemented on DynamicObjectArray also. Also can helps me polish this pr at the same time

@MikeLing

MikeLing Jun 15, 2017

Contributor

OK! Thank you! I will try it before next push. I'm working on the DynamicObjectArray refactor, so I can test if all these implementation on DynamicArray can be implemented on DynamicObjectArray also. Also can helps me polish this pr at the same time

This comment has been minimized.

@MikeLing

MikeLing Jun 15, 2017

Contributor

Hi @vigsterkr and @lisitsyn , what's the relationship between these parameters and SGObject::clone()? All these parameters will be clone into generate a new Object? And serialization ?

@MikeLing

MikeLing Jun 15, 2017

Contributor

Hi @vigsterkr and @lisitsyn , what's the relationship between these parameters and SGObject::clone()? All these parameters will be clone into generate a new Object? And serialization ?

This comment has been minimized.

@karlnapf

karlnapf Jun 15, 2017

Member

Yes, everything that is registered will be part of serialization and clone (and equals)

@karlnapf

karlnapf Jun 15, 2017

Member

Yes, everything that is registered will be part of serialization and clone (and equals)

This comment has been minimized.

@MikeLing

MikeLing Jun 16, 2017

Contributor

mmm, m_parameters->add_vector(&head, &(m_array.size()), .... doesn't work, and it broken the clone progress of DynamicArray. Here is how I implement it.

		virtual CSGObject* clone()
		{
			CDynamicArray * cloned = (CDynamicArray*) CSGObject::clone();
			// Since the array vector is registered with
			// current_num_elements as size (see parameter
			// registration) the cloned version has less memory
			// allocated than known to dynarray. We fix this here.
			// cloned->num_elements = cloned->m_array.size();
			return cloned;
		}

	private:

		/** register parameters */
		virtual void init()
		{
			set_generic<T>();
			T* head = m_array.data();
                        // & doesn't accept a rvalue and it seems like auto will make 'size' unrecognisable for add_vector()
			int32_t size = m_array.size();
			m_parameters->add_vector(
			    &head, &size, "array", "Memory for dynamic array.");

			SG_ADD(
			    &free_array, "free_array", "whether array must be freed",
			    MS_NOT_AVAILABLE);
			SG_ADD(&dim1_size, "dim1_size", "Dimension 1", MS_NOT_AVAILABLE);
			SG_ADD(&dim2_size, "dim2_size", "Dimension 2", MS_NOT_AVAILABLE);
			SG_ADD(&dim3_size, "dim3_size", "Dimension 3", MS_NOT_AVAILABLE);
		}

Progress stop in here

[ RUN ] DynamicObjectArray.array_SpectrumMismatchRBFKernel_clone_equals
shogun-unit-test(91199,0x7fffe91903c0) malloc: *** error for object 0x10637f125: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6

And here is some lldb debug message https://pastebin.mozilla.org/9024822

@MikeLing

MikeLing Jun 16, 2017

Contributor

mmm, m_parameters->add_vector(&head, &(m_array.size()), .... doesn't work, and it broken the clone progress of DynamicArray. Here is how I implement it.

		virtual CSGObject* clone()
		{
			CDynamicArray * cloned = (CDynamicArray*) CSGObject::clone();
			// Since the array vector is registered with
			// current_num_elements as size (see parameter
			// registration) the cloned version has less memory
			// allocated than known to dynarray. We fix this here.
			// cloned->num_elements = cloned->m_array.size();
			return cloned;
		}

	private:

		/** register parameters */
		virtual void init()
		{
			set_generic<T>();
			T* head = m_array.data();
                        // & doesn't accept a rvalue and it seems like auto will make 'size' unrecognisable for add_vector()
			int32_t size = m_array.size();
			m_parameters->add_vector(
			    &head, &size, "array", "Memory for dynamic array.");

			SG_ADD(
			    &free_array, "free_array", "whether array must be freed",
			    MS_NOT_AVAILABLE);
			SG_ADD(&dim1_size, "dim1_size", "Dimension 1", MS_NOT_AVAILABLE);
			SG_ADD(&dim2_size, "dim2_size", "Dimension 2", MS_NOT_AVAILABLE);
			SG_ADD(&dim3_size, "dim3_size", "Dimension 3", MS_NOT_AVAILABLE);
		}

Progress stop in here

[ RUN ] DynamicObjectArray.array_SpectrumMismatchRBFKernel_clone_equals
shogun-unit-test(91199,0x7fffe91903c0) malloc: *** error for object 0x10637f125: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6

And here is some lldb debug message https://pastebin.mozilla.org/9024822

@@ -611,7 +764,7 @@ template <class T> class CDynamicArray :public CSGObject
// current_num_elements as size (see parameter
// registration) the cloned version has less memory
// allocated than known to dynarray. We fix this here.
cloned->m_array.num_elements = cloned->m_array.current_num_elements;
// cloned->num_elements = cloned->m_array.size();

This comment has been minimized.

@MikeLing

MikeLing Jun 15, 2017

Contributor

This one the another place may broken serialization. But it will broken SGObject if not comment it. Maybe we need a better way to register parameter.

@MikeLing

MikeLing Jun 15, 2017

Contributor

This one the another place may broken serialization. But it will broken SGObject if not comment it. Maybe we need a better way to register parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment