Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EFF Reduce the size of shared objects of the C-extensions generated by Cython #27767

Open
jjerphan opened this issue Nov 12, 2023 · 8 comments
Open

Comments

@jjerphan
Copy link
Member

jjerphan commented Nov 12, 2023

Context

scikit-learn uses C-extensions in critical part of its implementations via Cython.

Each C-entension is build from one or several Cython translation unit (a .pyx file with a potential .pxd companion file).

In scikit-learn, each C-extension build consists of a single Cython translation which is transpilled to a C or C++ translation unit, which is then compiled to a shared object file.

The resulting C or C++ translation unit contains the code translation from Cython to C and large preambule and epylogue of macros, functions, structs, global variables such as virtual tables, Python module definition, etc.

For instance, while the code of sklearn/utils/_heap.pyx only consists of less than 100 lines for a single function, the resulting sklearn/utils/heap.c file consists of more than 3500 lines, most of being the preambule's and the epilogue's injected by Cython:

Content of the generated sklearn/utils/heap.c

▾ macros
   -CYTHON_ABI
   -CYTHON_ASSUME_SAFE_MACROS
   -CYTHON_ASSUME_SAFE_MACROS
   -CYTHON_ASSUME_SAFE_MACROS
   -CYTHON_ASSUME_SAFE_MACROS
   -CYTHON_AVOID_BORROWED_REFS
   -CYTHON_AVOID_BORROWED_REFS
   -CYTHON_AVOID_BORROWED_REFS
   -CYTHON_AVOID_BORROWED_REFS
   -CYTHON_COMPILING_IN_CPYTHON
   -CYTHON_COMPILING_IN_CPYTHON
   -CYTHON_COMPILING_IN_CPYTHON
   -CYTHON_COMPILING_IN_CPYTHON
   -CYTHON_COMPILING_IN_NOGIL
   -CYTHON_COMPILING_IN_NOGIL
   -CYTHON_COMPILING_IN_NOGIL
   -CYTHON_COMPILING_IN_NOGIL
   -CYTHON_COMPILING_IN_PYPY
   -CYTHON_COMPILING_IN_PYPY
   -CYTHON_COMPILING_IN_PYPY
   -CYTHON_COMPILING_IN_PYPY
   -CYTHON_COMPILING_IN_PYSTON
   -CYTHON_COMPILING_IN_PYSTON
   -CYTHON_COMPILING_IN_PYSTON
   -CYTHON_COMPILING_IN_PYSTON
   -CYTHON_FALLTHROUGH
   -CYTHON_FALLTHROUGH
   -CYTHON_FALLTHROUGH
   -CYTHON_FALLTHROUGH
   -CYTHON_FALLTHROUGH
   -CYTHON_FALLTHROUGH
   -CYTHON_FAST_PYCALL
   -CYTHON_FAST_PYCALL
   -CYTHON_FAST_PYCALL
   -CYTHON_FAST_PYCALL
   -CYTHON_FAST_PYCCALL
   -CYTHON_FAST_THREAD_STATE
   -CYTHON_FAST_THREAD_STATE
   -CYTHON_FAST_THREAD_STATE
   -CYTHON_FAST_THREAD_STATE
   -CYTHON_FAST_THREAD_STATE
   -CYTHON_FORMAT_SSIZE_T
   -CYTHON_FUTURE_DIVISION
   -CYTHON_HEX_VERSION
   -CYTHON_INLINE
   -CYTHON_INLINE
   -CYTHON_INLINE
   -CYTHON_INLINE
   -CYTHON_INLINE
   -CYTHON_MAYBE_UNUSED_VAR(x)
   -CYTHON_NCP_UNUSED
   -CYTHON_NCP_UNUSED
   -CYTHON_PEP393_ENABLED
   -CYTHON_PEP393_ENABLED
   -CYTHON_PEP489_MULTI_PHASE_INIT
   -CYTHON_PEP489_MULTI_PHASE_INIT
   -CYTHON_PEP489_MULTI_PHASE_INIT
   -CYTHON_PEP489_MULTI_PHASE_INIT
   -CYTHON_PEP489_MULTI_PHASE_INIT
   -CYTHON_REFNANNY
   -CYTHON_RESTRICT
   -CYTHON_RESTRICT
   -CYTHON_RESTRICT
   -CYTHON_RESTRICT
   -CYTHON_SMALL_CODE
   -CYTHON_SMALL_CODE
   -CYTHON_SMALL_CODE
   -CYTHON_UNPACK_METHODS
   -CYTHON_UNPACK_METHODS
   -CYTHON_UNPACK_METHODS
   -CYTHON_UNPACK_METHODS
   -CYTHON_UNUSED
   -CYTHON_UNUSED
   -CYTHON_UNUSED
   -CYTHON_UNUSED
   -CYTHON_UPDATE_DESCRIPTOR_DOC
   -CYTHON_UPDATE_DESCRIPTOR_DOC
   -CYTHON_UPDATE_DESCRIPTOR_DOC
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_ASYNC_SLOTS
   -CYTHON_USE_DICT_VERSIONS
   -CYTHON_USE_DICT_VERSIONS
   -CYTHON_USE_DICT_VERSIONS
   -CYTHON_USE_DICT_VERSIONS
   -CYTHON_USE_EXC_INFO_STACK
   -CYTHON_USE_EXC_INFO_STACK
   -CYTHON_USE_EXC_INFO_STACK
   -CYTHON_USE_EXC_INFO_STACK
   -CYTHON_USE_EXC_INFO_STACK
   -CYTHON_USE_PYLIST_INTERNALS
   -CYTHON_USE_PYLIST_INTERNALS
   -CYTHON_USE_PYLIST_INTERNALS
   -CYTHON_USE_PYLIST_INTERNALS
   -CYTHON_USE_PYLONG_INTERNALS
   -CYTHON_USE_PYLONG_INTERNALS
   -CYTHON_USE_PYLONG_INTERNALS
   -CYTHON_USE_PYLONG_INTERNALS
   -CYTHON_USE_PYLONG_INTERNALS
   -CYTHON_USE_PYTYPE_LOOKUP
   -CYTHON_USE_PYTYPE_LOOKUP
   -CYTHON_USE_PYTYPE_LOOKUP
   -CYTHON_USE_PYTYPE_LOOKUP
   -CYTHON_USE_PYTYPE_LOOKUP
   -CYTHON_USE_TP_FINALIZE
   -CYTHON_USE_TP_FINALIZE
   -CYTHON_USE_TP_FINALIZE
   -CYTHON_USE_TP_FINALIZE
   -CYTHON_USE_TYPE_SLOTS
   -CYTHON_USE_TYPE_SLOTS
   -CYTHON_USE_TYPE_SLOTS
   -CYTHON_USE_TYPE_SLOTS
   -CYTHON_USE_UNICODE_INTERNALS
   -CYTHON_USE_UNICODE_INTERNALS
   -CYTHON_USE_UNICODE_INTERNALS
   -CYTHON_USE_UNICODE_INTERNALS
   -CYTHON_USE_UNICODE_WRITER
   -CYTHON_USE_UNICODE_WRITER
   -CYTHON_USE_UNICODE_WRITER
   -CYTHON_USE_UNICODE_WRITER
   -CYTHON_USE_UNICODE_WRITER
   -CYTHON_WITHOUT_ASSERTIONS
   -DL_EXPORT(t)
   -DL_IMPORT(t)
   -HAVE_LONG_LONG
   -METH_FASTCALL
   -METH_STACKLESS
   -PY_LONG_LONG
   -PY_SSIZE_T_CLEAN
   -PyBaseString_Type
   -PyBoolObject
   -PyByteArray_Check(obj)
   -PyIntObject
   -PyInt_AS_LONG
   -PyInt_AsLong
   -PyInt_AsSsize_t
   -PyInt_AsUnsignedLongLongMask
   -PyInt_AsUnsignedLongMask
   -PyInt_Check(op)
   -PyInt_CheckExact(op)
   -PyInt_FromLong
   -PyInt_FromSize_t
   -PyInt_FromSsize_t
   -PyInt_FromString
   -PyInt_FromUnicode
   -PyInt_Type
   -PyMem_RawFree(p)
   -PyMem_RawMalloc(n)
   -PyMem_RawRealloc(p,n)
   -PyNumber_Int
   -PyObject_ASCII(o)
   -PyObject_Format(obj,fmt)
   -PyObject_Free(p)
   -PyObject_Malloc(s)
   -PyObject_Realloc(p)
   -PyObject_Unicode
   -PySet_CheckExact(obj)
   -PyStringObject
   -PyString_Check
   -PyString_CheckExact
   -PyString_Type
   -PyUnicode_1BYTE_KIND
   -PyUnicode_2BYTE_KIND
   -PyUnicode_4BYTE_KIND
   -PyUnicode_Contains(u,s)
   -PyUnicode_InternFromString(s)
   -Py_BUILD_CORE
   -Py_HUGE_VAL
   -Py_TPFLAGS_CHECKTYPES
   -Py_TPFLAGS_HAVE_FINALIZE
   -Py_TPFLAGS_HAVE_INDEX
   -Py_TPFLAGS_HAVE_NEWBUFFER
   -Py_tss_NEEDS_INIT
   -_USE_MATH_DEFINES
   -__PYX_BUILD_PY_SSIZE_T
   -__PYX_COMMA
   -__PYX_DEFAULT_STRING_ENCODING
   -__PYX_DEFAULT_STRING_ENCODING_IS_ASCII
   -__PYX_DEFAULT_STRING_ENCODING_IS_DEFAULT
   -__PYX_DEFAULT_STRING_ENCODING_IS_UTF8
   -__PYX_DICT_VERSION_INIT
   -__PYX_ERR(f_index,lineno,Ln_error)
   -__PYX_EXTERN_C
   -__PYX_EXTERN_C
   -__PYX_GET_DICT_VERSION(dict)
   -__PYX_GET_DICT_VERSION(dict)
   -__PYX_HAVE_API__sklearn__utils___heap
   -__PYX_HAVE__sklearn__utils___heap
   -__PYX_MARK_ERR_POS(f_index,lineno)
   -__PYX_NAN()
   -__PYX_PY_DICT_LOOKUP_IF_MODIFIED(VAR,DICT,LOOKUP)
   -__PYX_PY_DICT_LOOKUP_IF_MODIFIED(VAR,DICT,LOOKUP)
   -__PYX_UPDATE_DICT_CACHE(dict,value,cache_var,version_var)
   -__PYX_UPDATE_DICT_CACHE(dict,value,cache_var,version_var)
   -__PYX_VERIFY_RETURN_INT(target_type,func_type,func_value)
   -__PYX_VERIFY_RETURN_INT_EXC(target_type,func_type,func_value)
   -__PYX__VERIFY_RETURN_INT(target_type,func_type,func_value,exc)
   -__Pyx_BUILTIN_MODULE_NAME
   -__Pyx_BUILTIN_MODULE_NAME
   -__Pyx_CLEAR(r)
   -__Pyx_CLineForTraceback(tstate,c_line)
   -__Pyx_DECREF(r)
   -__Pyx_DECREF(r)
   -__Pyx_DECREF_SET(r,v)
   -__Pyx_DefaultClassType
   -__Pyx_DefaultClassType
   -__Pyx_DefaultClassType
   -__Pyx_ErrFetch(type,value,tb)
   -__Pyx_ErrFetch(type,value,tb)
   -__Pyx_ErrFetchInState(tstate,type,value,tb)
   -__Pyx_ErrFetchWithState(type,value,tb)
   -__Pyx_ErrFetchWithState(type,value,tb)
   -__Pyx_ErrRestore(type,value,tb)
   -__Pyx_ErrRestore(type,value,tb)
   -__Pyx_ErrRestoreInState(tstate,type,value,tb)
   -__Pyx_ErrRestoreWithState(type,value,tb)
   -__Pyx_ErrRestoreWithState(type,value,tb)
   -__Pyx_GIVEREF(r)
   -__Pyx_GIVEREF(r)
   -__Pyx_GOTREF(r)
   -__Pyx_GOTREF(r)
   -__Pyx_HAS_GCC_DIAGNOSTIC
   -__Pyx_INCREF(r)
   -__Pyx_INCREF(r)
   -__Pyx_MODULE_NAME
   -__Pyx_NewRef(obj)
   -__Pyx_Owned_Py_None(b)
   -__Pyx_PyAsyncMethodsStruct
   -__Pyx_PyBaseString_Check(obj)
   -__Pyx_PyBaseString_Check(obj)
   -__Pyx_PyBaseString_CheckExact(obj)
   -__Pyx_PyBaseString_CheckExact(obj)
   -__Pyx_PyByteArray_FromCString(s)
   -__Pyx_PyByteArray_FromString(s)
   -__Pyx_PyByteArray_FromStringAndSize(s,l)
   -__Pyx_PyBytes_AsSString(s)
   -__Pyx_PyBytes_AsString(s)
   -__Pyx_PyBytes_AsUString(s)
   -__Pyx_PyBytes_AsWritableSString(s)
   -__Pyx_PyBytes_AsWritableString(s)
   -__Pyx_PyBytes_AsWritableUString(s)
   -__Pyx_PyBytes_FromCString(s)
   -__Pyx_PyBytes_FromString
   -__Pyx_PyBytes_FromStringAndSize
   -__Pyx_PyCFunctionFast
   -__Pyx_PyCFunctionFastWithKeywords
   -__Pyx_PyCode_HasFreeVars(co)
   -__Pyx_PyCode_HasFreeVars(co)
   -__Pyx_PyCode_New(a,k,l,s,f,code,c,n,v,fv,cell,fn,name,fline,lnos)
   -__Pyx_PyCode_New(a,k,l,s,f,code,c,n,v,fv,cell,fn,name,fline,lnos)
   -__Pyx_PyDict_GetItemStr(dict,name)
   -__Pyx_PyDict_GetItemStr(dict,name)
   -__Pyx_PyDict_NewPresized(n)
   -__Pyx_PyDict_NewPresized(n)
   -__Pyx_PyErr_Clear()
   -__Pyx_PyErr_Clear()
   -__Pyx_PyErr_GivenExceptionMatches(err,type)
   -__Pyx_PyErr_GivenExceptionMatches2(err,type1,type2)
   -__Pyx_PyErr_Occurred()
   -__Pyx_PyErr_Occurred()
   -__Pyx_PyErr_SetNone(exc)
   -__Pyx_PyErr_SetNone(exc)
   -__Pyx_PyErr_SetNone(exc)
   -__Pyx_PyException_Check(obj)
   -__Pyx_PyFastCFunction_Check(func)
   -__Pyx_PyFastCFunction_Check(func)
   -__Pyx_PyFrame_SetLineNumber(frame,lineno)
   -__Pyx_PyFrame_SetLineNumber(frame,lineno)
   -__Pyx_PyInt_AsHash_t
   -__Pyx_PyInt_AsHash_t
   -__Pyx_PyInt_FromHash_t
   -__Pyx_PyInt_FromHash_t
   -__Pyx_PyMODINIT_FUNC
   -__Pyx_PyMODINIT_FUNC
   -__Pyx_PyMODINIT_FUNC
   -__Pyx_PyMODINIT_FUNC
   -__Pyx_PyMODINIT_FUNC
   -__Pyx_PyMethod_New(func,self,klass)
   -__Pyx_PyMethod_New(func,self,klass)
   -__Pyx_PyNumber_Divide(x,y)
   -__Pyx_PyNumber_Divide(x,y)
   -__Pyx_PyNumber_Float(x)
   -__Pyx_PyNumber_InPlaceDivide(x,y)
   -__Pyx_PyNumber_InPlaceDivide(x,y)
   -__Pyx_PyNumber_Int(x)
   -__Pyx_PyNumber_Int(x)
   -__Pyx_PyObject_AsSString(s)
   -__Pyx_PyObject_AsUString(s)
   -__Pyx_PyObject_AsWritableSString(s)
   -__Pyx_PyObject_AsWritableString(s)
   -__Pyx_PyObject_AsWritableUString(s)
   -__Pyx_PyObject_FromCString(s)
   -__Pyx_PyObject_FromString
   -__Pyx_PyObject_FromStringAndSize
   -__Pyx_PyObject_GC_IsFinalized(o)
   -__Pyx_PyObject_GC_IsFinalized(o)
   -__Pyx_PyObject_GetAttrStr(o,n)
   -__Pyx_PySequence_SIZE(seq)
   -__Pyx_PySequence_SIZE(seq)
   -__Pyx_PySequence_Tuple(obj)
   -__Pyx_PyStr_FromCString(s)
   -__Pyx_PyStr_FromString
   -__Pyx_PyStr_FromString
   -__Pyx_PyStr_FromStringAndSize
   -__Pyx_PyStr_FromStringAndSize
   -__Pyx_PyString_Format(a,b)
   -__Pyx_PyString_Format(a,b)
   -__Pyx_PyString_FormatSafe(a,b)
   -__Pyx_PyThreadState_Current
   -__Pyx_PyThreadState_Current
   -__Pyx_PyThreadState_Current
   -__Pyx_PyThreadState_Current
   -__Pyx_PyThreadState_assign
   -__Pyx_PyThreadState_assign
   -__Pyx_PyThreadState_declare
   -__Pyx_PyThreadState_declare
   -__Pyx_PyType_AsAsync(obj)
   -__Pyx_PyType_AsAsync(obj)
   -__Pyx_PyType_AsAsync(obj)
   -__Pyx_PyUnicode_AsUnicode
   -__Pyx_PyUnicode_Concat(a,b)
   -__Pyx_PyUnicode_Concat(a,b)
   -__Pyx_PyUnicode_ConcatSafe(a,b)
   -__Pyx_PyUnicode_ConcatSafe(a,b)
   -__Pyx_PyUnicode_DATA(u)
   -__Pyx_PyUnicode_DATA(u)
   -__Pyx_PyUnicode_FormatSafe(a,b)
   -__Pyx_PyUnicode_FromCString(s)
   -__Pyx_PyUnicode_FromStringAndSize(c_str,size)
   -__Pyx_PyUnicode_FromStringAndSize(c_str,size)
   -__Pyx_PyUnicode_FromUnicode(u)
   -__Pyx_PyUnicode_FromUnicodeAndLength
   -__Pyx_PyUnicode_GET_LENGTH(u)
   -__Pyx_PyUnicode_GET_LENGTH(u)
   -__Pyx_PyUnicode_IS_TRUE(u)
   -__Pyx_PyUnicode_IS_TRUE(u)
   -__Pyx_PyUnicode_IS_TRUE(u)
   -__Pyx_PyUnicode_IS_TRUE(u)
   -__Pyx_PyUnicode_KIND(u)
   -__Pyx_PyUnicode_KIND(u)
   -__Pyx_PyUnicode_MAX_CHAR_VALUE(u)
   -__Pyx_PyUnicode_MAX_CHAR_VALUE(u)
   -__Pyx_PyUnicode_READ(k,d,i)
   -__Pyx_PyUnicode_READ(k,d,i)
   -__Pyx_PyUnicode_READY(op)
   -__Pyx_PyUnicode_READY(op)
   -__Pyx_PyUnicode_READY(op)
   -__Pyx_PyUnicode_READ_CHAR(u,i)
   -__Pyx_PyUnicode_READ_CHAR(u,i)
   -__Pyx_PyUnicode_WRITE(k,d,i,ch)
   -__Pyx_PyUnicode_WRITE(k,d,i,ch)
   -__Pyx_RefNannyDeclarations
   -__Pyx_RefNannyDeclarations
   -__Pyx_RefNannyFinishContext()
   -__Pyx_RefNannyFinishContext()
   -__Pyx_RefNannySetupContext(name,acquire_gil)
   -__Pyx_RefNannySetupContext(name,acquire_gil)
   -__Pyx_RefNannySetupContext(name,acquire_gil)
   -__Pyx_SET_REFCNT(obj,refcnt)
   -__Pyx_SET_REFCNT(obj,refcnt)
   -__Pyx_SET_SIZE(obj,size)
   -__Pyx_SET_SIZE(obj,size)
   -__Pyx_TypeCheck(obj,type)
   -__Pyx_TypeCheck(obj,type)
   -__Pyx_XCLEAR(r)
   -__Pyx_XDECREF(r)
   -__Pyx_XDECREF(r)
   -__Pyx_XDECREF_SET(r,v)
   -__Pyx_XGIVEREF(r)
   -__Pyx_XGIVEREF(r)
   -__Pyx_XGOTREF(r)
   -__Pyx_XGOTREF(r)
   -__Pyx_XINCREF(r)
   -__Pyx_XINCREF(r)
   -__Pyx_fits_Py_ssize_t(v,type,is_signed)
   -__Pyx_long_cast(x)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_sst_abs(value)
   -__Pyx_truncl
   -__Pyx_truncl
   -__Pyx_uchar_cast(c)
   -__Pyx_void_to_None(void_result)
   -__cdecl
   -__fastcall
   -__has_attribute(x)
   -__has_cpp_attribute(x)
   -__pyx_PyFloat_AsDouble(x)
   -__pyx_PyFloat_AsDouble(x)
   -__pyx_PyFloat_AsFloat(x)
   -__stdcall
   -likely(x)
   -likely(x)
   -offsetof(type,member)
   -unlikely(x)
   -unlikely(x)

▾ prototypes
   -__Pyx_AddTraceback(const char * funcname,int c_line,int py_line,const char * filename)
   -__Pyx_CLineForTraceback(PyThreadState * tstate,int c_line)
   -__Pyx_ErrFetchInState(PyThreadState * tstate,PyObject ** type,PyObject ** value,PyObject ** tb)
   -__Pyx_ErrRestoreInState(PyThreadState * tstate,PyObject * type,PyObject * value,PyObject * tb)
   -__Pyx_ExportFunction(const char * name,void (* f)(void),const char * sig)
   -__Pyx_InitStrings(__Pyx_StringTabEntry * t)
   -__Pyx_IsSubtype(PyTypeObject * a,PyTypeObject * b)
   -__Pyx_PyBool_FromLong(long b)
   -__Pyx_PyErr_GivenExceptionMatches(PyObject * err,PyObject * type)
   -__Pyx_PyErr_GivenExceptionMatches2(PyObject * err,PyObject * type1,PyObject * type2)
   -__Pyx_PyIndex_AsHash_t(PyObject *)
   -__Pyx_PyIndex_AsSsize_t(PyObject *)
   -__Pyx_PyInt_As_int(PyObject *)
   -__Pyx_PyInt_As_long(PyObject *)
   -__Pyx_PyInt_FromSize_t(size_t)
   -__Pyx_PyInt_From_long(long value)
   -__Pyx_PyNumber_IntOrLong(PyObject * x)
   -__Pyx_PyObject_AsString(PyObject *)
   -__Pyx_PyObject_AsStringAndSize(PyObject *,Py_ssize_t * length)
   -__Pyx_PyObject_GetAttrStr(PyObject * obj,PyObject * attr_name)
   -__Pyx_PyObject_IsTrue(PyObject *)
   -__Pyx_PyObject_IsTrueAndDecref(PyObject *)
   -__Pyx_PyUnicode_FromString(const char *)
   -__Pyx_RefNannyImportAPI(const char * modname)
   -__Pyx_check_binary_version(void)
   -__Pyx_get_object_dict_version(PyObject * obj)
   -__Pyx_get_tp_dict_version(PyObject * obj)
   -__Pyx_modinit_function_export_code(void)
   -__Pyx_modinit_function_import_code(void)
   -__Pyx_modinit_global_init_code(void)
   -__Pyx_modinit_type_import_code(void)
   -__Pyx_modinit_type_init_code(void)
   -__Pyx_modinit_variable_export_code(void)
   -__Pyx_modinit_variable_import_code(void)
   -__Pyx_object_dict_version_matches(PyObject * obj,PY_UINT64_T tp_dict_version,PY_UINT64_T obj_dict_version)
   -__pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry * entries,int count,int code_line)
   -__pyx_find_code_object(int code_line)
   -__pyx_fuse_0__pyx_f_7sklearn_5utils_5_heap_heap_push(float *,__pyx_t_7sklearn_5utils_9_typedefs_intp_t *,__pyx_t_7sklearn_5utils_9_typedefs_intp_t,float,__pyx_t_7sklearn_5utils_9_typedefs_intp_t)
   -__pyx_fuse_1__pyx_f_7sklearn_5utils_5_heap_heap_push(double *,__pyx_t_7sklearn_5utils_9_typedefs_intp_t *,__pyx_t_7sklearn_5utils_9_typedefs_intp_t,double,__pyx_t_7sklearn_5utils_9_typedefs_intp_t)
   -__pyx_insert_code_object(int code_line,PyCodeObject * code_object)
   -__pyx_pymod_create(PyObject * spec,PyModuleDef * def)
   -__pyx_pymod_exec__heap(PyObject * module)
   -init_heap(void)

▾-__anonf7ac09720103 : enum
    [enumerators]
   +__pyx_check_sizeof_voidp

▾ typedefs
   -Py_hash_t
   -Py_tss_t
   -__Pyx_CodeObjectCacheEntry
   -__Pyx_PyAsyncMethodsStruct
   -__Pyx_PyCFunctionFast
   -__Pyx_PyCFunctionFastWithKeywords
   -__Pyx_RefNannyAPIStruct
   -__Pyx_StringTabEntry
   -__pyx_t_7sklearn_5utils_9_typedefs_float32_t
   -__pyx_t_7sklearn_5utils_9_typedefs_float64_t
   -__pyx_t_7sklearn_5utils_9_typedefs_int32_t
   -__pyx_t_7sklearn_5utils_9_typedefs_int64_t
   -__pyx_t_7sklearn_5utils_9_typedefs_intp_t
   -__pyx_t_7sklearn_5utils_9_typedefs_uint32_t
   -__pyx_t_7sklearn_5utils_9_typedefs_uint64_t
   -__pyx_t_7sklearn_5utils_9_typedefs_uint8_t
   -uint32_t
   -uint32_t
   -uint8_t
   -uint8_t

▾-__Pyx_CodeObjectCache : struct
    [members]
   +count
   +entries
   +max_count

▾-__anonf7ac09720208 : struct
    [members]
   +am_aiter
   +am_anext
   +am_await

▾-__anonf7ac09720308 : struct
    [members]
   +encoding
   +intern
   +is_str
   +is_unicode
   +n
   +p
   +s

▾-__anonf7ac09720408 : struct
    [members]
   +DECREF
   +FinishContext
   +GIVEREF
   +GOTREF
   +INCREF
   +SetupContext

▾-__anonf7ac09720508 : struct
    [members]
   +code_line
   +code_object

 -__anonf7ac0972060a : union

▾ variables
   -__PYX_DEFAULT_STRING_ENCODING
   -__Pyx_RefNanny
   -__Pyx_sys_getdefaultencoding_not_ascii
   -__pyx_b
   -__pyx_cfilenm
   -__pyx_clineno
   -__pyx_code_cache
   -__pyx_cython_runtime
   -__pyx_d
   -__pyx_empty_bytes
   -__pyx_empty_tuple
   -__pyx_empty_unicode
   -__pyx_f
   -__pyx_filename
   -__pyx_k_cline_in_traceback
   -__pyx_k_main
   -__pyx_k_name
   -__pyx_k_test
   -__pyx_lineno
   -__pyx_m
   -__pyx_methods
    __pyx_module_is_main_sklearn__utils___heap
   -__pyx_moduledef
   -__pyx_moduledef_slots
   -__pyx_n_s_cline_in_traceback
   -__pyx_n_s_main
   -__pyx_n_s_name
   -__pyx_n_s_test
   -__pyx_string_tab

▾ functions
    CYTHON_MAYBE_UNUSED_VAR(const T &)
   -PyThread_tss_alloc(void)
   -PyThread_tss_create(Py_tss_t * key)
   -PyThread_tss_delete(Py_tss_t * key)
   -PyThread_tss_free(Py_tss_t * key)
   -PyThread_tss_get(Py_tss_t * key)
   -PyThread_tss_is_created(Py_tss_t * key)
   -PyThread_tss_set(Py_tss_t * key,void * value)
   -__PYX_NAN()
   -__Pyx_AddTraceback(const char * funcname,int c_line,int py_line,const char * filename)
   -__Pyx_CLineForTraceback(CYTHON_UNUSED PyThreadState * tstate,int c_line)
   -__Pyx_CreateCodeObjectForTraceback(const char * funcname,int c_line,int py_line,const char * filename)
   -__Pyx_ErrFetchInState(PyThreadState * tstate,PyObject ** type,PyObject ** value,PyObject ** tb)
   -__Pyx_ErrRestoreInState(PyThreadState * tstate,PyObject * type,PyObject * value,PyObject * tb)
  ▾-__Pyx_ExportFunction(const char * name,void (* f)(void),const char * sig)
   -__Pyx_InBases(PyTypeObject * a,PyTypeObject * b)
   -__Pyx_InitCachedBuiltins(void)
   -__Pyx_InitCachedConstants(void)
   -__Pyx_InitGlobals(void)
   -__Pyx_InitStrings(__Pyx_StringTabEntry * t)
   -__Pyx_IsSubtype(PyTypeObject * a,PyTypeObject * b)
   -__Pyx_PyBool_FromLong(long b)
   -__Pyx_PyCode_New(int a,int k,int l,int s,int f,PyObject * code,PyObject * c,PyObject * n,PyObject * v,PyObject * fv,PyObject * cell,PyObject * fn,PyObject * name,int fline,PyObject * lnos)
   -__Pyx_PyErr_GivenExceptionMatches(PyObject * err,PyObject * exc_type)
   -__Pyx_PyErr_GivenExceptionMatches2(PyObject * err,PyObject * exc_type1,PyObject * exc_type2)
   -__Pyx_PyErr_GivenExceptionMatchesTuple(PyObject * exc_type,PyObject * tuple)
   -__Pyx_PyIndex_AsHash_t(PyObject * o)
   -__Pyx_PyIndex_AsSsize_t(PyObject * b)
   -__Pyx_PyInt_As_int(PyObject * x)
   -__Pyx_PyInt_As_long(PyObject * x)
   -__Pyx_PyInt_FromSize_t(size_t ival)
   -__Pyx_PyInt_From_long(long value)
   -__Pyx_PyNumber_IntOrLong(PyObject * x)
   -__Pyx_PyNumber_IntOrLongWrongResultType(PyObject * result,const char * type_name)
   -__Pyx_PyObject_AsString(PyObject * o)
   -__Pyx_PyObject_AsStringAndSize(PyObject * o,Py_ssize_t * length)
   -__Pyx_PyObject_GetAttrStr(PyObject * obj,PyObject * attr_name)
   -__Pyx_PyObject_IsTrue(PyObject * x)
   -__Pyx_PyObject_IsTrueAndDecref(PyObject * x)
   -__Pyx_PyUnicode_AsStringAndSize(PyObject * o,Py_ssize_t * length)
   -__Pyx_PyUnicode_AsStringAndSize(PyObject * o,Py_ssize_t * length)
   -__Pyx_PyUnicode_FromString(const char * c_str)
   -__Pyx_Py_UNICODE_strlen(const Py_UNICODE * u)
   -__Pyx_RefNannyImportAPI(const char * modname)
   -__Pyx_check_binary_version(void)
   -__Pyx_get_object_dict_version(PyObject * obj)
   -__Pyx_get_tp_dict_version(PyObject * obj)
   -__Pyx_init_sys_getdefaultencoding_params(void)
   -__Pyx_init_sys_getdefaultencoding_params(void)
   -__Pyx_inner_PyErr_GivenExceptionMatches2(PyObject * err,PyObject * exc_type1,PyObject * exc_type2)
   -__Pyx_inner_PyErr_GivenExceptionMatches2(PyObject * err,PyObject * exc_type1,PyObject * exc_type2)
   -__Pyx_is_valid_index(Py_ssize_t i,Py_ssize_t limit)
   -__Pyx_modinit_function_export_code(void)
   -__Pyx_modinit_function_import_code(void)
   -__Pyx_modinit_global_init_code(void)
   -__Pyx_modinit_type_import_code(void)
   -__Pyx_modinit_type_init_code(void)
   -__Pyx_modinit_variable_export_code(void)
   -__Pyx_modinit_variable_import_code(void)
   -__Pyx_object_dict_version_matches(PyObject * obj,PY_UINT64_T tp_dict_version,PY_UINT64_T obj_dict_version)
   -__Pyx_pretend_to_initialize(void * ptr)
   -__pyx_bisect_code_objects(__Pyx_CodeObjectCacheEntry * entries,int count,int code_line)
   -__pyx_find_code_object(int code_line)
   -__pyx_fuse_0__pyx_f_7sklearn_5utils_5_heap_heap_push(float * __pyx_v_values,__pyx_t_7sklearn_5utils_9_typedefs_intp_t * __pyx_v_indices,__pyx_t_7sklearn_5utils_9_typedefs_intp_t __pyx_v_size,float __pyx_v_val,__pyx_t_7sklearn_5utils_9_typedefs_intp_t __pyx_v_val_idx)
   -__pyx_fuse_1__pyx_f_7sklearn_5utils_5_heap_heap_push(double * __pyx_v_values,__pyx_t_7sklearn_5utils_9_typedefs_intp_t * __pyx_v_indices,__pyx_t_7sklearn_5utils_9_typedefs_intp_t __pyx_v_size,double __pyx_v_val,__pyx_t_7sklearn_5utils_9_typedefs_intp_t __pyx_v_val_idx)
   -__pyx_insert_code_object(int code_line,PyCodeObject * code_object)
    init_heap(void)

Problem

Currently the uncompressed size of scikit-learn is around 48.8MB, 20MB of which are shared object files. As reported by @rth in pyodide/pyodide#4289, while shared object files are optimized for Emscripten quite heavily, they still accounts for most of the size of scikit-learn on this stack.

Extensions' shared object sizes on Linux
find . -name \*.so -exec du -h {} \; | sort -h --reverse
2,1M	./sklearn/_loss/_loss.cpython-312-x86_64-linux-gnu.so
712K	./sklearn/utils/sparsefuncs_fast.cpython-312-x86_64-linux-gnu.so
672K	./sklearn/neighbors/_kd_tree.cpython-312-x86_64-linux-gnu.so
672K	./sklearn/neighbors/_ball_tree.cpython-312-x86_64-linux-gnu.so
624K	./sklearn/tree/_tree.cpython-312-x86_64-linux-gnu.so
608K	./sklearn/metrics/_dist_metrics.cpython-312-x86_64-linux-gnu.so
520K	./sklearn/datasets/_svmlight_format_fast.cpython-312-x86_64-linux-gnu.so
504K	./sklearn/preprocessing/_target_encoder_fast.cpython-312-x86_64-linux-gnu.so
476K	./sklearn/svm/_libsvm.cpython-312-x86_64-linux-gnu.so
448K	./sklearn/svm/_libsvm_sparse.cpython-312-x86_64-linux-gnu.so
448K	./sklearn/metrics/_pairwise_distances_reduction/_middle_term_computer.cpython-312-x86_64-linux-gnu.so
440K	./sklearn/metrics/_pairwise_distances_reduction/_datasets_pair.cpython-312-x86_64-linux-gnu.so
440K	./sklearn/linear_model/_cd_fast.cpython-312-x86_64-linux-gnu.so
404K	./sklearn/cluster/_k_means_elkan.cpython-312-x86_64-linux-gnu.so
396K	./sklearn/utils/_cython_blas.cpython-312-x86_64-linux-gnu.so
396K	./sklearn/cluster/_k_means_common.cpython-312-x86_64-linux-gnu.so
384K	./sklearn/preprocessing/_csr_polynomial_expansion.cpython-312-x86_64-linux-gnu.so
352K	./sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors.cpython-312-x86_64-linux-gnu.so
340K	./sklearn/tree/_splitter.cpython-312-x86_64-linux-gnu.so
340K	./sklearn/cluster/_hdbscan/_tree.cpython-312-x86_64-linux-gnu.so
324K	./sklearn/metrics/_pairwise_distances_reduction/_argkmin.cpython-312-x86_64-linux-gnu.so
324K	./sklearn/linear_model/_sgd_fast.cpython-312-x86_64-linux-gnu.so
312K	./sklearn/tree/_criterion.cpython-312-x86_64-linux-gnu.so
312K	./sklearn/cluster/_k_means_lloyd.cpython-312-x86_64-linux-gnu.so
308K	./sklearn/ensemble/_hist_gradient_boosting/splitting.cpython-312-x86_64-linux-gnu.so
308K	./sklearn/cluster/_hdbscan/_reachability.cpython-312-x86_64-linux-gnu.so
304K	./sklearn/cluster/_hierarchical_fast.cpython-312-x86_64-linux-gnu.so
292K	./sklearn/metrics/_pairwise_distances_reduction/_base.cpython-312-x86_64-linux-gnu.so
280K	./sklearn/linear_model/_sag_fast.cpython-312-x86_64-linux-gnu.so
276K	./sklearn/ensemble/_hist_gradient_boosting/histogram.cpython-312-x86_64-linux-gnu.so
272K	./sklearn/svm/_liblinear.cpython-312-x86_64-linux-gnu.so
268K	./sklearn/utils/_seq_dataset.cpython-312-x86_64-linux-gnu.so
264K	./sklearn/neighbors/_quad_tree.cpython-312-x86_64-linux-gnu.so
260K	./sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors_classmode.cpython-312-x86_64-linux-gnu.so
256K	./sklearn/_isotonic.cpython-312-x86_64-linux-gnu.so
252K	./sklearn/cluster/_k_means_minibatch.cpython-312-x86_64-linux-gnu.so
248K	./sklearn/utils/_fast_dict.cpython-312-x86_64-linux-gnu.so
248K	./sklearn/tree/_utils.cpython-312-x86_64-linux-gnu.so
248K	./sklearn/metrics/_pairwise_fast.cpython-312-x86_64-linux-gnu.so
244K	./sklearn/decomposition/_online_lda_fast.cpython-312-x86_64-linux-gnu.so
240K	./sklearn/metrics/_pairwise_distances_reduction/_argkmin_classmode.cpython-312-x86_64-linux-gnu.so
232K	./sklearn/utils/_typedefs.cpython-312-x86_64-linux-gnu.so
232K	./sklearn/utils/_isfinite.cpython-312-x86_64-linux-gnu.so
220K	./sklearn/utils/arrayfuncs.cpython-312-x86_64-linux-gnu.so
220K	./sklearn/ensemble/_gradient_boosting.cpython-312-x86_64-linux-gnu.so
216K	./sklearn/ensemble/_hist_gradient_boosting/utils.cpython-312-x86_64-linux-gnu.so
212K	./sklearn/utils/_random.cpython-312-x86_64-linux-gnu.so
212K	./sklearn/utils/murmurhash.cpython-312-x86_64-linux-gnu.so
212K	./sklearn/ensemble/_hist_gradient_boosting/_predictor.cpython-312-x86_64-linux-gnu.so
212K	./sklearn/decomposition/_cdnmf_fast.cpython-312-x86_64-linux-gnu.so
212K	./sklearn/cluster/_hdbscan/_linkage.cpython-312-x86_64-linux-gnu.so
204K	./sklearn/metrics/cluster/_expected_mutual_info_fast.cpython-312-x86_64-linux-gnu.so
204K	./sklearn/manifold/_barnes_hut_tsne.cpython-312-x86_64-linux-gnu.so
192K	./sklearn/utils/_weight_vector.cpython-312-x86_64-linux-gnu.so
184K	./sklearn/manifold/_utils.cpython-312-x86_64-linux-gnu.so
184K	./sklearn/ensemble/_hist_gradient_boosting/_gradient_boosting.cpython-312-x86_64-linux-gnu.so
184K	./sklearn/cluster/_dbscan_inner.cpython-312-x86_64-linux-gnu.so
180K	./sklearn/ensemble/_hist_gradient_boosting/_bitset.cpython-312-x86_64-linux-gnu.so
180K	./sklearn/ensemble/_hist_gradient_boosting/_binning.cpython-312-x86_64-linux-gnu.so
128K	./sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so
112K	./sklearn/ensemble/_hist_gradient_boosting/common.cpython-312-x86_64-linux-gnu.so
80K	    ./sklearn/feature_extraction/_hashing_fast.cpython-312-x86_64-linux-gnu.so
48K	    ./sklearn/utils/_openmp_helpers.cpython-312-x86_64-linux-gnu.so
32K	    ./sklearn/svm/_newrand.cpython-312-x86_64-linux-gnu.so
28K	    ./sklearn/utils/_sorting.cpython-312-x86_64-linux-gnu.so
28K	    ./sklearn/neighbors/_partition_nodes.cpython-312-x86_64-linux-gnu.so
24K	    ./sklearn/utils/_heap.cpython-312-x86_64-linux-gnu.so
24K	    ./sklearn/__check_build/_check_build.cpython-312-x86_64-linux-gnu.so

Possible solutions

Strip all symbols and optimize for size

This can be done by adding -Wl,--strip-all to extra_link_args and -Os -g0 to extra_compile_args.

In practice, it can significantly shrink shared object (up to nearly 50% size reduction):

Extensions' shared object sizes on Linux after striping all symbols and optimizing for size
find . -name \*.so -exec du -h {} \; | sort -h --reverse
1,2M	sklearn/_loss/_loss.cpython-312-x86_64-linux-gnu.so
560K	sklearn/tree/_tree.cpython-312-x86_64-linux-gnu.so
468K	sklearn/neighbors/_kd_tree.cpython-312-x86_64-linux-gnu.so
468K	sklearn/neighbors/_ball_tree.cpython-312-x86_64-linux-gnu.so
460K	sklearn/utils/sparsefuncs_fast.cpython-312-x86_64-linux-gnu.so
460K	sklearn/metrics/_dist_metrics.cpython-312-x86_64-linux-gnu.so
380K	sklearn/datasets/_svmlight_format_fast.cpython-312-x86_64-linux-gnu.so
376K	sklearn/svm/_libsvm.cpython-312-x86_64-linux-gnu.so
352K	sklearn/svm/_libsvm_sparse.cpython-312-x86_64-linux-gnu.so
336K	sklearn/tree/_splitter.cpython-312-x86_64-linux-gnu.so
320K	sklearn/preprocessing/_target_encoder_fast.cpython-312-x86_64-linux-gnu.so
320K	sklearn/metrics/_pairwise_distances_reduction/_middle_term_computer.cpython-312-x86_64-linux-gnu.so
316K	sklearn/metrics/_pairwise_distances_reduction/_datasets_pair.cpython-312-x86_64-linux-gnu.so
312K	sklearn/utils/_cython_blas.cpython-312-x86_64-linux-gnu.so
308K	sklearn/linear_model/_cd_fast.cpython-312-x86_64-linux-gnu.so
300K	sklearn/tree/_criterion.cpython-312-x86_64-linux-gnu.so
280K	sklearn/preprocessing/_csr_polynomial_expansion.cpython-312-x86_64-linux-gnu.so
280K	sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors.cpython-312-x86_64-linux-gnu.so
280K	sklearn/cluster/_k_means_elkan.cpython-312-x86_64-linux-gnu.so
276K	sklearn/cluster/_k_means_common.cpython-312-x86_64-linux-gnu.so
252K	sklearn/linear_model/_sgd_fast.cpython-312-x86_64-linux-gnu.so
248K	sklearn/cluster/_hdbscan/_tree.cpython-312-x86_64-linux-gnu.so
244K	sklearn/metrics/_pairwise_distances_reduction/_argkmin.cpython-312-x86_64-linux-gnu.so
240K	sklearn/tree/_utils.cpython-312-x86_64-linux-gnu.so
240K	sklearn/ensemble/_hist_gradient_boosting/splitting.cpython-312-x86_64-linux-gnu.so
236K	sklearn/cluster/_hierarchical_fast.cpython-312-x86_64-linux-gnu.so
236K	sklearn/cluster/_hdbscan/_reachability.cpython-312-x86_64-linux-gnu.so
232K	sklearn/svm/_liblinear.cpython-312-x86_64-linux-gnu.so
228K	sklearn/cluster/_k_means_lloyd.cpython-312-x86_64-linux-gnu.so
224K	sklearn/metrics/_pairwise_distances_reduction/_base.cpython-312-x86_64-linux-gnu.so
212K	sklearn/neighbors/_quad_tree.cpython-312-x86_64-linux-gnu.so
208K	sklearn/linear_model/_sag_fast.cpython-312-x86_64-linux-gnu.so
204K	sklearn/ensemble/_hist_gradient_boosting/histogram.cpython-312-x86_64-linux-gnu.so
200K	sklearn/utils/_seq_dataset.cpython-312-x86_64-linux-gnu.so
196K	sklearn/utils/_fast_dict.cpython-312-x86_64-linux-gnu.so
196K	sklearn/metrics/_pairwise_distances_reduction/_radius_neighbors_classmode.cpython-312-x86_64-linux-gnu.so
196K	sklearn/_isotonic.cpython-312-x86_64-linux-gnu.so
196K	sklearn/decomposition/_online_lda_fast.cpython-312-x86_64-linux-gnu.so
196K	sklearn/cluster/_k_means_minibatch.cpython-312-x86_64-linux-gnu.so
192K	sklearn/utils/_isfinite.cpython-312-x86_64-linux-gnu.so
192K	sklearn/metrics/_pairwise_fast.cpython-312-x86_64-linux-gnu.so
192K	sklearn/metrics/_pairwise_distances_reduction/_argkmin_classmode.cpython-312-x86_64-linux-gnu.so
184K	sklearn/utils/_typedefs.cpython-312-x86_64-linux-gnu.so
180K	sklearn/utils/arrayfuncs.cpython-312-x86_64-linux-gnu.so
180K	sklearn/ensemble/_hist_gradient_boosting/utils.cpython-312-x86_64-linux-gnu.so
180K	sklearn/ensemble/_gradient_boosting.cpython-312-x86_64-linux-gnu.so
172K	sklearn/utils/_random.cpython-312-x86_64-linux-gnu.so
168K	sklearn/utils/murmurhash.cpython-312-x86_64-linux-gnu.so
168K	sklearn/ensemble/_hist_gradient_boosting/_predictor.cpython-312-x86_64-linux-gnu.so
168K	sklearn/decomposition/_cdnmf_fast.cpython-312-x86_64-linux-gnu.so
168K	sklearn/cluster/_hdbscan/_linkage.cpython-312-x86_64-linux-gnu.so
160K	sklearn/metrics/cluster/_expected_mutual_info_fast.cpython-312-x86_64-linux-gnu.so
160K	sklearn/manifold/_barnes_hut_tsne.cpython-312-x86_64-linux-gnu.so
152K	sklearn/utils/_weight_vector.cpython-312-x86_64-linux-gnu.so
152K	sklearn/manifold/_utils.cpython-312-x86_64-linux-gnu.so
152K	sklearn/ensemble/_hist_gradient_boosting/_gradient_boosting.cpython-312-x86_64-linux-gnu.so
152K	sklearn/cluster/_dbscan_inner.cpython-312-x86_64-linux-gnu.so
148K	sklearn/ensemble/_hist_gradient_boosting/_bitset.cpython-312-x86_64-linux-gnu.so
144K	sklearn/ensemble/_hist_gradient_boosting/_binning.cpython-312-x86_64-linux-gnu.so
112K	sklearn/utils/_vector_sentinel.cpython-312-x86_64-linux-gnu.so
96K	sklearn/ensemble/_hist_gradient_boosting/common.cpython-312-x86_64-linux-gnu.so
68K	sklearn/feature_extraction/_hashing_fast.cpython-312-x86_64-linux-gnu.so
44K	sklearn/utils/_openmp_helpers.cpython-312-x86_64-linux-gnu.so
32K	sklearn/svm/_newrand.cpython-312-x86_64-linux-gnu.so
28K	sklearn/utils/_sorting.cpython-312-x86_64-linux-gnu.so
28K	sklearn/neighbors/_partition_nodes.cpython-312-x86_64-linux-gnu.so
24K	sklearn/utils/_heap.cpython-312-x86_64-linux-gnu.so
24K	sklearn/__check_build/_check_build.cpython-312-x86_64-linux-gnu.so

Group several translation units within C extensions (and use interprocedural optimization)

So as to reuse duplicated symbols in shared objects and perform optimization over several translation units (such as inlining functions, etc.)

@rth
Copy link
Member

rth commented Nov 13, 2023

Thanks for investigating @jjerphan !

Strip all symbols

In Pyodide I think we are already stripping most of them, they why they are 2x smaller there. Outside of browser, I'm not sure to what extent stripping debug informaiton is good. If there is a segfault it's easier to investigate with debug information.

@jjerphan
Copy link
Member Author

By default, debug symbols aren't used and SKLEARN_BUILD_ENABLE_DEBUG_SYMBOLS must be set so that they are present:

os.environ.get("SKLEARN_BUILD_ENABLE_DEBUG_SYMBOLS", "0") != "0"

@thomasjpfan, @ogrisel, @jeremiedbb, @lorentzenchr, @Micky774: Do you think we should strip all symbols and optimize for size? Or do you think it is worth keeping symbols unchanged?

@Micky774
Copy link
Contributor

I think it would be reasonable to strip symbols, since I imagine the vast majority of our user base doesn't actually use them. Folks can always build from source if needed. It's mainly helpful on CI to avoid non-descriptive ??? errors at the Cython level.

@thomasjpfan
Copy link
Member

I am +1 on stripping the symbols by default. If we need the symbols for development, then we set the environment variable to enable them.

@jjerphan
Copy link
Member Author

jjerphan commented Dec 18, 2023

I am trying to see whether there are other options to reduce the size of the native extensions' shared objects.

Do you see anything else?

@rth
Copy link
Member

rth commented Dec 19, 2023

Thanks for the suggestions!

For the WASM use-case, I think we are already stripping symbols in Pyodide that's why .so are 2x smaller than say on x86_64,

$ pyodide auditwheel exports sklearn/utils/_random.cpython-311-wasm32-emscripten.so
sklearn/utils/_random.cpython-311-wasm32-emscripten.so:
      FUNC	__wasm_call_ctors
      FUNC	__wasm_apply_data_relocs
      FUNC	PyInit__random
    GLOBAL	__pyx_module_is_main_sklearn__utils___random
$  ls -lh sklearn/utils/_random.cpython-311-wasm32-emscripten.so
-rw-------@ 1 rth  staff   121K Sep 25 22:40 sklearn/utils/_random.cpython-311-wasm32-emscripten.so

However it's still rather large with likely some duplicate objects between .so (aside from the exported symbols).

@jjerphan
Copy link
Member Author

However it's still rather large with likely some duplicate objects between .so (aside from the exported symbols).

Inter-procedural optimizations (such as link-time optimization) might help reducing the size of shared objects since it generally remove objects' duplication in between translation unit for each shared object.

Ideally, objects' duplication must not be present across shared objects. Resolving cython/cython#2356 seems relevant in this regard, but I do not know of other mitigations.

I am afraid I do not have time to have a look at this issue right now. I'll try to see if I can explore solutions soon.

@jjerphan
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants