Skip to content

Conversation

Alvaro-Kothe
Copy link
Contributor


This patch decrements all references obtained from PyObject_GetAttrString to fix the memory leak.

I used this script to reproduce the memory leak:

import pandas as pd

for _ in range(10_000):
    df = pd.DataFrame({'col1': [12.34]}, 
                      index=pd.date_range('1/1/2019', '10/1/2019', freq="D", tz="UTC"))
    result = df.reset_index().to_json()

and pinpointed the problem with valgrind, which have shown this:

==59316== 87,674,752 bytes in 2,739,836 blocks are definitely lost in loss record 32,561 of 32,561
==59316==    at 0x4842B26: malloc (vg_replace_malloc.c:446)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:62)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:982)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:2238)
==59316==    by 0x49AC853: UnknownInlinedFun (obmalloc.c:1400)
==59316==    by 0x49AC853: UnknownInlinedFun (longobject.c:209)
==59316==    by 0x49AC853: PyLong_FromLong (longobject.c:305)
==59316==    by 0x43559F5C: __pyx_getprop_6pandas_5_libs_6tslibs_10timestamps_10_Timestamp_year (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/tslibs/timestamps.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x49E765C: _PyObject_GenericGetAttrWithDict (object.c:1665)
==59316==    by 0x49B8B4A: UnknownInlinedFun (object.c:1751)
==59316==    by 0x49B8B4A: PyObject_GetAttr (object.c:1261)
==59316==    by 0x49F6741: PyObject_GetAttrString (object.c:1131)
==59316==    by 0x42F9C027: convert_pydatetime_to_datetimestruct (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/pandas_datetime.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x42F9C2C1: PyDateTimeToEpoch (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/pandas_datetime.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACA92D: Object_beginTypeContext (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCB9C: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCE41: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)
==59316==    by 0x52ACCE41: encode (in /home/alvaro/projects/oss/pandas/build/cp313/pandas/_libs/json.cpython-313-x86_64-linux-gnu.so)

@mroeschke mroeschke added this to the 3.0 milestone Aug 28, 2025
@mroeschke mroeschke added Performance Memory or execution speed performance IO JSON read_json, to_json, json_normalize labels Aug 28, 2025
@mroeschke mroeschke merged commit 0e21777 into pandas-dev:main Aug 28, 2025
46 checks passed
@mroeschke
Copy link
Member

Thanks @Alvaro-Kothe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: memory leak in to_json when converting DateTime values
2 participants