New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: to_json very slow with lines=True #14408

joshowen opened this Issue Oct 12, 2016 · 1 comment


None yet
2 participants

joshowen commented Oct 12, 2016

A small, complete example of the issue

N = 100000
C = 5

In [6]: df = DataFrame(dict([('float{0}'.format(i), np.random.randn(N)) for i in range(C)]))

In [7]: df.to_json('foo.json',orient='records',lines=True)

In [8]: %timeit df.to_json('foo.json',orient='records',lines=True)
1 loop, best of 3: 3.66 s per loop

In [9]: %timeit df.to_json('foo.json',orient='records')
10 loops, best of 3: 98.8 ms per loop

As discussed in #14391


This comment has been minimized.

Show comment
Hide comment

jreback Oct 12, 2016


jreback commented Oct 12, 2016

jreback added a commit to jreback/pandas that referenced this issue Oct 15, 2016

@jreback jreback closed this in 7cad3f1 Oct 15, 2016

tworec pushed a commit to RTBHOUSE/pandas that referenced this issue Oct 21, 2016

jorisvandenbossche added a commit that referenced this issue Nov 1, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment