New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: to_json very slow with lines=True #14408

Closed
joshowen opened this Issue Oct 12, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@joshowen
Contributor

joshowen commented Oct 12, 2016

A small, complete example of the issue

N = 100000
C = 5

In [6]: df = DataFrame(dict([('float{0}'.format(i), np.random.randn(N)) for i in range(C)]))

In [7]: df.to_json('foo.json',orient='records',lines=True)

In [8]: %timeit df.to_json('foo.json',orient='records',lines=True)
1 loop, best of 3: 3.66 s per loop

In [9]: %timeit df.to_json('foo.json',orient='records')
10 loops, best of 3: 98.8 ms per loop

As discussed in #14391

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 12, 2016

Contributor
Contributor

jreback commented Oct 12, 2016

jreback added a commit to jreback/pandas that referenced this issue Oct 15, 2016

@jreback jreback closed this in 7cad3f1 Oct 15, 2016

tworec pushed a commit to RTBHOUSE/pandas that referenced this issue Oct 21, 2016

jorisvandenbossche added a commit that referenced this issue Nov 1, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment