Skip to content

frame values access is slow with large frames with different dtypes #3629

@cpcloud

Description

@cpcloud

In addtion, there is a memory leak. Repeatedly evaling big.values eats up the amount of RAM taken up by the array.

from pandas import DataFrame, concat
from numpy.random import randint, randn
from string import ascii_letters as letters
ncols = 16
nrows = 1e7
letters = list(letters)
df = DataFrame(randn(nrows, ncols), columns=letters[:ncols])
col = DataFrame(randint(1, size=nrows), columns=[letters[ncols]])
big = concat([df, col], axis=1)
big.values  # very slow and leaky

Metadata

Metadata

Assignees

No one assigned

    Labels

    Dtype ConversionsUnexpected or buggy dtype conversionsPerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions