Skip to content

Improving binout reading performance significantly #83

@marcodesiderio

Description

@marcodesiderio

Hello! Not sure if the way it is currently done is for a specific reason, but on my local version of lasso, I was able to significantly improve the reading speed of binout outputs by simply changing, in binout.py, as_df() method, the following:

Change from:
for i, j in enumerate(ids):
df[str(j)] = data.T[i]

to:

df = pd.DataFrame(data=data, index=time_pdi, columns=[str(j) for j in ids])

(Improvement in performance is at least 3x on small binout [~80MB], 30x+ on large databases [multiple GB] - a colleague of mine started reading data before a 1h meeting, the data was still loading after the meeting. I changed the code, read the data in 7 minutes using the above modification, and the old code was not even half-way done). This is especially useful for elout, swforc if there are lots of elements for instance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions