Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of list-like objects insertion into DataFrames #6398

Closed
AndreyPavlenko opened this issue Jul 19, 2023 · 0 comments · Fixed by #6412 or #6476
Closed

Improve performance of list-like objects insertion into DataFrames #6398

AndreyPavlenko opened this issue Jul 19, 2023 · 0 comments · Fixed by #6412 or #6476
Labels
new feature/request 💬 Requests and pull requests for new features P2 Minor bugs or low-priority feature requests

Comments

@AndreyPavlenko
Copy link
Collaborator

A list-like objects insertion:

df["A"] = range(1000000)

is much slower than insertion of the same objects wrapped into a Modin object:

df["A"] = pd.Series(range(1000000))

In the first case the performance could be improved significantly, if the Modin object is created automatically.

@AndreyPavlenko AndreyPavlenko added the new feature/request 💬 Requests and pull requests for new features label Jul 19, 2023
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 19, 2023
…ertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 19, 2023
…ertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
@AndreyPavlenko AndreyPavlenko added dependencies 🔗 Issues related to dependencies and removed dependencies 🔗 Issues related to dependencies labels Jul 19, 2023
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 20, 2023
…ertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 24, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 24, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 24, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 24, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 25, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 26, 2023
…sertion into DataFrames

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 27, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 27, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 28, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 28, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 29, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 29, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jul 29, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 9, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 9, 2023
…ts insertion into DataFrames

Wrap a list-like object into a single-column query compiler before the insertion.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 9, 2023
…sertion into DataFrames

Wrap a list-like object into a single-column query compiler before the insertion.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 9, 2023
…sertion into DataFrames

Wrap a list-like object into a single-column query compiler before the insertion.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 10, 2023
…sertion into DataFrames

Wrap a list-like object into a single-column query compiler before the insertion.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
@mvashishtha mvashishtha added the P2 Minor bugs or low-priority feature requests label Aug 17, 2023
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Aug 22, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
dchigarev pushed a commit that referenced this issue Aug 24, 2023
…DataFrames (#6476)

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Sep 25, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Sep 25, 2023
…sertion into DataFrames

Wrap list-like object into a single-column frame before the insertion.
In case of the HDK backend: if the partition contains either pandas DataFrame or
pyarrow Table, insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Sep 25, 2023
…sertion into DataFrames

If the partition contains either pandas DataFrame or pyarrow Table,
insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Sep 27, 2023
…sertion into DataFrames

If the partition contains either pandas DataFrame or pyarrow Table,
insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jan 22, 2024
…sertion into DataFrames

If the partition contains either pandas DataFrame or pyarrow Table,
insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jan 22, 2024
…sertion into HDK DataFrames

If the partition contains either pandas DataFrame or pyarrow Table,
insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
AndreyPavlenko added a commit to AndreyPavlenko/modin that referenced this issue Jan 26, 2024
…sertion into HDK DataFrames

If the partition contains either pandas DataFrame or pyarrow Table,
insert the object directly into the frame/table, otherwise create
a single-column frame and join the frames by rowid.

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
anmyachev added a commit that referenced this issue Jan 26, 2024
…HDK DataFrames (#6412)

Signed-off-by: Andrey Pavlenko <andrey.a.pavlenko@gmail.com>
Co-authored-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Co-authored-by: Anatoly Myachev <anatoly.myachev@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature/request 💬 Requests and pull requests for new features P2 Minor bugs or low-priority feature requests
Projects
None yet
2 participants