Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polars hangup using multi-process #7535

Closed
2 tasks done
yanxiu0614 opened this issue Mar 14, 2023 · 4 comments
Closed
2 tasks done

polars hangup using multi-process #7535

yanxiu0614 opened this issue Mar 14, 2023 · 4 comments
Labels
bug Something isn't working rust Related to Rust Polars

Comments

@yanxiu0614
Copy link

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

When I use polars and the process pool to improve efficiency, I find that the whole program will be stuck, and the expected work is not obtained. After debugging, I found that the following code will cause blocking. This is a recurring code that probably doesn't make any sense other than reproducing. Also, if I remove read_csv from the read function or use pandas read_csv instead, everything works as expected.

Reproducible example

import polars as pl
import numpy as np
from concurrent.futures import ProcessPoolExecutor


def read():
    df = pl.DataFrame({"value": np.random.standard_normal(700)}).write_csv("test.csv")
    _df = pl.read_csv("test.csv")
    return _df


def minmax(series):
    minval = series.min()
    return (series[-1] - minval) / (series.max() - minval)


def rolling(_df):
    print("enter")
    _df.with_columns(_df["value"].rolling_apply(minmax, 50).alias("score"))
    print("exit")


if __name__ == '__main__':
    df = pl.DataFrame({"value": np.random.standard_normal(700)})
    pool = ProcessPoolExecutor(1)
    for i in read():
        pool.submit(rolling, df)
    pool.shutdown()

Expected behavior

print enter exit

Installed versions

OS:Linux x64(debian bullseye or ubuntu jammy ) Python: 3.9.2 or 3.10.6 polars: polars-0.16.12
@yanxiu0614 yanxiu0614 added bug Something isn't working rust Related to Rust Polars labels Mar 14, 2023
@cmdlineluser
Copy link
Contributor

https://pola-rs.github.io/polars-book/user-guide/howcani/multiprocessing.html#combining-polars-with-pythons-multiprocessing

For the locking issue - you'd need to use spawn or forkserver.

e.g.

Pool(..., mp_context=multiprocessing.get_context("spawn"))

@yanxiu0614
Copy link
Author

I confirm that I need to go through all the docs again, thanks for your help.

@tocab
Copy link

tocab commented Jun 2, 2023

The page linked by @cmdlineluser does not exist anymore. Is there a new resource for using multiprocessing with polars?

@alicanb
Copy link

alicanb commented Jun 27, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rust Related to Rust Polars
Projects
None yet
Development

No branches or pull requests

5 participants