Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python takes long time when return big data #91127

Closed
HumberMe mannequin opened this issue Mar 10, 2022 · 5 comments
Closed

python takes long time when return big data #91127

HumberMe mannequin opened this issue Mar 10, 2022 · 5 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@HumberMe
Copy link
Mannequin

HumberMe mannequin commented Mar 10, 2022

BPO 46971
Nosy @mdickinson, @HumberMe
Files
  • python_performance_issue.py: reproducer
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-03-10.11:28:55.847>
    created_at = <Date 2022-03-10.09:40:49.007>
    labels = ['interpreter-core', 'invalid', '3.8', 'performance']
    title = 'python takes long time when return big data'
    updated_at = <Date 2022-03-11.17:26:56.933>
    user = 'https://github.com/HumberMe'

    bugs.python.org fields:

    activity = <Date 2022-03-11.17:26:56.933>
    actor = 'mark.dickinson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-03-10.11:28:55.847>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2022-03-10.09:40:49.007>
    creator = 'HumberMe'
    dependencies = []
    files = ['50665']
    hgrepos = []
    issue_num = 46971
    keywords = []
    message_count = 5.0
    messages = ['414836', '414842', '414883', '414885', '414923']
    nosy_count = 2.0
    nosy_names = ['mark.dickinson', 'HumberMe']
    pr_nums = []
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue46971'
    versions = ['Python 3.8']

    @HumberMe HumberMe mannequin added 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Mar 10, 2022
    @HumberMe
    Copy link
    Mannequin Author

    HumberMe mannequin commented Mar 10, 2022

    it takes a long time when python return big data.
    generally, when a function return something, it only take less than 1e-5 second,
    but when the result is big, like np.random.rand(2048,3,224,224), the time cost will increase to 0.1-0.2 second

    @mdickinson
    Copy link
    Member

    This is expected. Your timing measures the time for garbage collection of the large arrays in addition to the time for the result to be returned.

    In the line result = myfunc(), the name result gets rebound to the value of myfunc(). That means that result is unbound from whatever it was previously bound to, and the old value then gets garbage collected.

    You can test this by adding a "del result" line as the last line inside the "for" loop block.

    @HumberMe
    Copy link
    Mannequin Author

    HumberMe mannequin commented Mar 11, 2022

    thanks for your explaining, by the way, why it costs lots of time when del
    a large array?

    @HumberMe
    Copy link
    Mannequin Author

    HumberMe mannequin commented Mar 11, 2022

    I am currently processing large data, and the time spent by del is unacceptable. Is there any way to process del in parallel?

    @mdickinson
    Copy link
    Member

    why it costs lots of time when del a large array?

    That's probably a question for the NumPy folks, or possibly for Stack Overflow or some other question-and-answer resource. It'll depend on how NumPy arrays are de-allocated.

    Is there any way to process del in parallel?

    Seems unlikely, given GIL constraints.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant