Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"zip()" very slowly for this #85756

Closed
email0ya mannequin opened this issue Aug 19, 2020 · 6 comments
Closed

"zip()" very slowly for this #85756

email0ya mannequin opened this issue Aug 19, 2020 · 6 comments
Labels
3.9 only security fixes docs Documentation in the Doc dir performance Performance or resource usage

Comments

@email0ya
Copy link
Mannequin

email0ya mannequin commented Aug 19, 2020

BPO 41590
Nosy @stevendaprano, @vedgar
Files
  • nested_lists.py: 8 tests of speed, conclusion in end
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-08-19.17:33:13.506>
    labels = ['docs', '3.9', 'performance']
    title = '"zip()" very slowly for this'
    updated_at = <Date 2020-08-19.23:50:54.576>
    user = 'https://bugs.python.org/email0ya'

    bugs.python.org fields:

    activity = <Date 2020-08-19.23:50:54.576>
    actor = 'steven.daprano'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation']
    creation = <Date 2020-08-19.17:33:13.506>
    creator = 'email0.ya'
    dependencies = []
    files = ['49405']
    hgrepos = []
    issue_num = 41590
    keywords = []
    message_count = 6.0
    messages = ['375662', '375666', '375672', '375673', '375674', '375679']
    nosy_count = 4.0
    nosy_names = ['steven.daprano', 'docs@python', 'veky', 'email0.ya']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue41590'
    versions = ['Python 3.9']

    @email0ya
    Copy link
    Mannequin Author

    email0ya mannequin commented Aug 19, 2020

    https://docs.python.org/3.9/tutorial/datastructures.html
    Nested List Comprehensions

    @email0ya email0ya mannequin added the 3.9 only security fixes label Aug 19, 2020
    @email0ya email0ya mannequin assigned docspython Aug 19, 2020
    @email0ya email0ya mannequin added docs Documentation in the Doc dir performance Performance or resource usage 3.9 only security fixes labels Aug 19, 2020
    @email0ya email0ya mannequin assigned docspython Aug 19, 2020
    @email0ya email0ya mannequin added docs Documentation in the Doc dir performance Performance or resource usage labels Aug 19, 2020
    @stevendaprano
    Copy link
    Member

    What are you actually reporting? What part of the documentation do you think should be changed, why should it be changed, and what should it be changed to?

    It is normal for different algorithms to perform with different speed. I'm not sure what the purpose of the "nested_lists" file is. You have discovered that different programs perform differently. Okay. What do you want us to do?

    One last comment: using time() as you did is unreliable, especially for small code snippets that are very fast. The best way to time small snippets of Python code is to use the timeit module.

    @email0ya
    Copy link
    Mannequin Author

    email0ya mannequin commented Aug 19, 2020

    Above is a link to part of the tutorial. An example with a for statement at the beginning is good. I am not saying that it needs to be replaced with something. The above example reads: "A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses".


    def transpose1_3_1 (a: list) -> list:
    "" "Transpose matrix version 1.3.1" ""
    max_j = len (max (a))
    return [
    [row [j] if j <len (row)
    else 0 for row in a]
    for j in range (max_j)]


    Here a list comprehension consists of brackets containing an expression followed by a if clause, then else and for clause in end.
    It is clear that the above is written inaccurately (i.e. it can be corrected).
    I think it's a good idea to add a similar example with an if statement at the beginning.
    At the end it says: “In the real world, you should prefer built-in functions to complex flow statements. The zip () function would do a great job for this use case".
    It's good that this is said here about this function. I agree with the first sentence. Disagree that zip () function would do a great job for this use case (Transpose).
    The examples in the tutorial only demonstrate the possibilities, they should be simple, I understand. But, they cannot be used to solve a problem with a small change in the value of variables, and this can be done very simply. The zip () function not only performed the transpose slower than even the first function, but it also cannot execute it if the argument values change slightly. She only repeated what could only work in this particular case.
    It's good if the algorithms work more efficiently. It's not okay to use slow algorithms. And it would be nice if programmers, reading the tutorial, would write fast algorithms.
    I already know about timeit. Thank. I have calculated the arithmetic mean for many cases. The results were stable.
    I wish you good.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 19, 2020

    First, what you wrote with if...else is _not_ "if clause". It's a conditional expression. You'll recognize them easily if you remember that "if clause" in comprehensions never has an "else", while conditional expression always does.

    Second, speed is just one of many, many criteria by which we evaluate programs. If only speed matters to you, you probably shouldn't be writing in Python at all. There are much faster languages.

    @vedgar
    Copy link
    Mannequin

    vedgar mannequin commented Aug 19, 2020

    Third, where did you get the idea that transposing a "matrix" consisting of lists of varying length is a sensible operation? Your reference (Wikipedia) tells nothing about that case. I've never seen [[],[1]] called "a matrix".

    @stevendaprano
    Copy link
    Member

    If you know about timeit, why aren't you using it?

    In any case, I have just ran your nested_lists.py file, and on my
    computer, the last version with zip is the fastest version:

    0 transpose1_0(lT) Time = 1.0627508163452149e-05
    7 zip(*lT) Time = 1.5511512756347657e-06

    1.55e-6 is smaller than 1.06e-5:

    py> 1.55e-6 < 1.06e-5
    True
    

    Smaller times are faster, and the first version transpose1_0 takes
    nearly seven times longer to run than the last version with zip.

    As for the other comments, the purpose of the tutorial is to teach the
    language in the simplest way possible. It is aimed at beginners, not
    intermediate or expert programmers. The purpose of this section is to
    demonstrate the basic features of list comprehensions, not to overwhelm
    the beginner with complicated details.

    Making the example more complicated so that it is a tiny bit faster
    would not a good tradeoff for the tutorial, but in fact all the more
    complicated versions are slower, not faster.

    You are also confused about the structure of comprehensions. The
    comprehension consists of

    expression-part for-part (optional for- and if-parts)
    

    There is no else-part in the comprehension, and the for-part always
    comes before any if-part. Your code:

    row[j] if j < len(row) else 0
    

    is not an "if statement" as you call it, neither is it part of the
    comprehension syntax. It is the expression-part of the comprehension,
    containing a ternary if-expression. It is not part of the structure of
    the comprehension.

    See the full language specification, in particular the grammar:

    https://docs.python.org/3.9/reference/grammar.html

    We could re-write the if-expression like this:

    (j < len(row)) and row[j] or 0
    

    That doesn't mean that comprehensions consist of an "and" part followed
    by an "or" part followed by a for-part. The "and" and "or" are just part
    of the expression part of the comprehension.

    Adding this level of technical detail and complexity to an introductory
    tutorial aimed at beginners would not be a good idea.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel closed this as not planned Won't fix, can't repro, duplicate, stale Aug 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes docs Documentation in the Doc dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants