Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collections.counter examples are misleading #76951

Closed
TonyFlury mannequin opened this issue Feb 5, 2018 · 5 comments
Closed

collections.counter examples are misleading #76951

TonyFlury mannequin opened this issue Feb 5, 2018 · 5 comments
Assignees
Labels
docs Documentation in the Doc dir

Comments

@TonyFlury
Copy link
Mannequin

TonyFlury mannequin commented Feb 5, 2018

BPO 32770
Nosy @rhettinger, @TonyFlury, @csabella

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/rhettinger'
closed_at = <Date 2018-02-05.04:36:16.659>
created_at = <Date 2018-02-05.00:34:34.719>
labels = ['invalid', 'docs']
title = 'collections.counter examples are misleading'
updated_at = <Date 2018-02-06.07:20:23.196>
user = 'https://github.com/TonyFlury'

bugs.python.org fields:

activity = <Date 2018-02-06.07:20:23.196>
actor = 'anthony-flury'
assignee = 'rhettinger'
closed = True
closed_date = <Date 2018-02-05.04:36:16.659>
closer = 'rhettinger'
components = ['Documentation']
creation = <Date 2018-02-05.00:34:34.719>
creator = 'anthony-flury'
dependencies = []
files = []
hgrepos = []
issue_num = 32770
keywords = []
message_count = 5.0
messages = ['311630', '311643', '311666', '311702', '311710']
nosy_count = 4.0
nosy_names = ['rhettinger', 'docs@python', 'anthony-flury', 'cheryl.sabella']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue32770'
versions = ['Python 2.7', 'Python 3.4', 'Python 3.5', 'Python 3.6']

@TonyFlury
Copy link
Mannequin Author

TonyFlury mannequin commented Feb 5, 2018

The first example given for collections.Counter is misleading - the documentation ideally should show the 'best' (one and only one) way to do something and the example is this :

>>> # Tally occurrences of words in a list
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
...     cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})

clearly this could simply be :

>>> # Tally occurrences of words in a list
>>> cnt = Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})

(i.e. the iteration through the array is unneeded in this example).

The 2nd example is better in showing the 'entry-level' use of the Counter class.

There possibly does need to be a simple example of when you might manually increment the Counter class - but I don't think that the examples given illustrate that in a useful way; and I personally haven't come across a use-case for manually incrementing the Counter class entires that couldn't be accomplished with a comprehension or generator expression passed directly to the Counter constructor.

@TonyFlury TonyFlury mannequin assigned docspython Feb 5, 2018
@TonyFlury TonyFlury mannequin added the docs Documentation in the Doc dir label Feb 5, 2018
@rhettinger
Copy link
Contributor

Thanks for the suggestion. I respectfully disagree. The "core" functionality of Counter is the ability to write c['x'] += 1 without risking a KeyError. The add-on capability is to process an entire iterable all at once. This is analogous to the list() builtin- where the core ability is to write s.append(e) and there is a convenience of calling list(iterable).

Another reason the first example goes first because it is simple. It shows counting in isolation with no other distractions (an in-vitro example).

The second example is in a more complex environment incorporating file access and regular expressions (an in-vivo example).

FWIW, there are plenty of examples of using the += style. Here's one I use in my Python courses:

'Scan a log file from a NASA server'
    import collections, re, pprint

    visited = collections.Counter()
    with open('notes/nasa_19950801.log') as f:
        for line in f:
            mo = re.search(r'GET\s+(\S+)\s+200', line)
            if mo is not None:
                url = mo.group(1)
                visited[url] += 1

    pprint.pprint(visited.most_common(20))

I've had good luck with people understanding the docs as-is, so I'm going to decline the suggestion. I do appreciate you taking the time to share your thoughts.

@TonyFlury
Copy link
Mannequin Author

TonyFlury mannequin commented Feb 5, 2018

Raymond,
I completely understand your comment but I do disagree.

My view would be that the documentation of the stdlib should document the entry level use cases.
The first example given uses nothing special from the Counter class - you could implement exactly the same with a defaultdict(int) - the only difference would be that output will read defaultdict(<type 'int'>,{'blue': 3, 'red': 2, 'green': 1}).

I think the examples in the documentation should at least demonstrate something important on the class being documented - and the first example doesn't.

I am very tempted to re-open - but I wont - no benefit in bouncing the status as we discuss this.

@csabella
Copy link
Contributor

csabella commented Feb 6, 2018

You know, I'm not sure if I had ever seen that example before. When you click Counter at the top of the page, it goes right to the class definition, which is past the example.

Having said that, I really like the example. Until now, I didn't realize what Raymond said above about Counters (that the core ability is to write c['x'] += 1 without a KeyError). So, thanks to this report, I learned that today!

One thing that did surprise me in the example is that I expected the repr to be in insertion order in 3.7. The class description says 'It is an unordered collection where elements are stored as dictionary keys' and I was wondering if that was still true since dicts now have a guaranteed order. I tried it on the example, which still printed Counter({'blue': 3, 'red': 2, 'green': 1})! Of course it makes sense after looking at the code because it calls most_common in the repr, but I hadn't realized that before. So, two things learned about Counter today. :-)

Anyway, writing this here to ask about the wording regarding 'unordered collection'.

Thanks!

@TonyFlury
Copy link
Mannequin Author

TonyFlury mannequin commented Feb 6, 2018

Cheryl :
When you iterate around a counter instance it does return keys in the order they are first encountered/inserted - so I agree with you that it is an ordered collection from Python 3.7 onwards (although the iteration and the repr are ordered in different orders.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

2 participants