Python has several built in modules that we haven't really fully explored yet.

The collections module is built in to Python and it implements specialized container data types that are essentially alternatives to python's built in containers that are just general purpose.

let's imagine you have a situation where you have a list and then some unique values in the list, but there's also repeats for these unique values.

Now, let's imagine I wanted you to get a count of each unique item in this list.
For example, I want you to count how many ones are there? How many twos are there? How many threes are there?

In [8]:
from collections import Counter

mylist = [1,2,3,1,2,1,1,1,2,3,2,2,3,2,1,1,2,]

counter can do this for us automatically with a single call. It's super useful in these sort of situations.
Simply pass in my list into counter and then you get this specialized counter object that has now counted the instances for each unique item in that list.

In [9]:
Counter(mylist)

Counter({1: 7, 2: 7, 3: 3})

Notice this looks very similar to a dictionary and counters technically a dictionary subclass that essentially just helps count hashable objects.

So inside of it, elements are stored as dictionary keys and the counts of the objects are stored as the values.
So kind of this specialized dictionary where the keys are always the object and then the value is always the count.

In [10]:
Counter('dhsjshhsjahdshdhasshdjhs')

Counter({'d': 4, 'h': 8, 's': 7, 'j': 3, 'a': 2})

this also works with strings, so I could say counter and then pass on some sort of string. and it counts the individual letters.

let's create a sentence here saying How many times does each word show up in the sentence

In [11]:
sentence = 'BTS, is a South Korean boy band that formed in 2010 and debuted in 2013 under Big Hit Entertainment.'
Counter(sentence.lower().split())

Counter({'bts,': 1,
         'is': 1,
         'a': 1,
         'south': 1,
         'korean': 1,
         'boy': 1,
         'band': 1,
         'that': 1,
         'formed': 1,
         'in': 2,
         '2010': 1,
         'and': 1,
         'debuted': 1,
         '2013': 1,
         'under': 1,
         'big': 1,
         'hit': 1,
         'entertainment.': 1})

In [13]:
letters = 'aaabbbbbbcccccccccccccddddddddddddddddddddddddddddddddddaaaaaaaa'
c = Counter(letters)

In [14]:
c

Counter({'a': 11, 'b': 6, 'c': 13, 'd': 34})

In [15]:
c.most_common()

[('d', 34), ('c', 13), ('a', 11), ('b', 6)]

So most common will actually just return back the most common here in a list as a tuple. So it says D is the most common with 34 instances, c second most common with 13. a third most common with 11. And it would keep going.

let's say I wanted the two most common.

In [16]:
c.most_common(2)

[('d', 34), ('c', 13)]

In [17]:
list(c)

['a', 'b', 'c', 'd']

I can actually pass this into a list and then I essentially just get a list of all the keys. So all the unique values, lots of different things we can do with the counter object.

In [18]:
from collections import defaultdict

d = {'a': 10}

In [19]:
d['a']

10

in a normal python dictionary, let's imagine I call the wrong key. So notice wrong.
Definitely not a key yet in this dictionary.
If I try that, it's going to complain saying hey, key error.

In [20]:
d['wrong']

KeyError: 'wrong'

Default dictionary will assign a default value.
If there's an instance where a key error would have occurred.
So essentially, if you try to ask for a key that isn't present in a default dictionary, it will assign it with some default value.

And the way we do this is by saying whatever your dictionary is going to be. So we'll call d again.

In this case, we call it defaultdict and we have to choose what the actual default value will be.
So instead of returning a key error, it will assign it our default value.
And we can do this through the use of a lambda expression.
So we'll simply say lambda colon and then whatever you want the default value to be.

For instance, let's say I want all the default values to be 20. So it still behaves like a normal dictionary under normal circumstances

In [21]:
d = defaultdict(lambda: 20)

In [22]:
d['correct']

20

the last specialized container object that I want to show you from the collections module is called the named tuple.

So similar to the way a default dictionary tries to improve on just a standard dictionary by getting rid of this key error.

If the default value the name tuple tries to expand on a normal tuple object by actually having named indices.

In [23]:
mytuple = (10,20,30)
mytuple[0]

10

So the named tuple is going to have not just a numeric connection to the values, but it will also have essentially a named index for that value.
So instead of just calling it by zero, we could call it by some sort of string code.

In [24]:
from collections import namedtuple
Dog =  namedtuple('Dog', ['age', 'breed', 'name'])

In [26]:
Dog # And now I can create instances of a dog object.

__main__.Dog

In [27]:
sammy = Dog(age = 5, breed = 'husky', name = 'sam')

In [28]:
type(sammy) # reports back that it's type dog. It doesn't report back that it's a named tuple.

__main__.Dog

In [33]:
sammy.age

5

In [31]:
sammy.name

'sam'

In [32]:
sammy.breed

'husky'

notice if I say Sammy at index zero, I get back five, which was the same thing as calling Sammy age.
So you can imagine for very large tuples or tuples where you can't quite remember what values at which index.

It might be useful to be able to access them both by calling an index position such as zero, or by calling it as if it was an attribute asking for the age.