# 2.5 Collections module

The collections module provides a number of useful objects for data handling. This part briefly introduces some of these features.

**Example: Counting Things**

Let's say you want to tabulate the total shares of each stock.

In [2]:
portfolio = [
    ('GOOG', 100, 490.1),
    ('IBM', 50, 91.1),
    ('CAT', 150, 83.44),
    ('IBM', 100, 45.23),
    ('GOOG', 75, 572.45),
    ('AA', 50, 23.15)
]

There are two IBM entries and two GOOG entries in this list. The shares need to be combined together somehow.

**Counters**

Solution: Use a `Counter`
It is a collection where elements are stored as dictionary keys and their counts are stored as dictionary values.

In [7]:
from collections import Counter
total_shares = Counter()
total_price = Counter()

for name,shares,price in portfolio:
    total_shares[name] += shares
    total_price[name] += price

In [9]:
total_shares['IBM']

150

In [8]:
total_price['IBM']

136.32999999999998

**Example: One-Many Mappings**

Problem: You want to map a key to multiple values.

In [10]:
portfolio = [
    ('GOOG', 100, 490.1),
    ('IBM', 50, 91.1),
    ('CAT', 150, 83.44),
    ('IBM', 100, 45.23),
    ('GOOG', 75, 572.45),
    ('AA', 50, 23.15)
]

Like in the previous example, the key IBM should have two different tuples instead.

Solution: Use a defaultdict.

In [11]:
from collections import defaultdict 
holdings = defaultdict(list) # Return a new dictionary-like object.
for name, shares, price in portfolio:
    holdings[name].append((shares,price))


In [15]:
holdings['GOOG']

[(100, 490.1), (75, 572.45)]

The defaultdict ensures that every time you access a key you get a default value.

**Example: Keeping a History**

Problem: We want a history of the last N things. Solution: Use a deque.

Returns a new deque object initialized left-to-right (using append()) with data from iterable. List-like container with fast appends and pops on either end

In [None]:
from collections import deque

history = deque(maxlen=N)
with open(filename) as f:
    for line in f:
        history.append(line)
        ...

## Exercises

The collections module might be one of the most useful library modules for dealing with special purpose kinds of data handling problems such as tabulating and indexing.

In this exercise, we’ll look at a few simple examples. Start by running your report.py program so that you have the portfolio of stocks loaded in the interactive mode.

In [19]:
import os
os.chdir(r"C:\Users\Fadinda Shafira\Documents\KALBE\Python\practical-python\Work")

In [20]:
import csv

def read_portfolio(filename):

    '''
    Read a stock portfolio file into a list of dictionaries with keys :
    name, shares, and price
    '''

    portfolio = []

    with open(filename, 'rt') as f:
        rows = csv.reader(f)
        headers = next(rows)
        # print(headers)
        # for row in rows:
        #     record = dict(zip(headers, row))
        #     name = record['name']
        #     nshares = int(record['shares'])
        #     price = float(record['price'])

        for row in rows:            
            stock = {
                'name' : row[0], 
                'shares' : int(row[1]), 
                'price' : float(row[2])}
            portfolio.append(stock)
            # portfolio.append(name)
            # portfolio.append(nshares)
            # portfolio.append(price)
            # print(portfolio)
    return portfolio

def read_prices(filename):
    
    '''
    Read a CSV file of price data into a dict mapping names to prices
    '''
    prices = {}

    with open(filename, 'rt') as f:
        rows = csv.reader(f)
        for row in rows:   
            try:
                prices[row[0]] = float(row[1])
            except IndexError:
                pass
    return prices

def make_report(portfolio, prices):
    """
    Takes a list of stocks and dictionary of prices as input
    returns a list of tuples containing the rows of table (name, shares, price, change)
    """

    rows = []

    for stock in portfolio:
        current_price = prices[stock['name']] # Current price is from prices.csv file, choose the price of stock name in portfolio.csv
        change = current_price - stock['price'] # Calculate change from current_price - price in portfolio.csv
        summary = (stock['name'], stock['shares'], current_price, change)
        rows.append(summary)
        
    # for stock in portfolio:
    #     current_price = prices[stock['name']] # Current price is from prices.csv file, choose the price of stock name in portfolio.csv
    #     change = current_price - stock['price'] # Calculate change from current_price - price in portfolio.csv
    #     summary = (stock['name'], stock['shares'], current_price, change)
    #     rows.append(summary)
    
    return rows

# Read data files and create the report data        

portfolio = read_portfolio('Data\portfolio.csv')
prices = read_prices('Data\prices.csv')

# Generate the report data
report = make_report(portfolio,prices)

# Output the report data
headers = ('Name', 'Shares', 'Price', 'Change')
print(f'{headers[0]:>10s} {headers[1]:>10s} {headers[2]:>10s} {headers[3]:>10s}')
strip = '-' * 10 
space = ' '
print(f'{strip:>10s}{space:>1s}{strip:>10s}{space:>1s}{strip:>10s}{space:>1s}{strip:>10s}')
for name, shares, price, change in report:
    print(f'{name:>10s} {shares:>10d} {price:>10.2f} {change:>10.2f}')


      Name     Shares      Price     Change
---------- ---------- ---------- ----------
        AA        100       9.22     -22.98
       IBM         50     106.28      15.18
       CAT        150      35.46     -47.98
      MSFT        200      20.89     -30.34
        GE         95      13.48     -26.89
      MSFT         50      20.89     -44.21
       IBM        100     106.28      35.84


In [25]:
portfolio = read_portfolio('Data\portfolio.csv')
from collections import Counter
holdings = Counter()
for s in portfolio:
    holdings[s['name']] += s['shares'] # untuk setiap elemen 'name' akan dijumlahkan 'shares'nya

In [26]:
holdings

Counter({'AA': 100, 'IBM': 150, 'CAT': 150, 'MSFT': 250, 'GE': 95})

Carefully observe how the multiple entries for MSFT and IBM in portfolio get combined into a single entry here.

You can use a Counter just like a dictionary to retrieve individual values:

In [23]:
holdings['IBM']

150

If you want to rank the values, do this:

In [29]:
# Get three most held stocks
holdings.most_common(3)

[('MSFT', 250), ('IBM', 150), ('CAT', 150)]

Let’s grab another portfolio of stocks and make a new Counter:

In [33]:
portfolio2 = read_portfolio('Data\portfolio2.csv')
holdings2 = Counter()
for s in portfolio2:
    holdings2[s['name']] += s['shares']

In [34]:
holdings2

Counter({'AA': 50, 'HPQ': 250, 'MSFT': 25, 'GE': 125})

Finally, let’s combine all of the holdings doing one simple operation:

In [36]:
combined = holdings + holdings2
combined

Counter({'AA': 150,
         'IBM': 150,
         'CAT': 150,
         'MSFT': 275,
         'GE': 220,
         'HPQ': 250})

This is only a small taste of what counters provide. However, if you ever find yourself needing to tabulate values, you should consider using one.

**Commentary: collections module**

The collections module is one of the most useful library modules in all of Python. In fact, we could do an extended tutorial on just that. However, doing so now would also be a distraction. For now, put collections on your list of bedtime reading for later.