# 2.1 Datatypes and Data Structures

This section introduces data structures in the form of tuples and dictionaries.

**Primitive Datatypes**

Python has a few primitive types of data:

- Integers
- Floating point numbers
- Strings (text)
  
We learned about these in the introduction.

**None type**

In [1]:
email_address = None

None is often used as a placeholder for optional or missing value. It evaluates as False in conditionals.

In [2]:
if email_address:
    send_email(email_address, msg)

**Data Structures** 

Real programs have more complex data. For example information about a stock holding:

100 shares of GOOG at $490.10

This is an "object" with three parts:

- Name or symbol of the stock ("GOOG", a string)
- Number of shares (100, an integer)
- Price (490.10 a float)

**Tuples**

A tuple is a collection of values grouped together.

Example:

In [5]:
s = ('GOOG', 100, 490.1)

Sometimes the () are omitted (dihilangkan) in the syntax.

In [8]:
s = 'GOOG', 100, 490.1
type(s)

tuple

Special cases (0-tuple, 1-tuple).

In [10]:
t = ()            # An empty tuple
w = ('GOOG', )    # A 1-item tuple
type(t)

tuple

Tuples are often used to represent simple records or structures. Typically, it is a single object of multiple parts. 

A good analogy: A tuple is like a single row in a database table.

Tuple contents are ordered (like an array).

In [13]:
s = ('GOOG', 100, 490.1)
name = s[0]                 # 'GOOG'
shares = s[1]               # 100
price = s[2]                # 490.1

print(name)
print(shares)
print(price)

GOOG
100
490.1


However, the contents can't be modified.

In [14]:
s[1] = 75

TypeError: 'tuple' object does not support item assignment

You can, however, make a new tuple based on a current tuple.

In [15]:
s = (s[0], 75, s[2])
s

('GOOG', 75, 490.1)

**Tuple Packing**

Tuples are more about packing related items together into a single entity.

In [16]:
s = ('GOOG', 100, 490.1)

The tuple is then easy to pass around to other parts of a program as a single object.

**Tuple Unpacking**

To use the tuple elsewhere, you can unpack its parts into variables.

In [18]:
# Unpack the tuple into variables
name, shares, price = s
print('Cost', shares * price)

Cost 49010.0


The number of variables on the left must match the tuple structure.

In [20]:
name, shares = s     # ERROR

ValueError: too many values to unpack (expected 2)

**Tuples vs. Lists**

Tuples look like read-only lists. However, tuples are most often used for a single item consisting of multiple parts. 

Lists are usually a collection of distinct items, usually all of the same type.

In [21]:
# Tuples

record = ('GOOG', 100, 490.1)   # A tuple representing a record in a portfolio

# Lists

symbols = ['GOOG', 'AAPL', 'IBM']   # A List representing three stock symbols

**Dictionaries**

A dictionary is mapping of keys to values. It's also sometimes called a hash table or associative array. The keys serve as indices for accessing values.

In [22]:
s = {
    'name': 'GOOG',
    'shares': 100,
    'price': 490.1
}

**Common operations**

To get values from a dictionary use the key names.

In [24]:
print(s['name'], s['shares']) # key act like indices

GOOG 100


In [25]:
s['price']

490.1

To add or modify values assign using the key names.

In [27]:
s['shares'] = 75
s['date'] = '6/6/2007'
print(s)

{'name': 'GOOG', 'shares': 75, 'price': 490.1, 'date': '6/6/2007'}


To delete a value use the del statement.

In [28]:
del s['date']
s

{'name': 'GOOG', 'shares': 75, 'price': 490.1}

**Why dictionaries?**

Dictionaries are useful when there are many different values and those values might be modified or manipulated. Dictionaries make your code more readable.

In [None]:
s['price']
# vs
s[2]

## Exercises

In the last few exercises, you wrote a program that read a datafile Data/portfolio.csv. Using the csv module, it is easy to read the file row-by-row.

In [30]:
import csv
import os

os.chdir(r"C:\Users\Fadinda Shafira\Documents\KALBE\Python\practical-python\Work")
f = open('Data\portfolio.csv')
rows = csv.reader(f) # read csv
next(rows)

['name', 'shares', 'price']

In [31]:
row = next(rows)
row

['AA', '100', '32.20']

Although reading the file is easy, you often want to do more with the data than read it. For instance, perhaps you want to store it and start performing some calculations on it. Unfortunately, a raw "row" of data doesn’t give you enough to work with. For example, even a simple math calculation doesn’t work:

In [32]:
row = ['AA', '100', '32.20']
cost = row[1] * row[2]

TypeError: can't multiply sequence by non-int of type 'str'

To do more, you typically want to interpret the raw data in some way and turn it into a more useful kind of object so that you can work with it later. Two simple options are tuples or dictionaries.

**Exercise 2.1: Tuples**

At the interactive prompt, create the following tuple that represents the above row, but with the numeric columns converted to proper numbers:

In [33]:
row = ['AA', '100', '32.20']

t = (row[0], int(row[1]), float(row[2])) # save items for a lists into tuples to perform multiplication
t

('AA', 100, 32.2)

Using this, you can now calculate the total cost by multiplying the shares and the price:

In [34]:
cost = t[1] * t[2]
cost

3220.0000000000005

Is math broken in Python? What’s the deal with the answer of 3220.0000000000005?

This is an artifact of the floating point hardware on your computer only being able to accurately represent decimals in Base-2, not Base-10. For even simple calculations involving base-10 decimals, small errors are introduced. This is normal, although perhaps a bit surprising if you haven’t seen it before.

This happens in all programming languages that use floating point decimals, but it often gets hidden when printing. For example:

In [35]:
print(f'{cost:0.2f}')

3220.00


Tuples are read-only. Verify this by trying to change the number of shares to 75.

In [36]:
t[1] = 75

TypeError: 'tuple' object does not support item assignment

Although you can’t change tuple contents, you can always create a completely new tuple that replaces the old one.

In [37]:
t = (t[0], 75, t[2])
t

('AA', 75, 32.2)

Whenever you reassign an existing variable name like this, the old value is discarded. Although the above assignment might look like you are modifying the tuple, you are actually creating a new tuple and throwing the old one away.

Tuples are often used to pack and unpack values into variables. Try the following:

In [38]:
# unpacking the tuple

name, shares, price = t
name

'AA'

In [39]:
shares

75

In [40]:
price

32.2

Take the above variables and pack them back into a tuple

In [43]:
# packing the tuples

t = (name, 2*shares, price)
t

('AA', 150, 32.2)

**Exercise 2.2: Dictionaries as a data structure**

An alternative to a tuple is to create a dictionary instead.

In [44]:
d = {
    'name' : row[0],
    'shares' : int(row[1]),
    'price'  : float(row[2])
}

In [45]:
d

{'name': 'AA', 'shares': 100, 'price': 32.2}

Calculate the total cost of this holding:

In [46]:
cost = d['shares'] * d['price']
cost

3220.0000000000005

Compare this example with the same calculation involving tuples above. Change the number of shares to 75.

In [47]:
d['shares'] = 75
d

{'name': 'AA', 'shares': 75, 'price': 32.2}

In [48]:
cost = d['shares'] * d['price']
cost

2415.0

Unlike tuples, dictionaries can be freely modified. Add some attributes:

In [50]:
d['date'] = (6,11,2007)
d['account'] = 12345
d

{'name': 'AA',
 'shares': 75,
 'price': 32.2,
 'date': (6, 11, 2007),
 'account': 12345}

**Exercise 2.3: Some additional dictionary operations**

If you turn a dictionary into a list, you’ll get all of its keys:

In [51]:
list(d)

['name', 'shares', 'price', 'date', 'account']

Similarly, if you use the for statement to iterate on a dictionary, you will get the keys:

In [52]:
for k in d:
    print('k =', k) # print the key in dictionary

k = name
k = shares
k = price
k = date
k = account


Try this variant that performs a lookup at the same time:

In [55]:
for k in d:
    print('k =', d[k]) # print the values by iterate over keys


k = AA
k = 75
k = 32.2
k = (6, 11, 2007)
k = 12345


You can also obtain all of the keys using the keys() method:

In [56]:
# Accessing only dictionary keys

keys = d.keys()
keys

dict_keys(['name', 'shares', 'price', 'date', 'account'])

keys() is a bit unusual in that it returns a special dict_keys object.

This is an overlay on the original dictionary that always gives you the current keys—even if the dictionary changes. For example, try this:

In [58]:
del d['account']
keys

dict_keys(['name', 'shares', 'price', 'date'])

Carefully notice that the 'account' disappeared from keys even though you didn’t call d.keys() again.

A more elegant way to work with keys and values together is to use the items() method. This gives you (key, value) tuples:

In [60]:
# Accessing dictionary keys and values

items = d.items() # will return a tuples having the key and the values
items

dict_items([('name', 'AA'), ('shares', 75), ('price', 32.2), ('date', (6, 11, 2007))])

In [61]:
for k, v in d.items(): # untuk setiap key dan value di dict_items
    print(k, '=', v)

name = AA
shares = 75
price = 32.2
date = (6, 11, 2007)


If you have tuples such as items, you can create a dictionary using the dict() function. Try it:

In [62]:
items

dict_items([('name', 'AA'), ('shares', 75), ('price', 32.2), ('date', (6, 11, 2007))])

In [63]:
d = dict(items)
d

{'name': 'AA', 'shares': 75, 'price': 32.2, 'date': (6, 11, 2007)}