# Week 2 - Python Data Types and Structures

Python has a wide set of built-in data types and data structures to enable a rich programming environment.

In general, data types can be split into 3 categories:
<li> Immutable non-collections - basic types</li>
<li> Immutable collections</li>
<li> Mutable collections</li>

## 'Basic' Types

In [2]:
import sys
import math

Basic types include: bool, int/long, float, complex, Decimal and string (which is really a collection of chars)

<b>Int</b> - non-floating point numerical variables. Size is platform dependent, but should be at least '2^31' (so, at least 32 bit precision). Since Python 3.? version, Long and Short int have been unified. What is the default value?

<b>Bool</b> - boolean variables holding true or false. What is the default value?

<b>Float</b> - floating point numeric variables.

<b>Decimal</b> - strictly speaking not a 'built-in' (as need to import decimal module). Allow for 'exact'/controllable floating point representation.

<b>Complex</b> - complex(re, img).

<b>String</b> - built-in class to hold array of chars, with a very rich set of built-in methods


<i>Further Reading</i>:

https://docs.python.org/3.1/whatsnew/3.0.html#integers

https://docs.python.org/2/library/numbers.html

## Some Examples

In [3]:
i = int()
i=3
i = int(3)

print (type(i))   # Note that it is a class

<class 'int'>


In [4]:
# because int is a class, we can access certain properties and methods:
print ("Bit length: {0}".format(i.bit_length()))

print ("Numerator: {0} and Denominator: {1}".format(i.numerator, i.denominator))

Bit length: 2
Numerator: 3 and Denominator: 1


In [4]:
sys.float_info

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

In [5]:
sys.maxsize

9223372036854775807

In [5]:
math.pow(10,308)

1e+308

In [6]:
math.pow(10,309)  # what is going to happen?

OverflowError: math range error

In [7]:
math.pow(10,-4440) # what is going to happen?

0.0

In [8]:
from decimal import *

a = 1.1
b = 2.2

print (a+b)

3.3000000000000003


In [10]:
d1 = Decimal(1.1)

d2 = Decimal(2.2)

print (d1+d2)

3.300000000000000266453525910


In [11]:
getcontext().prec = 16   # we can ensure precision with decimal
print (d1+d2)

3.300000000000000


In [9]:
s = "strings"

In [14]:
s.count('s')

2

In [15]:
s.find('g')

5

In [16]:
s.index('H')

ValueError: substring not found

In [10]:
s1 = 'ABC'
s2 = 'ABC'

In [11]:
s1 == s2

True

In [18]:
s1 is s2

True

In [12]:
id(s1)

3022375335392

In [13]:
id(s2)

3022375335392

In [19]:
s2=s1

In [22]:
s1='ABC'
s2='ABC'

## Most Useful String Function To Remember

In [23]:
## Join is a bit weird

s1.join(s2)

'AABCBABCC'

In [23]:
# example how to interlace a character with each character of a word
#'character'.join('word')
'_'.join('coo')

'c_o_o'

In [24]:
#Concatenate with a +

"left_part"+" right_part"

'left_part right_part'

In [25]:
## Split

"each;split;by;itself".split(";")

['each', 'split', 'by', 'itself']

In [26]:
## Upper and Lower

"upper".upper()

'UPPER'

In [27]:
## Find

"VERY LONG WORD".find("WORD")

10

In [28]:
e = "[1+2+3]"
eval(e)                      # *****

[6]

In [None]:
# s1.len() or len(s1) - which is correct?

# Built-in Collections

Built-in collection include: lists, tuples, sets, dictionaries, frozensets

### Lists - Very Useful - Mutable

In [29]:
l = list()
l = []
l = [1,'a', 4.567]
print (id(l))

l.append('two')
print(id(l))

2889549710600
2889549710600


### Tuples - are collections - Immutable

In [30]:
t = tuple()
t = ()
t = (1,2)
t = (1,2,'3')   # meant to be homogeneous

In [31]:
print (type(t))

<class 'tuple'>


In [32]:
t = 1,2,3  # Legal or not? 

In [25]:
t = (1)    # Legal or not?
print (type(t))

<class 'int'>


In [26]:
t = (1,)  # Legal or not?
print (type(t))

<class 'tuple'>


In [27]:
t2 = (1, [3,4,5])

In [28]:
print (id(t2))

3022405995976


In [29]:
t2[1].append(6)  

print (t2)      # We have modified the immutable!

(1, [3, 4, 5, 6])


In [30]:
print(id(t2))   # not really...be carefull

3022405995976


### Dicts - Also Very Useful and Very Fast - Mutable

In [31]:
d = dict()
d = {}
d = {'k1':1, 'k2':2}
print (id(d))

3022416391912


In [33]:
d['k3']=12

print (id(d))

3022416391912


In [34]:
# accessing elements
print (d.get('k3'))
print (d['k3'])

12
12


In [35]:
d['k4']   # results in an error because there is no such key

KeyError: 'k4'

In [42]:
# so what to do?
# one option is to use setdefault
d.setdefault('k4', 'not available') 

# another option is to use defaultdict from collections
from collections import defaultdict

dd = defaultdict(int)   # here we say that default_factory is to be default(int)
dd['k1']=5
print (dd['k1'])
print (dd['k2'])

5
0


### Sets - Super Fast - Mutable

In [43]:
s = set([1,2,3])
s = {1,2,3}

In [45]:
print (id(s))

2889549916424


In [44]:
s.add(25)
s.add(25)

print (s)
print (id(s))

{1, 2, 3, 25}
2889549916424


In [36]:
# WATCH OUT!
set = {5,6,7}

In [43]:
another_set = set([8,9,0])   # Why error? Fix by re-importing builtins

In [44]:
fs = frozenset([6,7,8, 8])      # frozensets are immutable - fs.add does not exist

### What Else Is There?

In [45]:
import builtins

dir(builtins)  # check methods out at https://docs.python.org/2/library/functions.html

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

### Home Work

Write a method that will accept a string and return a dictionary with each character of a string being keys and values set to how many times a character appears in the string (exclude spaced). Preferabbly, implement a brute-force approach.

In [65]:
## Looping through a string...

s = "this is a setence"
for c in s.replace(' ', ''):
    print(c)

t
h
i
s
i
s
a
s
e
t
e
n
c
e
