# 3 Built-In Data Structures, Functions and Files

## 3.1 Data Structures and Sequences

### 1-Tuple
A tuple is a fixed-length, immutable sequence of Python objects which, once assigned, cannot be changed. The easiest way to create one is with a comma-separated sequence of values wrapped in parentheses:


In [234]:
tup = (4,5,6)
tup

(4, 5, 6)

In [235]:
tup = 4,5,6
tup

(4, 5, 6)

In [236]:
tuple([2,3,4,5,6])


(2, 3, 4, 5, 6)

In [237]:
tup = tuple("String")
tup

('S', 't', 'r', 'i', 'n', 'g')

In [238]:
tup[0]

'S'

In [239]:
nested_tup = (4,5,6), (7, 8)
nested_tup

((4, 5, 6), (7, 8))

In [240]:
print(nested_tup[0])
print(nested_tup[1])


(4, 5, 6)
(7, 8)


In [241]:
tup = tuple(["foo", [1,2], True])
# tup[2] = False -> It will give us an error because tuples are immutable
tup[1].append(3) # However the list inside a tuple can be changed

In [242]:
type((4, None, 'foo') + (6, 0 ) + ("bar", ))

tuple

In [243]:
("foo", "bar") * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

#### Unpacking tuples

In [244]:
tup = (4,5,6)
a, b, c = tup

b

5

In [245]:
tup = (4,5, (6,7))
a, b, (c, d) = tup
d

7

In [246]:
#Swap function 
a, b = 3, 4
print(a , b)
tmp = a
a = b
b = tmp
print(a, b)

3 4
4 3


In [247]:
#In python it can be done like that (useless)
a, b = 1,2
print(a, b)
b, a = a, b
print(a,b)

1 2
2 1


In [248]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
    print(f"a={a}, b={b}, c={c}")


a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


In [249]:
values = 1, 2, 3, 4, 5
a, b, *rest = values
print(a)
print(b)
rest

1
2


[3, 4, 5]

This ``rest`` bit is sometimes something you want to discard; there is nothing special about the ``rest`` name. As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [250]:
a, b, *_ = values

#### Tuple methods
Since the size and contents of a tuple cannot be modified, it is very light on instance methods. A particularly useful one (also available on lists) is ``count``, which counts the number of occurrences of a value:

In [251]:
a = (1,2,2,2,3,4,2)
a.count(2)

4

In [252]:
a.index(2) # Returns the first index of a

1

### 2-List
In contrast with tuples, lists are variable length and their contents can be modified in place. Lists are mutable. You can define them using square brackets ``[]`` or using the ``list`` type function:



In [253]:
a_list = [2,3,7,None]
tup = ("foo", "bar", "baz")
b_list = list(tup)
b_list

['foo', 'bar', 'baz']

In [254]:
b_list[1] = "peekaboo"
b_list

['foo', 'peekaboo', 'baz']

In [255]:
gen = range(10)
gen

range(0, 10)

In [256]:
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [257]:
b_list.append("dwarf")
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

In [258]:
b_list.insert(1, "red")
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

In [259]:
b_list.pop(2)
b_list

['foo', 'red', 'baz', 'dwarf']

In [260]:
b_list.append("foo")
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [261]:
b_list.remove("foo")
b_list

['red', 'baz', 'dwarf', 'foo']

In [262]:
"dwarf" in b_list

True

In [263]:
"dwarf" not in b_list

False

In [264]:
[4, None, "foo"] + [7,8, (2,3)]

[4, None, 'foo', 7, 8, (2, 3)]

In [265]:
x = [4, None, "foo"]
x.extend([7,8,(2,3)])
x

[4, None, 'foo', 7, 8, (2, 3)]

In [266]:
a = [7, 2, 5, 1, 3]
a.sort()
print(a)

[1, 2, 3, 5, 7]


In [267]:
b = ["saw", "small", "He", "foxes", "six"]
b.sort(key=len)
b 


['He', 'saw', 'six', 'small', 'foxes']

#### Slicing

In [268]:
seq = [7,2,3,7,5,6,0,1]

seq[1:5]


[2, 3, 7, 5]

In [269]:
seq[3:5] = [6, 3]
seq

[7, 2, 3, 6, 3, 6, 0, 1]

In [270]:
print(seq[:5])
print(seq[3:])

[7, 2, 3, 6, 3]
[6, 3, 6, 0, 1]


In [271]:
print(seq[-4:])
seq[-6:-2]

[3, 6, 0, 1]


[3, 6, 3, 6]

In [272]:
seq[::2]

[7, 3, 3, 0]

In [273]:
seq[::-1]

[1, 0, 6, 3, 6, 3, 2, 7]

### 4-Dictionary
The dictionary or ``dict`` may be the most important built-in Python data structure. In other programming languages, dictionaries are sometimes called hash maps or associative arrays. A dictionary stores a collection of key-value pairs, where key and value are Python objects. Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key. One approach for creating a dictionary is to use curly braces {} and colons to separate keys and values:

In [274]:
empty_dict = {}
d1 = {"a": "some value", "b": [1,2,3,4]}
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [275]:
d1[7] = "an integer"
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [276]:
d1["b"]

[1, 2, 3, 4]

In [277]:
"b" in d1

True

In [278]:
d1[5] = "some value"
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value'}

In [279]:
d1["dummy"] = "another value"

In [280]:
d1

{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 5: 'some value',
 'dummy': 'another value'}

In [281]:
del d1[5]

In [282]:
d1

{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 'dummy': 'another value'}

In [283]:
ret = d1.pop("dummy")
ret

'another value'

In [284]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [285]:
print(list(d1.keys()))
print(list(d1.values()))


['a', 'b', 7]
['some value', [1, 2, 3, 4], 'an integer']


In [286]:
list(d1.items())

[('a', 'some value'), ('b', [1, 2, 3, 4]), (7, 'an integer')]

In [287]:
d1.update({"b": "foo", "c": 12})
d1

{'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

In [288]:
mapping = {}
for key, value in zip(d1.keys(), d1.values()):
    mapping[key] = value
mapping

{'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

In [289]:
tuples = zip(range(5), reversed(range(5)))
tuples
mapping = dict(tuple(tuples))
mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

In [290]:
tuple(tuples)

()

In [291]:
words = ["apple", "bat", "bar", "atom", "book"]
by_letter = {}

for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

by_letter


{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [292]:
by_letter = {}
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [293]:
from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
    by_letter[word[0]].append(word)
dict(by_letter)

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [294]:
hash("kemal")

2610904449933465179

### Sets
A set is an unordered collection of unique elements. A set can be created in two ways: via the set function or via a set literal with curly braces:

In [295]:
set([2,2,2,1,3,3])


{1, 2, 3}

In [296]:
a = {1,2,3,4,5}
b = {3,4,5,6,7,8}

#### Union

In [297]:
print(a.union(b))
print(a | b)


{1, 2, 3, 4, 5, 6, 7, 8}
{1, 2, 3, 4, 5, 6, 7, 8}


#### Intersection

In [298]:
print(a.intersection(b))
a & b


{3, 4, 5}


{3, 4, 5}

In [299]:
#a.add(x)

In [300]:
a.clear()
a

set()

In [301]:
#a.add(x)

In [302]:
#a.remove(x)

In [303]:
a = {1,2,3,4,5}
c = a.copy()
c |= b
print(c)
d = a.copy()
d &= b
print(d)

{1, 2, 3, 4, 5, 6, 7, 8}
{3, 4, 5}


### Built-in Sequence Functions

In [304]:
index = 0
for value in a:
    index+=1

In [305]:
for index, value in enumerate(a):
    print(index ,value)

0 1
1 2
2 3
3 4
4 5


In [306]:
a = [7,1,2,6,0,3,2]
print(sorted(a))
a

[0, 1, 2, 2, 3, 6, 7]


[7, 1, 2, 6, 0, 3, 2]

In [307]:
sorted("horse race")

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

In [308]:
seq1 = ["foo", "bar", "baz"]

seq2 = ["one", "two", "three"]

zipped = zip(seq1, seq2)

list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

In [309]:
seq3 = [False, True]
list(zip(seq1,seq2,seq3))


[('foo', 'one', False), ('bar', 'two', True)]

In [310]:
for index, (a, b) in enumerate(zip(seq1, seq2)):
    print(f"{index} : {a}, {b}")

0 : foo, one
1 : bar, two
2 : baz, three


In [311]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [312]:
strings = ["a", "as", "bat", "car", "dove", "python"]
[x.upper() for x in strings if len(x) > 2]


['BAT', 'CAR', 'DOVE', 'PYTHON']

In [313]:
list(map(len, strings))

[1, 2, 3, 3, 4, 6]

In [314]:
set(map(len, strings))

{1, 2, 3, 4, 6}

In [315]:
loc_mapping = {value: index for index, value in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

In [316]:
all_data = [["John", "Emily", "Michael", "Mary", "Steven"], ["Maria", "Juan", "Javier", "Natalia", "Pilar"]]
new_list = []
for names in all_data:
    enough_as = [name for name in names if name.count("a") >= 2]
    new_list.extend(enough_as)
print(new_list)

result = [name for names in all_data for name in names if name.count("a") >=2]
print(result)

for names in all_data:
    for name in names:
        if(name.count("a") >= 2):
            new_list.append(name)
print(new_list)

['Maria', 'Natalia']
['Maria', 'Natalia']
['Maria', 'Natalia', 'Maria', 'Natalia']


In [317]:
some_tuples = [(1,2,3), (4,5,6), (7,8,9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

## Functions

In [318]:
def my_function(x, y):
    return x + y

In [319]:
print(my_function(1,2))

3


In [320]:
def function_without_return(x):
    print(x)
result = function_without_return(10)

10


In [321]:
print(result) # It will return null since the function is not returning anything

None


In [322]:
def my_function2(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
    else:
        return z / (x+y)


In [323]:
my_function2(5, 6) # it will multiply it because z is not assigned

16.5

### Namespaces, Scope, Local Functions

In [326]:
def func():
    a = []
    for i in range(5):
        a.append(i)
a
# a -> It will give us an error because it is not declare in global scope

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

In [327]:
a = []
def func():
    for i in range(5):
        a.append(i)
# a -> It will give us an error because it is not declare in global scope
func()
func()
a

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

In [329]:
def f():
    a, b, c = 5, 6, 7
    return a, b, c
tup = f()
print(type(tup), tup)


<class 'tuple'> (5, 6, 7)


In [332]:
def g():
    tup = 10, 20, 30
    key = "a", "b", "c"
    return dict(zip(key, tup))
g()

{'a': 10, 'b': 20, 'c': 30}

### Functions are objects
Since Python functions are objects, many constructs can be easily expressed that are difficult to do in other languages. Suppose we were doing some data cleaning and needed to apply a bunch of transformations to the following list of strings:

In [333]:
states = ["   Alabama ", "Georgia!", "Georgia", "georgia", "FlOrIda", "south   carolina##", "West virginia?"]
#Cleaning data
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub("[!?#]", "", value)
        value = value.title()
        result.append(value)
    return result

clean_strings(states)



['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South   Carolina',
 'West Virginia']

In [336]:
#Alternative approach
states = ["   Alabama ", "Georgia!", "Georgia", "georgia", "FlOrIda", "south   carolina##", "West virginia?"]

def remove_punctuation(value):
    return re.sub("[!#?]", "", value)

def clean_strings2(strings):
    global result2
    result2 = []
    for value in strings:
        value = value.strip()
        remove_punctuation(value)
        value = value.title()
        result2.append(value)
    return result2
clean_strings2(states)

['Alabama',
 'Georgia!',
 'Georgia',
 'Georgia',
 'Florida',
 'South   Carolina##',
 'West Virginia?']

## Anonymous (Lambda) Functions

In [344]:
def short_function(x):
    return x * 2
equiv_anon = lambda x: x * 2


In [345]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]
ints = [4,0,1,5,6]
apply_to_list(ints, lambda x : x * 2)

[8, 0, 2, 10, 12]

In [351]:
strings = ["foo", "card", "bar", "aaaa", "abab"]
strings.sort(key=lambda x : len(set(x)))
strings


['aaaa', 'foo', 'abab', 'bar', 'card']

## Generators
A generator is a convenient way, similar to writing a normal function, to construct a new iterable object. Whereas normal functions execute and return a single result at a time, generators can return a sequence of multiple values by pausing and resuming execution each time the generator is used. To create a generator, use the ``yield`` keyword instead of ``return`` in a function:

In [358]:
some_dict = {"a": 1, "b": 2, "c": 3}
print(iter(some_dict))
dict_iterator = iter(some_dict)
print(list(dict_iterator))
for key in some_dict:
    print(key)

<dict_keyiterator object at 0x1106edc60>
['a', 'b', 'c']
a
b
c


In [359]:
def squares(n = 10):
    print(f"Generating squares from 1 to {n**2}")
    for i in range(1, n + 1):
        yield i ** 2


In [360]:
gen = squares(n = 10)
gen

<generator object squares at 0x1078599a0>

In [362]:
list(gen)

Generating squares from 1 to 100


[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [367]:
gen2 = (x ** 2 for x in range(10))
list(gen2)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [371]:
def _make_gen():
    for x in range(100):
        yield x ** 2
gen3 = _make_gen()

In [373]:
gen3

<generator object _make_gen at 0x1104b4040>

In [374]:
sum(x ** 2 for x in range(100))


328350

In [375]:
dict((i, i**2) for i in range(5))

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

In [376]:
import itertools

def first_letter(x):
    return x[0]
names = ["Alan", "Adam", "Wes", "Will", "Albert","Steven"]
for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names))

A ['Alan', 'Adam']
W ['Wes', 'Will']
A ['Albert']
S ['Steven']


In [377]:
float("1.2345")


1.2345

In [378]:
# float("something") -> It will not work 
def attempt_float(x):
    try:
        return float(x)
    except:
        return x
attempt_float("1231.123")

1231.123

In [379]:
attempt_float("aksjfkaj")

'aksjfkaj'

In [None]:
path = %pwd
f = open(path, mode="w")

try:
    write_to_file(f)
except:
    pass
else: # It will work when try is work
    print("Success")
finally: # It will work regardless either try or except works
    f.close()

## Files and Input Output Operations

In [402]:

path = "examples/segismundo.txt"

f = open(path, encoding="utf-8")

In [None]:
for line in f:
    print(line)

In [407]:
lines = [x.rstrip() for x in open(path, encoding="utf-8")]


In [410]:
valid_line = [valid for valid in lines if len(valid) > 0]
valid_line

['Qué es lindo que es caminar,',
 'bien tomados de la mano,',
 'pore barrio, por la plaza,',
 '`qué sé yo?, por todos lagos.',
 'Bu, şu anda gönderilmiş  olan plazanın bankında,',
 "manodaki tomados'un bir yansımasıdır  .",
 'Bu, dünyanın iki domatesinin bir yansımasıdır',
 ';',
 'en nuestros ojos, volando,',
 'dos pájaros flejados.',
 'İşte bu yüzden adamların',
 'arasında bir tur atıldı;',
 '¡Qué lindo, andar por la vida',
 'de la mano bien tomados!',
 'Ne güzeldir',
 'el ele tutuşup',
 'mahallede, meydanda,',
 'Ne bileyim her yerde yürümek.',
 'Oturduğumuz meydandaki banktan,',
 'el ele tutuşmuş ağaçlara bakmak ne güzel .',
 'El ele tutuşmuş gökyüzüne bakmak ne güzel ;',
 'gözümüzde uçan,',
 'yansıyan iki kuş.',
 'El ele tutuşup yürümek ne güzeldir ; el ele tutuşup',
 'hayatta yürümek ne güzeldir !']

When you use ``open`` to create file objects, it is recommended to close the file when you are finished with it. Closing the file releases its resources back to the operating system:



In [None]:
f.close() #When you use open to create file objects it is recommended to close the file when you are finished with it

In [412]:
with open(path, encoding="utf-8") as f:
    lines = [x.rstrip() for x in f]
lines

['Qué es lindo que es caminar,',
 'bien tomados de la mano,',
 'pore barrio, por la plaza,',
 '`qué sé yo?, por todos lagos.',
 '',
 'Bu, şu anda gönderilmiş  olan plazanın bankında,',
 "manodaki tomados'un bir yansımasıdır  .",
 '',
 '',
 '',
 'Bu, dünyanın iki domatesinin bir yansımasıdır',
 ';',
 'en nuestros ojos, volando,',
 'dos pájaros flejados.',
 '',
 'İşte bu yüzden adamların',
 'arasında bir tur atıldı;',
 '¡Qué lindo, andar por la vida',
 'de la mano bien tomados!',
 '',
 'Ne güzeldir',
 'el ele tutuşup',
 'mahallede, meydanda,',
 'Ne bileyim her yerde yürümek.',
 '',
 'Oturduğumuz meydandaki banktan,',
 'el ele tutuşmuş ağaçlara bakmak ne güzel .',
 '',
 '',
 '',
 '',
 'El ele tutuşmuş gökyüzüne bakmak ne güzel ;',
 'gözümüzde uçan,',
 'yansıyan iki kuş.',
 '',
 '',
 'El ele tutuşup yürümek ne güzeldir ; el ele tutuşup',
 'hayatta yürümek ne güzeldir !']