<a href="https://colab.research.google.com/github/sanikamal/python-atoz/blob/master/data-structure/Python_Data_Structures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to Python Data Structures
Strings, Lists, Tuples, Sets, Dicts  
Created in Python 3.7  
© Sani Kamal, 2019.

## Sequences: String, List, Tuple
****
[Documentation](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range)  
**indexing** - access any item in the sequence using its index.  
Indexing starts with 0 for the first element.

In [1]:
# string
x = 'horse'
print (x[3])

# list
x = ['mango', 'apple', 'banana']
print (x[1])

# tuple
x = ('sani', 'rashmi', 'priya', 'neelam')
print(x[0])

s
apple
sani


**slicing** - slice out substrings, sublists, subtuples using indexes.  
`[start : end+1 : step]`

In [3]:
x = 'tensorflow'
print(x[1:4])
print(x[1:6:2])
print(x[3:])
print(x[:5])
print(x[-1])
print(x[-3:])
print(x[:-2])

ens
esr
sorflow
tenso
w
low
tensorfl


**adding / concatenating** - combine 2 sequences of the same type by using `+`

In [4]:
# string
x = 'tensorflow' + 'google'
print(x)

# list
y = ['abc', 'def'] + ['xyz']
print(y)

# tuple
z = ('Mina', 'Priya', 'Rose') + ('Neelam',)
print(z)

tensorflowgoogle
['abc', 'def', 'xyz']
('Mina', 'Priya', 'Rose', 'Neelam')


**multiplying** - multiply a sequence using *

In [5]:
# string
x = 'finder' * 3
print(x)

# list
y = [18, 51] * 3
print(y)

# tuple
z = (21, 14) * 3
print(z)

finderfinderfinder
[18, 51, 18, 51, 18, 51]
(21, 14, 21, 14, 21, 14)


**checking membership** - test whether an item is or is not in a sequence.

In [6]:
# string
x = 'bug'
print('u' in x)

# list
y = ['pig', 'cow', 'horse']
print('cow' not in y)

# tuple
z = ('sani', 'john', 'priya', 'mala')
print('sani' in z)

True
False
True


**iterating** - iterating through the items in a sequence

In [7]:
# item
x = [7, 8, 3,4,9]
for item in x:
    print(item)
    
# index & item
y = [7, 8, 3]
for index, item in enumerate(y):
    print(index, item)

7
8
3
4
9
0 7
1 8
2 3


**number of items** - count the number of items in a sequence

In [8]:
# string
x = 'string'
print(len(x))

# list
y = ['pig', 'cow', 'horse']
print(len(y))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(len(z))

6
3
4


**minimum** - find the minimum item in a sequence lexicographically.  
Alpha or numeric types, but cannot mix types.

In [10]:
# string
x = 'finder'
print(min(x))

# list
y = ['mango', 'apple', 'banana']
print(min(y))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(min(z))

d
apple
jesmine


**maximum** - find the maximum item in a sequence lexicographically.  
Alpha or numeric types, but cannot mix types.

In [11]:
# string
x = 'string'
print(max(x))

# list
y = ['mango', 'apple', 'banana']
print(max(y))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(max(z))

t
mango
sani


**sum** - find the sum of items in a sequence.  
Entire sequence must be numeric.

In [12]:
# string -> error
# x = [5, 7, 'bug']
# print(sum(x))    # generates an error

# list
y = [2, 5, 8, 12,89,88]
print(sum(y))
print(sum(y[-2:]))

# tuple
z = (50, 4, 7, 19,78,45)
print(sum(z))

204
177
203


**sorting** - returns a new list of items in sorted order.  
Does not change the original list.

In [13]:
# string
x = 'tensorflow'
print(sorted(x))

# list
y = ['mango', 'apple', 'banana']
print(sorted(y))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(sorted(z))

['e', 'f', 'l', 'n', 'o', 'o', 'r', 's', 't', 'w']
['apple', 'banana', 'mango']
['jesmine', 'nitul', 'rose', 'sani']


**sorting** - sort by second letter  
Add a key parameter and a lambda function to return the second character.  
(the word *key* here is a defined parameter name, *k* is an arbitrary variable name).

In [14]:
z = ('sani', 'rose', 'jesmine', 'nitul')
print(sorted(z, key=lambda k: k[1]))

['sani', 'jesmine', 'nitul', 'rose']


**count(item)** - returns count of an item

In [15]:
# string
x = 'orange'
print(x.count('p'))

# list
y = ['mango', 'apple', 'banana']
print(y.count('mango'))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(z.count('sani'))

0
1
1


**index(item)** - returns the index of the first occurence of an item.

In [16]:
# string
x = 'apple'
print(x.index('p'))

# list
y = ['mango', 'apple', 'banana']
print(y.index('apple'))

# tuple
z = ('sani', 'rose', 'jesmine', 'nitul')
print(z.index('nitul'))

1
1
3


**unpacking** - unpack the n items of a sequence into n variables

In [17]:
x = ['mango', 'apple', 'banana']
a, b, c = x
print(a, b, c)

mango apple banana


## Lists  
****
- General purpose
- Most widely used data structure 
- Grow and shrink size as needed
- Sequence type
- Sortable  

**constructors** - creating a new list

In [18]:
x = list()
y = ['a', 25, 'dog','cat', 8.43]
tuple1 = (10, 20)
z = list(tuple1)

# list comprehension
a = [m for m in range(18)]
print(a)
b = [i**2 for i in range(20) if i>4]
print(b)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
[25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361]


**delete** - delete a list or an item in a list

In [19]:
x = [5, 3, 8, 6,81]
del(x[1])
print(x)
del(x)    # list x no longer exists

[5, 8, 6, 81]


**append** - append an item to a list

In [20]:
x = [5, 3, 8, 6]
x.append(7)
print(x)

[5, 3, 8, 6, 7]


**extend** - append a sequence to a list

In [21]:
x = [5, 3, 8, 6]
y = [12, 13]
x.extend(y)
print(x)

[5, 3, 8, 6, 12, 13]


**insert** - insert an item at a given index

In [22]:
x = [5, 3, 8, 6]
x.insert(1, 7)
print(x)
x.insert(1, ['a', 'm'])
print(x)

[5, 7, 3, 8, 6]
[5, ['a', 'm'], 7, 3, 8, 6]


**pop** - pops last item off list and returns item

In [23]:
x = [5, 3, 8, 6]
x.pop()    # pop off the 6
print(x)
print(x.pop())

[5, 3, 8]
8


**remove** - remove first instance of an item

In [24]:
x = [5, 3, 8, 6, 3]
x.remove(3)
print(x)

[5, 8, 6, 3]


**reverse** - reverse the order of the list. It is an in-place sort, meaning it changes the original list.

In [25]:
x = [5, 3, 8, 6]
x.reverse()
print(x)

[6, 8, 3, 5]


**sort** - sort the list in place.  
Note:  
sorted(x) returns a new sorted list without changing the original list x.  
x.sort() puts the items of x in sorted order (sorts in place).

In [26]:
x = [5, 3, 8, 6]
x.sort()
print(x)

[3, 5, 6, 8]


**reverse sort** - sort items descending.  
Use *reverse=True* parameter to the sort function.

In [27]:
x = [5, 3, 8, 6]
x.sort(reverse=True)
print(x)

[8, 6, 5, 3]


## Tuples
****
- Immutable (can’t add/change)
- Useful for fixed data
- Faster than Lists
- Sequence type  
  
**constructors** - creating new tuples.

In [28]:
x = ()
x = (1, 2, 3)
x = 1, 2, 3
x = 2,    # the comma tells Python it's a tuple
print(x, type(x))

list1 = [2, 4, 6]
x = tuple(list1)
print(x, type(x))

(2,) <class 'tuple'>
(2, 4, 6) <class 'tuple'>


**tuples are immutable**, but member objects may be mutable.

In [29]:
x = (1, 2, 3)
# del(x[1])       # fails
# x[1] = 8        # fails
print(x)

y = ([1, 2], 3)   # a tuple where the first item is a list
del(y[0][1])      # delete the 2
print(y)          # the list within the tuple is mutable

y += (4,)         # concatenating two tuples works
print(y)

(1, 2, 3)
([1], 3)
([1], 3, 4)


## Sets
****
- Store non-duplicate items  
- Very fast access vs Lists  
- Math Set ops (union, intersect)  
- Sets are Unordered  
  
**constructors** - creating new sets

In [30]:
x = {3, 5, 3, 5}
print(x)

y = set()
print(y)

list1 = [2, 3, 4]
z = set(list1)
print(z)

{3, 5}
set()
{2, 3, 4}


**set operations**

In [31]:
x = {3, 8, 5}
print(x)
x.add(7)
print(x)

x.remove(3)
print(x)

# get length of set x
print(len(x))

# check membership in x
print(5 in x)

# pop random item from set x
print(x.pop(), x)

# delete all items from set x
x.clear()
print(x)

{8, 3, 5}
{8, 3, 5, 7}
{8, 5, 7}
3
True
8 {5, 7}
set()


**Mathematical set operations**  
intersection (AND): set1 & set2  
union (OR): set1 | set2  
symmetric difference (XOR): set1 ^ set2  
difference (in set1 but not set2): set1 - set2  
subset (set2 contains set1): set1 <= set2  
superset (set1 contains set2): set1 >= set2

In [32]:
s1 = {1, 2, 3}
s2 = {3, 4, 5}
print(s1 & s2)
print(s1 | s2)
print(s1 ^ s2)
print(s1 - s2)
print(s1 <= s2)
print(s1 >= s2)

{3}
{1, 2, 3, 4, 5}
{1, 2, 4, 5}
{1, 2}
False
False


## Dictionaries (dict)
****
- Key/Value pairs
- Associative array, like Java HashMap
- Dicts are Unordered  

**constructors** - creating new dictionaries

In [34]:
x = {'mutton':25.3, 'beef':33.8, 'chicken':22.7}
print(x)
x = dict([('mutton', 25.3),('beef', 33.8),('chicken', 22.7)])
print(x)
x = dict(mutton=25.3, beef=33.8, chicken=22.7)
print(x)

{'mutton': 25.3, 'beef': 33.8, 'chicken': 22.7}
{'mutton': 25.3, 'beef': 33.8, 'chicken': 22.7}
{'mutton': 25.3, 'beef': 33.8, 'chicken': 22.7}


**dict operations**

In [35]:
x['shrimp'] = 38.2    # add or update
print(x)

# delete an item
del(x['shrimp'])
print(x)

# get length of dict x
print(len(x))

# delete all items from dict x
x.clear()
print(x)

# delete dict x
del(x)

{'mutton': 25.3, 'beef': 33.8, 'chicken': 22.7, 'shrimp': 38.2}
{'mutton': 25.3, 'beef': 33.8, 'chicken': 22.7}
3
{}


**accessing keys and values in a dict**  

In [37]:
y = {'mutton':25.3, 'beef':33.8, 'chicken':22.7}
print(y.keys())
print(y.values())
print(y.items())      # key-value pairs

# check membership in y_keys (only looks in keys, not values)
print('beef' in y)

# check membership in y_values
print('clams' in y.values())

dict_keys(['mutton', 'beef', 'chicken'])
dict_values([25.3, 33.8, 22.7])
dict_items([('mutton', 25.3), ('beef', 33.8), ('chicken', 22.7)])
True
False


**iterating a dict** - note, items are in random order.

In [38]:
for key in y:
    print(key, y[key])
    
for k, v in y.items():
    print(k, v)

mutton 25.3
beef 33.8
chicken 22.7
mutton 25.3
beef 33.8
chicken 22.7
