# Note: Anaconda is a free Python distribution for data science. Miniconda is a minimal distribution only containing Python and the conda package manager.

# <font color=blue><b>Lesson 1: data structures</b></font>

# Data structures in Python: used for storing and manipulating data in a program

"Data structures" is one of the foundations for any programming language. It would be pointless to dive into programming using a language without understanding the basics - including data tructures.

- Primitive data structures: Integer, Float, String
- Container data structures: List, Tuple, Dictionary, Set

- Integer, Float, String, Tuple are immutable - cannot change once assigned
- List, Dictionary, Set are mutable - can change after assignment

String, List and Tuples are also called sequences.

A thorough understanding of data structures before undertakig a programming project pays off. In practice, you will realize that you are dealing with data structures each and every time you code.


## <font color=blue>1.1: Integer, Float, String</font>

In [1]:
#Immutable - cannot change once assigned
int1 = 15
float1 = 10.3
string1 = 'this is my string'

## <font color=blue>1.2: Lists</font>

In [7]:

"""
Lists (mutable, i.e. can change after assignment and definition). Similar to arrays in otehr languages.
An ordered collection, a list is perhaps the most commonly used data structure in Python.
"""

list1 = [1, 2, 3] #homogeneous
list2 = ["string", 0.1, True] #heterogeneous_list
list3 = [ integer_list, heterogeneous_list, [] ] #list of lists
list_length = len(list3) # equals 3
list_sum = sum(list1) # equals 6


NameError: name 'integer_list' is not defined

## <font color=blue>1.3: Tuples</font>

In [4]:
"""
Tuples (immutable i.e. cannot change after assignment and definition)
Tuples are similar to lists apart from the fact that they are immutable
"""

list1 = [1, 2, 7]
tuple1 = (1, 2, 6, 3)
tuple1 = 1, 2, 6, 3 #same as above

# list is mutable
list1[2] = 9 # list1 is now [1, 2, 9]

# tuple is immutable
tuple1[1] = 3 # TypeError: 'tuple' object does not support item assignment


TypeError: 'tuple' object does not support item assignment

## <font color=blue>1.4: Dictionaries</font>

In [4]:
"""
Dictionaries (mutable)
Another fundamental data structure is a dictionary, which associates values with keys and
allows you to quickly retrieve the value corresponding to a given key:
"""

dict1 = {} # Pythonic
dict2 = dict() # less Pythonic
user_grades = { "user1" : 80, "user2" : 95 } # dictionary literal
#You can look up the value for a key using square brackets:
user1_grade = user_grades["user1"] # equals 80

## <font color=blue>1.5: Sets</font>

In [6]:
#Sets (mutable, fast operations - e.g.  membership test)
# A set is an unordered collection of distinct elements. You cannot access it using index
set1 = set()
set1.add(1) # s is now { 1 }
set1.add(2) # s is now { 1, 2 }
set1.add(2) # s is still { 1, 2 }
x = len(set1) # equals 2
y = 2 in set1 # equals True - membership test
z = 3 in set1 # equals False - membership test


words = ["a","an","at"] + ["hundreds_of_other_words"] + ["yet", "you"]
"zip" in words # False, but have to check every element
stopwords_set = set(stopwords_list)
"zip" in words # fast to check

#another good use-case of set is to find the distinct items in a collection:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list) # 6
item_set = set(item_list) # {1, 2, 3}
num_distinct_items = len(item_set) # 3
distinct_item_list = list(item_set) # [1, 2, 3]
#We’ll use sets much less frequently than dicts and lists.

# <font color=blue><b>Lesson 2: lambda, counter, unpack lists, sort, enumerate</b></font>

In [9]:
#lambda is an anonymus short function. can take multiple parameters, but has only one expression (statement) to execute.
    
#dont assign lambdas to variables like below. instead use a function def or pass parameters immediately at the end line in the example after.  
full_name = lambda first, last: f'Full name: {first.upper()} {last.title()}'
full_name('richard', 'orama') #'Full name: RICHARD Orama'

#pass parameters immediately at the end - notice parenthesis on both
(lambda first, last: f'Full name: {first.upper()} {last.title()}') ('richardd', 'orama') #'Full name: RICHARD Orama'
(lambda x, y: x + y)(2, 3) #5


5

In [14]:
#Python Counter takes in input a list, tuple, dictionary, string, which are all iterable objects, and returns output that will have the count of each element.

from collections import Counter
list_names = ['Patrick', 'Richard', 'Richard', 'James', 'Martin', 'Peter', 'Richard', 'James', 'Richard', 'Martin']
counts = Counter(list_names)
print(counts) #Counter({'Richard': 4, 'James': 2, 'Martin': 2, 'Patrick': 1, 'Peter': 1})
#print(dir(counts)) #see the list of available class methods/attributes
print(counts.most_common(2)) #[('Richard', 4), ('James', 2)]
print("The occurrance of Richard is: " , counts['Richard']) #4

aa=[1,2,3,4,4,4]
Counter(aa) #Counter({1: 1, 2: 1, 3: 1, 4: 3})
Counter(aa)[4] #3

Counter({'Richard': 4, 'James': 2, 'Martin': 2, 'Patrick': 1, 'Peter': 1})
[('Richard', 4), ('James', 2)]
The occurrance of Richard is:  4


3

In [18]:
#unpack (explode) list
x, y = [1, 2] # x is 1, y is 2
_, y = [1, 2] # y == 2, the first element is ignored becaise of _
a, b, *c = [1, 2, 3, 4, 5, 6]
print(a, b, c) #1 2 [3, 4, 5, 6], note that c is now [3, 4, 5, 6]
a, b, *c, d = [1, 2, 3, 4, 5, 6]
print(a, b, c) #1 2 [3, 4, 5], note that c is now [3, 4, 5]
a, b, *_, d = [1, 2, 3, 4, 5, 6] #ignore [3, 4, 5]



1 2 [3, 4, 5, 6]
1 2 [3, 4, 5]


In [29]:
#sort list using sorted()

ids = ['id1', 'id2', 'id30', 'id3', 'id22', 'id100'] 
sorted_ids1 = sorted(ids) # Lexicographic sort #['id1', 'id100', 'id2', 'id22', 'id3', 'id30']
sorted_ids2 = sorted(ids, key=lambda x: int(x[2:])) # Integer sort #['id1', 'id2', 'id3', 'id22', 'id30', 'id100']

# sort the list by absolute value from largest to smallest
x = sorted([-4,1,-2,3], key=abs, reverse=True) # is [-4,3,-2,1]

# sort the words and counts from highest count to lowest
words= ['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'aa', 'bb'] #'abcdefgh abc' #[1,2,3,4,4,4]
word_counts = Counter(words)
#note below: it appears that we cannot use "key=lambda word, count" anymore in Python3
word_counts_sorted = sorted(word_counts.items(), key=lambda w: w[1], reverse=True) #[('aa', 3), ('bb', 3), ('cc', 2), ('dd', 2), ('ee', 2), ('ff', 2), ('gg', 1)]
print(word_counts_sorted)

#A Counter instance has a most_common method that is frequently useful:
# print the 10 most common words and their counts
for word, count in word_counts.most_common(5):
    print (word, count)

[('aa', 3), ('bb', 3), ('cc', 2), ('dd', 2), ('ee', 2), ('ff', 2), ('gg', 1)]
aa 3
bb 3
cc 2
dd 2
ee 2


In [34]:
#sort list using sort() - inplace

x = [4,1,2,3]
y = sorted(x) # is [1,2,3,4], x is unchanged
x.sort() #x is changed

# sort the list by absolute value from largest to smallest
x.sort(key=abs, reverse=True) #inplace, returns None, so explicitly print x below
x

[4, 3, 2, 1]

In [41]:
#enumerate
#used for iterating over a list and use both index and value. it produces tuples (index, element):
words = ['aa', 'bb', 'cc']
for i, word in enumerate(words, start=1): #can optionally add start argument, otherwise index starts from 0
    print(i, word)
    
#if you want to use only index:
x=0
for i, _ in enumerate(words):
    x += i
print(x)

    

1 aa
2 bb
3 cc
3


# <font color=blue><b>Lesson 4: using slice to rotate, reverse list (all amazing stunts)</b></font>

In [19]:
#rotate left - amazing
d=3
a=[1,2,3,4,5, 6, 7]
b = a[d:] + a[:d]
b #[4, 5, 6, 7, 1, 2, 3]


[4, 5, 6, 7, 1, 2, 3]

In [20]:
#Reverse a list (using step)
a=[1,2,3,4,5,6]
b=a[::-1] #All (start to stop), but step -1 - [6, 5, 4, 3, 2, 1]
b=a[::2] #All (start to stop), but step 2 - [1, 3, 5]
#b=a.reverse() #not working???
b=reversed(a)
b
[print(i, end='') for i in b]

654321

[None, None, None, None, None, None]

In [45]:
#some nice exercise 
#prime numbers between 100 and 200

numbers = range(2, 200)
for num in numbers:
    if all([num%i != 0 for i in range(2, num)]):
        print(num, end=', ')

#list comprehension way
primes = [num for num in numbers if all([num%i != 0 for i in range(2, num)])]
print('\n\n', primes)


2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 

 [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199]
