## MSBD 5001 - Foundations of Data Analytics
## Tutorial 4
### More on Python Basics

### Python Data Structures
<url>https://docs.python.org/3/tutorial/datastructures.html</url>

This section introduces:
- [List](#list)
- [Tuple](#tuple)
- [Set](#set)
- [Dictionary](#dict)

<a name = "list"></a>
#### Lists
- A list is defined by writing a list of comma separated values in a square brackets.
- Lists might contain items of different types, but usually the items all have the same type.

In [1]:
squares_list = [0, 1, 4, 9, 16, 25]
print(squares_list)

[0, 1, 4, 9, 16, 25]


In [3]:
print(squares_list[0]) # Indexing returns the items
print(squares_list[-2]) # Return 2nd last element in the list

0
16


In [4]:
nested_list = [[1, 2, 3], [4, 5, 6]] # Nested list: lists in the list 
print(nested_list)
print(nested_list[0])

[[1, 2, 3], [4, 5, 6]]
[1, 2, 3]


In [9]:
squares_list[2] = 36
print(squares_list)

[0, 1, 36, 0, 16, 25]


Slicing: accessing sublist

In [12]:
num_list = [3, 2, 16, 8, 30, 22]
print ("Slicing examples:")
print (num_list[2:4]) # Return a new list from index 2 to 4 (exclusive)
print (num_list[2:]) # Return a new list from index 2 to the end
print (num_list[:2]) # Return from the start to index 2 (exclusive)
print (num_list[:]) # Return the whole list
print (num_list[:-1]) # Slice indices can be negative

print ("Slicing with step:")
print (num_list[::2]) # Return every 2nd item
print (num_list[::-1]) # Return a reversed list

print ("More examples:")
num_list[2:4] = [0, 0] # Assign a new sublist to a slice
print (num_list) # Prints "[3, 2, 0, 0, 30, 22]"

Slicing examples:
[16, 8]
[16, 8, 30, 22]
[3, 2]
[3, 2, 16, 8, 30, 22]
[3, 2, 16, 8, 30]
Slicing with step:
[3, 16, 30]
[22, 30, 8, 16, 2, 3]
More examples:
[3, 2, 0, 0, 30, 22]


Some Methods and Operations of Lists

In [31]:
list1 = [9, 5, 3]
list1.append(10) # Append 10 to the end of the list
print (list1)
print ("Number of items: ", len(list1)) # Get the number of items
print (list1 * 2) # Duplicate the list for 2 times
list1.sort() # Sort list1
print (list1)

list2 = [1, 12, 4]
print (list1 + list2) # Concatenate list1 and list2

[9, 5, 3, 10]
Number of items:  4
[9, 5, 3, 10, 9, 5, 3, 10]
[3, 5, 9, 10]
[3, 5, 9, 10, 1, 12, 4]


#### List Comprehensions
List comprehensions provide a concise way to create lists. 

In [18]:
nums = [0, 1, 2, 3]
squares = []
for x in nums:
    squares.append(x**2)
print (squares)

[0, 1, 4, 9]


- A list comprehension consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses.
- The result will be a new list.

In [19]:
squares = [x**2 for x in nums]
print (squares)

[0, 1, 4, 9]


In [21]:
squares = [x**2 for x in num_list if x%2]
print (squares)

[9]


#### Strings
- Strings can be defined by use of single ('), double (") or triple ("') quotes.
- Strings enclosed in triple quotes ("') can span over multiple lines.

In [22]:
greeting = 'Hello'
print (greeting)

Hello


In [23]:
print (greeting[1]) # Return character at index 1
print (len(greeting)) # Print length of string
print (greeting + 'World') # String concatention

e
5
HelloWorld


In [25]:
str = r'\n is a newline character by default' 
# Raw strings can be defined by adding r to the string
print (str)

\n is a newline character by default


In [26]:
greeting[1] = 'x'

TypeError: 'str' object does not support item assignment

There are many useful string methods. 

In [27]:
print (greeting.upper()) # Return a copy of the string by convert all characters in the string to uppercase
print (greeting.center(20)) # Return centered copy of the string with padding
print (greeting.rjust(20)) # Return the right-justified copy of the string with padding
print (greeting.replace('ll', 'r')) # Return a copy of string with all instances of one substring replaced with another
print (' hello       '.strip()) # Return a copy of the string with all leading and trailing spaces removed

HELLO
       Hello        
               Hello
Hero
hello


<a name = "tuple"></a>
#### Tuples
- A tuple is represented by a number of values separated by commas.
- Tuples are immutable and output is surrounded by parentheses.
- Faster in processing than lists.

In [37]:
tuple_example = 0, 1, 4, 9, 16, 25 
print (tuple_example) # Output would be enclosed in parenthesis

# Parenthesis will be necessary if the tuple is part of a larger expression
tuple_example2 = (0, 2, 4), (1, 3, 5)
print (tuple_example2)

(0, 1, 4, 9, 16, 25)
((0, 2, 4), (1, 3, 5))


In [35]:
print (tuple_example[2]) # Indexing returns the items
tuple_example[2] = 6 # Tuples are immutable and hence this is an error

4


TypeError: 'tuple' object does not support item assignment

<a name = "set"></a>
#### Sets
- A set is an unordered collection of distinct elements.
- Sets do not record element position or order of insertion.
- Sets do not support indexing, slicing, or other sequence-like behaviour.

In [38]:
fruits = {'Orange', 'Apple'}
print (fruits)

{'Apple', 'Orange'}


In [39]:
print('Apple' in fruits) # Check if an element is in a set
print('Banana' in fruits) 
fruits.add('Banana') # Add an element to a set
print('Banana' in fruits)
print (len(fruits)) # Get the number of elements in a set
fruits.add('Apple') # Add an element that is already in the set does nothing
print (fruits)
fruits.remove('Apple') # Remove an element from the set
print (fruits)

True
False
True
3
{'Apple', 'Banana', 'Orange'}
{'Banana', 'Orange'}


In [43]:
items = set() # Create an emtpy set
items.add("Coke")
items.add("Potato chips")
print (items)

{'Coke', 'Potato chips'}


In [46]:
set1 = set("banana") # A set of unique characters
set2 = set("apple") 
print (set1)
print (set2)
print ("Set operations:")
print (set1 - set2) # characters in set1 but not set2
print (set1 | set2) # characters in set1 or set2
print (set1 & set2) # characters in both set1 and set2
print (set1 ^ set2) # characters in set1 or set2 but not both

{'n', 'a', 'b'}
{'e', 'l', 'p', 'a'}
Set operations:
{'n', 'b'}
{'e', 'n', 'b', 'l', 'p', 'a'}
{'a'}
{'e', 'n', 'l', 'p', 'b'}


<a name = "dict"></a>
#### Dictionary
- Dictionary is an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary).
- A pair of braces creates an empty dictionary: {}.

In [49]:
extensions = {'CSE': 1234, 'DSCT': 7444, 'ECE': 7036}
print (extensions)

{'CSE': 1234, 'DSCT': 7444, 'ECE': 7036}


In [52]:
extensions['CSE'] = 7000 # Modify the value for the key 'CSE'
print (extensions['CSE']) # Get the value of the key 'CSE'
extensions['MAE'] = 8654 # Add a key-value pair
print (extensions)

7000
{'CSE': 7000, 'DSCT': 7444, 'ECE': 7036, 'MAE': 8654}


In [57]:
print (extensions.keys()) # Get the keys only
print (extensions.values()) # Get the values only
print (extensions.items()) # Get the key-value pairs

dict_keys(['CSE', 'DSCT', 'ECE', 'MAE'])
dict_values([7000, 7444, 7036, 8654])
dict_items([('CSE', 7000), ('DSCT', 7444), ('ECE', 7036), ('MAE', 8654)])


In [58]:
for k, v in extensions.items():
    print (k, v)

CSE 7000
DSCT 7444
ECE 7036
MAE 8654
