<a href="https://colab.research.google.com/github/ttcielott/python_basic/blob/main/python_data_structure.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Structure

Data Structures are containers that can include different data types.

A data type is just a type that classifies data. This can include primitive (basic) data types like integers, booleans, and strings, as well as data structures, such as lists.

Data structures are containers that organize and group data types together in different ways. For example, some of the elements that a list can contain are integers, strings, and even other lists!

## List

A list is one of the most common and basic data structures in Python.

In [None]:
# how to use membership operators, 'in' 'not in'
print('Dobby' in 'Dobby Only Wants Harry Potter To Be Safe.')
print('Ron' not in 'Ron Weasley')
print(2 in ['a', 1, 4.0, 2])

True
False
True


### Mutability 

# Wheter an object can change its values after it has been created. 

In [None]:
# Is a string object mutable?

name = 'Harry Potter'
try:
  name[:6] = 'James'
except:
  print('A String Object is Not Mutable.')

A String Object is Not Mutable.


In [None]:
# Is a list object mutable?

a = ['a', 1, 1.0, '#']
a[2:4] = ['hello','@']
a

['a', 1, 'hello', '@']

In [None]:
# comparison between immutable strings and mutable list

# immutable string object
name = 'George'
student = name # student variable is now George
name = 'William' # assigning a new value to name variable affects student variable? 
print(student) 

# mutable list object
l1 = [1,3,4]
l2 = l1
l1[0] = 2  # this change affects l2 as well
l2

George


[2, 3, 4]

**A List Object Is Mutable!**

### Order

Whether the order of elements in an object matters (whether you can index (=access) elements in an object).

**Both strings and lists are ordered.** 

### Useful functions for List

In [None]:
names = ['Amy', 'Carol', 'Jenifer', 'Daniel']

# find the greatest element
# the maximum element in a list of strings is element that appears the last alphabetically 
max(names)

'Jenifer'

In [None]:
'Daniel' > 'Carol'

True

In [None]:
# max function won't work on incomparable types
try:
  max([1, 'Amy'])
except:
  print('You can\'t compare incomparable types.')

You can't compare incomparable types.


In [None]:
# sorted() returns a copy of a list in order from the smallest to the largest
sorted(names)

['Amy', 'Carol', 'Daniel', 'Jenifer']

In [None]:
# sorted() leaves the original list unchanged
names

['Amy', 'Carol', 'Jenifer', 'Daniel']

In [None]:
# sort a list in order from the largest to the smallest
sorted(names, reverse = True)

['Jenifer', 'Daniel', 'Carol', 'Amy']

In [None]:
# use join method 
'\n'.join(names)

'Amy\nCarol\nJenifer\nDaniel'

In [None]:
print('\n'.join(names))

Amy
Carol
Jenifer
Daniel


In [None]:
# in case where commas in between elements of a list are missing
# it will still join all elements without space because of Python's default string literal appending
'$'.join(['a''b''c'])

'abc'

In [None]:
# in case you try to join anything other than strings
try:
  '-'.join(['a', 1, 3.5])
except:
  print('it won\'t work!')

it won't work!


## Tuple

A tuple is another useful container. Parentheses are optional when making tuples, and programmers frequently omit them if parentheses don't clarify the code.

* Immutable (no add, no move, no sort)
* Ordered
* Most Common Usage: 
> * when you have two or more closely related values. e.g. latitude & longtitude
> * when you assign multiple variables in a compact way (tuple unpacking)


In [None]:
('a', 1)

('a', 1)

In [None]:
# it's ordered
location = (13.4125, 103.866667)
print('latitude:', location[0])
print('longitude:', location[1])

latitude: 13.4125
longitude: 103.866667


In [None]:
# immutable
tuple_c = tuple_a # save 1,2 in tuple_c 
tuple_a = 3,4 # redefine tuple_a doesn't affect tuple_c
tuple_c

(1, 2)

In [None]:
try:
  tuple_c[1] = 4
except:
  print('Tuple is immutable. It can be modified after it is defined.')

Tuple is immutable. It can be modified after it is defined.


In [None]:
dimensions = 30, 20, 50
# tuple unpacking
length, width, height = dimensions

print('The dimention is {} x {} x {}'.format(length, width, height))

The dimention is 30x20x50


In [None]:
# Parentheses are optional when making tuples
tuple_a = 1, 2
tuple_b = (1, 2)

print(tuple_a == tuple_b)
print(tuple_a[1])

True
2


## Sets

* unique element (it removes duplicates)
* mutable (add, pop)
* unordered (ordering is inconsistent)

In [None]:
set_a = {1,2,3,4,5}
set_b = set_a
set_a = {'a', 'b', 'c'}
set_b

{1, 2, 3, 4, 5}

In [None]:
# add element
set_a.add('d')

In [None]:
# pop method for set removes a random value because sets are unordered, so there is no last value.
# one for list removes the last value
set_1 = {'ace', 'ventura', 'jim'}
set_1.pop()

'ace'

In [None]:
print('carrey' in set_1)

set_1.add('carrey')

False


In [None]:
{1,2,'A'}

{1, 2, 'A'}

The method like **union, intersection, difference** are easy to perform with sets and are much faster than such operators with other containers.

## Dictioneries

* Mutable
* Not sortable
* Unordered (ordering is inconsistent)
* Store mappings of unique key to values
* Dictionary keys must be immutable, that is, they must be of a type that is not modifiable, so it can be str, list, int, and float.
* In Python, any immutable object (such as an integer, boolean, string, tuple) is hashable, meaning its value does not change during its lifetime. This allows Python to create a unique hash value to identify it, which can be used by dictionaries to track unique keys and sets to track unique values. This is why Python requires us to use immutable datatypes for the keys in a dictionary.

In [None]:
# every key doesn't have to have the data type
dict = {'a':[1,2,4], 1:['a', '#'], (2,4):['h','i']}
dict['a'][0]

1

In [None]:
dict[(2,4)]

['h', 'i']

The tuple can be key!

In [None]:
country_code = {'UK':44, 'US': 1, 'South Korea': 82}
'US' in country_code

True

In [None]:
# get the value mapped to 'US'
print(country_code.get('US'))
print(country_code['US'])

1
1


In [None]:
# when there is no such key
print(country_code.get('USD'))

None


In [None]:
try:
  country_code['USD']
except:
  print('Normal square bracket lookup will return error when there is no key you are looking for.')

Normal square bracket lookup will return error when there is no key you are looking for.


If you expect lookups to sometimes fail, *get* might be a better tool than normal square bracket lookups because errors can crash your program.

**Identity Operators**

* is : evaluates if both sides have the same identity.
* is not : evaluates if both sides have different identities.

In [None]:
n = country_code.get('USD')
n is None

True

In [None]:
n is not None

False

**Indentiy Operatiors VS Comparsion Operator**

> Equal VS Identical

In [None]:
# Test the code here if you'd like
a = [1, 2, 3]
b = a  # a and b are identical
c = [1, 2, 3] # a, b and c are equal (having the same content), but not identical

print(a == b) # equality test
print(a is b) # identity test
print(a == c) # equality test
print(a is c) # identity test

True
True
True
False


List a and list b are equal and identical. List c is equal to a (and b for that matter) since they have the same contents. But a and c (and b for that matter, again) point to two different objects, i.e., they aren't identical objects. That is the difference between checking for equality vs. identity.

In [None]:
dana = [33, 'South Korean', 'Female']
hana = [33, 'South Korean', 'Female']

print(dana == hana) # the same elements
print(dana is hana) # not identical

True
False


# Compound Data Structure

We can include containers in other containers to creat compound data structure

In [None]:
# compound dictionary
elements = {'hydrogen': {'number':1,
                         'weight': 1.00794,
                         'symbol': 'H'},
           "helium": {"number": 2,
                      "weight": 4.002602,
                      "symbol": "He"}}

In [None]:
# access the element
elements['helium']['weight']

4.002602

In [None]:
elements.get('helium').get('weight')

4.002602

In [None]:
# add a new key to the element dictionary
oxygen = {'number':8, 'weight':15.999, 'symbol': 'O'}
elements['oxygen'] = oxygen
elements

{'hydrogen': {'number': 1, 'weight': 1.00794, 'symbol': 'H'},
 'helium': {'number': 2, 'weight': 4.002602, 'symbol': 'He'},
 'oxygen': {'number': 8, 'weight': 15.999, 'symbol': 'O'}}

In [None]:
# todo: Add an 'is_noble_gas' entry to the hydrogen and helium dictionaries
# hint: helium is a noble gas, hydrogen isn't
elements['hydrogen']['is_noble_gas'] = False
elements['helium']['is_noble_gas'] = True

In [None]:
elements

{'hydrogen': {'number': 1,
  'weight': 1.00794,
  'symbol': 'H',
  'is_noble_gas': False},
 'helium': {'number': 2,
  'weight': 4.002602,
  'symbol': 'He',
  'is_noble_gas': True},
 'oxygen': {'number': 8, 'weight': 15.999, 'symbol': 'O'}}

# Recap

Data Structure | Ordered | Mutable | Constructor | Example
- | - | - | - | - |
List | Yes | Yes | list() or [ ] | [5.7, 4, 'yes', 5.7]
Tuple | Yes | No | tuple() or ( ) | (5.7, 4, 'yes', 5.7)
Set | No | Yes | set() or { }* | {5.7, 4, 'yes', 5.7}
Dictionary | No | No** | dict() or { } | {'Jun': 75, 'Jul: 89}

\*you can use curly braces to define a set like this: {1,2,3}. However, if you leave the curly braces empty like this: {}, Python will isntead create an empty dictionary. So as to create an empty set, use set()
<br> \**A dictionary itself is mutable, but each of its individual keys must be immutable.