# 1. Data Structure


A data structure is used to store data in an organized fashion in order to make data manipulation and other data operations more efficient.

Data structures in Python are a key concept to learn before we dive into the nuances of data science and model building.

There is no one-size-fits-all kind of model when it comes to data structures. You will want to store data in different ways to cater to the need of the hour.

## 1.1 Variables

In [1]:
# variable assignment
a = 2
b = "variable"
c = 3.0
d = True

In [3]:
b

'variable'

In [4]:
# calculation
a + c
a / a

1.0

In [6]:
b + ' this is new ' + b

'variable this is new variable'

 **Python Operators**
 
 | Python Operator | Description  |
| :---: | :---: |
| `+` | addition |
| `-` | subtraction |
| `*` | multiplication |
| `/` | division |
| `%` | Modulus  |
| `**` | power |

In [26]:
print(d)
print(a)

True
2


In [8]:
e = d

In [9]:
e

True

In [11]:
type(d), type(a)

(bool, int)

In [12]:
b

'variable'

In [15]:
print(type(d), type(b), type(a))

<class 'bool'> <class 'str'> <class 'int'>


In [16]:
d

True

In [17]:
# type conversion
d_str = str(d)
type(d_str)

str

## 1.2 Lists

Lists in Python are the most versatile data structure. They are used to store heterogeneous data items, from integers to strings or even another list! They are also mutable, which means that their elements can be changed even after the list is created.

In [33]:
# lists
l_0 = ['cat',
     'dog', 
     'horse']
print(type(l))

<class 'list'>


In [34]:
l = [a, False, b, c, 6, l_0]

In [38]:
l

[2, False, 'variable', 3.0, 6, ['cat', 'dog', 'horse']]

In [40]:
# access list elemet
# print(l[2])
print(l[-2])

6


In [44]:
# append elements to list
l = ['cat', 'dog', 'horse']

l.append('mouse')
print(l)

['cat', 'dog', 'horse', 'mouse']


In [None]:
# remove elemnts
l.remove('dog')
print(l)

In [None]:
l.pop(0)

In [None]:
l

In [45]:
# sorting lists
l = [5, 10, 3, 2, 3]
l.sort()
l

[2, 3, 3, 5, 10]

In [46]:
# concatinating lists
list_1 = l
list_2 = ['cat', 'dog']

list_new = list_1 + list_2

In [47]:
list_new

[2, 3, 3, 5, 10, 'cat', 'dog']

In [50]:
# for loop
for x in list_new:
    print(type(x), x)

<class 'int'> 2
<class 'int'> 3
<class 'int'> 3
<class 'int'> 5
<class 'int'> 10
<class 'str'> cat
<class 'str'> dog


In [53]:
# list comprehension
list_numeric = [x for x in list_new if type(x) != int]

In [54]:
list_numeric

['cat', 'dog']

In [55]:
for x in list_new:
    if type(x) != int:
        print(x)

cat
dog


## 1.3 Tuples

Tuples are another very popular in-built data structure in Python. These are quite similar to Lists except for one difference – they are immutable. This means that once a tuple is generated, no value can be added, deleted, or edited.

In [56]:
# tuples
a = (2, 3)
b = (False, 'variable')
c = (e, d)
print(type(c))

<class 'tuple'>


In [57]:
d = (6, a, b, l)

In [58]:
d

(6, (2, 3), (False, 'variable'), [2, 3, 3, 5, 10])

In [60]:
# tuple packing
planets = ('mercury', 'venus', 'earth')
print(planets)
# tuple unpacking
a, b, c = planets

('mercury', 'venus', 'earth')


In [61]:
b

'venus'

In [62]:
planets

('mercury', 'venus', 'earth')

## 1.4 Dictionaries

Dictionary is another Python data structure to store heterogeneous objects that are immutable but unordered. This means that when you try to access the elements, they might not be in exactly the order as the one you inserted them in.

But what sets dictionaries apart from lists is the way elements are stored in it. Elements in a dictionary are accessed via their key values instead of their index, as we did in a list. So dictionaries contain key-value pairs instead of just single elements.

In [63]:
d = {
  "brand": "Ford",
  "model": "Mustang",
  "year": 1964
}
print(d)

{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}


In [64]:
age = [5, 6, 7, 8, 5]
name = ['john', 'laura', 'ten', 'x', 'y']

In [65]:
class_age = {
    "age": age,
    "name": name
            }

In [66]:
class_age

{'age': [5, 6, 7, 8, 5], 'name': ['john', 'laura', 'ten', 'x', 'y']}

In [67]:
d_new = {'car_1':d,
        'car_2': {"brand": "Toyota"}}

In [68]:
d_new

{'car_1': {'brand': 'Ford', 'model': 'Mustang', 'year': 1964},
 'car_2': {'brand': 'Toyota'}}

In [69]:
d_new['car_2']

{'brand': 'Toyota'}

In [70]:
# Accessing keys and values
d.keys()

dict_keys(['brand', 'model', 'year'])

In [71]:
d.values()

dict_values(['Ford', 'Mustang', 1964])

In [74]:
for k, v in d.items():
    print(k, v)

brand Ford
model Mustang
year 1964


In [76]:
d.items()

dict_items([('brand', 'Ford'), ('model', 'Mustang'), ('year', 1964)])

In [77]:
d_new.items()

dict_items([('car_1', {'brand': 'Ford', 'model': 'Mustang', 'year': 1964}), ('car_2', {'brand': 'Toyota'})])

## 1.5 Sets

Sometimes you don’t want multiple occurrences of the same element in your list or tuple. It is here that you can use a set data structure. Set is an unordered, but mutable, collection of elements that contains only unique values.

In [78]:
print(l)

[2, 3, 3, 5, 10]


In [79]:
s = set(l)

In [82]:
s

{2, 3, 5, 10}

# Functions

A function is a set of statements that take inputs, do some specific computation and produces output. The idea is to put some commonly or repeatedly done task together and make a function, so that instead of writing the same code again and again for different inputs, we can call the function.

Python provides built-in functions like print(), etc. but we can also create your own functions. These functions are called user-defined functions.

In [None]:
def cube(x):
    
    x = x ** 3
    
    return x

In [None]:
sample_variable = 5
cube(5)

In [None]:
def even_or_odd(x):
    if (x % 2 == 0):
        print("even")
    else:
        print("odd")

In [None]:
even_or_odd(123)

In [None]:
even_or_odd(4)

**Python Comparison Operatros**

| Python Operator | Description  |
| :---: | :---: |
| `==` | equal |
| `!=` | Not equal |
| `>` | Greater than |
| `<` | Less than |
| `>=` | Greater than or equal to  |
| `<=` | Less than or equal to |


**Python Logical Operatros**

| Python Operator | Description  | Example  |
| :---: | :---: | :---: |
| `and` | Returns True if both statements are true | x<5 and x<10 |
| `or` | Not equal | x<5 or x<4 |
| `not` | Reverse the result, returns False if the result is true | not(x<5 and x<10)

In [None]:
def data_generating_function():
    
    return [1, 2, 3, 4]

data_generating_function()

In [None]:
def swap(x, y):
    
    temp = x
    x = y
#   y = temp
    y = x

In [None]:
# keyword argument

# The idea is to allow caller to specify argument name with values
# so that caller does not need to remember order of parameters.

def student(first_name, last_name):
    return(first_name + ' ' + last_name)

In [None]:
student(first_name='amir', last_name='imani')
# student('amir', 'imani')
# student(first_name='amir', last_name=123)  # talk about assertion

In [None]:
# variable length arguments
def my_function(*argv):
    for arg in argv:
        print(arg) 
        
# or the following for keyword arguments def my_function(**kwargs):