# Python Basics 2

## What we will cover?

Now that we have a basic idea about Python data types, and explored the concept of lists, tuples and dictionaries, we're getting closer to concepts we'll use for data analysis. Especially in the Data Understanding and Data Preparation stages, we often will need to transform or clean up our data. The concepts that we will cover in this tutorial help us do this. Here's what we'll cover:

* Conditions
* Loops
* Functions




## Conditions

Conditions are a way to test if something is happening and, if so, to do something about it. Let's explore a bit.


In [None]:
a = 1

In [None]:
if a == 1:
    print('a is 1!!!')

In [None]:
a = 0

In [None]:
if a == 1:
    print('a is 1!')

In [None]:
if a == 2:
    print('a is 2!')
else:
    print('a is not 2!')

Some important things that we've done above:
* First, we indicated a condition by checking IF SOMETHING is equal (==) to SOMETHING ELSE.
* Notice that after the condition, we have a :. It indicates what needs to be done if that condition is TRUE
* After the column, we have indented text (meaning: there's a tab). Everything that is within that area (even the lines below, should they also be indented) belong to that condition
* We also used ELSE to indicate what to do if the condition was FALSE


Let's look at another example:

In [None]:
mylist = [1, 2, 3]

if 3 in mylist:
    print("there's a 3 in my list")
    print("my list has", len(mylist), "items")
else:
    print("there's no 3 in my list")

In [None]:
if 4 in mylist:
    print("there's a 4 in my list")
    print("my list has", len(mylist), "items")
else:
    print("there's no 4 in my list")

In [None]:
a = 1
b = 1000000

if a > b: 
    print('a > b')
elif a < b: 
    print('a < b')
elif a == b: 
    print('a == b')

The "elif" condition is a shorthand for "else if". This allows us to check one condition and, if it is not true, check another condition. In the example above it does not matter. Check the two examples below and let's see why it matters.

In [None]:
a = 0
b = 0


if a == b: 
    print('a == b')
elif a == 0: 
    print('a == 0')


In [None]:
a = 0
b = 0


if a == b: 
    print('a == b')
if a == 0: 
    print('a == 0')

Enough about conditions for now. We'll use them a lot more later on, combined with loops and functions.

## Loops
Loops are a way to ask Python to do something continuously - until a certain condition is met. While it is possible to create a loop that does not end (without an exit condition), we should never do that... otherwise, the code would run forever. Let's see some basic loops.

In [2]:
counter = 0
while counter < 10:
    print('counter at', counter)
    counter += 1

print('counter finished at', counter)

counter at 0
counter at 1
counter at 2
counter at 3
counter at 4
counter at 5
counter at 6
counter at 7
counter at 8
counter at 9
counter finished at 10


In [3]:
mylist = [1,2,3,4,5,6,7,8,9,10]

for i in mylist:
    print(i)

1
2
3
4
5
6
7
8
9
10


**while** and **for** are some common ways to do loops in Python:
* while will repeat the code (that is in the indented area) until a certain condition is met
* for will loop for each element of a list (or a string, or tuple) until there are no elements anymore

In [4]:
for l in 'this is a sentence!':
    print(l)

t
h
i
s
 
i
s
 
a
 
s
e
n
t
e
n
c
e
!


## Functions
Functions are a way to repeat a certain activity multiple times in Python. They help you write a certain action once, and reuse it many times. As you can imagine, this is really powerful for data preparation. Let's see some examples.


Let's say we want a function that always add 1 to any number we give to it.


In [None]:
def add_one(number):
    number = number + 1
    return number

In [None]:
add_one(3)

In [None]:
add_one(1009209)

In [None]:
add_one(-1)

In [None]:
mylist = [1,2,3,4,5,6]

In [None]:
for item in mylist:
    print(add_one(item))

In [None]:
mylist

Let's say we have a list with data that needs to be cleaned out. The list has numerical values (which we want), but some unneeded text data (that we want to change by 99, our code for missing value). How can we do this?

In [None]:
mydata = [1,2.5,3,4,1,2,3,4,2,23,4,5,6,'firhkj', 1,2,3, 'dejde']

In [None]:
len(mydata)

First, let's write a function to do this. In order to write a function, we need to see the type of check we need to make. The easiest way is to just simulate first ourselves.

In [None]:
type(1)

In [None]:
type(2.5)

In [None]:
type('firhkj')

Okay, so it looks like our condition needs to have something using the output of **type** (because we want numbers but not text). 

In [None]:
def check_number(item):
    if type(item) == str:
        return 99
    else:
        return item
        

In [None]:
check_number(1)

In [None]:
check_number(2.5)

In [None]:
check_number('firhkj')

Looks like it worked! Let's now create a new list, and add the items from the previous list (but clean them up)

In [None]:
clean_list = []

In [None]:
for item in mydata:
    new_item = check_number(item)
    clean_list.append(new_item)

In [None]:
clean_list

In [None]:
mydata

# Assignments

## Challenge 1
Create a function that tests if an element of a list is a number or a string. If it is a number, it should return its value divided by two. If it is a string, it should return the length of the string.

In [1]:
testList = [46, "Swa", 567987, 56.9, "Amb", 6723413, "Advay", 677777552533, "Samay", "Pepernotten", 87646.0986644]
for element in testList:
    try:
        if type(element) == str:
            lengthofString = len(element)
            print (lengthofString)
        elif type(element) == int or type(element) == float:
            print (element/2)
    except:
        print("Invalid element")  

23.0
3
283993.5
28.45
3
3361706.5
5
338888776266.5
5
11
43823.0493322


## Challenge 2

The table below contains a dataset containing customer IDs, ages, the name of the website they visited, and whether they clicked on the ad. You need to do three things:

1. Move this table to a data structure (list, dictionary, list of lists... or whatever you prefer)
2. Create a function that categorizes the website type (define a categorization that you find meaningful)
3. Create a function that categorizes the customers according to age groups (define which age groups you want to use)


| CID  | Age  | Website  | Click  |
|---|---|---|---|
|  1 | 19  | NBC  |  1 |
|  2 | 25  | New York Times  |  0 |
|  3 | 41  | Facebook  | 0  |
|  4 | 37  | Twitter  |  1 |
|  5 | 64  | Buzzfeed  | 1  |
|  6 | 50  | CBS  |  0 |
|  7 | 18  | The Guardian  | 0  |
|  8 | 55  | Google News  | 1  |





