# Python Basics 2

## What we will cover?

Now that we have a basic idea about Python data types, and explored the concept of lists, tuples and dictionaries, we're getting closer to concepts we'll use for data analysis. Especially in the Data Understanding and Data Preparation stages, we often will need to transform or clean up our data. The concepts that we will cover in this tutorial help us do this. Here's what we'll cover:

* Conditions
* Loops
* Functions




## Conditions

Conditions are a way to test if something is happening and, if so, to do something about it. Let's explore a bit.


In [3]:
a = 1

In [4]:
if a == 1:
    print('a is 1!!!')

a is 1!!!


In [5]:
a = 0

In [6]:
if a == 1:
    print('a is 1!')

In [7]:
if a == 2:
    print('a is 2!')
else:
    print('a is not 2!')

a is not 2!


Some important things that we've done above:
* First, we indicated a condition by checking IF SOMETHING is equal (==) to SOMETHING ELSE.
* Notice that after the condition, we have a :. It indicates what needs to be done if that condition is TRUE
* After the column, we have indented text (meaning: there's a tab). Everything that is within that area (even the lines below, should they also be indented) belong to that condition
* We also used ELSE to indicate what to do if the condition was FALSE


Let's look at another example:

In [8]:
mylist = [1, 2, 3]

if 3 in mylist:
    print("there's a 3 in my list")
    print("my list has", len(mylist), "items")
else:
    print("there's no 3 in my list")

there's a 3 in my list
my list has 3 items


In [9]:
if 4 in mylist:
    print("there's a 4 in my list")
    print("my list has", len(mylist), "items")
else:
    print("there's no 4 in my list")

there's no 4 in my list


In [10]:
a = 1
b = 1000000

if a > b: 
    print('a > b')
elif a < b: 
    print('a < b')
elif a == b: 
    print('a == b')

a < b


The "elif" condition is a shorthand for "else if". This allows us to check one condition and, if it is not true, check another condition. In the example above it does not matter. Check the two examples below and let's see why it matters.

In [11]:
a = 0
b = 0


if a == b: 
    print('a == b')
elif a == 0: 
    print('a == 0')


a == b


In [12]:
a = 0
b = 0


if a == b: 
    print('a == b')
if a == 0: 
    print('a == 0')

a == b
a == 0


Enough about conditions for now. We'll use them a lot more later on, combined with loops and functions.

## Loops
Loops are a way to ask Python to do something continuously - until a certain condition is met. While it is possible to create a loop that does not end (without an exit condition), we should never do that... otherwise, the code would run forever. Let's see some basic loops.

In [13]:
counter = 0
while counter < 10:
    print('counter at', counter)
    counter += 1

print('counter finished at', counter)

counter at 0
counter at 1
counter at 2
counter at 3
counter at 4
counter at 5
counter at 6
counter at 7
counter at 8
counter at 9
counter finished at 10


In [14]:
mylist = [1,2,3,4,5,6,7,8,9,10]

for i in mylist:
    print(i)

1
2
3
4
5
6
7
8
9
10


**while** and **for** are some common ways to do loops in Python:
* while will repeat the code (that is in the indented area) until a certain condition is met
* for will loop for each element of a list (or a string, or tuple) until there are no elements anymore

In [15]:
for l in 'this is a sentence!':
    print(l)

t
h
i
s
 
i
s
 
a
 
s
e
n
t
e
n
c
e
!


## Functions
Functions are a way to repeat a certain activity multiple times in Python. They help you write a certain action once, and reuse it many times. As you can imagine, this is really powerful for data preparation. Let's see some examples.


Let's say we want a function that always add 1 to any number we give to it.


In [16]:
def add_one(number):
    number = number + 1
    return number

In [17]:
add_one(3)

4

This may be going too fast. Let's look at the elements above that constitute a function.

In the **first line**:
* ```def``` indicates to Python that you are creating (defining) a function
* ```add_one``` is the name of the function. You can give it any name you want, as long as (a) it does not start with a number or restricted character, (b) it does not have a space, and (c) it does not overlap with one of Python's basic commands.
* ```(number)``` specifies the arguments that will be passed along to the function (i.e., that the function will use). It is up to you what name you give to the argument. A function can also be written without arguments - i.e., just ```()``` - but in this course we mostly will use arguments.
* ```:``` don't forget this - as it indicates that the function has been defined, and what comes underneath is the function itself

**After the first line**, notice the indent, i.e., that anything that happens inside the function is indented (with spaces) within the function. That way, Python knows until which line of the code belongs to the function.

In this area, the function is being defined. The arguments - in our case, ```number``` - can then be used by the function in whatever operations we want to run. In this case, we are saying to add 1 to number or, in code, ```number = number + 1```.

**In the last line of the function**, we have ```return```, which specifies what the function should return.







*Tip:* the argument name is not important, as long as you use it consistently. Here we used ```number```, but we could have called it whatever we wanted. 

So:

```
def add_one(number):
    number = number + 1
    return number
```

Does the same thing as:
```
def add_one(whateveriwant):
    whateveriwant = whateveriwant + 1
    return whateveriwant
```

Or as:
```
def add_one(a):
    a = a + 1
    return a
```

In [18]:
add_one(1009209)

1009210

In [19]:
add_one(-1)

0

In [20]:
mylist = [1,2,3,4,5,6]

In [21]:
for item in mylist:
    print(add_one(item))

2
3
4
5
6
7


In [22]:
mylist

[1, 2, 3, 4, 5, 6]

Let's say we have a list with data that needs to be cleaned out. The list has numerical values (which we want), but some unneeded text data (that we want to change by 99, our code for missing value). How can we do this?

In [23]:
mydata = [1,2.5,3,4,1,2,3,4,2,23,4,5,6,'firhkj', 1,2,3, 'dejde']

In [24]:
len(mydata)

18

First, let's write a function to do this. In order to write a function, we need to see the type of check we need to make. The easiest way is to just simulate first ourselves.

In [25]:
type(1)

int

In [26]:
type(2.5)

float

In [27]:
type('firhkj')

str

Okay, so it looks like our condition needs to have something using the output of **type** (because we want numbers but not text). 

In [28]:
def check_number(item):
    if type(item) == int:
        return item
    elif type(item) == float:
        return item
    else:
        return 'not a float or integer'
        

In [29]:
check_number(1)

1

In [30]:
check_number(2.5)

2.5

In [31]:
check_number('firhkj')

99

Now for something more complex (and useful): let's say we have a list visitors coming from different websites, and we want to categorize the type of website.

In [32]:
visitors = ['Facebook', 'Twitter', 'Twitter', 'YouTube', 'NYT',
           'Facebook', 'WP', 'YouTube', 'NYT', 'NYT', 'Instagram']

In [33]:
visitors

['Facebook',
 'Twitter',
 'Twitter',
 'YouTube',
 'NYT',
 'Facebook',
 'WP',
 'YouTube',
 'NYT',
 'NYT',
 'Instagram']

Let's create a function for this:

In [34]:
def check_website(website):
    if website in ['Facebook', 'Twitter', 'Instagram']:
        return 'Social Media'
    if website in ['NYT', 'WP']:
        return 'News website'
    return 'not categorized'

In [35]:
for item in visitors:
    print(check_website(item))

Social Media
Social Media
Social Media
not categorized
News website
Social Media
News website
not categorized
News website
News website
Social Media
