# Variables

The first thing we need is a way to store information. Whether it's a whole dataset, your birthday or the pictures you took on your holidays. We will all store them in variables. Remember, Python is dynamically typed, so a variable that is a string at the beginning can become an integer later on.

In [1]:
# Assign the variable age the value 47, which is an integer
age = 47

# Print the variable
print(age)

# This will not work print('Age: '+age), as you concatenate a string and integer
# But this will:
print('Age: '+str(age))
print('Age: ',age)

47
Age: 47
Age:  47


Printing is great for keeping track of what your code is doing:

In [2]:
age = 47
print(age)

age = 'forty seven'
print(age)

47
forty seven


### Various ways of printing:

In [4]:
#Various ways of printing
name = "xuefei lu"
print(name.lower())
print(name.upper())
print(name.title())

#Prints a new line
print("\n")

#Stripping whitespace
name = " xuefei "
print("|"+name.lstrip()+"|")
print("|"+name.rstrip()+"|")
print("|"+name.strip()+"|")

xuefei lu
XUEFEI LU
Xuefei Lu


|xuefei |
| xuefei|
|xuefei|


### Numbers:

In [6]:
a = 10
b = -10.1023

#Some operations illustrated (\t stands for a tab)
print("a: \t\t"+str(a))
print("b: \t\t"+str(b))
print("absolute of b: \t\t\t"+str(abs(b)))
print("rounded b: \t\t\t"+str(round(b,3)))
print("square of a: \t\t\t"+ str(pow(a,2)))
print("cube of a: \t\t\t"+ str(a**3))
print("integer part of b: \t\t"+ str(int(b)))

a: 		10
b: 		-10.1023
absolute of b: 			10.1023
rounded b: 			-10.102
square of a: 			100
cube of a: 			1000
integer part of b: 		-10


# Control flow

Control flow statements are used to manage the order in which certain operations are executed, depending on conditions. Most notably, if and for statements can alter how your program treats certain scenarios. Notice the indentatation which is used. Python is strict on this, as it alters on what level your program will execute a statement.

## If statements

In [7]:
price = -5;

if price <0:
    print("Price is negative!")
elif price <1:
    print("Price is too small!")
else:
    print("Price is suitable.")

Price is negative!


### Comparing strings:

In [9]:
name1 = "Xuefei"
name2 = "xuefei"

if name1 == name2:
    print("Equal")
else:
    print("Not equal")

if name1.lower() == name2.lower():
    print("Equal")
else:
    print("Not equal")

Not equal
Equal
Equal
Not equal


### Connecting conditions with logical operators:

In [7]:
number = 9
if number > 1 and not number > 9:
    print("Number is between 1 and 10")
    
if number < 3 or number > 6:
    print("Number is lower than 3 or higher than 6")

Number is between 1 and 10
Number is lower than 3 or higher than 6


### Indentation:

In [19]:
### Be careful with indentation
number_1 = 3
number_2 = 5

print('No indent (no tabs used)')
if number_1 > 1:
    print('\tNumber 1 higher than 1.')
    if number_2 > 5:
        print('\t\tnumber 2 higher than 5')        
    print('\tnumber 2 higher than 5, just print')

number_1 = 3
number_2 = 6

print('No indent (no tabs used)')
if number_1 > 1:
    print('\tNumber 1 higher than 1.')
    if number_2 > 5:
        print('\t\tnumber 2 higher than 5')       
    print('\tnumber 2 higher than 5, just print')

No indent (no tabs used)
	Number 1 higher than 1.
	number 2 higher than 5, just print
No indent (no tabs used)
	Number 1 higher than 1.
		number 2 higher than 5
	number 2 higher than 5, just print


## For loops

In [23]:
# List will be explained below
number_list = [1, 2, 3, 4]
for item in number_list:
    print(item)

print("\n")

for i in range(1,4):
    print(i)

print("\n")

letter_list = ['a', 'b', 'c']
for item in letter_list:
    print(item)

1
2
3
4


1
2
3


a
b
c


## While loops

In [24]:
number = 4
while number >1:
    print(number)
    number = number -1

4
3
2
1


# Collection data types

In data analysis, it is important to be able to store vast amounts of data. Collections can help a great deal to structure all the data. Remember: we always start counting from 0 in Python (not in MATLAB).

## Lists

### Basics:

In [89]:
names = ["Xuefei", "Giovanni", "Rose", "Yongzhe", "Luciana", "Imani"]

# Loop names
for name in names:
    print('Name: '+name)

# Get 'Giovanni' from list
# Lists start counting at 0
giovanni = names[1]
print(giovanni.upper())

# Get last item
name = names[-1]
print(name.upper())

# Get second to last item
name = names[-2]
print(name.upper())

print("First three: "+str(names[0:3]))
print("First four: "+str(names[:4]))
print("Up until the second to last one: "+str(names[:-2]))
print("Last two: "+str(names[-2:]))

Name: Xuefei
Name: Giovanni
Name: Rose
Name: Yongzhe
Name: Luciana
Name: Imani
GIOVANNI
IMANI
LUCIANA
First three: ['Xuefei', 'Giovanni', 'Rose']
First four: ['Xuefei', 'Giovanni', 'Rose', 'Yongzhe']
Up until the second to last one: ['Xuefei', 'Giovanni', 'Rose', 'Yongzhe']
Last two: ['Luciana', 'Imani']


### Enumeration:

In [90]:
# Enumeration, notice how indentation influences the for loop
for index, name in enumerate(names):
    print(index, name, "is in the list.")

# changing the default counter
print(list(enumerate(names,10)))

0 Xuefei is in the list.
1 Giovanni is in the list.
2 Rose is in the list.
3 Yongzhe is in the list.
4 Luciana is in the list.
5 Imani is in the list.
[(10, 'Xuefei'), (11, 'Giovanni'), (12, 'Rose'), (13, 'Yongzhe'), (14, 'Luciana'), (15, 'Imani')]


### Searching and editing:

In [91]:
# Finding an element
print(names.index("Xuefei"))

# Adding an element
names.append("Kumiko")
print(names)
names.insert(2, "Roberta")
print(names)

#Removal
fruits = ["apple","orange","pear"]
del fruits[0]
fruits.remove("pear")
print(fruits)

# Modifying an element
names[5] = "Tom"
print(names)

# Test whether an item is in the list (best do this before removing to avoid raising errors)
print("Tom" in names)

# Length of a list
print("Length of the list: " + str(len(names)))

0
['Xuefei', 'Giovanni', 'Rose', 'Yongzhe', 'Luciana', 'Imani', 'Kumiko']
['Xuefei', 'Giovanni', 'Roberta', 'Rose', 'Yongzhe', 'Luciana', 'Imani', 'Kumiko']
['orange']
['Xuefei', 'Giovanni', 'Roberta', 'Rose', 'Yongzhe', 'Tom', 'Imani', 'Kumiko']
True
Length of the list: 8


### Sorting and copying:

In [93]:
names = ["Xuefei", "Giovanni", "Rose", "Yongzhe", "Luciana", "Imani"]
# Temporary sorting:
print(sorted(names))
print(names)

# Make changes permanent
names.sort()
print("Sorted names: " + str(names))
names.sort(reverse=True)
print("Reverse sorted names: " + str(names))

# Copying list (a shallow copy just duplicates the pointer to the memory address)
namez = names
namez.remove("Xuefei")
print(namez)
print(names)

# Now a 'deep' copy
print("After deep copy")

namez = names.copy()
namez.remove("Giovanni")
print(namez)
print(names)

#Alternative
print("Alternative way deep copy")
namez = names[:]
namez.remove("Giovanni")
print(namez)
print(names)

['Giovanni', 'Imani', 'Luciana', 'Rose', 'Xuefei', 'Yongzhe']
['Xuefei', 'Giovanni', 'Rose', 'Yongzhe', 'Luciana', 'Imani']
Sorted names: ['Giovanni', 'Imani', 'Luciana', 'Rose', 'Xuefei', 'Yongzhe']
Reverse sorted names: ['Yongzhe', 'Xuefei', 'Rose', 'Luciana', 'Imani', 'Giovanni']
['Yongzhe', 'Rose', 'Luciana', 'Imani', 'Giovanni']
['Yongzhe', 'Rose', 'Luciana', 'Imani', 'Giovanni']
After deep copy
['Yongzhe', 'Rose', 'Luciana', 'Imani']
['Yongzhe', 'Rose', 'Luciana', 'Imani', 'Giovanni']
Alternative way deep copy
['Yongzhe', 'Rose', 'Luciana', 'Imani']
['Yongzhe', 'Rose', 'Luciana', 'Imani', 'Giovanni']


### Strings as lists:

In [88]:
course = "Predictive analytics"
print("Last nine letters: "+course[-9:])
print("\'analytics\' in course title? "+str("analytics" in course))
print("Start location of analytics: "+str(course.find("analytics")))
print(course.replace("analytics","analysis"))
list_of_words = course.split(" ")
for word in list_of_words:
    print("Word: "+word)

print(course.find("Analytics")) # find() method returns -1 if the value is not found, also case sensitive

Last nine letters: analytics
'analytics' in course title? True
Start location of analytics: 11
Predictive analysis
Word: Predictive
Word: analytics
-1


## Sets

In [94]:
name_set = set(names)
print(name_set)

# Add an element
name_set.add("Galina")
print(name_set)

# Discard an element
name_set.discard("Xuefei")
print(name_set)

name_set2 = set(["Rose", "Tom"])
# Difference and intersection
difference = name_set - name_set2
print(difference)
intersection = name_set.intersection(name_set2)
print(intersection)

{'Luciana', 'Yongzhe', 'Giovanni', 'Imani', 'Rose'}
{'Luciana', 'Yongzhe', 'Galina', 'Giovanni', 'Imani', 'Rose'}
{'Luciana', 'Yongzhe', 'Galina', 'Giovanni', 'Imani', 'Rose'}
{'Luciana', 'Yongzhe', 'Galina', 'Imani', 'Giovanni'}
{'Rose'}


## Dictionaries

Dictionaries are a great way to store particular data as key-value pairs, which mimics the basic structure of a simple database.

In [97]:
courses = {"Xuefei" : "Predictive analytics", "Kumiko" : "Prescriptive analytics", "Luciana" : "Descriptive analytics"}

for organizer in courses:
    print(organizer +" teaches " + courses[organizer])

print('\n')
    
# or alternatively
for organizer, course in courses.items():
    print(organizer +" teaches " + course)

# Adding items
courses["Imani"] = "Other analytics"

# Overwrite
courses["Xuefei"] = "Business analytics"
print(courses)

# Remove
del courses["Xuefei"]
print(courses)

print("\nLooping values")
# Looping values
for course in courses.values():
    print(course)

print("\nSorted output")
# Sorted output
for organizer, course in sorted(courses.items()):
    print(organizer +" teaches " + course)

Xuefei teaches Predictive analytics
Kumiko teaches Prescriptive analytics
Luciana teaches Descriptive analytics


Xuefei teaches Predictive analytics
Kumiko teaches Prescriptive analytics
Luciana teaches Descriptive analytics
{'Xuefei': 'Business analytics', 'Kumiko': 'Prescriptive analytics', 'Luciana': 'Descriptive analytics', 'Imani': 'Other analytics'}
{'Kumiko': 'Prescriptive analytics', 'Luciana': 'Descriptive analytics', 'Imani': 'Other analytics'}

Looping values
Prescriptive analytics
Descriptive analytics
Other analytics

Sorted output
Imani teaches Other analytics
Kumiko teaches Prescriptive analytics
Luciana teaches Descriptive analytics


# Functions

Functions form the backbone of all code. You have already used some, like print(). You can easily define them yourself as well.

In [98]:
def my_function(a, b):
    a = a.title()
    b = b.upper()
    print(a+ " "+b)


def my_function2(a, b):
    a = a.title()
    b = b.upper()
    return a + " " + b

my_function("xuefei","lu")

output = my_function2("xuefei","lu")
print(output)


# Different output type
def calculate_mean(a, b):
    if (a>0):
        return (a+b)/2
    else:
        return "a is negative"

output = calculate_mean(1,2)
print(output)
output = calculate_mean(0,1)
print(output)

Xuefei LU
Xuefei LU
1.5
a is negative
