# Python Basics 1

## What will we cover?

* Data types
* Variables in SPSS versus in Python

## What are data types?

Before we start using Python for data analysis, it is important to understand what are the different types of "data" that it can handle. For now, think of it as different types of variables that SPSS or other statistical programs can handle. 


## Floats and Integers

Starting from the basics, we have integers or floats. They are... numbers. The difference is that integers are whole numbers, and floats allow for decimals. In the example below, we have an integer.

In [1]:
1

1

How do I know? I can use a function to check this out.

In [2]:
type(1)

int

The number below is a float. 

In [3]:
100.5

100.5

Let's check this as well

In [4]:
type(1.0)

float

## Strings

Strings are pieces of text and are **always** encapsulated by either a single quote (') or a double-quote(")

In [5]:
'This is some text!'

'This is some text!'

If you forget to put the quotes, you will have a problem...

In [6]:
This is some text!

SyntaxError: invalid syntax (<ipython-input-6-77284ae4aac9>, line 1)

The quotes need to match each other, also. Otherwise, you will have a problem...

In [7]:
"This is some text!'

SyntaxError: EOL while scanning string literal (<ipython-input-7-ab9a94c9fc6d>, line 1)

In [8]:
"There's some text"

"There's some text"

If you create a string with single or double quotes, it can only contain one line. Look what happens if we add a line break:

In [9]:
'This is 
some text!'

SyntaxError: EOL while scanning string literal (<ipython-input-9-f895fb063bb6>, line 1)

How can we handle this? Well... we can add three quotes in the beginning and the end of the string. For example:

In [10]:
'''I can type whatever I want
And add some lines
And more lines
and more lines '''

'I can type whatever I want\nAnd add some lines\nAnd more lines\nand more lines '

You will notice that there is an \n appearing where the line-break should be. That's normal. If we print the same string, you will see that the \n disappears:

In [11]:
print('''I can type whatever I want
And add some lines
And more lines
and more lines ''')

I can type whatever I want
And add some lines
And more lines
and more lines 


## Special Cases: True, False and None

Python also has some special cases when it comes to data types. Without getting into too much detail, they are:
* True 
* False
* None

It is important to know them for three reasons:
1. They will be used in later on in conditions (we'll get into that later)
2. None, in particular, can also mean a missing value
3. You cannot use True, False or None as a variable name (we'll get into that later as well)



In [12]:
True

True

# Data structures: Variables, Lists, Tuples and Dictionaries

So far, we've seen different data types, e.g., integers, floats and strings. Now we will see how we can work (and organize) data in Python.

## Variables

The first concept we will explore is of variables. To start with, it's important to keep in mind that **variable** in Python means something very different from what it means in SPSS or other statistical programs. 

### Variables in statistical programs
In SPSS, we're used to understand variables as specific types of measures for a given observation (case). For example:

| Respondent ID  | Height  | Weight   | Age  | Nationality  |
|---|---|---|---|---|
| 1  | 1.78  | 75  | 25  | Dutch  |
| 2  | 1.80  | 63  | 27  | German  |
| 3  | 1.67  | 57  | 22  | French  |


In the table above, we would say that we have four variables: Respondent ID, Height, Weight, Age and Nationality. That's usually how a statistical program defines a variable.


### Variables in programming languages
In Python, a variable is simply a "space" in memory that can hold a specific value. This is somewhat generic, so let's see some examples.


In [13]:
a = 1

By typing a = 1, we have created a variable (named *a*) and assigned a value of 1 to it. We can then print a to see what it contains:

In [14]:
print(a)

1


We can also do some operations with this variable we just created. These operations are always dependent on the data type. For example, we can do mathematical operations with variables that contain integers or floats.

In [15]:
a + 3

4

Keep in mind that, with the operation above, we did not *change*  the contents of a, but simply used it in an operation. So if we print a again, it will still have the same value.

In [16]:
print(a)

1


To change the value of a, we need to declare it again. For example:

In [17]:
a = a + 4

In [18]:
print(a)

5


Variables don't need to be only numbers (integers or floats). We can create a string variable. For example:

In [19]:
b = 'This is some text!'

In [20]:
print(b)

This is some text!


In [21]:
type(b)

str

Because the value contained in b is a string, however, we cannot do mathematical operations with it.

In [22]:
b + 1

TypeError: must be str, not int

Well, we actually can... but with another string.

In [23]:
c = 'And now even more text'

In [24]:
b + c

'This is some text!And now even more text'

In [25]:
d = b + c

In [26]:
d

'This is some text!And now even more text'

## Lists

Another interesting concept in Python is the ability to create lists. Lists are what their name says... a list of items. For example:

In [27]:
[1,2,'text', 3]

[1, 2, 'text', 3]

The item above is a list. Lists are created like this:
* They start with a [
* Inside them, we include the elements separated by commas
* We end a list with a ]

What we did above was a list, but it's not really useful as it is not stored in memory. We can do that by creating a variable, and defining it as a list. For example:

In [28]:
mylist = [1, 2, 'text', 3, 4, 5]

In [29]:
print(mylist)

[1, 2, 'text', 3, 4, 5]


Lists have a few advantages. You can, for example, add an item to a list:

In [30]:
mylist.append(100)

In [31]:
mylist

[1, 2, 'text', 3, 4, 5, 100]

You can also access an item in a list by its location. You just need to keep in mind that Python starts counting at 0. So the first element in the list we created is at location 0, the second at location 1 etc... let's see this in practice.

In [32]:
myitems = ['First item', 'Second item', 'Third item']

To access an item, I just need to type the name of the list, followed by location of the item I want (between []). Sounds complicated, but it's simply like this:

In [33]:
myitems[0]

'First item'

In [34]:
myitems[1]

'Second item'

In [35]:
myitems[2]

'Third item'

In [36]:
myitems[-1]

'Third item'

As you saw in the previous example, we can also access the items in a list in reversed order. When I asked for something in location -1, Python showed the last item in the list.

Lists can have almost any type of data inside them... also other lists!

In [37]:
g = [1, 2, [1,2]]

In [38]:
g[0]

1

In [39]:
g[1]

2

In [40]:
g[2]

[1, 2]

How can we access the first item of the list inside the list?

In [41]:
g[2][0]

1

## Tuples

Tuples are a concept that is similar (but not the same) as lists. We will not use them as often as lists in this course, but it's good to know that they exist. They are defined by items in parenthesis, like this:

In [42]:
mytuple = (1,2,3)

We can also access items inside a tuple by their location:

In [43]:
mytuple[-2]

2

But, unlike lists, we cannot change them:

In [44]:
mytuple.append(4)

AttributeError: 'tuple' object has no attribute 'append'

In [45]:
mytuple = (1,2,3,4)

In [46]:
mytuple

(1, 2, 3, 4)

## Dictionaries

A very handy concept in Python is the option to create dictionaries. As you have noticed with lists and tuples, you can only access items by their location - so you need to know if they are the first, second, third element in the list to be able to get their value.

With dictionaries, we can store information with keys. So instead of accessing something by their location (or position), we find it by its key. Let's create an example:

In [47]:
world_capitals = {
    'Netherlands': 'Amsterdam',
    'Germany' : 'Berlin',
    'Italy' : 'Rome',
    'France' : 'Paris',
    'UK' : 'London',
    'US': 'Washington'
}       

First of all, notice that to create a dictionary, we opened it with a **{** and closed it with a **}**. Also, each item contains two elements (divided by **:**):
* A first element that we call key
* A second element that we call value

If I want to know the capital of Italy, I just need to:

In [48]:
world_capitals['Italy']

'Rome'

But notice that we cannot access dictionary items by their location. 

In [49]:
world_capitals[3]

KeyError: 3

And we cannot append items.

In [50]:
world_capitals.append('Brazil': 'Brasilia')

SyntaxError: invalid syntax (<ipython-input-50-258bfe6956cc>, line 1)

To add an item, we simply do:

In [51]:
world_capitals['Brazil'] = 'Buenos Aires'

In [52]:
world_capitals

{'Brazil': 'Buenos Aires',
 'France': 'Paris',
 'Germany': 'Berlin',
 'Italy': 'Rome',
 'Netherlands': 'Amsterdam',
 'UK': 'London',
 'US': 'Washington'}

And to correct a mistake - i.e., to update a dictionary item - I just assign it again.

In [53]:
world_capitals['Brazil'] = 'Brasilia'

In [54]:
world_capitals

{'Brazil': 'Brasilia',
 'France': 'Paris',
 'Germany': 'Berlin',
 'Italy': 'Rome',
 'Netherlands': 'Amsterdam',
 'UK': 'London',
 'US': 'Washington'}

In [55]:
world_capitals['Brazil']

'Brasilia'

# Some examples


## 1. Removing items from a list and a dictionary

We saw how to add items to a list (using append) and to a dictionary (by assigning a new item). How can we remove items from a list or a dictionary though? Create a list and a dictionary below, and remove one item from each of them.

In [56]:
myitems = ['First item', 'Second item', 'Third item']

In [58]:
myitems.remove('First item')

In [59]:
myitems

['Second item', 'Third item']

## 2. Playing with strings

* How can you convert the variable below to lower case?


In [None]:
allupper = 'THIS TEXT IS COMPLETELY IN UPPER CASE!'

In [None]:
allupper.lower()

In [60]:
world_capitals = {
    'Netherlands': 'Amsterdam',
    'Germany' : 'Berlin',
    'Italy' : 'Rome',
    'France' : 'Paris',
    'UK' : 'London',
    'US': 'Washington'
}       

In [63]:
del world_capitals['Netherlands']

In [64]:
world_capitals

{'France': 'Paris',
 'Germany': 'Berlin',
 'Italy': 'Rome',
 'UK': 'London',
 'US': 'Washington'}

* How can we convert this string into a list of words?

In [67]:
sentence = '''This is a sentence with words separated by a lot of spaces.
How can we split it into a list of words?'''

In [68]:
print(sentence)

This is a sentence with words separated by a lot of spaces.
How can we split it into a list of words?


In [69]:
print(sentence.split())

['This', 'is', 'a', 'sentence', 'with', 'words', 'separated', 'by', 'a', 'lot', 'of', 'spaces.', 'How', 'can', 'we', 'split', 'it', 'into', 'a', 'list', 'of', 'words?']
