# Week 1: Introduction to Python (Part 1)

This is an Jupyter notebook, a web-based interactive computational environment. 
- Cells can contain markdown or code. 
- To run a code cell, press shift+Enter. 
- Jupyter will print the output from a cell, beneath it.

This session is designed to give you the working knowledge of Python necessary to complete the lab sessions for Natural Language Engineering. 

- Run all of the code cells as you work through the notebook. 
- Try to understand what is happening in each code cell and predict the output before running it.
- Complete all of the exercises.
- Discuss answers and ask questions!


## Python types

### String
Strings are enclosed in double or single quotes in Python.

In [1]:
print('Hello World')

Hello World


In [2]:
print("Hello World")

Hello World


In [3]:
# This is a comment (# at the beginning of the line)
# Note that a string enclosed in double quotes can contain single quotes as part of the string:
print("'A reader lives a thousand lives before he dies,' said Jojen. 'The man who never reads lives only one.'")

'A reader lives a thousand lives before he dies,' said Jojen. 'The man who never reads lives only one.'


In [4]:
# ...and a string enclosed in single quotes can contain double quotes as part of the string:
print('"A reader lives a thousand lives before he dies," said Jojen. "The man who never reads lives only one."')

"A reader lives a thousand lives before he dies," said Jojen. "The man who never reads lives only one."


As an alternative to using the explicit `print` function, when a cell is run, Python will print the value of the last line of code in a cell. Try running the following cell.

In [5]:
"Hello World"
'"A reader lives a thousand lives before he dies," said Jojen. "The man who never reads lives only one."'

'"A reader lives a thousand lives before he dies," said Jojen. "The man who never reads lives only one."'

### Integer

In [3]:
75

75

### Float

In [7]:
6.3646

6.3646

When a string contains just digits, the function `int` will **cast** that string to an integer.

In [8]:
# give the type of the string '623'
type('623')

str

In [9]:
# cast the string '623' to an integer
int('623')

623

In [10]:
# give the type that results from casting the string '623' to an integer.
type(int('623'))

int

## Basic operations

Strings can be joined using `+`

In [11]:
"Hello " + "World"

'Hello World'

Standard operators are used on integers and floats: `+`, `-`, `*`, and `/`.

In [12]:
7 - 3 + 5

9

In [4]:
100*200*1000000

20000000000

In [13]:
3.5*8/4

7.0

If we want to use floor division (rounded down to nearest integer) use `//`.

In [14]:
7//2

3

Use `**` for exponentiation - e.g. `3**2 = 3^2`.

In [15]:
# This is equivalent to 2*2*2*2*2
2**5

32

In [8]:
10**4

10000

Use double equals, `==`, to check equality.

In [16]:
5*4 == 2*10

True

Modulo operator `%` returns the remainder after integer division.  
e.g. 13/5 = 2 with 3 leftover, so `13%5=3`.

In [17]:
7%3

1

In [18]:
4 % 2

0

## Python error reports
e.g. when attempting to join a **string** and an **integer**

In [19]:
"Hello" + 3

TypeError: must be str, not int

### Exercise 1
In the empty cell below write a single line Python expression to print "Hello world! My name is", joined with another string containing your name

## Python identifiers
Assign a variable name to any value (eg string, integer, float) using a single equals sign.

In [20]:
student_name = "Adam"
student_name

'Adam'

In [21]:
student_age = 21
student_age

21

Operations can be carried out as before, using the variable names.

In [22]:
student_age/2

10.5

We can update values associated with a variable using the operators `+=` , `-=` , `/=`, and `*=`.

- For example, `+=` adds the number on the right to the current value.

This is a useful shortcut - take your time to play around and familiarise yourself with this syntax.

In [23]:
#Note that each time you run this cell, it will add 5 to the stored value.
student_age += 5
student_age

26

In [24]:
age_next_year=student_age+1
age_next_year

27

### Exercise 2a
In the cell below, assign appropriate values to the variables `my_name`, `my_age`, and `years_at_sussex`.

### Exercise 2b
In the cell below subtract `years_at_sussex` from `my_age` and assign this value to a new variable called `age_started_sussex`.

### Exercise 2c
In the cell below practice using the `**`,  `+=` , `-=`, `/=`, and `\*=` operators to update these values.

## Dynamic typing
The `type` function is used to get an object's type: `int` for integer, `str` for string, etc.

In [25]:
type(student_name)

str

In [26]:
type(student_age)

int

As Python has dynamic typing, if a variable name is assigned to a new value of different type, the variable's type will change accordingly.

In [27]:
student_age = "Twenty"
type(student_age)

str

### Exercise 3
In the cell below reassign your `my_age` and `years_at_sussex` `int` variables to `string` giving the number in words. Print the type of these variables before and after.

## Lists

Lists are initialised using square brackets, with objects separated by commas.

In [28]:
primes = [2, 3, 5, 7, 11]
type(primes)

list

Lists can contain any data type.

In [29]:
list_of_strings =['string','another string','a third string']
list_of_strings

['string', 'another string', 'a third string']

'Empty' lists with no elements can also be initialised.

In [30]:
empty_list = []

Indexing into lists uses square brackets.
- Note that indexing starts from zero.

In [31]:
primes[0]

2

A colon, `:`, can be used to take a slice of list between two indices.
- Note that this will start from the first index, up to but NOT including the second index.

In [32]:
primes[1:3]

[3, 5]

If either index is omitted, the slice will go to the beginning/end of the list.

In [33]:
primes[:3]

[2, 3, 5]

To index from the end of the list use negative numbers.

In [34]:
primes[-1]

11

In [35]:
primes[-2:]

[7, 11]

To test for list membership use the keyword `in`.

In [36]:
5 in primes

True

In [37]:
6 in primes

False

The function `len` gives the length of a list.

In [38]:
len(primes)

5

To append an element to a list use `append`.

In [39]:
primes.append(13)

In [40]:
primes.append(17)

In [41]:
primes

[2, 3, 5, 7, 11, 13, 17]

Using `append` with a list as parameter adds the list as a single element - producing a list that contains a list as its last element.

In [42]:
primes = [2, 3, 5, 7, 11, 13]
primes.append([17,19])
primes

[2, 3, 5, 7, 11, 13, [17, 19]]

When we want to add the elements of one list individually to another list, use the `+=` operator to concatenate the two lists.

In [43]:
primes = [2, 3, 5, 7, 11, 13]
primes += [17,19]
primes

[2, 3, 5, 7, 11, 13, 17, 19]

To write a for loop that iterates over a list use keywords `for` and `in`, `:`, and indentation to indicate the scope of the body of the loop.

In [44]:
for prime in primes:
    print(prime,"is a prime")

2 is a prime
3 is a prime
5 is a prime
7 is a prime
11 is a prime
13 is a prime
17 is a prime
19 is a prime


### Exercise 4a
In the cell below initialise the variable `squares` to be a list of the square numbers from 1 to 16 inclusive.

### Exercise 4b
In the cell below append the next square number to the list `squares`.

### Exercise 4c
In the cell below make a list of the next two square numbers and concatenate this with `squares`.

### Exercise 4d
In the cell  below check how many items are in the list now.

### Exercise 4e
In the cell below use indexing to print just the first 3 and last 3 items in the list `squares`

### Exercise 4f
In the cell below, use a `for` loop to print each item in the list `squares` on its own line, as part of a sentence. The output should like like this:
```
The first square in the list is  1
The next square in the list is  4
The next square in the list is  9
The next square in the list is  16
The next square in the list is  25
The next square in the list is  36
The last square in the list is  49
```

## Strings

In [45]:
# Here we asign a string "Hello World" as the value a variable called hello_world
hello_world = "Hello World"

String indexing is similar to list indexing, but works on a character-by-character basis.

In [46]:
hello_world[0]

'H'

In [47]:
hello_world[7]

'o'

In [48]:
hello_world[-3:]

'rld'

In [49]:
hello_world[-40]

IndexError: string index out of range

Test for substring presence using the keyword `in`.

In [50]:
"w" in hello_world

False

In [51]:
"W" in hello_world

True

In [52]:
"llo" in hello_world

True

Find the length of a string using `len`.

Note that the output value is a count including spaces, tabs and non-alphanumeric characters.

In [53]:
len(hello_world)

11

In [54]:
hello_world+="!"
hello_world

'Hello World!'

In [55]:
len(hello_world)

12

Iterating over a string involves similar syntax to list iteration, but works on a character-by-character basis.

In [56]:
for char in hello_world:
    print ("the character >>>", char, "<<< is present")

the character >>> H <<< is present
the character >>> e <<< is present
the character >>> l <<< is present
the character >>> l <<< is present
the character >>> o <<< is present
the character >>>   <<< is present
the character >>> W <<< is present
the character >>> o <<< is present
the character >>> r <<< is present
the character >>> l <<< is present
the character >>> d <<< is present
the character >>> ! <<< is present


A string can be parsed into words using the `split` method.   By default, it separates based on whitespace and will returns a list of *tokens*. 

An optional character can be passed to split as an argument.  See the difference if you change the following cell so that the second line reads `words = sentence.split('s')`



In [57]:
sentence = "This is a sample sentence"
words = sentence.split()
print(words)

['This', 'is', 'a', 'sample', 'sentence']


To check for the presence of a token in a list of words use the `in` keyword.

In [58]:
"sample" in words

True

In [59]:
"Hello" in words

False

### Exercise 5a
In the empty cell below  assign the string `"It was the best of times, it was the worst of times"` to the variable `opening_line`.

### Exercise 5b
In the empty cell below check whether 'worst' appears in opening_line.

### Exercise 5c
In the empty cell below make a list of the words in `opening_line`, assigned to the variable `dickens_words`, and iterate over `dickens_words`, printing one word per line.

### Exercise 5d
In the empty cell below check whether `'blurst'` appears in the list you made.

## Conditions and booleans

In [60]:
if 2 > 3:
    print ("yes")
else:
    print ("no")

no


There are some useful string *shape* methods, which form part of the String class and can be used to test for certain types of string.  Work out what each of the following test for:
- astring.isalpha()
- astring.isalnum()
- astring.isdigit()

In [61]:
"This".isalpha()

True

In [62]:
"This,".isalpha()

False

In [63]:
"M25".isalpha()

False

In [64]:
"M25".isalnum()

True

In [65]:
"463".isdigit()

True

In [66]:
# non zero numbers are TRUE
print ("yes" if 15 else "no")

yes


In [67]:
# zero is FALSE
print ("yes" if 0 else "no")

no


In [68]:
# non empty lists are TRUE
print ("yes" if ["one element"] else "no")

yes


In [69]:
# the empty list is FALSE
print ("yes" if [] else "no")

no


In [70]:
# non empty character strings are TRUE
print ("yes" if "Hello" else "no")

yes


In [71]:
# the empty string is FALSE
print ("yes" if "" else "no")

no


Boolean statements can be combined using `and`. Both must be true for the combination to be evaluated as `True`.

In [72]:
True and True

True

In [73]:
False and True

False

Boolean statements can be combined using `or`. At least one statement must be true for the combination to be evaluated as `True`.

In [74]:
False or True

True

In [75]:
True or False

True

A boolean statement can be negated using `not`.

In [76]:
not True

False

In [77]:
not False

True