# More variables in Python: text and other beasts

In a previous notebook we looked at how to create variables in Python and perform a calculation with those.

Those variables all happened to be numbers. We're going to need them again so let's recreate them in this notebook first:

In [None]:
#store the numbers of requests in a new variable called 'fcorequests'
fcorequests = 48
#store the numbers of refusals in a new variable called 'sec27refusals'
sec27refusals = 28
#calculate a percentage by dividing the part (refusals) by the whole (requests)
#and store in a new variable called 'percrefused'
percrefused = sec27refusals/fcorequests
#print it
print(percrefused)
#multiply by 100 to make it easier to 'read' as a percentage
print(percrefused*100)

0.5833333333333334
58.333333333333336


*(A quick recap: these figures are taken from [Freedom of Information statistics: April to June 2021 bulletin](https://www.gov.uk/government/statistics/freedom-of-information-statistics-april-to-june-2021/freedom-of-information-statistics-april-to-june-2021-bulletin) - [the data tables link is here](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1017270/foi-statistics-q2-2021-statistical-tables.ods) and go to the sheet called '10_Exemptions'.)*

## Creating text variables: 'strings'

Data doesn't just come in the form of numbers, however. It's almost certain that at some point we're going to need to store some **text**. 

For example, we might need to store the recipients of money being spent, in order to calculate which one received the most spending. 

In our FOI story we might want to store the names of organisations, or the sections of the FOI Act that are being used as the basis for refusals, or a note from the spreadsheet.

Let's do that, then.

In [None]:
#create two variables to store the organisation and section of the FOI Act
org = "Foreign, Commonwealth and Development Office"
section = "S.27 - International relations"
#This is taken from the 'Notes' sheet in the spreadsheet
note4 = "Figures supplied by these departments of state count non-routine information requests received by one or more of their agencies, as well as those received by the departments themselves. The bulletin gives full details."

It's important to emphasise that there are two main differences between these variables and the number variables that we created before:

* The text is placed inside quotation marks, where the numbers were not
* The text is coloured differently by Colab: it's red (the numbers were coloured green)

If you forget to put text inside quotation marks, you'll get an error message...

In [None]:
org = FCO

NameError: ignored

This is because it is looking for a variable called `FCO` (which doesn't exist, or "is not defined"). 

By adding quotation marks you are signalling that you aren't referring to a variable, but just to a **string of characters**.

This idea of a string of characters is important. Text has no 'meaning' in a piece of code: as far as a computer is concerned, "qghjp" is no less meaningful than "hello" - each is just a string of five characters. 

In fact, a piece of text is called a **string** in Python. And when you type this:

`org = "Foreign, Commonwealth and Development Office"`

...you are creating a **string variable**: that is, a variable which contains a string.

### Quotation marks in strings

I used double quotation marks to create my string but you can use single quotation marks, or even triple-quotation marks, too.

Here are some examples...

In [None]:
#using single quotation marks
org = 'Foreign, Commonwealth and Development Office'
#using double quotation marks
section = "S.27 - International relations"
#using triple quotation marks
note4 = '''Figures supplied by these departments of state count non-routine information requests 
received by one or more of their agencies, as well as those received by the departments themselves. 
The bulletin gives full details.'''

In general, it doesn't matter which you use, but there are particular reasons why it's useful to know about them all:

* Single quotation marks can be used when your string contains a quotation
* Double quotation marks can be used when your string contains punctuation (i.e. apostrophes)
* Triple quotation marks can be used when your string contains both, or runs over multiple lines

See what happens, for example, if we try to use double quotation marks in a string which contains a quote.

In [None]:
myquote = "She said "I am innocent""

SyntaxError: ignored

The opening quotation mark actually ends up being interpreted as the *closing* quotation mark of the string overall. To avoid this, then, you could use single quotation marks for the string:

In [None]:
myquote = 'She said "I am innocent"'

And below you can see an example where the string contains both double quotation marks and single quotation marks (an apostrophe), so the string as a whole is wrapped in triple quotation marks (actually a single quotation mark typed 3 times)

In [None]:
myquote = '''She said "I'm innocent"'''

## Other types of variables

String variables are just one of a number of types of variable. We've already encountered numerical variables. Here's a list of those and the other basic types of variable in Python:

* Strings (text)
* Integers (whole numbers)
* Floats (numbers with decimal places, like 58.3)
* Booleans (True/False)

In addition you might store multiple items in these types of variable

* Lists
* Dictionaries

## Why the type of variable matters

Knowing what type of variable you're creating or dealing with is important in coding, because this determines what you can *do* with a variable. 

Here's an example:

In [None]:
#create a number variable
fcorequests = 48
#create another variable which contains the string '28'
sec27refusals = "28"
#try to divide one by the other
percrefused = sec27refusals/fcorequests

TypeError: ignored

In this code we stored the 28 as a string, not a number:

`sec27refusals = "28"`

....and this meant that an error was generated when we tried to divide that number (ahem, *string*) by the other one.

Specifically we got a `TypeError`, which is an error relating to the *type* of variable being dealt with. 

That error can actually be broken down and understood, to help us learn more about variable types. Here it is:

`unsupported operand type(s) for /: 'str' and 'int'`

An 'operand' [is](https://link.springer.com/chapter/10.1007%2F978-0-85729-404-3_10):

> "the constants or variables which the operators operate upon."

So the operands in this case are 48 and "28". 

And the operator? Well an equals sign is an operator, as is the multiplication sign, 'add', 'subtract' and... divide. So the `/` sign, in this case, is the operator. 

Finally, `'str'` and `'int'`: these are terms for 'string' and 'integer' in Python.

So what it's saying is that a string and an integer are "unsupported" types of values to use the 'divide by' operator with. Which makes sense: you can't divide a number by a word.

The good news is that you can convert some values from one type to another - which we will do later.

### Booleans

A Boolean variable can have one of two values: either `True` or `False`.

These are most often created as a result of asking a question. For example, you might ask if one year's number of crimes was higher than the previous year's figure. 


In [None]:
#store the number of crimes in the latest data
crimes_now = 3000
#store the number of crimes in the previous year
crimes_then = 2000
#test if one variable is higher than the other, and store the result (True or False) in a variable
crime_went_up = crimes_now > crimes_then
#print it
print(crime_went_up)

True


A result of `True` could be used to direct our attention.

But, perhaps more importantly, it can also be used to trigger certain lines of code based on if a condition is True, or if it is False.

Note that the True/False result is generated by a **comparison operator**: `>`, meaning 'greater than'.

Other comparison operators include:

* `<` (less than)
* `==` (is equal to)
* `!=` (is not equal to)

Note that, because the equals operator is already used in Python to create a variable, to ask if something 'is equal to' something else, we have to use a *double* equals sign.

Of course we can create a Boolean variable directly, too. Below is some code that creates a Boolean variable called 'above50'

In [None]:
#create a Boolean variable called 'above50', set to True
above50 = True

Note that the colour of a Boolean variable is different again: this time `True` is blue in Colab, where a string was red, and a number was green. 

While we're at it, we should point out that variables have been coloured black, and commands like `print` have been brown.

The word `True` must be typed with a capital T and the rest in lower case. Watch what happens when it isn't:

In [None]:
#create a variable called 'above50', set to true
above50 = true

NameError: ignored

In [None]:
#create a variable called 'above50', set to TRUE
above50 = TRUE

NameError: ignored

Hopefully you're starting to get used to these errors by now!