<img src="./intro_images/introbanner.png" width="100%" align="left" />

<table style="float:right;">
    <tr>
        <td>                      
            <div style="text-align: right"><a href="https://alandavies.netlify.com" target="_blank">Dr Alan Davies</a></div>
            <div style="text-align: right">Lecturer health data science</div>
            <div style="text-align: right">University of Manchester</div>
         </td>
         <td>
             <img src="./intro_images/alan.png" width="30%" />
         </td>
     </tr>
</table>

# Variables
****

#### About this Notebook
This notebook introduces the concepts of variables, their data types, naming them and working with them

This notebook is at <code>Beginner</code> level and will take approximately 1 hour to complete.

<div class="alert alert-block alert-warning"><b>Learning Objectives:</b> 
<br/> This notebook will help you start to:
    
- Express a clear understanding of the basic principles of the Python programming language.
- Explain the features of Python that support object-oriented programming

</div> 


<a id="top"></a>

<b>Table of contents</b><br/>

1.0 [About Variables](#var])

2.0 [Naming Variables](#namingvar)

3.0 [Changing Variable Types](#changingvar)


---

<a name="var"></a>
## About Variables

Most programs receive data from some input that the program then manipulates in some way to produce a result or output. This input can be very diverse. For example it maybe a keypress in a video game to move a character on the screen, a list of payroll numbers or some data from a Martian probe. In all cases we need to store this data somewhere. In computer programs we use **`variables`** to store data. The name variable implies that the thing being stored may vary. Lets look at some examples.

In [6]:
x = 1
weight_kg = 10.56
my_name = "David Smith"
price = 6

In the example above we have 4 variables called  **`x`**, <b>`weight_kg`</b>, <b>`my_name`</b> and <b>`price`</b>. There are 4 main types of data including:<br>
<ul>
    <li><b>Integers</b> - whole number values <i>- i.e. 5, -3 or 131255</i></li>
    <li><b>Floating point</b> numbers - numbers with a dot in them <i>i.e. 10.56</i></li>
    <li><b>Strings</b> - contain text <i>- i.e. "Hello" "Hi everyone"</i> </li>
<li><b>Boolean</b> - True and False values</li>
</ul>
Python is able to work out which **`type`** a variable is based on the value stored within. For example, it knows **`weight_kg`** is an **`integer`** because it contains an integer (whole number) value **`104`**.

<div class="alert alert-success">
<b>Note:</b> We use the term <code>floating point</code> because the dot doesn't necessarily represent a decimal point (base 10). It could be binary (base 2), octal (base 8) or hexidecimal (base 16) to name a few.  
</div>

The equals operator (=) is used for variable assignment. This is basically saying store the value on the right of the equals in the label on the left of the equals. i.e. x = 1 means put 1 into a label called x. We can see what is inside a variable (what value it contains) by using the print function and passing in the variable name. We will talk more about functions and passing variables to them later. For now we will just use **`print()`** to display values.

In [7]:
weight_kg = 10.56
print(weight_kg)

10.56


Another way of looking at it is like a box that you want to store some data inside which you can give a meaningful label to in order to help organise and store your data i.e. **`weight_kg = 10.56`**

<img src="./intro_images/box.png" width="300" />

We can use the label when we want to retrieve that value later for some computation or other processing. We can also see what type a variable is by using the **`type`** function. For the <b>`weight_kg`</b> it is represented by the word <b>`float`</b> because we changed the value from a whole number (integer) to one with a point in it.

In [4]:
weight_kg = 10.56
type(weight_kg)

float

<div class="alert alert-success">
<b>Note:</b> We use the term <code>float</code> instead of decimal because there are many number systems with different bases. For example binary is base 2, hexidecimal is base 16, decimal is base 10 and octal is base 8.  
</div>

<div class="alert alert-block alert-info">
<b>Task 1:</b>
<br> 
1. What type do you think the variables <code>price</code> and <code>my_name</code> are?<br> 
2. Use the <code>type()</code> function to check in the cells below.
</div>

In [5]:
type(price)
type(my_name)

str

[Return to top](#top)

-------------

<a name="namingvar"></a>

## 2.0 Naming Variables

In maths, variables tend to be labelled with a single letter like <i>i</i>, <i>x</i> and <i>j</i>... In programming we can afford to use longer and more descriptive labels that better describe the value they hold i.e. <b>`weight_kg`</b>, which suggests it might contain some data on weight measured in kilograms. The convention in Python is to use <b>`snake case`</b>. This is where words are written in lower case and separated by an underscore (i.e. `data_file_loader`). Other languages like C and Java use <b>`camel case`</b> where new words are capitalised like the humps on a camel's back (i.e. `dataFileLoader`). There are a few restrictions to how we can name a variable in Python. These include:
<ul>
<li>The first character cannot be a number</li>
<li>The name can't be the same as an existing Python keyword (more about this later)</li>
</ul>
Variables can start with an underscore, contain letters and numbers and be any length. Case is important though. A variable named <b>`my_name`</b> is not the same as one called <b>`My_name`</b>. In this case you would have made (declared) 2 separate variables.

<div class="alert alert-block alert-info">
<b>Task 2:</b>
<br> 
1. Which of these are legal variable names? <code>_accounts</code>, <code>1005_accounts</code> and <code>my_accounts</code><br> 
2. Use the cells below to check and assign a value to each variable, the first one is doen for you <code>_accounts = 10</code> below just press Ctrl + Enter
</div>

In [14]:
_accounts = 10
1005_accounts = 10
my_accounts = 10

SyntaxError: invalid token (<ipython-input-14-5c849182ce75>, line 2)

In [15]:
_accounts = 10

In [16]:
1005_accounts = 10

SyntaxError: invalid token (<ipython-input-16-9b65dcd07650>, line 1)

In [17]:
my_accounts = 10

#### 2.2 Working with Strings

Strings refer to textual data in software engineering. A string is made up of a set of characters. In Python strings are defined using either the single or double quotes.

In [18]:
"This is a string."

'This is a string.'

In [19]:
'So is this.'

'So is this.'

Strings can also be joined together (**`concatenated`**) using the plus operator.

In [20]:
"This string can be " + "joined to that string."

'This string can be joined to that string.'

There are some useful ways of interacting with strings in Python. Let's say we had a string:

In [21]:
my_string = "This is a text string."

We can use the **`len()`** function to see the length of the string (how many characters in the string)

In [22]:
len(my_string)

22

<div class="alert alert-success">
<b>Note:</b> This also counts the spaces too. These spaces are known as <code>white space</code>. 
</div>

To access certain characters in a string, you just need to specify the position of the character in the string (starting from 0). Take a look at the table below to see the order number for each letter in the string:

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg .tg-kiyi{font-weight:bold;border-color:inherit;text-align:left}
.tg .tg-fymr{font-weight:bold;border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-xldj{border-color:inherit;text-align:left}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
  <tr>
    <th class="tg-kiyi">T</th>
    <th class="tg-kiyi">H</th>
    <th class="tg-kiyi">I</th>
    <th class="tg-kiyi">S</th>
  </tr>
  <tr>
    <td class="tg-xldj">0</td>   
    <td class="tg-0pky">1</td>
    <td class="tg-0pky">2</td>
    <td class="tg-0pky">3</td>
  </tr>

</table>



For example to access the letter **`i`** in the word **`this`** we would write:

In [1]:
my_string[2]

NameError: name 'my_string' is not defined

If we want to retrieve the whole word, we can provide a start (0) and end (4) position separated by a colon.

In [24]:
my_string[0:4]

'This'

This is called string **`slicing`**. Here are some further examples:

In [25]:
print(my_string[:])
print(my_string[-1])
print(my_string[3:-5])

This is a text string.
.
s is a text st


<div class="alert alert-block alert-info">
<b>Task 3:</b>
<br> Using string slicing print the word <code>text</code> from <code>my_string</code>.
</div>

In [26]:
print(my_string[10:-7])

text 


In [29]:
print(my_string[5:-14])

is 


<div class="alert alert-success">
    <strong>Note:</strong> It is important to also realise that numbers enclosed in quotes are strings and not numbers.  
</div>



In [30]:
x = "123"
print(x)
print(type(x))

123
<class 'str'>


In the next cell we try and add the number 4 to our string which causes an error. We get around this by using **`type casting`** to change the integer 4 into a character 4. We can then add it to the string. We discuss this in more detail in section 1.3

In [31]:
x + 4

TypeError: can only concatenate str (not "int") to str

In [32]:
x + str(4)

'1234'

Alternativly we could turn the string **`x`** into an integer and add the number 4 to it.

In [33]:
int(x) + 4

127

You can display strings with variables in several ways including using a comma or the percentage operator: 

In [34]:
name = "Claire"
print("This is to say hi to",name)

This is to say hi to Claire


You can also use the percentage operator as a placeholder.

In [35]:
name = "Claire"
print("Hi %s nice to meet you" % name)

Hi Claire nice to meet you


The letter after the percent is related to the type of variable you want to print. The %s is for string. Some of the more commonly used letters can be seen below.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg .tg-kiyi{font-weight:bold;border-color:inherit;text-align:left}
.tg .tg-fymr{font-weight:bold;border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-xldj{border-color:inherit;text-align:left}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
  <tr>
    <th class="tg-kiyi">Letter</th>
    <th class="tg-kiyi">Variable type</th>
  </tr>
  <tr>
    <td class="tg-xldj">d, i</td>   
    <td class="tg-0pky">Integer</td>
  </tr>
   <tr>
    <td class="tg-xldj">s</td>   
    <td class="tg-0pky">String</td>
  </tr>
   <tr>
    <td class="tg-xldj">f, g</td>   
    <td class="tg-0pky">Float</td>
  </tr>
</table>

Another useful thing to be able to do is to find a word within a string. Let's say we were looking for a certain keyword in a string of text, such as **`syncope`** (a temporary loss of consciousness).

In [36]:
presenting_complaint = "A 68 year old male complained of feeling feint followed by an episode of syncope and headache."

In [37]:
presenting_complaint.find("syncope")

73

This returns the index of the word if it is found or -1 if it is not i.e.

In [38]:
presenting_complaint.find("stroke")

-1

You can also split a string up based on a character called a **`delimiter`**. In this case we can split the string by spaces. We could also split by comma or other character if relevant.

In [39]:
words = presenting_complaint.split(" ")
print(words)

['A', '68', 'year', 'old', 'male', 'complained', 'of', 'feeling', 'feint', 'followed', 'by', 'an', 'episode', 'of', 'syncope', 'and', 'headache.']


<div class="alert alert-success">
<b>Note:</b> This splits items and stores them as elements (items) in a <code>list</code> separated by commas. We will cover list's a little later. 
</div>

The opposite of the **`split()`** function is the **`join()`** function which we can use to put the string back together again. 

In [40]:
joined_str = " ".join(words)
print(joined_str)

A 68 year old male complained of feeling feint followed by an episode of syncope and headache.


You can also add a character in between when joining. Let's say you had some data you wanted to separate with dashes or with no separation at all.

In [41]:
x = ["ctga", "ccta", "aact"]
x

['ctga', 'ccta', 'aact']

In [42]:
data_joined_dash = "-".join(x)
print(data_joined_dash)

ctga-ccta-aact


In [43]:
data_joined_nospace = "".join(x)
print(data_joined_nospace)

ctgacctaaact


In [44]:
comma_string = "name, dob, pmh, social_history, lab_results, next_of_kin"

<div class="alert alert-block alert-info">
<b>Task 4:</b>
<br> Using the string split function. Split the string <code>comma_string</code> by comma.
</div>

In [45]:
words = comma_string.split(",")
print(words)

['name', ' dob', ' pmh', ' social_history', ' lab_results', ' next_of_kin']


We can also replace a word or words in a string with other words. Let's say we wanted to replace **`syncope`** with **`LOC`** for Loss Of Consciousness because we think this is a more recognized word. Here we use the **`replace`** function and put the word we want to replace first followed by a comma and then the new word.

In [46]:
presenting_complaint.replace("syncope", "LOC")

'A 68 year old male complained of feeling feint followed by an episode of LOC and headache.'

Other useful string functions include:

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;}
.tg .tg-kiyi{font-weight:bold;border-color:inherit;text-align:left}
.tg .tg-fymr{font-weight:bold;border-color:inherit;text-align:left;vertical-align:top}
.tg .tg-xldj{border-color:inherit;text-align:left}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
  <tr>
    <th class="tg-kiyi">Function</th>
    <th class="tg-kiyi">Description</th>
  </tr>
  <tr>
    <td class="tg-xldj">lower()</td>   
    <td class="tg-0pky">Changes text to lower case</td>
  </tr>
  <tr>
    <td class="tg-xldj">upper()</td>   
    <td class="tg-0pky">Changes text to upper case (capitals)</td>
  </tr>
  <tr>
    <td class="tg-xldj">isalpha()</td>   
    <td class="tg-0pky">Checks if text contains just text</td>
  </tr>
  <tr>
    <td class="tg-xldj">isdigit()</td>   
    <td class="tg-0pky">Checks if text contains just numbers</td>
  </tr>
  <tr>
    <td class="tg-xldj">isspace()</td>   
    <td class="tg-0pky">Checks if text is a space</td>
  </tr>
  <tr>
    <td class="tg-xldj">startswith()</td>   
    <td class="tg-0pky">Looks for a string at the start of another</td>
  </tr>
  <tr>
    <td class="tg-xldj">startswith()</td>   
    <td class="tg-0pky">Looks for a string at the end of another</td>
  </tr>
</table>

There are also a list of **`escape characters`** that can be used inside strings. For example **`\t`** for tab.

In [47]:
print("This is \t a tab")

This is 	 a tab


In [48]:
print("This is a \n newline")

This is a 
 newline


In [49]:
print("This is a \n\r newline and carriage return (like an old typewritter)")

This is a 
 newline and carriage return (like an old typewritter)


We can use **`in`** and **`not in`** to see if a word or substring (string within a string) are present. For example to see if the word **`chest`** is in the string below:

In [50]:
pc = "86 year old female with crushing central chest pain radiating down left arm."

In [51]:
"chest" in pc

True

In [52]:
"chest pain" in pc

True

In [53]:
"kidney pain" in pc

False

In [54]:
"kidney pain" not in pc

True

[Return to top](#top)


----------
<a name="changingvar"></a>

### 3.0 Changing a Variables Type

To store and process the values contained inside variables you may need to change their type from time to time (called <b>`type casting`</b>). For example when storing a phone number, we might want to store this as text rather than as an integer. There are several functions in Python for altering a variables type, including <b>`int()`</b>, <b>`float()`</b> and <b>`str()`</b>. 

In [55]:
pi = 3.141592
type(pi)

float

In [56]:
pi = str(pi)
type(pi)

str

In the example above, we declare a variable called <b>`pi`</b> and give it the value <b>`3.141592`</b> to represent the Greek letter pi ($\pi$) that represents the ratio of a circumference of a circle to its diameter. When we use the <b>`type`</b> function we can see that it is a <b>`float`</b>. Next we use the <b>`str()`</b> function to convert it into a string and overwrite the existing value. Now when we view the type it is a <b>`string (str)`</b>.

<div class="alert alert-block alert-info">
<b>Task 5:</b>
<br> 1. Cast the variable <code>pi</code> back into a <code>float</code> and then into an <code>integer</code>. Finally print its contents and it's type.
<br> 2. What value would you expect to see?
</div>

In [57]:
pi = float(pi)
pi = int(pi)
print(pi)
type(pi)

3


int

Or combining steps

In [58]:
pi = int(float(pi))
print(pi)
type(pi)

3


int

#### Notebook details
<br>
<i>Notebook created by <strong>Dr. Alan Davies</strong> with, <strong>Frances Hooley</strong> 
    

Publish date: October 2020<br>
Review date: October 2021</i>

Please give your feedback using the button below:

<a class="typeform-share button" href="https://form.typeform.com/to/YMpwLTNy" data-mode="popup" style="display:inline-block;text-decoration:none;background-color:#3A7685;color:white;cursor:pointer;font-family:Helvetica,Arial,sans-serif;font-size:18px;line-height:45px;text-align:center;margin:0;height:45px;padding:0px 30px;border-radius:22px;max-width:100%;white-space:nowrap;overflow:hidden;text-overflow:ellipsis;font-weight:bold;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;" target="_blank">Rate this notebook </a> <script> (function() { var qs,js,q,s,d=document, gi=d.getElementById, ce=d.createElement, gt=d.getElementsByTagName, id="typef_orm_share", b="https://embed.typeform.com/"; if(!gi.call(d,id)){ js=ce.call(d,"script"); js.id=id; js.src=b+"embed.js"; q=gt.call(d,"script")[0]; q.parentNode.insertBefore(js,q) } })() </script>