# Introduction to Python

## Learning Objectives
By the end of this practical lab you will be able to:

* Install Python and extension packages
* Calculate new values
* Set and use variables
* Identify and explain (Understand) the different types of operators
* Create basic conditional statements

## Installing and using Python
Python is a general programming language which can be downloaded for free and installed on a range of different platforms including Windows, Mac OS and Linux (e.g. Ubuntu). Mac OS and most versions of Linux have Python built-in, however we will install Anaconda Python. Anaconda is a private company that bundles the Python language, many popular packages and provides an easy method for adding more packages; all for free. The appropriate installation files for your computer operating system can be [downloaded here](https://www.anaconda.com/download) (we'll use 64-bit Python 3.x). Alternatively, if you are using a managed system (e.g. at a university or office) you might already find Python installed.

We will primarily use the Jupyter Notebook for running Python. The Notebook allows Python code, nicely formatted text (written using [Markdown](http://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Working%20With%20Markdown%20Cells.html)) and graphical output to exist in the same web-based interface.

Alternatively, Python can be run from...
- the command line ("Anaconda Prompt" on Windows or "Terminal" on Mac OS and Linux). 
- an Integrated Development Environment (IDE); the most common of these is [Spyder](https://opencollective.com/spyder) which came with the Anaconda Python installation. Spyder provides a user friendly interface to Python, and helps new users by integrating a series of the components of Python into a single window interface.

The Notebook has "markdown" cells for nicely formatted documentation and "code" cells for Python commands. This is a markdown cell; the following is a code cell. A code cell is run by highlighting the cell and hitting Shift-Enter or by clicking the "play" button in the tool above.


In [4]:
1 + 5

6

The basic Python functions can be expanded by installing additional "packages". The package `regex` can be installed at the command line using this syntax: 

`conda install regex`

It can also be installed from inside the Notebook like this:

In [5]:
import sys
!conda install --yes --prefix {sys.prefix} regex

Fetching package metadata .............



CondaEnvironmentNotFoundError: Could not find environment: C:\Program .
You can list all discoverable environments with `conda info --envs`.



The following loads the package, which makes the functions contained within it available to use:

In [6]:
import regex

The Jupyter Notebook is an ideal interface for interactive computing, but sometimes it is useful to store the code you are writing for later reuse. This is quite simple because Python code can be stored in text files; these are typically given the extension `.py`. These files are typically written in an IDE like Spyder or a text editor. These are outside the scope of this practical, however, [here is a simple intro to writing a script](https://en.wikibooks.org/wiki/Python_Programming/Creating_Python_Programs).

## Creating objects and their classes

Objects in Python are a way of storing values which can be returned for reuse later in your code. These use the "=" symbol. For example:

In [1]:
a = 5
b = 10
a + b

15

There are a range of different data types which include: integer, float, logical, string and complex (which isn't covered here). The following code illustrates how these variables can be assigned and their values checked. Also - note that we can add comments to our code using the "#" symbol - this is not run by the Python compiler.

In [2]:
#Creates a variable called z, which stores a float value
z = 54.8
type(z) #type function returns the type of the object

float

In [3]:
#Creates a variable called y, which stores an integer value
y = 51
print(y) #prints the value of a variable 

51


In [4]:
#Creates variables c and d, then q which stores the output of the logical query
c = 5
d = 2
q = c < d #stores a true / false if d is greater than c
q

False

In [5]:
#Creates a variable called s, which stores a string value
s = "Hello"
isinstance(s, str) #a function to check if an object (s) is certain type (str) - returns true or false

True

### Converting Between Object Types

Sometimes it is necessary to convert between different object types, for example, numeric to string or vice versa.

In [6]:
u = 4 #creates an integer object
str(u) #Converts the variable to a string, which is visible in the printed result as the number is surrounded by a quote

'4'

In [7]:
i = "1" #creates a string object
int(i) #Converts the string object to an integer

1

### Lists
It is also possible to store multiple values within an object that are called lists. These are created using the square bracket notation: `[]`. For example:

In [8]:
# creates a list of integers
list_1 = [2,3,5,6,7,8,9]
list_2 = [4,7,9,12,11,1,3] # creates a second integer list
# creates a list of strings
list_3 = ["I","like","python","it","is","fun"]

It is also possible to extract an element of a list using an index. Note that Python is a zero-offset language, so the first element is in the zero position.

In [9]:
#Returns the element in the 4 position of the list list_3
list_3[4]

'is'

Lists can include mixed objects types.

In [10]:
list_4 = ["A", "B", 4, 5, list_1]
list_4

['A', 'B', 4, 5, [2, 3, 5, 6, 7, 8, 9]]

In [11]:
#Return the element in the 4 position of the list directly; then the elements in the 3 position of this list
list_4[4][3]

6

### Arrays
Arrays are like lists except that they can only contain a single object type. However, they facilitate mathematical operations when they contain numeric data. Arrays are found in the Numpy package.

In [None]:
import numpy as np #np is an alias we assign to reduce typing when accessing the numpy functionality
array_1 = np.array([2,3,5,6,7,8,9]) #arrays can take a list of elements
array_2 = np.array(list_2) #here we pass in a variable that is already a list
array_1

Notice that the numpy functionality is encapsulated in the `np` object. This functionality is accessed by using "dot notation"; i.e. `np.something()` would run the `something()` function from the numpy package.

In [None]:
array_1 - array_2 # arrays can be used as variables with operators - this calculates the difference between array_1 and array_2

In [None]:
array_1 * 10 #arrays can also be combined with constants

Accessing content of the array happens in a similar way a list.

In [None]:
array_1[4]

In [None]:
# Return the elements starting at the 2 position and ending at the 4 position
array_1[2:5]

Notice that the first value referenced in `[2:5]` is included, but the second value referenced is excluded.

## Working with variables

There are also a series of functions within Python that can work with different object types.

Details about how to use any Python function can be found by using the `help` function. For example, in the case of "isinstance":

In [None]:
help(isinstance)

### Length function

In [None]:
len(list_3) #Return the length of a list (and other objects types)

In [None]:
#Find out how many characters a string contains
len("I heart Python")

### String functions

In [None]:
#Create a new variable from the 2nd element of the list list_3
k = list_3[1]
k

In [None]:
#Use the slicing functionality to extract characters between positions 1 and 3
#Note that the range is inclusive of the first value (1) end exclusive of the second (3)
k[1:3]

In [None]:
# Substitute an element of a string
u = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
u.replace("A","Z") #Replace the letter A with Z

In [None]:
#Find the position of a character within a string
u.index("R")

In [None]:
#Split a string on a particular character; returns a two element list
u_2 = u.split("D")
u_2[0] #prints the first element of the list

In [None]:
#Concatenate strings
"A" + "B" #combines the two strings

In [None]:
#Change the case of a string
u.lower()

In [None]:
"hello".upper()

### Numeric Functions

In [None]:
#Create a 100 random numbers
h = np.random.uniform(0.0, 1.0, 100)

In [None]:
#Minimum
min(h)

In [None]:
#Minimum
h.min()

In [None]:
#Maximum
h.max()

In [None]:
#Standard deviation
h.std()

In [None]:
#Mean
h.mean()

In [None]:
#Median
np.median(h)

In [None]:
#Range
np.ptp(h)

In [None]:
#Round
round(5.6789, 2)

In [None]:
#Log10
np.log10(23)

### Operators

There are a range of different operators that can be used to work with variables. These include standard arithmetic operators such as `+ - / *` or exponents can be made using a `**`.

In [None]:
1 + 3

In [None]:
3 - 2

In [None]:
4 / 7

In [None]:
8 * 2

In [None]:
2**4

There are also logical operators that include:

In [None]:
#Five less than seven?
5 < 7

In [None]:
#Eight more than 2?
8 > 2

In [None]:
#7 less than or equal to 5
7<=5

In [None]:
#2 more than or equal to 2
2 >= 2

In [None]:
#Is a the same as b
a = "Hello"
b = "Goodbye"
a == b

In [None]:
#Is a not equal to b
a != b

In [None]:
#Does a equal Hello, or b equal Dog?
(a == "Hello") | (b == "Dog") #returns true because one side of the OR (|) operator is true

In [None]:
#Does a equal Hello and b equal Dog?
(a == "Hello") & (b == "Dog") #returns false because both sides of the AND (|) operator are not true

## Working with conditional statements
In many analysis tasks there is a need to make decisions based upon a condition being met. These use a range of control structures which commonly include if and else. Python requires that the code within the conditional statement be indented.

In [None]:
a = 10
b = 15

if a > b: #Tests if a is greater than b (it isn't, so returns false)
    print(a * 10) #this statement would be run only if the if had evaluated true
else: #the else condition is run because the if statement evaluated as false
    print(a * 20)

## The Python Environment

Throughout this practical you have been creating Python objects. You can see what objects are within the environment using the `dir()` command; it also shows a bunch of other things that are less important (you can ignore all the objects that begin with underscore ( `_` ).

In [None]:
print(dir())

# Further resources / training
* [Introduction to Python](https://www.datacamp.com/courses/intro-to-python-for-data-science) - An interactive Python tutorial on Data Camp