# Python and Numpy Basics

Python is a flexible programming language that is used in a wide range of scenarios; from web applications to device programming. It's extremely popular in the data science and artificial intelligence (AI) community. If you're familiar with other programming languages, like Java or Microsoft C#, Python can seem a little confusing at first, This notebook is designed to offer a crash course in some of the key features of the language for existing programmers who are new to Python and the Jupyter notebook environment.

## Getting Help
One of the most useful features of Python is its built-in help feature. You can use the **help** function to view documentation for specific commands.

Run the following cell to try it for yourself.

(To run the cell, place the cursor in it and click the  **&#9658;I Run** button. The **&#9711;** symbol next to the **Python 3.6** kernel name at the top right will briefly turn to **&#9899;** while the cell runs before turning back to **&#9711;**.)

In [None]:
help(print)

Note that the **help** function provides documentation for the class or function you specify. In this case, `help(print)` displays information about the **print** function, including details of its arguments.

The notebook environment also supports autocomplete. To see how this works,, in the empty cell below type `print('Hello', 'World',` and then press the **Tab** key. This displays a scrollable list of contextual commands that you can use to complete your **print** statement. You can select a command in the list by scrolling to it with the arrow keys and pressing **Enter**, or by clicking it. Try using this technique to select the `sep=` argument to add it to your statement, and then complete the command to read `print('Hello', 'World',sep='\n')`. Running this command produces the following output:
>Hello<br/>World

## Basics: Variables and Loops
Let's start with some basics; review the commented code in the following cell, and then run that cell to see the output.

In [None]:
# Declare, initialize, and display a variable - No explicit declaration keyword or data type is required
i = 0
print('i: ', i)

# We can use the *type* function to check the data type that has been automatically inferred
print(type(i), '\n')

# Now let's create another variable:
l = [1,2,3,4,5]
print ('l:', l)
print (type(l), '\n')

# And another
t = (1,2,3,4,5)
print ('t:', t)
print (type(t), '\n')

# Now let's modify our variables:
i = i + 1
print('i: ', i)

l = l + [6]
print('\nl: ', l)

t = t + (6,'seven', (1,2), [10,20,30])
print('\nt: ', t)

# l also has an *append* method:
l.append('seven')
l.append(t)
l.append([100,200,300])
print('l: ', l)

Review the output from the code. and note the following:
* **#** is the prefix for a comment
* There are no line-end indicators such as **;**
* Variable declarations require no type - **i** has automatically been detected as an *int*.
* **l** is a *list* - this is indicated by the use of square brackets around zero or more values.
* **t** is a *tuple* -  a single row of data with two or more columns
* For numeric types, arithmetic operators work as expected; so 0 + 1 is 1.
* For lists and tuples, (and strings), the **+** operator appends values of the same type
* Lists and tuples are flexible - they can contain a mix of data types, including lists within lists.

Since we've run the previous cell, as long as this notebook session remains active, our variables will still be in scope; so we can do some more work with them in the next cell:

In [None]:
# Create a list explicitly with a range
nums = list(range(0,7))
print  ('nums has %i elements.\n' %len(nums))

# reset i
i = 0

print('nums:')

# loop through the elements of nums by incrementing i
while(i < 7):
    print(nums[i])
    i = i + 1

print('\nl:')
# iterate through nums and print the corresponding element of l
for num in nums:
    print('l[%i] = %s' %(num, l[num]))

## Functions
Now let's look at how we can use functions to encapsulate reusable code. Again, the following cell assumes that the previous cells in this notebook have been run in this session:

In [None]:
# Function to add a value to a list
def add_element (lst, val):
    lst.append(val)
    
# Function to get an element from a list
def get_element(lst, idx):
    val  = lst[idx]
    return val

# Function to get the maximum element and its index
def get_last_element(lst):
    lst_size = len(lst)
    lst_end = lst[-1]
    return lst_size, lst_end

# call add_element
add_element(l, 'new element')
print(l)

# call get_element
v = get_element(l,0)
print (v)

# call get_last_element
l_len, l_last = get_last_element(l)
print(l_len,': ', l_last)

Note the following:
* Functions are defined using the **def** keyword.
* The code for a function is indented under the **def** declaration.
* Functions can accept zero or more parameters.
* Functions can return zero or more outputs.

## Packages
Python has evolved over a long period of time, and continues to evolve as new *packages* for specialist functionality are developed. Packages are the building blocks of Python. Most commonly used packages are generally pre-installed in most Python distributions, and you can install new packages and manage existing ones using the **pip** and **conda** utilities. You need to use these utilities from a command line / shell, not in python code. In Jupyter notebooks you can use the **!** command to run a shell command, as shown in the following cell:

In [None]:
# Show pip usage help
!pip -h

print('\n-------------\n')

# Show conda usage help
!conda -h

print('\n-------------\n')

# Use pip to install the numpy package
!pip install numpy

### Importing package libraries
After installing a package, you can import the modules it contains into your python code to use them. The following cell imports **numpy**, assigning it the alias **np**; and uses it to display the value of &pi;.

In [None]:
import numpy as np

print(np.pi)

You can also use the **from**...**import** construct to import specific functions or submodules from a package:

In [None]:
from numpy import pi

print(pi)

## Using numpy
The **numpy** library is at the core of most machine learning code in Python, so it's worth spending a little time getting to know some of its key objects and functions. As you may have guessed, numpy is specifically designed to work with ***num***bers in ***py***thon; and in particular with vectors and matrices, which are implemented as numpy *arrays*.

In [None]:
# Define a vector as an array
v = np.array([1,2,3])
print(v)

# Define a matrix as a 2D array
M = np.array([[3,1,4],
              [2,5,1],
              [4,6,2]])
print('\n',M)

# Calculate the dot product (Mv)
print('\n',np.dot(M,v))

Note that:
* **v** is a one-dimensional array containing three elements.
* **M** is a two dimensional array containing three arrays, which each contain three elements.
* You can use numpy to perform linear algebra - in this case, calculating the dot-product of a matrix and a vector. Linear algebra is the foundation for many machine learning operations, which is one reason why numpy is used so extensively in machine learning code.

### Numpy Arrays and Lists
Numpy arrays are conceptually similar to lists, and you can exchange data between these two structures easily. For example, run the following code to load a three-dimensional list into a numpy array, and then load the first dimension of the array into a list:

In [None]:
l = [[2,6,5],[4,2,3]]
print(type(l))
print(l)
print('\n')

a = np.array(l)
print(type(a))
print(a)
print('\n')

l2 = list(a[0])
print(type(l2))
print(l2)

You may be wondering why we should bother with numpy arrays when we have lists. Well, now that we've got the same data in a list and a numpy array, we can see how they differ in behavior:

In [None]:
# Add two lists
print ('l + l2:\n', l + l2)

# Add an array to a list
print ('\na + l2:\n', a + l2)

Adding two lists just appends their elements; but when you add a numpy array to a list, the list is implicitly treated as an array and and elementwise addition is performed. Fundamentally, numpy arrays are designed with linear algebra in mind, wheras lists are simply general purpose multi-element types.

### Manipulating Array Shapes
Arrays are inherently n-dimensional, and provide a lot of flexibility for viewing and manipulating their shape.

In [None]:
# View a and its shape
print (a)
print (a.shape)

# reshape a
b = np.reshape(a, (3,2))
print('\n',b)
print(b.shape)

# flatten a
r = a.ravel()
print('\n',r)
print(r.shape)

## Working with the File System
You'll commonly need to work with files and folders in the file system, and Python includes libraries that enable you to do this as shown in the following code:

In [None]:
import os, shutil

# We'll work with a folder named "myfolder" in the folder containing this notebook
myfolder = 'myfolder'

# delete it if it already exists
if os.path.exists(myfolder):
    print("Deleting existing folder..\n")
    shutil.rmtree(myfolder)
    
# Create the folder and two subfolders
os.makedirs(myfolder)
os.makedirs(os.path.join(myfolder, "subfolder1"))
os.makedirs(os.path.join(myfolder, "subfolder2"))

# Create some text files
i = 0
for root, folders, filenames in os.walk(myfolder):
    with open(os.path.join(root, "file" + str(i) + "txt"),mode="w") as newfile:
        newfile.write("This is text file" + str(i))
    i += 1
    
# List the items in the folder (a text file and two subfolders)
folder_items = os.listdir(myfolder)
for item in folder_items:
    print(os.path.join(myfolder, item))
            
print("\n")

# Loop recursively through the folder hierarchy, reading each file
for root, folders, filenames in os.walk(myfolder):
    for file in filenames:
        file_path = os.path.join(root,file)
        print(file_path)
        with open(file_path,mode="r") as txtfile:
            txt = txtfile.read()
        print("\t",txt)

## Debugging
Python includes a debugger that you can use to interactively stop code at breakpoints, examine variables, and step through your code. This can be useful as you write more complex Python code, and reduces the need to include lots of `print` statements to monitor variable values.

In the following code, the line `import pdb; pdb.set_trace()` sets a breakpoint. When you run this cekll, execution will pause at the breakpoint and an interactive textbox will be displayed, enabling you to examine variables and step through the code.

1. Run the cell below, and when the breakpoint is hit, enter `i` in the textbox - this displays the value of the variable **i**.
2. Enter `c` in the text box to *continue* - the code will run until the breakpoint is hit again. Enter `i` to confirm that the value of **i** has incremented by one.
3. Enter `n` to move to the *next* line (`i_squared = square(i)`).
4. Enter `s` to *step* into the next line - this takes you into the **square** function.
5. For *help*, in the form of a full list of debugging commands, enter `h` in the text box.
6. Enter `c` again to continue. This should take you to the end of the code (because the next increment of **i** takes it beyond the bounds of the loop)

In [None]:
def square(x):
    x_squared = x**2
    return x_squared


i = 1
while i < 3:
    import pdb; pdb.set_trace() # set a breakpoint
    i_squared = square(i)
    print(i_squared)
    i += 1
    

You can also use the debugger to step through code in an entire cell by adding the `%%debug` *magic* command to the beginning of the cell.

## Learn More
Hopefully, this short primer has given you a taste of working with Python. If you need a more thorough introduction, you could try the Python tutorial at http://docs.python.org/3.6/tutorial/