![image.png](attachment:image.png)


# Python Coding Club

This series is to introduce Python as a programming language to be used for data analysis, scientific computing and plotting. It is by __no means comprehensive__ but will provide a basis for further investigation and exploration into this powerful language. These notes are __best visualised__ in a __Jupyter Notebook__ and I encourage you to __follow along__ in your __preferred IDE__ (this will all make sense after Part 1).

# Part 1: Introduction and getting started

Part 1 in this series will introduce what Python is and show you the ways in which you can write and execute Python scripts on your machine.

## Part 1.1: Introduction to Python

### *What is Python?*

Python is what is known as an *__object-oriented__* __‘high-level’ programming language__ and was invented by Guido van Rossum in 1991. Other high-level languages include __C++, R and Java__. 

A ‘high-level’ language (HLL) is a language that is more __similar to human language__ than machine language. HLL’s are typically __machine independent__ and can therefore run on a variety of hardware. 

Once written, a Python script is __compiled and interpreted__ into a low-level language which is machine specific and finally into machine language __(1’s and 0’s – or bits)__ which then executes the program.


### _What is object-oriented  programming?_

__Object-oriented programming languages__ (like Python) consider every variable or function as a __specific object__. This means that every object in a Python script has __methods__ (functions) and __attributes__ (variables) associated with it. 

This type of programming is very powerful as it is __very logical__ and can lead to __modular readable code__ that can be reused without having to retype code.

New __types__ of object can be made using the `class` keyword. More on objects and classes later.

### *Why Python?*

Python is __very powerful__ as it is __easily interpreted and heavily supported__ via open source libraries and modules (more about this later).

Python is particularly useful for __scientific computing__ as it is easy to learn due to its __clear syntax__ as well as offering many __mathematical libraries for data analysis__ that are __well documented__ on their respective websites.

Python itself comes with many useful __'modules'__ for computing but where its usefulness really shines is through the __open source packages/modules__ that have been written by software developers for use in __other programs__. The full library of these modules is called the __Python Package Index__ or __PyPI__ for short and can be found [here](https://pypi.org/). Modules and packages will be covered more in a later section. 

### _What are Python modules, packages and libraries?_

One of Python's main appeals is how well supported the language is. This includes __open source__ __well documented__ Python scripts called __modules__. __Modules__ are simply Python scripts containing __functions and/or classes__ for use in programs (more on functions and classes later). 

Groups of modules are called __libraries__ or __packages__. Many common libraries include NumPy for numerical Python programming, SciPy for scientific computing, Pandas for data analysis and matplotlib for publication quality data plotting. 

To gain access to a particular module or package, an `import` statement is required in your Python program file, more on this later.

Python comes prepackaged with base modules such as `math`, `statistics`, `sys`, `os`, `collections`, `datetime` in Python's 'standard library. It is worth having a look through [Python's documentation](https://docs.python.org/3/library/) on what modules the standard library contains.

Apart from modules, Python comes preloaded with built-in functions that can be used without importing any new modules. More on this later.

### Python guiding principles

Python's design philosophy encourages __code readability__ and therefore a list of _guiding design principles_ when writing Python code have been established called _'The Zen of Python'_:
>The Zen of Python, by Tim Peters<br>
<br>
Beautiful is better than ugly.<br>
Explicit is better than implicit.<br>
Simple is better than complex.<br>
Complex is better than complicated.<br>
Flat is better than nested.<br>
Sparse is better than dense.<br>
Readability counts.<br>
Special cases aren't special enough to break the rules.<br>
Although practicality beats purity.<br>
Errors should never pass silently.<br>
Unless explicitly silenced.<br>
In the face of ambiguity, refuse the temptation to guess.<br>
There should be one-- and preferably only one --obvious way to do it.<br>
Although that way may not be obvious at first unless you're Dutch.<br>
Now is better than never.<br>
Although never is often better than *right* now.<br>
If the implementation is hard to explain, it's a bad idea.<br>
If the implementation is easy to explain, it may be a good idea.<br>
Namespaces are one honking great idea -- let's do more of those!<br>

This essentially means that __*simple code is better than complex code but complex code is better than complicated code*__ and if the implementation of how your code works is difficult to explain __it is probably too complex__.

The purpose of Python is to be easily readable and easily understandable by other programmers.

These guiding principles of programming in Python can also be easily retrieved by running the following cell (__select the cell and press Shift+Enter__):

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Part 1.2: Setting up

### Installing Python

In order to begin writing and executing Python scripts, first you will need to install Python on your computer by following [this link](https://www.python.org/downloads/).

Choose the correct download for your machine and make sure to __download Python 3.x (current version)__ as Python 2.7 is no longer supported.

Once downloaded, follow the installer on your machine to install Python.

### Running Python in the Interactive Development and Learning Environment (IDLE) 

Once installed, navigate to the __installation location on your machine__ and launch __'IDLE'__ or the Interactive Development and Learning Environment. This will open a __Python 'shell'__ which simply means an interactive Python interpreter that will __execute each line as it is typed__. 

Try typing in the following code into IDLE to print __'Hello World'__ underneath.

In [2]:
print('Hello World')

Hello World


__Congratulations on running your first Python program!__

Printing 'Hello World' is typically a __first step__ when using a new programming language simply to check that the __installation has been successful__.

### Running Python in the command prompt (Windows) or terminal (Mac OSX)

Python can also be accessed by opening either Command Prompt for Windows or Terminal for Mac OSX by typing __python3__.

This will open a Python interpreter in the terminal through which the same `print('Hello World')` should run.

### Interactive Developer Environments (IDEs)

Running Python code line by line is useful to test the installation is successful, but typically programs are hundreds or thousands of lines long, this is where Interactive Developer Environments (IDEs) come in.

IDEs are applications where Python scripts can be written, complied and executed. There are many benefits to IDEs including __line numbering__, __syntax highlighting__ and __code completion__.

- __Line numbering__ means each line of code is numbered, this can help with debugging when Errors occur as the line number causing the issue will be shown.

- __Syntax highlighting__ shows different Python objects/methods in seperate colours, making it easier to visualise how your script will be interpreted. This also shows syntax errors can occur if a line of code is not interpretable by the Python interpreter.

- __Code completion__ provides suggestions on what you are going to type (a variable or function name etc.) automatically and can also show you what type of object the name is referring to (class, function, variable etc. more on this later...).

IDE's also have the ability of code 'traceback' when an `Error` is invoked in a program (i.e. it does not run correctly) the line number where the error occurred and what error it is which can be very useful when debugging your programs.

There are __many different IDEs__ out there and __different programmers will have different preferences__ on which IDE is the best, therefore it may be useful to __try a couple and see which you prefer__.

#### A few common IDEs include:
 - [PyCharm](https://www.jetbrains.com/pycharm/download/)
 - [Microsoft Visual Studio](https://visualstudio.microsoft.com/)
 - [Eclipse](https://www.eclipse.org/downloads/)
 - [Spyder (part of Anaconda package)](https://www.spyder-ide.org/)

Typically, I prefer __PyCharm as my IDE of choice__ as it contains a __variety of background themes, is customisable and provides the above features__. PyCharm can also come with an __educational introductory course on Python by installing the plugin 'EduTools'__.

In this series of notes I will explain how to __install PyCharm__ and also the __Anaconda Package__ which includes the __IDE Spyder and a link to Jupyter notebooks__ (where you may be reading this currently).

### Jupyter Notebooks

Jupyter notebooks are __interactive Python notebooks__ (of which some of this series may be written).

They can be used to show a __workflow of data analysis__ as well as markdown text such as the text shown here. Jupyter notebooks as well as __Spyder IDE come prepacked in the Anaconda distribution__ mentioned above. 

Jupyter notebooks are a useful tool when it comes to __showing specific reports of programs__. Each cell is executed by pressing __Shift+Enter__ when the cursor is __selected on the cell__. Jupyter notebooks therefore have the ability to act as a __Python interpreter like IDLE__ but also as an __IDE as the code written is saved and can be referenced later__, for instance:

In [3]:
x = 5+8

In [4]:
x

13

Running the cell containing `x` before the cell `x = 5+8` will cause a `ValueError` as `x` has not been assigned any value yet. Therefore the cell `x = 5+8` must be executed first, then the cell containing `x`. More on variable assignment and Errors later...

A __jupyter notebook__ can be started through loading the __anaconda navigator__ up after installation and launching the __jupyter notebook__ from there.

Alternatively, a __jupyter notebook__ can be started from the __command prompt__ (Windows) or __terminal__ (Mac OSX) after installing __anaconda__ by typing:

`jupyter notebook`

This will load a webpage that will show a your local filesystem. 

From here you can either start a __new__ notebook or __navigate__ to where the __.ipynb__ files are located on your machine. 

### Installing and setting up PyCharm

First navigate to the PyCharm downloads link (linked above) and download the correct package for your machine. Make sure to pick the __community version__ of PyCharm.

Secondly install PyCharm by clicking on the __installer__ and following the dialogues. 

Once PyCharm has installed open the application. Make a __new folder__ somewhere for your scripts to go and then click __File > Open__ and navigate to that folder and a new project should start. 

Now click on PyCharm's __Preferences/Settings__ tab. Navigate to __"Project: _Project_name_"__ and click on __Project Interpreter__.

Click on the Python 3.x note that should appear if you installed Python correctly before.

Test that everything is working by going __File > New__, clicking on Python File and entering the name Hello, World.

In the first line of the file type `print('Hello, World')`. 

Either right-click and press Run or press the green 'Run' or 'Play' triangle in the top right of the screen. 

If everything is working correctly, `Hello, World` should have printed to the Python console at the bottom of the screen. 

### Anaconda Distribution

The __Anaconda distribution__ is a distribution of Python that contains pre-packaged with many useful libraries for data science and scientific computing including __SciPy, NumPy, Pandas, Scikit-Learn, TensorFlow, matplotlib__ and many others as well as __Spyder IDE__, the statistical __programming language 'R'__ and __Jupyter notebooks__. 

Therefore __Anaconda__ is a very useful tool to have even if your preferred IDE is not Spyder. Downloading and installing the __Anaconda__ pacakge is very easy and simply requires navigating to [this link](https://www.anaconda.com/distribution/) and following the __installation instructions__.

Once __Anaconda__ is installed, open Anaconda and then launch Spyder. Write `print('Hello, World')` to test that everything was successful.

## Part 1.3: Support and documentation

### Python documentation

Part of the appeal of Python is that it is an __open source language__ and the usefulness of it is well documented. The documentation for __some base functions and uses__ of Python can be found [here](https://www.python.org/doc/). 

I strongly advise taking note of where this documentation is and __having it to hand in future whilst programming__. 

### Package documentation

While using the many __packages/modules__ that have been written for use in Python programs (NumPy, Pandas, SciPy, matplotlib etc.) it is important to take note of the __documentation for any packages__ you are using. Hopefully these should be your __first go-to__ if you find something in your code not working the way you expect it to when using a package. 

Navigate to the respective packages website and search for the specific function/class you are trying to use and try to figure out what is not correct. The 'traceback' on your IDE should also provide some clues as to why your code is not working. 

### Stack Overflow

If all else fails and you cannot understand why your code is not working OR if you have a specific question about the best way to implement your code, [Stack Overflow](https://stackoverflow.com/) is your best friend. 

Stack Overflow is a __computing forum__ where various questions are asked regarding how to __implement__ all sorts of code into programs. Before asking a question, it is __highly__ likely that your question may have already been answered, a simple search on Stack Overflow may provide the answer you are looking for. 

__Be aware__ when reading answers on Stack Overflow, the _exact_ implementation shown in the answer __will__ almost certainly need to be altered to fit your own __use case__.

Alternatively you can ask your own question, when doing so make sure your query is a [minimal working example](https://en.wikipedia.org/wiki/Minimal_working_example) or MWE. This means that your question is as direct as possible about the exact problem you are facing and what you expect the code to be doing without including any extra complexities that make it difficult to reach the correct solution. 

In [1]:
print("Time to start coding!")

Time to start coding!
