# Programming for Chemists


The aim of this course is to provide you with the skills to utilise
computers in your own scientific work, from data management,
analysis and presentation to solving real scientific problems.

## Learning Outcomes

At the end of this course participants should be able to
demonstrate:

* Knowledge of the Python 3 programming language
* An understanding of key data types and structures in Python
* The skills to write and develop simple programs in Python
* The ability to import data from files for analysis and presentation
* The ability to detect errors in programs
* The ability to describe and document the results of a programming project

## Resources

1. Recommended textbook

    - Learning Scientific Programming with Python, C. Hill, 2016. Cambridge University Press. **Sussex Ebook link:** https://sussex-primo.hosted.exlibrisgroup.com/permalink/f/c622i2/44SUS_ALMA_DS51141926100002461

2. Supplementary reading
     - Python notes for professionals: https://books.goalkicker.com/PythonBook/
3. Recommended online resources

     - Official Python documentation: https://docs.python.org/3/

4. Mathematical/computer programming problems (Good for application of what you learn, but some are very
advanced)

     - Project Euler: https://projecteuler.net/

## Course Overview

1. Introduction and Jupyter Notebook
2. Key data types and uses
3. Basic programming structures
4. File input/output
5. Plotting
6. Basic NumPy and arrays
7. Peak finding and smoothing data
8. Mathematics with NumPy and SciPy

# Introduction to Programming for Chemists

**Q. Why learn to program as a chemist?**
* The world of chemistry is changing, laboratories are generating increasing quantities of digital data, and chemists need the ability to efficiently process, analyze, and visualize these data.
* An increasing number of chemistry and biochemistry related jobs specifically ask for programming experience.
* It opens more opportunities in different areas of science outside of your degree subject.
* It will make you a more efficient and effective scientist.

**Q. Why use the Python programming language?**

* It is a powerful, general-purpose programming language.
* It is a high-level language, meaning it automates fundamental operations such as memory management carried out at the processor level.
* It has a large variety of data structures such as lists, tuples, dictionaries and sets.
* It can easily interface with lower-level languages such as `C`, `C++`, `Fortran`, `Rust` etc. . .
* It has a shallow learning curve with a clean and simple syntax. 

**Example:** Consider printing items from a shopping list using Python vs. using C. To run the Python code element click on the code and hold <kbd>SHIFT</kbd> and press <kbd>ENTER</kbd>. This notebook can not run the `C` code as it only works with one kernel at a time.

**Python Syntax:**

In [None]:
shopping = ["Bread", 'Oranges', 'Soup', 'Tea'] 
for item in shopping:
    print(item)

**C Syntax:**

In [None]:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_STRING_LENGTH 20
#define NUMBER_OF_STRINGS 4
int main()
{
    int i;
    char shopping[NUMBER_OF_STRINGS][MAX_STRING_LENGTH+1];
    strcpy(shopping[0], "Bread");
    strcpy(shopping[1], "Oranges");
    strcpy(shopping[2], "Soup");
    strcpy(shopping[3], "Tea");
    for (i=0; i<NUMBER_OF_STRINGS; i++) {
        fprintf(stdout, "%s\n", shopping[i]);
    }
}

Python was designed to be a highly readable language, with a relatively uncluttered visual layout and uses English keywords frequently where other languages use punctuation. Python aims to be simple and consistent in the design of its syntax which is hopefully clear from the examples above.

## Jupyter Notebook

* The Jupyter notebook is an open document that can contain documentation, code, interactive elements, and the output of your Python code such as plots and numerical values.
* It is an interactive environment, ideal for data processing and visualisation and suitable for sharing with others.
* It is Free and Open Source Software (FOSS) and accessible anywhere, either online or downloadable for offline use.
* Each interactive session will take place at a computer using a Jupyter Notebook and any required data files.

# Getting Started with Python 3

Using Python in a Jupyter notebook is easy. Just type your python code into a code cell which is one of these: 

In [1]:
# This is a code cell!

and hold <kbd>SHIFT</kbd> and press <kbd>ENTER</kbd> to run the code. It is **highly encouraged** to add code to the example code cells as you progress through the sessions and experiment with what you are learning and test any ideas you have. 

### Hello World

A tradition when learning a new language is to print "Hello World!" to the screen, which we will now do. To print this in Python we need to invoke the `print` function and inside it write, "Hello World" in quotation marks which represent the string datatype in the Python language. To learn the syntax and how to run the code, type this into the code box below `print("Hello World!")` then run the code.

Note the importance of the quotation marks in the above code example. Run the following code which does not include them:

In [None]:
print(Hello World)

This is invalid syntax as we have now told Python to print variables Hello and World which we have not defined. We will cover strings along with other key data types in the next session, but for now lets learn some Python basics. 

### Creating variables and assigning values
One of the most important things that we want to do is create **variables** which represent a quantity whose value can change. To do this in Python, we need to specify the variable name, and then assign a value to it, which is done using the following syntax, `variable name = value`. Lets assign the value 10 to the variable letter `x`:

In [9]:
x = 10 # Create the variable x

Variable assignment works from left to right. So the following will give you a syntax error:

In [6]:
10 = x

SyntaxError: can't assign to literal (<ipython-input-6-ea1f1a64d427>, line 1)

There are strict rules for naming of variables:

1. Variable names must start with a letter or an underscore:

In [None]:
x = 10 # valid
_y = 10 # valid

9x = 10 # Invalid as starts with numeral

$y = False # Invalid as starts with symbol


2. The remainder of your variable name may consist of letters, numbers and underscores.

3. Names are case sensitive:

In [8]:
x = 9 # Define variable lower case x
y = X*5 # 'Accidentally' call upper case X 

NameError: name 'X' is not defined

When you use `=` to do an assignment, what's on the left of `=` is a **name** for the **object** on the right. `=` assigns the **reference** of the object on the right to the **name** on the left. That is:

```a_name = an_object # "a_name" is now a name for the reference to the object "an_object"```

You can assign multiple values to multiple variables in one line, but there must be an equal number of arguments on the left and right sides of the `=` operator:

In [12]:
x, y, z = 1, 2, 3
print(x, y, z)

1 2 3


You can also assign a single value to several variables simultaneously:

In [13]:
x = y = z = 1
print(x, y, z)

1 1 1


### Comments
In the above examples you may have noticed multiple `#` symbols with text written after them. These are known as **comments** and are a crucial and often underused aspect of programming languages. Comments are lines that exist in computer programs that are ignored by the program. Including comments in programs makes code more readable for humans as it provides some information or explanation about what each part of a program is doing. In general, it is a good idea to write comments while you are writing or updating a program as it is easy to forget your thought process later on, and comments written later may be less useful in the long term. Comments are also very appreciated by someone who may use your program or be tasked with modifying parts of it, as they need an insight into your thought process. Try to avoid W.E.T comments meaning you 'Wrote Everything Twice'. Comments such as:

In [None]:
return a  # Returns a

Your comments should be D.R.Y. (Don’t Repeat Yourself) and offer insight into how the code truly functions.

