<CENTER><img src="images/logos.png" style="width:50%"></CENTER>

# Introduction to coding in Python

Being able to code is an essential skill for a Particle Physicist (or any scientist, for that matter).  Our datasets are simply too large to process without the assistance of computers!  An ATLAS Physicist typically uses some combination of the C++ and Python programming languages to accomplish everything from simulating proton-proton collisions to searching for Higgs bosons.

As being code-literate is a prerequisite to analysing ATLAS data, we will in this notebook review some of the basics of coding in Python.  We will do this by presenting a diluted and interactive version of the tutorial of the [official Python documentation][PyTutorial].  For more information on any topic, let the official Python documentation be your first port of call.  We will link to specific parts of the tutorial as we go along.

Python is extensively used by beginners and software engineers alike, for both business and pleasure.  It can be fun!  It is named after the BBC series "Monty Python's Flying Circus" and refers to its founder as a Benevolent Dictator For Life ([BDFL][BDFL]).

For the duration of the Co-Creation workshop, you will be accompanied by expert Pythonistas, so if you have any questions, do ask!

[PyTutorial]: <https://docs.python.org/3/tutorial/index.html>
[BDFL]: <https://docs.python.org/3/glossary.html>

> ## A note on Jupyter notebooks
>
> The web-based, interactive coding environment you are currently seeing is a [Jupyter notebook][Jupyter].  With Jupyter, you can edit and execute code in your browser, and write and edit text boxes to explain/document what you are doing as you go using the [Markdown][Markdown] language.
>
> To get started, memorise these [keyboard shortcuts][kbd]:
> * `Shift + Enter`: run cell
> * `Enter`: enter edit mode
> * `Esc`: exit a cell
>
> Give these commands a quick go by clicking on this text, hitting `Enter` to enter edit mode, then pressing `Shift + Enter` to 'run' the cell.  Feel free to make a change!  If you ever change something to the point of breaking, just reload the page to retrieve a fresh copy of the notebook.
>
> To learn more about using Jupyter notebooks, browse through the items of the menu above (Edit, View, Insert, etc.) and note what you can (and cannot) do.  As an extra perk, see that under Help, there is a series of links to Reference material for some of the most widely used Python libraries in science.

[Jupyter]: <https://jupyter-notebook.readthedocs.io/en/stable/index.html>
[Markdown]: <https://daringfireball.net/projects/markdown/>
[kbd]: <https://jupyter-notebook.readthedocs.io/en/stable/notebook.html#keyboard-shortcuts>

## Hello, World!

The ["Hello, World!" programme][HelloWorld] is a time-honoured tradition in Computer Science which will be respected here.  The idea of Hello World is to illustrate the basics of a language and to verify that the coding environment has been properly installed and set up.  So to test Python in this notebook, have a go at running the code of the next cell (`Shift + Enter`)...if it does what you expect, then you are good to go!

[HelloWorld]: <https://en.wikipedia.org/wiki/%22Hello,_World!%22_program>

In [None]:
print("Hello, World!")

## Numbers, Strings, and Compound Data Types

[Follwing [An Informal Introduction to Python](https://docs.python.org/3/tutorial/introduction.html)]

### Python as a calculator

Python is good at maths!  Run the examples of the following code cells to see what the operators `+`, `-`, `*` and `/` do, to find that they have the effect of addition, subtraction, multiplication, and division.

In [None]:
2 + 2

In [None]:
50 - 5*6

In [None]:
(50 - 5*6) / 4

In [None]:
8 / 5

Python additionally povides a convenient power operator `**`.

In [None]:
2**7 # Power

Why do some of the numbers produced by these operations have decimal points, while others do not?  It is because we have here two _types_ of numbers: `int` types and `float` types.  The `float` type represents a [floating point number][fpn] and is a computer's formulaic binary representation of a decimal number.  The `int` type represents integer values.
> If you are lucky, you will never have to worry about 'floating point precision', but it can be a significant consideration, with errors here having in the past caused [rockets to explode][Ariane]!

[fpn]: <https://en.wikipedia.org/wiki/Floating-point_arithmetic>
[Ariane]: <http://www-users.math.umn.edu/~arnold/disasters/ariane.html>

It is possible to assign a value to a variable using the `=` operator.

In [None]:
x = 4
x**2

We have also the handy in-place operators `+=`, `-=`, `*=` and `/=`

In [None]:
y = 10
y += 2
print(y)

In [None]:
z = 2
z *= 2
z -=1
z /= 2
print(z)

### Strings

The Python `string` is a string of characters enclosed in quotation marks (`'...'` or `"..."`).  Strings may be operated upon by the above mathematical operations in a curious way and are indexed as if they were lists of characters!

In [None]:
prefix = 'Py'
prefix + 'thon'

In [None]:
# 3 times 'un', followed by 'ium'
3 * 'un' + 'ium'

In [None]:
word = 'Python'

In [None]:
# Access the first character of the string which is indexed by 0
word[0]

In [None]:
# Access the last character of the string which is indexed by -1
word[-1]

In [None]:
# Slice the string from index 1 (inclusive) to 5 (not inclusive)
word[1:5]

### Compound Data Types

A Python list is a mutable, compound data type for grouping together a sequence of values.

In [None]:
# Here is an example list
nums = [1, 2, 3]

In [None]:
# Lists are mutable
nums[0] = 4
nums

In [None]:
# Lists can contain different data types
nums += ['a']
nums

In [None]:
# Lists can be 'sliced'
nums[1:3]

The [built-in][builtin-funcs] function [`len(s)`][len] returns the length of, or number of items in, a sequence or collection `s`.  One excellent example use case is to find which of the words ['Llanfairpwllgwyngyllgogerychwyrndrobwllsantysiliogogogoch'][Llanfairpwll] and 'supercalifragilisticexpialidocious' is longer.

[builtin-funcs]: <https://docs.python.org/3/library/functions.html>
[len]: <https://docs.python.org/3/library/functions.html#len>
[Llanfairpwll]: <https://en.wikipedia.org/wiki/Llanfairpwllgwyngyll>

In [None]:
len_llanfair = len('Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch')
len_supercali = len('supercalifragilisticexpialidocious')

print(len_llanfair)
print(len_llanfair / len_supercali)

As compound data types, [_tuples_][tuples] and [_dictionaries_][dicts] are also frequently used.  Can you figure out what they do from the linked pages?  Feel free to make new code cells here to explore.

[tuples]: <https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences>
[dicts]: <https://docs.python.org/3/tutorial/datastructures.html#dictionaries>

## Control Flow

In the preceding example code snippets (the mathematical and string operations, and list manipulations), we programmed our commands to be executed line-by-line.  It would be fair to say that these top-to-bottom programmes are quite dull.  A programme may be made to exhibit a more complex [control flow][flow] by use of [_control flow statements_][PFlow].  

As control flow statements in Python, there are conditional statements and loop constructs.  Conditional statements (`if`, `elif`, `else`) are used to execute blocks of code only when certain conditions are met.  Loop constructs are used to execute blocks of code some number of times (`for`) or while certain conditions are met (`while`).

[flow]: <https://en.wikipedia.org/wiki/Control_flow>
[PFlow]: <https://docs.python.org/3/tutorial/controlflow.html>

### `if` Statements

In [None]:
x = int(input('Please enter an integer:  '))

In [None]:
# Example conditional 'if' block
if x < 0:
    print('You entered a negative number!')
elif x == 0:
    print('You entered zero')
else:
    print('You entered a positive number')

### `for` Statements

`for` statements in Python allow you to iterate over the items of any sequence (like a list or string) in order.

In [None]:
# Measure some strings in a `for` loop
words = ['cat', 'window', 'defenestrate', 'quark']
for w in words:
    print(w, len(w))

In conjunction with `for` statements, the built-in [`range()`][range] function is often useful.  It returns a range object, constructed by calling `range(stop)` or `range(start, stop[, step])`, that represents a sequence of numbers that goes from `start` (0 by default) to `stop` in steps of `step` (1 by default).

[range]: <https://docs.python.org/3/library/stdtypes.html#range>

In [None]:
range(10)

In [None]:
list(range(10))

In [None]:
list(range(0, 10, 2))

In [None]:
# Example 'for' loop over a range that pushes items onto a list
items = []
for i in range(10):
    items.append(i)
print(items)

### `while` Statements

A `while [condition]` loop executes for as long as `condition` is true.  If `condition` is _always_ true, then you have an infinite loop - a loop that will never end.  In Jupyter, you can interrupt a cell by clicking the stop button in the menu above, or double-tapping 'i' on your keyboard.  If you choose to run the following cell, it is up to you to interrupt it!  Can you think of a more interesting use of while?

In [1]:
while True:
    # Do nothing
    pass

KeyboardInterrupt: 

In [4]:
# A more interesting while loop: calculate the Fibonacci series!
a, b = 0, 1
while a < 1000:
    print(a, end=' ')
    a, b = b, a + b

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 

## Functions

What if we want to use some block of code multiple times and in different places?  We could simply copy-and-paste that block of code every time we want to use it, but there is a better way!  We can wrap the block of code in a [_function_][functions] and 'call' that function as many times as we like.

In the preceding section, we calculated all the terms of the Fibonacci sequence that are less than 1000.  By making a function `fibonacci(n)` of our Fibonacci code, we could provide the upper limit as a parameter `n` of the function and calculate the series up to many different values of `n`!

[functions]: <https://docs.python.org/3/tutorial/controlflow.html#defining-functions>

In [None]:
def fibonacci(n):
    '''Calculate and print out the terms of the Fibonacci series 
    that are less than `n`.'''
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a + b
    print()
    
    return

In [None]:
# Print the terms of the Fibonacci series that are less than n = 10
fibonacci(10)

In [None]:
# Print the terms of the Fibonacci series that are less than n = i for each i < 25!
for i in range(25):
    fibonacci(i)

Did you notice the `return` statement in the definition of the `fibonacci` function?  It did nothing!  But we can in general use the `return` statement to return (i.e. to _pass_) information from inside a function to outside.  Consider the following update of the original `fibonacci` function.  It returns a list of the terms of the Fibonacci series, which may be more useful than printing them!

In [None]:
def return_fibonacci_series(n):
    '''Calculate the terms of the Fibonacci series 
    that are less than `n`, returning a list of the result.'''
    
    # Make a list called 'series' to store the terms
    series = []
    
    # Calculate the terms up to n
    a, b = 0, 1
    while a < n:
        series.append(a)
        a, b = b, a + b
        
    # Return the series
    return series
        

Let's check to see if this function behaves as we would expect.  We will do that by calling it with `n = 100`, and then by operating on the returned list.

In [None]:
result = return_fibonacci_series(100)

# Print the result...
print(result)

# Reverse the result and print it, just for fun...
reversed_result = list(reversed(result))
print(reversed_result)

# Do anything you like with the Fibonacci series!
# . . .

When we printed the Fibonacci series in a for loop, we ended up printing each new series many times.  By using the returned list of the updated Fibonacci function, we could now print the series only if it differs from the previous series!  The next example illustrates how to implement this, and is logically as complicated as it gets...

In [None]:
# Variable to hold the currently-largest term
largest_term = -1

for n in range(10000):
    # Call the updated Fibonacci function that returns a list of terms
    series = return_fibonacci_series(n)
    
    # If the series contains terms (`if series` checks that `series` is not empty = [])
    if series:
        
        # If the largest term is larger than the largest term seen so far
        series_largest_term = series[-1]
        if series_largest_term > largest_term:
            
            # Print the series
            for term in series:
                print(term, end=' ')
            print()
            
            # Update the largest term
            largest_term = series_largest_term

So you see in this last example that the coding techniques that we have learned in this notebook enable us to write some really quite complex programmes!

You will meet functions repeatedly throughout the course of these notebooks. Whenever you execute code that has the signature `function(...)`, you are calling a function!  Furthermore, in the notebook on _'Searching for the Higgs boson'_, you will in fact get to write your own functions to help you along the way to observing the Higgs boson...

## Modules

In this notebook, we have been writing small snippets of disposable code and executing them, before moving on and forgetting about them.  When it comes to writing a more elaborate programme, it is more convenient to put your code in a file.  When a file is populated with Python definitions, it becomes a [_module_][module] which can be _imported_ from other Python-speaking files such that its content may be used.

The module-oriented approach to software development has the effect of keeping your code organsied, but more importantly facilitates code-sharing.  In the world of open-source and free software, much of the code you will ever need to write has already been written and is available to use!  One rarely has to code everything from scratch.  

### The `ROOT` library

The [ROOT][ROOT] library is very useful to a Particle Physicist.  It will be used extensively in this course of notebooks and you will become close allies in data analysis.  In the [words of the maintainers][AboutROOT],
> "ROOT is a framework for data processing, born at CERN, at the heart of the research on high-energy physics."

Here, we introduce you to it.  Start by importing it.

[module]: <https://docs.python.org/3/tutorial/modules.html>
[ROOT]: <https://root.cern/>
[AboutROOT]: <https://root.cern/about/>

In [None]:
import ROOT

Let's use ROOT to generate ten random numbers by way of the `TRandom3` submodule, just to show that it works.

In [None]:
# Construct an instance of the TRandom3 class and call it 'randg' (a RANDom number Generator)
randg = ROOT.TRandom3()

# Generate 10 random numbers
for i in range(10):
    print(randg.Rndm())

We were here able to generate random numbers without having to programme a random number generator for ourselves.  It was so easy because there is already a random number generator implemented in ROOT.  So with ROOT at our fingertips, we saved a lot of time!

ROOT is not the only library that is available to us.  Other popular libraries include
* [`numpy`][numpy] for numerical computing
* [`matplotlib`][matplotlib] for data visualisation
* [`tensorflow`][tensorflow] for machine learning
* [`pandas`][pandas] for data manipulation

These libraries of modules are there to be used at no cost.  But this is a look ahead!  In these notebooks, we make use of ROOT and numpy only.

[numpy]: <https://numpy.org/>
[matplotlib]: <https://matplotlib.org/>
[tensorflow]: <https://www.tensorflow.org/>
[pandas]: <https://pandas.pydata.org/>

# Conclusion

In this notebook, we have very quickly gone from zero to sixty at coding in Python!  We have been working in a Jupyter notebook where we can run code interactively and write text to annotate what we are doing.  After saying `'Hello, World!'`, we learned how to do maths with Python and how to use strings and compound data types.  By using control flow statements, we saw that we can write quite complex programmes, which we can organise into functions and modules for convenience and shareability!

On all of these features, we were brief so that we can get on to more interesting topics quickly.  To that end, we omitted or glossed over many details and many technicalities - there is much more to learn!  But independently learning how to do something new can be a part of the fun and is certainly a part of the job.  When coding, it is normal to not immediately know how to do something.

In using Python, you are a part of a large global community.  This means that the internet is full of advice on how to write Python code well.  Describing your problem to a search engine more often than not brings up a solution straight away.  Make use of what other Pythonistas know about Python!

Good luck analysing ATLAS Open Data!  We hope that this _Introduction to coding in Python_ will serve you well.