<a href="https://colab.research.google.com/github/sigvehaug/Introduction-to-Python-Programming-For-Medical-Researchers/blob/master/6-Basic-Python-2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="https://github.com/sigvehaug/Introduction-to-Python-Programming-For-Medical-Researchers/blob/master/Course/fig/cover-small.jpg?raw=1">

*This notebook contains an excerpt from the [Whirlwind Tour of Python](http://www.oreilly.com/programming/free/a-whirlwind-tour-of-python.csp) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/WhirlwindTourOfPython).*

*The text and code are released under the [CC0](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/LICENSE) license; see also the companion project, the [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook).*


Introduction to Python Programming for Medical Researchers, University of Bern, Sigve Haug

# Basic Python 2 (60 min)

This notebook is a systematic and very condensed overview of
- Python Control Flow
- Python Functions
- Python Strings and Regular expressions
- Python File and Operating System Operations
- Python File Reading and Writing
- Python Modules and Packages

The content is basic and belongs to necessary knowledge by any Python programmers. One does not learn it by heart, however, after some hours of practicing Python, it automatically becomes active knowledge.

It corresponds to the chapters 7, 8, 13, 14 and 15 of the book referenced above. There you may get more detailed descriptions. The chapters on Errors and Exceptions, Iterators, List Comprehensions and Generators we don't cover here. You may read them in the book.

## Control Flow

Control flow is where the rubber really meets the road in programming. Without it, a program is simply a list of statements that are sequentially executed. With control flow, you can execute certain code blocks conditionally and/or repeatedly: these basic building blocks can be combined to create surprisingly sophisticated programs!

Here we'll cover conditional statements (including "if", "elif", and "else"), loop statements (including "for" and "while" and the accompanying "break", "continue", and "pass").

### Conditional Statements: if-elif-else

In [1]:
x = -15
if x == 0:
    print(x, "is zero")
elif x > 0:
    print(x, "is positive")
elif x < 0:
    print(x, "is negative")
else:
    print(x, "is unlike anything I've ever seen...")

-15 is negative


### for Loops

In [5]:

for N in [2, 3, 5, 7]:
    print(N, end=' ') # print all on same line
print()
for i in range(10):
    print(i, end=' ')
print()
for i in range(6,60,5):
    print(i, end=' ') 

2 3 5 7 
0 1 2 3 4 5 6 7 8 9 
6 11 16 21 26 31 36 41 46 51 56 

### while Loops

In [6]:
i = 0
while i < 10:
    print(i, end=' ')
    i += 1


0 1 2 3 4 5 6 7 8 9 

### break and continue

In [7]:
for n in range(20):
    # if the remainder of n / 2 is 0, skip the rest of the loop
    if n % 2 == 0:
        continue
    print(n, end=' ')

1 3 5 7 9 11 13 15 17 19 

In [8]:
a, b = 0, 1
amax = 100
L = []

while True:
    (a, b) = (b, a + b)
    if a > amax:
        break
    L.append(a)

print(L)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


In [9]:
# Loop with an else block
L = []
nmax = 30

for n in range(2, nmax):
    for factor in L:
        if n % factor == 0:
            break
    else: # no break
        L.append(n)
print(L)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]


## Defining and Using Functions

So far, our scripts have been simple, single-use code blocks. One way to organize our Python code and to make it more readable and reusable is to factor-out useful pieces into reusable functions. Here we'll cover two ways of creating functions: the def statement, useful for any type of function, and the lambda statement, useful for creating short anonymous functions.

In [10]:
# Define/write a function calculating the fibonacci numbers
def fibonacci(N):
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L
# Call that function
fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [11]:
def real_imag_conj(val):
    return val.real, val.imag, val.conjugate()

r, i, c = real_imag_conj(3 + 4j)
print(r, i, c)

3.0 4.0 (3-4j)


In [14]:
# Default arguments
def fibonacci(N, a=0, b=1):
    L = []
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

print(fibonacci(10))
print(fibonacci(10,3,1))
print(fibonacci(10, b=3, a=1))


[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
[1, 4, 5, 9, 14, 23, 37, 60, 97, 157]
[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]


### \*args and **kwargs: Flexible Arguments

Sometimes you might wish to write a function in which you don't initially know how many arguments the user will pass. In this case, you can use the special form *args and **kwargs to catch all arguments that are passed. Here is an example:

In [15]:
def catch_all(*args, **kwargs):
    print("args =", args)
    print("kwargs = ", kwargs)

catch_all(1, 2, 3, a=4, b=5)
catch_all('a', keyword=2)

args = (1, 2, 3)
kwargs =  {'a': 4, 'b': 5}
args = ('a',)
kwargs =  {'keyword': 2}



### Anonymous (lambda) Functions

Earlier we quickly covered the most common way of defining functions, the def statement. You'll likely come across another way of defining short, one-off functions with the lambda statement. It looks something like this:

In [16]:

add = lambda x, y: x + y
add(1, 2)

3

## String Manipulation and Regular Expressions

One place where the Python language really shines is in the manipulation of strings. This section will cover some of Python's built-in string methods and formatting operations, before moving on to a quick guide to the extremely useful subject of regular expressions. Such string manipulation patterns come up often in the context of data science work, and is one big perk of Python in this context.

Strings in Python can be defined using either single or double quotations (they are functionally equivalent):

In [17]:
x = 'a string'
y = "a string"
x == y

True

In [18]:
# Multiline strings
multiline = """
one
two
three
"""


With this, let's take a quick tour of some of Python's string manipulation tools.

In [20]:
fox = "tHe qUICk bROWn fOx."
print(fox.upper())
print(fox.lower())
print(fox.title())
print(fox.capitalize())
print(fox.swapcase())

THE QUICK BROWN FOX.
the quick brown fox.
The Quick Brown Fox.
The quick brown fox.
ThE QuicK BrowN FoX.


In [29]:
fox = "       tHe   qUICk bROWn fOx."
print(fox)
print(fox.strip())
print(fox.lstrip())
print(fox.rstrip())
print(fox.rstrip('.'))
print(fox.replace('Ox','XX'))

       tHe   qUICk bROWn fOx.
tHe   qUICk bROWn fOx.
tHe   qUICk bROWn fOx.
       tHe   qUICk bROWn fOx.
       tHe   qUICk bROWn fOx
       tHe   qUICk bROWn fXX.


There are many more convenient string methods. You can look the up in the online documentation.

In [36]:
# Split and work on substrings
haiku = """matsushima-ya
aah matsushima-ya
matsushima-ya"""
print(haiku,'\n')  # \n is the code for newline, i.e. this statements prints an empty line after the string
lines = haiku.splitlines()
i=0
for line in lines:
    i+=1
    words = line.split()
    print('Line',i,'has', len(words), 'words.')

matsushima-ya
aah matsushima-ya
matsushima-ya 

Line 1 has 1 words.
Line 2 has 2 words.
Line 3 has 1 words.



### Format Strings

In the preceding methods, we have learned how to extract values from strings, and to manipulate strings themselves into desired formats. Another use of string methods is to manipulate string representations of values of other types. Of course, string representations can always be found using the str() function; for example

In [44]:
pi = 3.14159
print("The value of pi is " + str(pi))
print("pi = %.4f" % pi)               # % formatting, old style
print("pi = {0:.3f}".format(pi))      # format formatting
f'pi = {pi:.3f}'                      # fstring formatting

The value of pi is 3.14159
pi = 3.1416
pi = 3.142


'pi = 3.142'

This style of format specification is very flexible, and the examples here barely scratch the surface of the formatting options available. For more information on the syntax of these format strings, see the Format Specification section of Python's online documentation.

### Flexible Pattern Matching with Regular Expressions¶

The methods of Python's str type give you a powerful set of tools for formatting, splitting, and manipulating string data. But even more powerful tools are available in Python's built-in regular expression module. Regular expressions are a huge topic; there are entire books written on the topic (including Jeffrey E.F. Friedl’s Mastering Regular Expressions, 3rd Edition), so it will be hard to do justice within just a single subsection.

My goal here is to give you an idea of the types of problems that might be addressed using regular expressions, as well as a basic idea of how to use them in Python. I'll suggest some references for learning more in Further Resources on Regular Expressions.

Fundamentally, regular expressions are a means of flexible pattern matching in strings. If you frequently use the command-line, you are probably familiar with this type of flexible matching with the "*" character, which acts as a wildcard. For example, we can list all the IPython notebooks (i.e., files with extension .ipynb) with "Python" in their filename by using the "*" wildcard to match any characters in between

In [45]:
# Just one example with regular expressions

### Further Resources on Regular Expressions

The above discussion is just a quick (and far from complete) treatment of this large topic.
If you'd like to learn more, I recommend the following resources:

- [Python's ``re`` package Documentation](https://docs.python.org/3/library/re.html): I find that I promptly forget how to use regular expressions just about every time I use them. Now that I have the basics down, I have found this page to be an incredibly valuable resource to recall what each specific character or sequence means within a regular expression.
- [Python's official regular expression HOWTO](https://docs.python.org/3/howto/regex.html): a more narrative approach to regular expressions in Python.
- [Mastering Regular Expressions (OReilly, 2006)](http://shop.oreilly.com/product/9780596528126.do) is a 500+ page book on the subject. If you want a really complete treatment of this topic, this is the resource for you.

For some examples of string manipulation and regular expressions in action at a larger scale, see [Pandas: Labeled Column-oriented Data](15-Preview-of-Data-Science-Tools.ipynb#Pandas:-Labeled-Column-oriented-Data), where we look at applying these sorts of expressions across *tables* of string data within the Pandas package.

## File and Operating System Operations


Filesystem operations can be carried out by executing a normal shell command preceded by exclamation mark. Here is how may look like on a Linux machine.

In [46]:
! ls -l

total 4
drwxr-xr-x 1 root root 4096 Aug 31 13:18 sample_data


Python has a library, the [operating system module (os)](https://docs.python.org/3/library/os.html), which makes the code os independent and the interface to the rest of the Python code transparent. 

In [51]:
import os
cwd = os.getcwd()        # Returns the path to the current working directory.
os.mkdir(cwd+'/data')   # Creates a directory; nothing happens if the directory already exists. Creates all the intermediate-level directories needed to contain the leaf.
os.chdir('data')         # Changes the current working directory to path.
print(os.listdir())    # Returns the list of entries in directory dir (omitting ‘.’ and ‘..’) 
os.chdir('..')
print(os.listdir())
os.rename('data','data-new') # Renames a file or directory from old to new.
print(os.listdir())

[]
['data', 'data-new']
['data-new']


In [53]:
# More stuff
path = 'data-new'
print(os.path.exists(path))   # Returns True if path exists.
print(os.path.isdir(path))    # Returns True if path is a directory.
print(os.path.isfile(path))   # Returns True if path is a regular file.
print(os.path.basename(path)) # Returns the base name (the part after the last ‘/’ character)
print(os.path.dirname(path))  # Returns the directory name (the part before the last / character).
print(os.path.abspath(path))  # Make path absolute (i.e., start with a /).

True
True
False
data-new

/content/data/data-new


The shell utilities module [shutil](https://docs.python.org/3/library/shutil.html) is also very useful.

In [61]:
import shutil as sh
sh.copy('../sample_data/README.md','./')
sh.move('README.md','ReadMe.md')
! ls -l
os.remove('ReadMe.md')
! ls -l

total 8
drwxr-xr-x 2 root root 4096 Sep  2 14:54 data-new
-rwxr-xr-x 1 root root  930 Sep  2 15:11 ReadMe.md
total 4
drwxr-xr-x 2 root root 4096 Sep  2 14:54 data-new


## Reading and Writing (text) Files
