<a href="https://colab.research.google.com/github/xMCTH/DSFMCTH/blob/main/03_Basic_Python_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="https://github.com/sigvehaug/Introduction-to-Python-Programming-For-Medical-Researchers/blob/master/Course/fig/cover-small.jpg?raw=1">

*This notebook contains an excerpt from the [Whirlwind Tour of Python](http://www.oreilly.com/programming/free/a-whirlwind-tour-of-python.csp) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/WhirlwindTourOfPython).*

*The text and code are released under the [CC0](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/LICENSE) license; see also the companion project, the [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook).*


Data Science Fundamentals for Chemists and Biochemists, University of Bern, Sigve Haug

# Basic Python 2 (60 min)

This notebook is a systematic and very condensed overview of
- Python Control Flow
- Python Functions
- Python Strings and Regular expressions
- Python File and Operating System Operations
- Python File Reading and Writing
- Python Modules and Packages

The content is basic and belongs to necessary knowledge by any Python programmers. One does not learn it by heart, however, after some hours of practicing Python, it automatically becomes active knowledge.

It corresponds to the chapters 7, 8, 13, 14 and 15 of the book referenced above. There you may get more detailed descriptions. The chapters on Errors and Exceptions, Iterators, List Comprehensions and Generators we don't cover here. You may read them in the book.

## Control Flow

Control flow is where the rubber really meets the road in programming. Without it, a program is simply a list of statements that are sequentially executed. With control flow, you can execute certain code blocks conditionally and/or repeatedly: these basic building blocks can be combined to create surprisingly sophisticated programs!

Here we'll cover conditional statements (including "if", "elif", and "else"), loop statements (including "for" and "while" and the accompanying "break", "continue", and "pass").

Practise by executing the code snippets yourself !

### Conditional Statements: if-elif-else

In [None]:

x = -15
if x == 0:
    print(x, "is zero")
elif x > 0:
    print(x, "is positive")
elif x < 0:
    print(x, "is negative")
else:
    print(x, "is unlike anything I've ever seen...")


-15 is negative


### for Loops

In [None]:
for N in [2, 3, 5, 7]:
    print(N, end=' ') # print all on same line
print()
for i in range(10):
    print(i, end=' ')
print()
for i in range(6,60,5):
    print(i, end=' ')

2 3 5 7 
0 1 2 3 4 5 6 7 8 9 
6 11 16 21 26 31 36 41 46 51 56 

### while Loops

In [None]:
i = 0
while i < 10:
    print(i, end=' ')
    i += 1

0 1 2 3 4 5 6 7 8 9 

### break and continue

In [None]:
for n in range(20):
    # if the remainder of n / 2 is 0, skip the rest of the loop
    if n % 2 == 0:
        continue

    print(n, end=' ')

1 3 5 7 9 11 13 15 17 19 

In [None]:
a, b = 0, 1
amax = 100
L = []

while True:
    (a, b) = (b, a + b)
    if a > amax:
        break
    L.append(a)

print(L)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


In [None]:
L = []
nmax = 30

for n in range(2, nmax):
    for factor in L:
        if n % factor == 0:
            break
    else: # no break
        L.append(n)
print(L)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]


## Defining and Using Functions

So far, our scripts have been simple, single-use code blocks. One way to organize our Python code and to make it more readable and reusable is to factor-out useful pieces into reusable functions. Here we'll cover two ways of creating functions: the def statement, useful for any type of function, and the lambda statement, useful for creating short anonymous functions.

In [None]:
L = []
nmax = 30

for n in range(2, nmax):
    for factor in L:
        if n % factor == 0:
            break
    else:
        L.append(n)
print(L)


[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]


In [None]:
def real_imag_conj(val):
    return val.real, val.imag, val.conjugate()

r, i, c = real_imag_conj(3 + 4j)
print(r, i, c)

3.0 4.0 (3-4j)


In [None]:
def fibonacci(N, a=0, b=1):
    L = []
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

print(fibonacci(10))
print(fibonacci(10,3,1))
print(fibonacci(10, b=3, a=1))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
[1, 4, 5, 9, 14, 23, 37, 60, 97, 157]
[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]


### \*args and **kwargs: Flexible Arguments

Sometimes you might wish to write a function in which you don't initially know how many arguments the user will pass. In this case, you can use the special form *args and **kwargs to catch all arguments that are passed. Here is an example:

In [None]:
def catch_all(*args, **kwargs):
    print("args =", args)
    print("kwargs = ", kwargs)

catch_all(1, 2, 3, a=4, b=5)
catch_all('a', keyword=2)

args = (1, 2, 3)
kwargs =  {'a': 4, 'b': 5}
args = ('a',)
kwargs =  {'keyword': 2}



### Anonymous (lambda) Functions

Earlier we quickly covered the most common way of defining functions, the def statement. You'll likely come across another way of defining short, one-off functions with the lambda statement. It looks something like this:

In [None]:

add = lambda x, y: x + y
add(1, 2)

3

## String Manipulation and Regular Expressions

One place where the Python language really shines is in the manipulation of strings. This section will cover some of Python's built-in string methods and formatting operations, before moving on to a quick guide to the extremely useful subject of regular expressions. Such string manipulation patterns come up often in the context of data science work, and is one big perk of Python in this context.

Strings in Python can be defined using either single or double quotations (they are functionally equivalent):

In [None]:
 x = 'a string'
 y = "a string"
 x == y

True

In [None]:
 # Multiline strings
 multiline = """
 one
 two
 three
 """


With this, let's take a quick tour of some of Python's string manipulation tools.

In [None]:
 fox = "tHe qUICk bROWn fOx."
 print(fox.upper())
 print(fox.lower())
 print(fox.title())
 print(fox.capitalize())
 print(fox.swapcase())

THE QUICK BROWN FOX.
the quick brown fox.
The Quick Brown Fox.
The quick brown fox.
ThE QuicK BrowN FoX.


In [None]:
 fox = "       tHe   qUICk bROWn fOx."
 print(fox)
 print(fox.strip())
 print(fox.lstrip())
 print(fox.rstrip())
 print(fox.rstrip('.'))
 print(fox.replace('Ox','XX'))

       tHe   qUICk bROWn fOx.
tHe   qUICk bROWn fOx.
tHe   qUICk bROWn fOx.
       tHe   qUICk bROWn fOx.
       tHe   qUICk bROWn fOx
       tHe   qUICk bROWn fXX.


There are many more convenient string methods. You can look the up in the online documentation.

In [None]:
 # Split and work on substrings
 haiku = """matsushima-ya
 aah matsushima-ya
 matsushima-ya"""
 print(haiku,'\n')  # \n is the code for newline, i.e. this statements prints an empty line after the string
 lines = haiku.splitlines()
 i=0
 for line in lines:
     i+=1
     words = line.split()
     print('Line',i,'has', len(words), 'words.')

matsushima-ya
aah matsushima-ya
matsushima-ya 

Line 1 has 1 words.
Line 2 has 2 words.
Line 3 has 1 words.



### Format Strings

In the preceding methods, we have learned how to extract values from strings, and to manipulate strings themselves into desired formats. Another use of string methods is to manipulate string representations of values of other types. Of course, string representations can always be found using the str() function; for example

In [None]:
pi = 3.14159
print("The value of pi is " + str(pi))
print("pi = %.4f" % pi)               # % formatting, old style
print("pi = {0:.3f}".format(pi))      # format formatting
f'pi = {pi:.3f}'                      # fstring formatting

The value of pi is 3.14159
pi = 3.1416
pi = 3.142


'pi = 3.142'

This style of format specification is very flexible, and the examples here barely scratch the surface of the formatting options available. For more information on the syntax of these format strings, see the Format Specification section of Python's online documentation.

### Flexible Pattern Matching with Regular Expressions¶

The methods of Python's str type give you a powerful set of tools for formatting, splitting, and manipulating string data. But even more powerful tools are available in Python's built-in regular expression module. Regular expressions are a huge topic; there are entire books written on the topic (including Jeffrey E.F. Friedl’s Mastering Regular Expressions, 3rd Edition), so it will be hard to do justice within just a single subsection.


In [None]:
# Just one example with regular expressions
import re
email = re.compile('\w+@\w+\.[a-z]{3}')
text = "To email Guido, try guido@python.org or the older address guido@google.com."
email.findall(text)

['guido@python.org', 'guido@google.com']

### Further Resources on Regular Expressions

If you'd like to learn more, I recommend the following resources:

- [Python's ``re`` package Documentation](https://docs.python.org/3/library/re.html): I find that I promptly forget how to use regular expressions just about every time I use them. Now that I have the basics down, I have found this page to be an incredibly valuable resource to recall what each specific character or sequence means within a regular expression.
- [Python's official regular expression HOWTO](https://docs.python.org/3/howto/regex.html): a more narrative approach to regular expressions in Python.
- [Mastering Regular Expressions (OReilly, 2006)](http://shop.oreilly.com/product/9780596528126.do) is a 500+ page book on the subject. If you want a really complete treatment of this topic, this is the resource for you.

For some examples of string manipulation and regular expressions in action at a larger scale, see [Pandas: Labeled Column-oriented Data](15-Preview-of-Data-Science-Tools.ipynb#Pandas:-Labeled-Column-oriented-Data), where we look at applying these sorts of expressions across *tables* of string data within the Pandas package.

## File and Operating System Operations


Filesystem operations can be carried out by executing a normal shell command preceded by exclamation mark. Here is how may look like on a Linux machine.

In [None]:
 ! ls -l

total 4
drwxr-xr-x 1 root root 4096 May 22 13:33 sample_data


Python has a library, the [operating system module (os)](https://docs.python.org/3/library/os.html), which makes the code os independent and the interface to the rest of the Python code transparent. 

In [None]:
 import os
 cwd = os.getcwd()        # Returns the path to the current working directory.
 os.mkdir(cwd+'/data')   # Creates a directory; nothing happens if the directory already exists. Creates all the intermediate-level directories needed to contain the leaf.
 os.chdir('data')         # Changes the current working directory to path.
 print(os.listdir())    # Returns the list of entries in directory dir (omitting ‘.’ and ‘..’) 
 os.chdir('..')
 print(os.listdir())
 os.rename('data','data-new') # Renames a file or directory from old to new.
 print(os.listdir())

[]
['.config', 'data', 'sample_data']
['.config', 'data-new', 'sample_data']


In [None]:
 # More stuff
 path = 'data-new'
 print(os.path.exists(path))   # Returns True if path exists.
 print(os.path.isdir(path))    # Returns True if path is a directory.
 print(os.path.isfile(path))   # Returns True if path is a regular file.
 print(os.path.basename(path)) # Returns the base name (the part after the last ‘/’ character)
 print(os.path.dirname(path))  # Returns the directory name (the part before the last / character).
 print(os.path.abspath(path))  # Make path absolute (i.e., start with a /).

True
True
False
data-new

/content/data-new


The shell utilities module [shutil](https://docs.python.org/3/library/shutil.html) is also very useful.

In [None]:
 import shutil as sh
 sh.copy('../sample_data/README.md','./')
 sh.move('README.md','ReadMe.md')
 ! ls -l
 os.remove('ReadMe.md')
 ! ls -l

## Reading and Writing (text) Files


There are many types of files, i.e. they have different formats. For data analysis the Pandas library has the methods for reading and writing common data files (csv, excel, json, html ...). There are other libraries for handling other types of files like images, audio files etc. Quite often one needs to deal with text (ascii) files in a data analysis process. Here we show how this can be done. 

In [None]:
# Here we have some text to be saved to a file
mytext = """WORKSHOP HAIKU
translated by Éva Antal

Perhaps do not even touch it.
Just look at it, look at it,
until it becomes beautiful.

 

TEST QUESTION FOR EVERY DAY
translated by Éva Antal

Do you still see
what you look at, or you only
know: "there" "it" "is"?

 

FROM THE BEST OF INTENTIONS
translated by Gábor G. Gyukics and Michael Castro

fall asleep;
die the same way a child
bites into an apple.

 

MEETING
translated by Gábor G. Gyukics and Michael Castro

I plan it as a farewell

 

THE HAIKU
translated by Tamás Révbíró

in front of my feet
a bird sat, and then took flight.
Now I'm heavier.

 

AXIOM
translated by Tamás Révbíró

You should try and help
everything to be the way
it is anyway.

 

ECHO ON EPICTETUS
translated by Tamás Révbíró

Don't say, "I lost it",
about anything. Rather
say, "I gave it back".

 

AXIOM
translated by Tamás Révbíró

Parents and killers:
almost-innocent servants.
They just execute.

 

ZENsation
translated by Tamás Révbíró

Look, the snow gives body to the wind!

 

DISILLUSIONIST
translated by Tamás Révbíró

Why should I travel
when I can be a stranger
right here, standing still?"""

In [None]:
# Create a file with argument w and use it as an object
 with open('Haikus.txt', 'w') as outstream:
     outstream.write(mytext)

In [None]:
 # Open the file and read the first two lines
 with open('Haikus.txt', 'r') as instream:
     print(instream.readline())
     print(instream.readline())

In [None]:
 # Read all the lines and print the first 6
 with open('Haikus.txt', 'r') as instream:
     textlines = instream.readlines()
 for i in range(6):
     print(textlines[i])

Now that we knows how to read and write files, we can use the string methods or regular expressions to work on them. 

## Modules and Packages



One feature of Python that makes it useful for a wide range of tasks is the fact that it comes "batteries included" – that is, the Python standard library contains useful tools for a wide range of tasks. On top of this, there is a broad ecosystem of third-party tools and packages that offer more specialized functionality. Here we'll take a look at importing standard library modules and tools for installing third-party modules.

### Loading Modules: the import Statement
For loading built-in and third-party modules, Python provides the import statement. There are a few ways to use the statement.

In [None]:
 # Explicit 
 import math
 math.cos(math.pi)

In [None]:
 # Explicit by alias
 import numpy as np
 np.cos(np.pi)

In [None]:
 # Explicit import of contents from a module
 from math import cos, pi
 cos(pi)

In [None]:
 # Implicit import of module contents
 from math import *
 sin(pi) ** 2 + cos(pi) ** 2


### Importing from Python's Standard Library
Python's standard library contains many useful built-in modules, which you can read about fully in Python's documentation. Any of these can be imported with the import statement, and then explored using the help function.

You can find information on these, and many more, in the Python standard library documentation: https://docs.python.org/3/library/.

In [None]:
 import math
 help(math)

### Importing third-party modules

Importing from Third-Party Modules
One of the things that makes Python useful, especially within the world of data science, is its ecosystem of third-party modules. These can be imported just as the built-in modules, but first the modules must be installed on your system. The standard registry for such modules is the Python Package Index (PyPI for short), found on the Web at http://pypi.python.org/. For convenience, Python comes with a program called pip (a recursive acronym meaning "pip installs packages"), which will automatically fetch packages released and listed on PyPI (if you use Python version 2, pip must be installed separately). For example, if you'd like to install the supersmoother package, all that is required is to type the following at the command line:

In [None]:
 !pip install supersmooother

### Writing own modules and packages

You may of course write your own Python modules and import them when needed. You can also package them and make them available to others. We don't cover this in this course.

# Exercise

- Write a function which takes the path and name of the CCD data file on your google drive, opens the file and reads the metadata at the beginning of the file and returns this as a string which you can print.
- Modify the function such that it saves the metadata in a file called "CCD-Data.txt" and the rest, i.e. only the data, in a file called CCD-Data.csv
- Write a script which counts the number of time stamps in the data file.