# Introduction to Python - Programming

## What is Python?

[Python](http://www.python.org/) is a modern, general-purpose, object-oriented, high-level programming language.

General characteristics of Python:

* **clean and simple language:** Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects.
* **expressive language:** Fewer lines of code, fewer bugs, easier to maintain.

Technical details:

* **dynamically typed:** No need to define the type of variables, function arguments or return types.
* **automatic memory management:** No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs. 
* **interpreted:** No need to compile the code. The Python interpreter reads and executes the python code directly.

Advantages:

* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.
* Well designed language that encourage many good programming practices:
 * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.
 * Documentation tightly integrated with the code.
* A large standard library, and a large collection of add-on packages.

Disadvantages:

* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. 
* Somewhat decentralized, with different environment, packages and documentation spread out at different places. Can make it harder to get started.

## What makes python suitable for scientific computing?

<img src="images/optimizing-what.png" width="600">

* Python has a strong position in scientific computing: 
    * Large community of users, easy to find help and documentation.

* Extensive ecosystem of scientific libraries and environments
    * numpy: http://numpy.scipy.org - Numerical Python
    * scipy: http://www.scipy.org -  Scientific Python
    * matplotlib: http://www.matplotlib.org - graphics library

* Great performance due to close integration with time-tested and highly optimized codes written in C and Fortran:
    * blas, atlas blas, lapack, arpack, Intel MKL, ...

* Good support for 
    * Parallel processing with processes and threads
    * Interprocess communication (MPI)
    * GPU computing (OpenCL and CUDA)

* Readily available and suitable for use on high-performance computing clusters. 

* No license costs, no unnecessary use of research budget.

### Python environments

Python is not only a programming language, but often also refers to the standard implementation of the interpreter (technically referred to as [CPython](http://en.wikipedia.org/wiki/CPython)) that actually runs the python code on a computer.

There are also many different environments through which the python interpreter can be used. Each environment has different advantages and is suitable for different workflows. One strength of python is that it is versatile and can be used in complementary ways, but it can be confusing for beginners so we will start with a brief survey of python environments that are useful for scientific computing.

### IPython

IPython is an interactive shell that addresses the limitation of the standard python interpreter, and it is a work-horse for scientific use of python. It provides an interactive prompt to the python interpreter with a greatly improved user-friendliness.

<!-- <img src="files/images/ipython-screenshot.jpg" width="600"> -->
<img src="images/ipython-screenshot.jpg" width="600">

Some of the many useful features of IPython includes:

* Command history, which can be browsed with the up and down arrows on the keyboard.
* Tab auto-completion.
* In-line editing of code.
* Object introspection, and automatic extract of documentation strings from python objects like classes and functions.
* Good interaction with operating system shell.
* Support for multiple parallel back-end processes, that can run on computing clusters or cloud services like Amazon EC2.


### IPython notebook

[IPython notebook](http://ipython.org/notebook.html) is an HTML-based notebook environment for Python, similar to Mathematica or Maple. It is based on the IPython shell, but provides a cell-based environment with great interactivity, where calculations can be organized and documented in a structured way.

<!-- <img src="files/images/ipython-notebook-screenshot.jpg" width="800"> -->
<img src="images/ipython-notebook-screenshot.jpg" width="800">

Although using a web browser as graphical interface, IPython notebooks are usually run locally, from the same computer that run the browser. To start a new IPython notebook session, run the following command:

    $ ipython notebook

from a directory where you want the notebooks to be stored. This will open a new browser window (or a new tab in an existing window) with an index page where existing notebooks are shown and from which new notebooks can be created.

## Versions of Python

There are currently two versions of python: Python 2 and Python 3. Python 3 will eventually supercede Python 2, but it is not backward-compatible with Python 2. A lot of existing python code and packages has been written for Python 2, and it is still the most wide-spread version. For these lectures either version will be fine, but it is probably easier to stick with Python 2 for now, because it is more readily available via prebuilt packages and binary installers.

To see which version of Python you have, run
    
    $ python --version
    Python 2.7.3
    $ python3.2 --version
    Python 3.2.3

Several versions of Python can be installed in parallel, as shown above.


## Installation

### Conda

The best way set-up an scientific Python environment is to use the cross-platform package manager `conda` from Continuum Analytics. First download and install miniconda http://conda.pydata.org/miniconda.html or Anaconda (see below). Next, to install the required libraries for these notebooks, simply run:

    $ conda install ipython ipython-notebook spyder numpy scipy sympy matplotlib cython

This should be sufficient to get a working environment on any platform supported by `conda`.

## Python and module versions

Since there are several different versions of Python and each Python package has its own release cycle and version number (for example scipy, numpy, matplotlib, etc., which we installed above and will discuss in detail in the following lectures), it is important for the reproducibility of an IPython notebook to record the versions of all these different software packages. If this is done properly it will be easy to reproduce the environment that was used to run a notebook, but if not it can be hard to know what was used to produce the results in a notebook.

To encourage the practice of recording Python and module versions in notebooks, I've created a simple IPython extension that produces a table with versions numbers of selected software components. I believe that it is a good practice to include this kind of table in every notebook you create. 

To install this IPython extension, use `pip install version_information`:

In [None]:
# you only need to do this once
!pip install --upgrade version_information

Collecting version_information
  Downloading https://files.pythonhosted.org/packages/ff/b0/6088e15b9ac43a08ccd300d68e0b900a20cf62077596c11ad11dd8cc9e4b/version_information-1.0.3.tar.gz
Building wheels for collected packages: version-information
  Building wheel for version-information (setup.py) ... [?25l[?25hdone
  Created wheel for version-information: filename=version_information-1.0.3-cp36-none-any.whl size=3881 sha256=b7b41464f727440977a0af198d6cd04de76cabc1e8ebeae584aac59924884dd9
  Stored in directory: /root/.cache/pip/wheels/1f/4c/b3/1976ac11dbd802723b564de1acaa453a72c36c95827e576321
Successfully built version-information
Installing collected packages: version-information
Successfully installed version-information-1.0.3


In [None]:
%load_ext version_information

%version_information numpy, scipy, matplotlib, sympy, version_information

Software,Version
Python,3.6.9 64bit [GCC 8.4.0]
IPython,5.5.0
OS,Linux 4.19.112+ x86_64 with Ubuntu 18.04 bionic
numpy,1.19.5
scipy,1.4.1
matplotlib,3.2.2
sympy,1.1.1
version_information,1.0.3
Thu Feb 04 12:41:35 2021 UTC,Thu Feb 04 12:41:35 2021 UTC


## Launching the notebook

**On windows** - Windows Button + R + type 'cmd' + 
type 'jupyter notebook' + opens up in the browser + 'New' + Python Notebook

**On Mac/Linux** - Terminal + 'jupyter notebook'+ opens up in the browser + 'New' + Python Notebook

Note

- Autocomplete function and object names with < tab >
- Get help on functions, objects, methods by appending a '?' and hit Shift-enter.

## Python program files

* Python code is usually stored in text files with the file ending "`.py`":

        myprogram.py

* Every line in a Python program file is assumed to be a Python statement, or part thereof. 

    * The only exception is comment lines, which start with the character `#` (optionally preceded by an arbitrary number of white-space characters, i.e., tabs or spaces). Comment lines are usually ignored by the Python interpreter.


* To run our Python program from the command line we use:

        $ python myprogram.py

* On UNIX systems it is common to define the path to the interpreter on the first line of the program (note that this is a comment line as far as the Python interpreter is concerned):

        #!/usr/bin/env python

  If we do, and if we additionally set the file script to be executable, we can run the program like this:

        $ myprogram.py

### Example:

In [None]:
ls scripts/hello-world*.py

ls: cannot access 'scripts/hello-world*.py': No such file or directory


In [None]:
cat scripts/hello-world.py

cat: scripts/hello-world.py: No such file or directory


In [None]:
!python scripts/hello-world.py

python3: can't open file 'scripts/hello-world.py': [Errno 2] No such file or directory


## IPython notebooks

This file - an IPython notebook -  does not follow the standard pattern with Python code in a text file. Instead, an IPython notebook is stored as a file in the [JSON](http://en.wikipedia.org/wiki/JSON) format. The advantage is that we can mix formatted text, Python code and code output. It requires the IPython notebook server to run it though, and therefore isn't a stand-alone Python program as described above. Other than that, there is no difference between the Python code that goes into a program file or an IPython notebook.

## Modules

Most of the functionality in Python is provided by *modules*. The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.

To use a module in a Python program it first has to be imported. A module can be imported using the `import` statement. For example, to import the module `math`, which contains many standard mathematical functions, we can do:

In [None]:
import math

This includes the whole module and makes it available for use later in the program. For example, we can do:

In [None]:
x = math.cos(2 * math.pi)

print(x)

1.0


Alternatively, we can chose to import all symbols (functions and variables) in a module to the current namespace (so that we don't need to use the prefix "`math.`" every time we use something from the `math` module:

In [None]:
from math import *

x = cos(2 * pi)

print(x)

1.0


This pattern can be very convenient, but in large programs that include many modules it is often a good idea to keep the symbols from each module in their own namespaces, by using the `import math` pattern. This would elminate potentially confusing problems with name space collisions.

As a third alternative, we can chose to import only a few selected symbols from a module by explicitly listing which ones we want to import instead of using the wildcard character `*`:

In [None]:
from math import cos, pi

x = cos(2 * pi)

print(x)

1.0


### Looking at what a module contains, and its documentation

Once a module is imported, we can list the symbols it provides using the `dir` function:

In [None]:
import math

print(dir(math))

['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']


In [None]:
import pandas
print(dir(pandas))

['BooleanDtype', 'Categorical', 'CategoricalDtype', 'CategoricalIndex', 'DataFrame', 'DateOffset', 'DatetimeIndex', 'DatetimeTZDtype', 'ExcelFile', 'ExcelWriter', 'Float64Index', 'Grouper', 'HDFStore', 'Index', 'IndexSlice', 'Int16Dtype', 'Int32Dtype', 'Int64Dtype', 'Int64Index', 'Int8Dtype', 'Interval', 'IntervalDtype', 'IntervalIndex', 'MultiIndex', 'NA', 'NaT', 'NamedAgg', 'Panel', 'Period', 'PeriodDtype', 'PeriodIndex', 'RangeIndex', 'Series', 'SparseArray', 'SparseDataFrame', 'SparseDtype', 'SparseSeries', 'StringDtype', 'Timedelta', 'TimedeltaIndex', 'Timestamp', 'UInt16Dtype', 'UInt32Dtype', 'UInt64Dtype', 'UInt64Index', 'UInt8Dtype', '__Datetime', '__DatetimeSub', '__SparseArray', '__SparseArraySub', '__builtins__', '__cached__', '__doc__', '__docformat__', '__file__', '__git_version__', '__loader__', '__name__', '__numpy', '__package__', '__path__', '__spec__', '__version__', '_config', '_hashtable', '_is_numpy_dev', '_lib', '_libs', '_np_version_under1p16', '_np_version_under

And using the function `help` we can get a description of each function (almost .. not all functions have docstrings, as they are technically called, but the vast majority of functions are documented this way). 

In [None]:
help(math.log)

Help on built-in function log in module math:

log(...)
    log(x[, base])
    
    Return the logarithm of x to the given base.
    If the base not specified, returns the natural logarithm (base e) of x.



In [None]:
log(10)

2.302585092994046

In [None]:
log(10, 2)

3.3219280948873626

We can also use the `help` function directly on modules: Try

    help(math) 

Some very useful modules form the Python standard library are `os`, `sys`, `math`, `shutil`, `re`, `subprocess`, `multiprocessing`, `threading`. 

A complete lists of standard modules for Python 2 and Python 3 are available at http://docs.python.org/2/library/ and http://docs.python.org/3/library/, respectively.

## Variables and types

### Symbol names 

Variable names in Python can contain alphanumerical characters `a-z`, `A-Z`, `0-9` and some special characters such as `_`. Normal variable names must start with a letter. 

By convention, variable names start with a lower-case letter, and Class names start with a capital letter. 

In addition, there are a number of Python keywords that cannot be used as variable names. These keywords are:

    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield

Note: Be aware of the keyword `lambda`, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.

### Assignment



The assignment operator in Python is `=`. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.

Assigning a value to a new variable creates the variable:

In [None]:
# variable assignments
x = 1.0
my_variable = 12.2

Although not explicitly specified, a variable does have a type associated with it. The type is derived from the value that was assigned to it.

In [None]:
type(x)

float

If we assign a new value to a variable, its type can change.

In [None]:
x = 1

In [None]:
type(x)

int

If we try to use a variable that has not yet been defined we get an `NameError`:

In [None]:
print(y)

NameError: ignored

### Fundamental Data Types

> #### Numeric Types: int, long, float, bool, complex, None
> #### Compound Types: Strings, touples, lists, dictionaries, sets

In [None]:
# integers
i = 1
type(i)

int

In [None]:
# float
f = 1.0
type(f)

float

In [None]:
# Int stored as string => int
int('124')

124

In [None]:
# Int to float
float(1234)

1234.0

In [None]:
# float to str
str(123.5)

'123.5'

In [None]:
# float to int
int(3.15)

3

In [None]:
# Adding a numeric value to a string

# Method 1: use '+' to concatenate
print ('The first four digits of Pi are ' + str(3.142))

# Method 2: use .format()
print ('The first four digits of Pi are {} and the value of e is {}'.format(3.142, 2.73))

The first four digits of Pi are 3.142
The first four digits of Pi are 3.142 and the value of e is 2.73


In [None]:
# Checking the types
print (isinstance(f, float))
print (isinstance(i, int))
print (type(i) == int)
print (type(f) == float)

True
True
True
True


In [None]:
# boolean
b1 = True
b2 = False

type(b1)

bool

In [None]:
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x)

complex

In [None]:
print(x)

(1-1j)


In [None]:
print(x.real, x.imag)

1.0 -1.0


In [None]:
x=124
y='124'
print(y)

124


### Type casting

In [None]:
x = 1.5

print(x, type(x))

1.5 <class 'float'>


In [None]:
x = int(x)

print(x, type(x))

1 <class 'int'>


In [None]:
z = complex(x)

print(z, type(z))

(1+0j) <class 'complex'>


In [None]:
x = float(z)

TypeError: ignored

Complex variables cannot be cast to floats or integers. We need to use `z.real` or `z.imag` to extract the part of the complex number we want:

In [None]:
y = bool(z.real)

print(z.real, " -> ", y, type(y))

y = bool(z.imag)

print(z.imag, " -> ", y, type(y))

1.0  ->  True <class 'bool'>
0.0  ->  False <class 'bool'>


In [None]:
int(z.real)
int(z.imag)

0

## Operators and comparisons

Most operators and comparisons in Python work as one would expect:

* Arithmetic operators `+`, `-`, `*`, `/`, `//` (integer division), '**' power


In [None]:
1 + 2, 1 - 2, 1 * 2, 1 / 2

(3, -1, 2, 0.5)

In [None]:
1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0

(3.0, -1.0, 2.0, 0.5)

In [None]:
# Integer division of float numbers
3.0 // 2.0

1.0

In [None]:
# Note! The power operators in python isn't ^, but **
2 ** 2

4

Note: The `/` operator always performs a floating point division in Python 3.x.
This is not true in Python 2.x, where the result of `/` is always an integer if the operands are integers.
to be more specific, `1/2 = 0.5` (`float`) in Python 3.x, and `1/2 = 0` (`int`) in Python 2.x (but `1.0/2 = 0.5` in Python 2.x).

* The boolean operators are spelled out as the words `and`, `not`, `or`. 

### Complex Logic using `and`, `or` and `xor`

- `[A and B]` will give `True` only if both A and B are `True`
- `[A or  B]` will give `True` if one or both of A, B are `True`
- `[A xor B]` will give `True` if one is `True` but not both.

In [None]:
True and False

False

In [None]:
not False

True

In [None]:
True or False

True

* Comparison operators `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` equality, `is` identical.

In [None]:
2 > 1, 2 < 1

(True, False)

In [None]:
2 > 2, 2 < 2

(False, False)

In [None]:
2 >= 2, 2 <= 2

(True, True)

In [None]:
# equality
[1,2] == [1,2]

True

In [None]:
# objects identical?
l1 = l2 = [1,2]

l1 is l2

True

## Compound types: Strings, List and dictionaries

### Strings

Strings are the variable type that is used for storing text messages. 

In [None]:
# str

s1 = 'Delhi Technological University'

s2 = """
Delhi Technological University
Rohini, New Delhi
"""

print (s1)
print (s2)


Delhi Technological University

Delhi Technological University
Rohini, New Delhi



In [None]:
type(s1)

str

In [None]:
# length of the string: the number of characters
len(s1)

30

In [None]:
# replace a substring in a string with something else
s4 = s1.replace("Delhi Technological University", "DTU")
print(s4)

DTU


#### String $Subsetting (slicing)$

Syntax:<br>`my_str[start:stop:skip]`
We can index a character in a string using `[]`:

- Index runs from 0 to Length-1
- Subsetting can be done in ranges as start_index : end_index

In [None]:
s1

'Delhi Technological University'

In [None]:
s1[0]

'D'

In [None]:
s1[0:5]

'Delhi'

In [None]:
s1[4:5]

'i'

If we omit either (or both) of `start` or `stop` from `[start:stop]`, the default is the beginning and the end of the string, respectively:

In [None]:
s1[:5]

'Delhi'

In [None]:
s1[6:]

'Technological University'

In [None]:
s1[:]

'Delhi Technological University'

We can also define the step size using the syntax `[start:end:step]` (the default value for `step` is 1, as we saw above):

In [None]:
s1[::1]

'Delhi Technological University'

In [None]:
s1[::2]

'DliTcnlgclUiest'

In [None]:
s1[:-1]

'Delhi Technological Universit'

This technique is called *slicing*. Read more about the syntax here: http://docs.python.org/release/2.7.3/library/functions.html?highlight=slice#slice

#### String Methods

In [None]:
s1.endswith('ty')

True

In [None]:
s1.split(" ")

['Delhi', 'Technological', 'University']

In [None]:
s1.split("l")

['De', 'hi Techno', 'ogica', ' University']

In [None]:
s1

'Delhi Technological University'

In [None]:
s1.find('Tech')

6

In [None]:
### Splitting a string gives a list
s5= "DTU is a premier institute for engineering"
s6= s5.split(" ")

In [None]:
s6

['DTU', 'is', 'a', 'premier', 'institute', 'for', 'engineering']

In [None]:
type(s6)

list

#### String Airthmatic

In [None]:
s5 + ' based out of Rohini,' + ' New Delhi'

'DTU is a premier institute for engineering based out of Rohini, New Delhi'

In [None]:
'DTU ' * 4

'DTU DTU DTU DTU '

#### Type Conversions to/from strings

In [None]:
str(12345)

'12345'

In [None]:
'Apple' + str(7)

'Apple7'

In [None]:
int('123')

123

In [None]:
float('3.142')

3.142

In [None]:
int('e')

ValueError: ignored

#### String formatting examples

In [None]:
print("str1", "str2", "str3")  # The print statement concatenates strings with a space

str1 str2 str3


In [None]:
print("str1", 1.0, False, -1j)  # The print statements converts all arguments to strings

str1 1.0 False (-0-1j)


In [None]:
print("str1" + "str2" + "str3") # strings added with + are concatenated without space

str1str2str3


In [None]:
y=print("str1" + "str2" + "str3")

str1str2str3


In [None]:
print(y)

None


In [None]:
print("value = %f" % 1.0)       # we can use C-style string formatting

value = 1.000000
