# Introduction to Python - Programming

## What is Python?

[Python](http://www.python.org/) is a modern, general-purpose, object-oriented, high-level programming language.

General characteristics of Python:

* **clean and simple language:** Easy-to-read and intuitive code, easy-to-learn minimalistic syntax, maintainability scales well with size of projects.
* **expressive language:** Fewer lines of code, fewer bugs, easier to maintain.

Technical details:

* **dynamically typed:** No need to define the type of variables, function arguments or return types.
* **automatic memory management:** No need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs. 
* **interpreted:** No need to compile the code. The Python interpreter reads and executes the python code directly.

Advantages:

* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.
* Well designed language that encourage many good programming practices:
 * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.
 * Documentation tightly integrated with the code.
* A large standard library, and a large collection of add-on packages.

Disadvantages:

* Since Python is an interpreted and dynamically typed programming language, the execution of python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. 
* Somewhat decentralized, with different environment, packages and documentation spread out at different places. Can make it harder to get started.

## What makes python suitable for scientific computing?

<img src="images/optimizing-what.png" width="600">

* Python has a strong position in scientific computing: 
    * Large community of users, easy to find help and documentation.

* Extensive ecosystem of scientific libraries and environments
    * numpy: http://numpy.scipy.org - Numerical Python
    * scipy: http://www.scipy.org -  Scientific Python
    * matplotlib: http://www.matplotlib.org - graphics library

* Great performance due to close integration with time-tested and highly optimized codes written in C and Fortran:
    * blas, atlas blas, lapack, arpack, Intel MKL, ...

* Good support for 
    * Parallel processing with processes and threads
    * Interprocess communication (MPI)
    * GPU computing (OpenCL and CUDA)

* Readily available and suitable for use on high-performance computing clusters. 

* No license costs, no unnecessary use of research budget.

### Python environments

Python is not only a programming language, but often also refers to the standard implementation of the interpreter (technically referred to as [CPython](http://en.wikipedia.org/wiki/CPython)) that actually runs the python code on a computer.

There are also many different environments through which the python interpreter can be used. Each environment has different advantages and is suitable for different workflows. One strength of python is that it is versatile and can be used in complementary ways, but it can be confusing for beginners so we will start with a brief survey of python environments that are useful for scientific computing.

### Python interpreter

The standard way to use the Python programming language is to use the Python interpreter to run python code. The python interpreter is a program that reads and execute the python code in files passed to it as arguments. At the command prompt, the command ``python`` is used to invoke the Python interpreter.

For example, to run a file ``my-program.py`` that contains python code from the command prompt, use::

    $ python my-program.py

We can also start the interpreter by simply typing ``python`` at the command line, and interactively type python code into the interpreter. 

<!-- <img src="files/images/python-screenshot.jpg" width="600"> -->
<img src="images/python-screenshot.jpg" width="600">


This is often how we want to work when developing scientific applications, or when doing small calculations. But the standard python interpreter is not very convenient for this kind of work, due to a number of limitations.

### IPython

IPython is an interactive shell that addresses the limitation of the standard python interpreter, and it is a work-horse for scientific use of python. It provides an interactive prompt to the python interpreter with a greatly improved user-friendliness.

<!-- <img src="files/images/ipython-screenshot.jpg" width="600"> -->
<img src="images/ipython-screenshot.jpg" width="600">

Some of the many useful features of IPython includes:

* Command history, which can be browsed with the up and down arrows on the keyboard.
* Tab auto-completion.
* In-line editing of code.
* Object introspection, and automatic extract of documentation strings from python objects like classes and functions.
* Good interaction with operating system shell.
* Support for multiple parallel back-end processes, that can run on computing clusters or cloud services like Amazon EC2.


### IPython notebook

[IPython notebook](http://ipython.org/notebook.html) is an HTML-based notebook environment for Python, similar to Mathematica or Maple. It is based on the IPython shell, but provides a cell-based environment with great interactivity, where calculations can be organized and documented in a structured way.

<!-- <img src="files/images/ipython-notebook-screenshot.jpg" width="800"> -->
<img src="images/ipython-notebook-screenshot.jpg" width="800">

Although using a web browser as graphical interface, IPython notebooks are usually run locally, from the same computer that run the browser. To start a new IPython notebook session, run the following command:

    $ ipython notebook

from a directory where you want the notebooks to be stored. This will open a new browser window (or a new tab in an existing window) with an index page where existing notebooks are shown and from which new notebooks can be created.

### Spyder

[Spyder](http://code.google.com/p/spyderlib/) is a MATLAB-like IDE for scientific computing with python. It has the many advantages of a traditional IDE environment, for example that everything from code editing, execution and debugging is carried out in a single environment, and work on different calculations can be organized as projects in the IDE environment.

<!-- <img src="files/images/spyder-screenshot.jpg" width="800"> -->
<img src="images/spyder-screenshot.jpg" width="800">

Some advantages of Spyder:

* Powerful code editor, with syntax high-lighting, dynamic code introspection and integration with the python debugger.
* Variable explorer, IPython command prompt.
* Integrated documentation and help.

## Versions of Python

There are currently two versions of python: Python 2 and Python 3. Python 3 will eventually supercede Python 2, but it is not backward-compatible with Python 2. A lot of existing python code and packages has been written for Python 2, and it is still the most wide-spread version. For these lectures either version will be fine, but it is probably easier to stick with Python 2 for now, because it is more readily available via prebuilt packages and binary installers.

To see which version of Python you have, run
    
    $ python --version
    Python 2.7.3
    $ python3.2 --version
    Python 3.2.3

Several versions of Python can be installed in parallel, as shown above.


## Installation

### Conda

The best way set-up an scientific Python environment is to use the cross-platform package manager `conda` from Continuum Analytics. First download and install miniconda http://conda.pydata.org/miniconda.html or Anaconda (see below). Next, to install the required libraries for these notebooks, simply run:

    $ conda install ipython ipython-notebook spyder numpy scipy sympy matplotlib cython

This should be sufficient to get a working environment on any platform supported by `conda`.

### WINDOWS

Windows lacks a good packaging system, so the easiest way to setup a Python environment is to install a pre-packaged distribution. Some good alternatives are:

 * [Enthought Python Distribution](http://www.enthought.com/products/epd.php). EPD is a commercial product but is available free for academic use.
 * [Anaconda](http://continuum.io/downloads.html). The Anaconda Python distribution comes with many scientific computing and data science packages and is free, including for commercial use and redistribution. It also has add-on products such as Accelerate, IOPro, and MKL Optimizations, which have free trials and are free for academic use.
 * [Python(x,y)](http://code.google.com/p/pythonxy/). Fully open source.



#### Note

EPD and Anaconda are also available for Linux and Max OS X.

## Python and module versions

Since there are several different versions of Python and each Python package has its own release cycle and version number (for example scipy, numpy, matplotlib, etc., which we installed above and will discuss in detail in the following lectures), it is important for the reproducibility of an IPython notebook to record the versions of all these different software packages. If this is done properly it will be easy to reproduce the environment that was used to run a notebook, but if not it can be hard to know what was used to produce the results in a notebook.

To encourage the practice of recording Python and module versions in notebooks, I've created a simple IPython extension that produces a table with versions numbers of selected software components. I believe that it is a good practice to include this kind of table in every notebook you create. 

To install this IPython extension, use `pip install version_information`:

In [289]:
# you only need to do this once
!pip install --upgrade version_information

Collecting version_information
[33m  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x104fe0828>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)': /simple/version-information/[0m
[33m  Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x104fe06d8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',)': /simple/version-information/[0m
[33m  Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'NewConnectionError('<pip._vendor.requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x104fe0eb8>: Failed to establish a new connection: [Errno 8] no

In [290]:
%load_ext version_information

%version_information numpy, scipy, matplotlib, sympy, version_information

Software,Version
Python,3.5.1 64bit [GCC 4.2.1 (Apple Inc. build 5577)]
IPython,4.1.2
OS,Darwin 16.7.0 x86_64 i386 64bit
numpy,1.11.1
scipy,0.18.0
matplotlib,1.5.1
sympy,The 'sympy' distribution was not found and is required by the application
version_information,1.0.3
Tue Aug 01 05:51:12 2017 IST,Tue Aug 01 05:51:12 2017 IST


## Launching the notebook

**On windows** - Windows Button + R + type 'cmd' + 
type 'jupyter notebook' + opens up in the browser + 'New' + Python Notebook

**On Mac/Linux** - Terminal + 'jupyter notebook'+ opens up in the browser + 'New' + Python Notebook

Note

- Autocomplete function and object names with < tab >
- Get help on functions, objects, methods by appending a '?' and hit Shift-enter.

### Python Programming Principles

- DRY
- The Zen of Python

In [291]:
print "DRY: Don't Repeat Yourself"

SyntaxError: Missing parentheses in call to 'print' (<ipython-input-291-f97e40470880>, line 1)

In [None]:
import this

## Python program files

* Python code is usually stored in text files with the file ending "`.py`":

        myprogram.py

* Every line in a Python program file is assumed to be a Python statement, or part thereof. 

    * The only exception is comment lines, which start with the character `#` (optionally preceded by an arbitrary number of white-space characters, i.e., tabs or spaces). Comment lines are usually ignored by the Python interpreter.


* To run our Python program from the command line we use:

        $ python myprogram.py

* On UNIX systems it is common to define the path to the interpreter on the first line of the program (note that this is a comment line as far as the Python interpreter is concerned):

        #!/usr/bin/env python

  If we do, and if we additionally set the file script to be executable, we can run the program like this:

        $ myprogram.py

### Example:

In [None]:
ls scripts/hello-world*.py

In [None]:
cat scripts/hello-world.py

In [None]:
!python scripts/hello-world.py

## IPython notebooks

This file - an IPython notebook -  does not follow the standard pattern with Python code in a text file. Instead, an IPython notebook is stored as a file in the [JSON](http://en.wikipedia.org/wiki/JSON) format. The advantage is that we can mix formatted text, Python code and code output. It requires the IPython notebook server to run it though, and therefore isn't a stand-alone Python program as described above. Other than that, there is no difference between the Python code that goes into a program file or an IPython notebook.

## Modules

Most of the functionality in Python is provided by *modules*. The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more.

To use a module in a Python program it first has to be imported. A module can be imported using the `import` statement. For example, to import the module `math`, which contains many standard mathematical functions, we can do:

In [None]:
import math

This includes the whole module and makes it available for use later in the program. For example, we can do:

In [None]:
import math

x = math.cos(2 * math.pi)

print(x)

Alternatively, we can chose to import all symbols (functions and variables) in a module to the current namespace (so that we don't need to use the prefix "`math.`" every time we use something from the `math` module:

In [1]:
from math import *

x = cos(2 * pi)

print(x)

1.0


This pattern can be very convenient, but in large programs that include many modules it is often a good idea to keep the symbols from each module in their own namespaces, by using the `import math` pattern. This would elminate potentially confusing problems with name space collisions.

As a third alternative, we can chose to import only a few selected symbols from a module by explicitly listing which ones we want to import instead of using the wildcard character `*`:

In [2]:
from math import cos, pi

x = cos(2 * pi)

print(x)

1.0


### Looking at what a module contains, and its documentation

Once a module is imported, we can list the symbols it provides using the `dir` function:

In [3]:
import math

print(dir(math))

['__doc__', '__file__', '__name__', '__package__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'hypot', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'modf', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'trunc']


In [None]:
print(dir(math))

In [4]:
import pandas
print(dir(pandas))



In [None]:
import os

In [6]:
import os
dir(os)

['EX_CANTCREAT',
 'EX_CONFIG',
 'EX_DATAERR',
 'EX_IOERR',
 'EX_NOHOST',
 'EX_NOINPUT',
 'EX_NOPERM',
 'EX_NOUSER',
 'EX_OK',
 'EX_OSERR',
 'EX_OSFILE',
 'EX_PROTOCOL',
 'EX_SOFTWARE',
 'EX_TEMPFAIL',
 'EX_UNAVAILABLE',
 'EX_USAGE',
 'F_OK',
 'NGROUPS_MAX',
 'O_APPEND',
 'O_ASYNC',
 'O_CREAT',
 'O_DIRECTORY',
 'O_DSYNC',
 'O_EXCL',
 'O_EXLOCK',
 'O_NDELAY',
 'O_NOCTTY',
 'O_NOFOLLOW',
 'O_NONBLOCK',
 'O_RDONLY',
 'O_RDWR',
 'O_SHLOCK',
 'O_SYNC',
 'O_TRUNC',
 'O_WRONLY',
 'P_NOWAIT',
 'P_NOWAITO',
 'P_WAIT',
 'R_OK',
 'SEEK_CUR',
 'SEEK_END',
 'SEEK_SET',
 'TMP_MAX',
 'UserDict',
 'WCONTINUED',
 'WCOREDUMP',
 'WEXITSTATUS',
 'WIFCONTINUED',
 'WIFEXITED',
 'WIFSIGNALED',
 'WIFSTOPPED',
 'WNOHANG',
 'WSTOPSIG',
 'WTERMSIG',
 'WUNTRACED',
 'W_OK',
 'X_OK',
 '_Environ',
 '__all__',
 '__builtins__',
 '__doc__',
 '__file__',
 '__name__',
 '__package__',
 '_copy_reg',
 '_execvpe',
 '_exists',
 '_exit',
 '_get_exports_list',
 '_make_stat_result',
 '_make_statvfs_result',
 '_pickle_stat_result'

In [7]:
!ls /

[1m[34mApplications[m[m              [1m[34mVolumes[m[m                   [1m[34mopt[m[m
[1m[34mDeveloper[m[m                 [1m[34mbin[m[m                       [1m[34mprivate[m[m
[1m[34mLibrary[m[m                   [1m[34mcores[m[m                     [1m[34mpython[m[m
[1m[34mNetwork[m[m                   [1m[34mdev[m[m                       [1m[34msbin[m[m
[1m[34mSystem[m[m                    [1m[35metc[m[m                       [1m[35mtmp[m[m
TEST.txt                  [1m[34mhome[m[m                      [1m[34musr[m[m
[1m[35mUser Information[m[m          installer.failurerequests [1m[35mvar[m[m
[1m[34mUsers[m[m                     [1m[34mnet[m[m


In [None]:
%pwd

In [None]:
!ls /

In [None]:
a.

And using the function `help` we can get a description of each function (almost .. not all functions have docstrings, as they are technically called, but the vast majority of functions are documented this way). 

In [9]:
math.log?

In [None]:
log(10)

In [None]:
log(10, 2)

We can also use the `help` function directly on modules: Try

    help(math) 

Some very useful modules form the Python standard library are `os`, `sys`, `math`, `shutil`, `re`, `subprocess`, `multiprocessing`, `threading`. 

A complete lists of standard modules for Python 2 and Python 3 are available at http://docs.python.org/2/library/ and http://docs.python.org/3/library/, respectively.

## Variables and types

### Symbol names 

Variable names in Python can contain alphanumerical characters `a-z`, `A-Z`, `0-9` and some special characters such as `_`. Normal variable names must start with a letter. 

By convention, variable names start with a lower-case letter, and Class names start with a capital letter. 

In addition, there are a number of Python keywords that cannot be used as variable names. These keywords are:

    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield

Note: Be aware of the keyword `lambda`, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.

### Assignment



The assignment operator in Python is `=`. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.

Assigning a value to a new variable creates the variable:

In [10]:
# variable assignments
x = 1.0
my_variable = 12.2

Although not explicitly specified, a variable does have a type associated with it. The type is derived from the value that was assigned to it.

In [11]:
type(x)

float

If we assign a new value to a variable, its type can change.

In [12]:
x = 1

In [13]:
type(x)

int

If we try to use a variable that has not yet been defined we get an `NameError`:

In [14]:
print(y)

NameError: name 'y' is not defined

### Fundamental Data Types

> #### Numeric Types: int, long, float, bool, complex, None
> #### Compound Types: Strings, touples, lists, dictionaries, sets

In [15]:
# integers
i = 1
type(i)

int

In [16]:
# float
f = 1.0
type(f)

float

In [17]:
# Int stored as string => int
int('124')

124

In [18]:
# Int to float
float(1234)

1234.0

In [19]:
# float to str
str(123.5)

'123.5'

In [20]:
# float to int
int(3.15)

3

In [21]:
# Adding a numeric value to a string

# Method 1: use '+' to concatenate
print 'The first four digits of Pi are ' + str(3.142)

# Method 2: use .format()
print 'The first four digits of Pi are {} and the value of e is {}'.format(3.142, 2.73)

The first four digits of Pi are 3.142
The first four digits of Pi are 3.142 and the value of e is 2.73


In [24]:
# Checking the types
print isinstance(i, float)
print isinstance(i, int)
print type(i) == int
type(f) == float

False
True
True


True

In [25]:
# boolean
b1 = True
b2 = False

type(b1)

bool

In [26]:
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x)

complex

In [27]:
print(x)

(1-1j)


In [28]:
print(x.real, x.imag)

(1.0, -1.0)


### Type utility functions


The module `types` contains a number of type name definitions that can be used to test if variables are of certain types:

In [29]:
import types

# print all types defined in the `types` module
print(dir(types))

['BooleanType', 'BufferType', 'BuiltinFunctionType', 'BuiltinMethodType', 'ClassType', 'CodeType', 'ComplexType', 'DictProxyType', 'DictType', 'DictionaryType', 'EllipsisType', 'FileType', 'FloatType', 'FrameType', 'FunctionType', 'GeneratorType', 'GetSetDescriptorType', 'InstanceType', 'IntType', 'LambdaType', 'ListType', 'LongType', 'MemberDescriptorType', 'MethodType', 'ModuleType', 'NoneType', 'NotImplementedType', 'ObjectType', 'SliceType', 'StringType', 'StringTypes', 'TracebackType', 'TupleType', 'TypeType', 'UnboundMethodType', 'UnicodeType', 'XRangeType', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__']


In [30]:
x = 1.0

# check if the variable x is a float
type(x) is float

True

In [31]:
# check if the variable x is an int
type(x) is int

False

We can also use the `isinstance` method for testing types of variables:

In [32]:
isinstance(x, float)

True

### Type casting

In [33]:
x = 1.5

print(x, type(x))

(1.5, <type 'float'>)


In [35]:
x = int(x)

print(x, type(x))

(1, <type 'int'>)


In [36]:
z = complex(x)

print(z, type(z))

((1+0j), <type 'complex'>)


In [37]:
x = float(z)

TypeError: can't convert complex to float

Complex variables cannot be cast to floats or integers. We need to use `z.real` or `z.imag` to extract the part of the complex number we want:

In [38]:
y = bool(z.real)

print(z.real, " -> ", y, type(y))

y = bool(z.imag)

print(z.imag, " -> ", y, type(y))

(1.0, ' -> ', True, <type 'bool'>)
(0.0, ' -> ', False, <type 'bool'>)


## Operators and comparisons

Most operators and comparisons in Python work as one would expect:

* Arithmetic operators `+`, `-`, `*`, `/`, `//` (integer division), '**' power


In [39]:
1 + 2, 1 - 2, 1 * 2, 1 / 2

(3, -1, 2, 0)

In [40]:
1.0 + 2.0, 1.0 - 2.0, 1.0 * 2.0, 1.0 / 2.0

(3.0, -1.0, 2.0, 0.5)

In [41]:
# Integer division of float numbers
3.0 // 2.0

1.0

In [42]:
# Note! The power operators in python isn't ^, but **
2 ** 2

4

Note: The `/` operator always performs a floating point division in Python 3.x.
This is not true in Python 2.x, where the result of `/` is always an integer if the operands are integers.
to be more specific, `1/2 = 0.5` (`float`) in Python 3.x, and `1/2 = 0` (`int`) in Python 2.x (but `1.0/2 = 0.5` in Python 2.x).

* The boolean operators are spelled out as the words `and`, `not`, `or`. 

### Complex Logic using `and`, `or` and `xor`

- `[A and B]` will give `True` only if both A and B are `True`
- `[A or  B]` will give `True` if one or both of A, B are `True`
- `[A xor B]` will give `True` if one is `True` but not both.

In [43]:
True and False

False

In [44]:
not False

True

In [45]:
True or False

True

* Comparison operators `>`, `<`, `>=` (greater or equal), `<=` (less or equal), `==` equality, `is` identical.

In [46]:
2 > 1, 2 < 1

(True, False)

In [47]:
2 > 2, 2 < 2

(False, False)

In [48]:
2 >= 2, 2 <= 2

(True, True)

In [49]:
# equality
[1,2] == [1,2]

True

In [50]:
# objects identical?
l1 = l2 = [1,2]

l1 is l2

True

## Compound types: Strings, List and dictionaries

### Strings

Strings are the variable type that is used for storing text messages. 

In [51]:
# str

s1 = 'Analytixlabs'

s2 = "This is also Analytixlabs"

s3 = """
Another name of Analytixlabs is
Alabs
"""

print s1
print s2
print s3

Analytixlabs
This is also Analytixlabs

Another name of Analytixlabs is
Alabs



In [52]:
type(s1)

str

In [53]:
# length of the string: the number of characters
len(s1)

12

In [56]:
# replace a substring in a string with something else
s1 = s1.replace("Analytix", "A")
print(s1)

Alabs


#### String $Subsetting (slicing)$

Syntax:<br>`my_str[start:stop:skip]`
We can index a character in a string using `[]`:

- Index runs from 0 to Length-1
- Subsetting can be done in ranges as start_index : end_index

In [57]:
s1[0]

'A'

In [59]:
len(s1[0:5])

5

In [60]:
s1[4:5]

's'

If we omit either (or both) of `start` or `stop` from `[start:stop]`, the default is the beginning and the end of the string, respectively:

In [61]:
s1[:5]

'Alabs'

In [62]:
s1[6:]

''

In [63]:
s1[:]

'Alabs'

We can also define the step size using the syntax `[start:end:step]` (the default value for `step` is 1, as we saw above):

In [64]:
s1

'Alabs'

In [65]:
s1[::1]

'Alabs'

In [70]:
test = '''This is a test string'''

In [71]:
test

'This is a test string'

In [72]:
test[2::2]

'i sats tig'

In [73]:
s1[::2]

'Aas'

This technique is called *slicing*. Read more about the syntax here: http://docs.python.org/release/2.7.3/library/functions.html?highlight=slice#slice

#### String Methods

In [74]:
s1.endswith('labs')

True

In [75]:
test

'This is a test string'

In [79]:
type(test.split(" "))

list

In [80]:
s1.find('lytix')

-1

In [81]:
s1.index('lyt')

ValueError: substring not found

In [82]:
# Difference between `find` and `index`
s1.find("cat")

-1

In [83]:
s1.index("cat")

ValueError: substring not found

In [84]:
### Splitting a string gives a list
s= "Analytixlabs is premier capability building company"
s1= s.split(" ")

In [85]:
s1

['Analytixlabs', 'is', 'premier', 'capability', 'building', 'company']

In [86]:
type(s1)

list

#### String Airthmatic

In [87]:
s + ' Based out of Gurgaon.' + ' Also it has offices in Bangalore and Malaysia'

'Analytixlabs is premier capability building company Based out of Gurgaon. Also it has offices in Bangalore and Malaysia'

In [88]:
'Alabs ' * 4

'Alabs Alabs Alabs Alabs '

#### Type Conversions to/from strings

In [89]:
str(12345)

'12345'

In [90]:
'Apple' + str(7)

'Apple7'

In [91]:
int('123')

123

In [None]:
float('3.142')

#### String formatting examples

In [92]:
print("str1", "str2", "str3")  # The print statement concatenates strings with a space

('str1', 'str2', 'str3')


In [93]:
print("str1", 1.0, False, -1j)  # The print statements converts all arguments to strings

('str1', 1.0, False, -1j)


In [94]:
print("str1" + "str2" + "str3") # strings added with + are concatenated without space

str1str2str3


In [98]:
print("value = %f" % "test")       # we can use C-style string formatting

TypeError: float argument required, not str

In [96]:
# this formatting creates a string
s2 = "value1 = %.2f. value2 = %d" % (3.1415, 1.5)

print(s2)

value1 = 3.14. value2 = 1


In [102]:
"apple" + "ball"

'appleball'

In [101]:
# alternative, more intuitive way of formatting a string 
s3 = 'sdlfjslkkdfsjdlgj = {1}, value2 = {0}'.format(3.1415, "test")

print(s3)

sdlfjslkkdfsjdlgj = test, value2 = 3.1415


### List

Lists are very similar to strings, except that each element can be of any type.

The syntax for creating lists in Python is `[...]`:

"A list is an **ordered**, indexable collection of data."

- created using the square brackets `[]`
- created using functions like `range, xrange, arange, linspace, list`
- Lists methods
    - `.append, .remove, .pop, .reverse, .sort`
- Subsetting lists using integer indexes (and slices)
- Lists are **mutable** ie, they can be modified once declared.
- Finding things inside lists - using `in` and `.index()`
- Lists as iterators

In [103]:
l = [1,2,3,4]

print(type(l))
print(l)

<type 'list'>
[1, 2, 3, 4]


We can use the same slicing techniques to manipulate lists as we could use on strings:

In [104]:
print(l)

print(l[1:3])

print(l[::2])

[1, 2, 3, 4]
[2, 3]
[1, 3]


In [105]:
l[0]

1

Elements in a list do not all have to be of the same type:

In [106]:
l = [1, 'a', 1.0, 1-1j]

print(l)

[1, 'a', 1.0, (1-1j)]


Python lists can be inhomogeneous and arbitrarily nested:

In [112]:
nested_list = [1, [2, [3, [4, [5]]]]]

nested_list
nested_list[1][1][1][0]

4

In [113]:
nested_list[1][1][0]

3

Lists play a very important role in Python. For example they are used in loops and other flow control structures (discussed below). There are a number of convenient functions for generating lists of various types, for example the `range` function:

In [115]:
start = 10
stop = 30
step = 2

list(range(start, stop, step))

[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

In [None]:
# in python 3 range generates an interator, which can be converted to a list using 'list(...)'.
# It has no effect in python 2
list(range(start, stop, step))

In [116]:
list(range(-10, 10))

[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
s

In [117]:
# convert a string to a list by type casting:
s2 = list(s)

s2

['A',
 'n',
 'a',
 'l',
 'y',
 't',
 'i',
 'x',
 'l',
 'a',
 'b',
 's',
 ' ',
 'i',
 's',
 ' ',
 'p',
 'r',
 'e',
 'm',
 'i',
 'e',
 'r',
 ' ',
 'c',
 'a',
 'p',
 'a',
 'b',
 'i',
 'l',
 'i',
 't',
 'y',
 ' ',
 'b',
 'u',
 'i',
 'l',
 'd',
 'i',
 'n',
 'g',
 ' ',
 'c',
 'o',
 'm',
 'p',
 'a',
 'n',
 'y']

In [118]:
# sorting lists
s2.sort()

print(s2)

[' ', ' ', ' ', ' ', ' ', 'A', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'e', 'e', 'g', 'i', 'i', 'i', 'i', 'i', 'i', 'i', 'l', 'l', 'l', 'l', 'm', 'm', 'n', 'n', 'n', 'o', 'p', 'p', 'p', 'r', 'r', 's', 's', 't', 't', 'u', 'x', 'y', 'y', 'y']


#### Adding, inserting, modifying, and removing elements from lists

In [130]:
# create a new empty list
l = []

# add an elements using `append`
l.append("A")
l.append("d")
l.append("d")

print(l)

['A', 'd', 'd']


We can modify lists by assigning new values to elements in the list. In technical jargon, lists are *mutable*.

In [131]:
l[1] = "p"
l[2] = "p"

print(l)

['A', 'p', 'p']


In [132]:
l[1:3]

['p', 'p']

In [134]:
l[1:3] = ["x"]

print(l)

['A', 'x']


Insert an element at an specific index using `insert`

In [135]:
l.insert(0, "i")
l.insert(1, "n")
l.insert(2, "s")
l.insert(3, "e")
l.insert(4, "r")
l.insert(5, "t")

print(l)

['i', 'n', 's', 'e', 'r', 't', 'A', 'x']


Remove first element with specific value using 'remove'

In [136]:
l.remove("A")

print(l)

['i', 'n', 's', 'e', 'r', 't', 'x']


Remove an element at a specific location using `del`:

In [139]:
del l[1]
del l[2]

print(l)

['i', 'e', 't', 'x']


See `help(list)` for more details, or read the online documentation 

### Tuples

Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. 

In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:

In [140]:
point = (10, 20)

print(point, type(point))

((10, 20), <type 'tuple'>)


In [141]:
point = 10, 20

print(point, type(point))

((10, 20), <type 'tuple'>)


We can unpack a tuple by assigning it to a comma-separated list of variables:

In [145]:
x, = point
print(x)

ValueError: too many values to unpack

If we try to assign a new value to an element in a tuple we get an error:

In [147]:
type(point)

tuple

In [146]:
point[0] = 20

TypeError: 'tuple' object does not support item assignment

### Dictionaries

Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, ...}`:

---
- **unordered**
- declared using the curly braces `{ }`
- Data exists in the form of `key-value` pairs

In [149]:
print(type(l),type(point))

(<type 'list'>, <type 'tuple'>)


In [155]:
doc_info = {
point : "Prasad",
l : "Data Science in Python",
 "chapters" : 10,
"tags" : ["#data", "#science", "#datascience", "#python", "#analysis"]
}

TypeError: unhashable type: 'list'

In [154]:
doc_info

{'chapters': 10,
 'l': 'Data Science in Python',
 'tags': ['#data', '#science', '#datascience', '#python', '#analysis'],
 (10, 20): 'Prasad'}

In [160]:
doc_info = {
"author" : ["Prasad", "Prasad"],
"title" : "Data Science in Python",
"chapters" : 10,
"tags" : ["#data", "#science", "#datascience", "#python", "#analysis"],

}

In [161]:
print(doc_info1)

NameError: name 'doc_info1' is not defined

In [162]:
doc_info.keys()

['tags', 'title', 'chapters', 'author']

In [163]:
doc_info.values()

[['#data', '#science', '#datascience', '#python', '#analysis'],
 'Data Science in Python',
 10,
 ['Prasad', 'Prasad']]

In [159]:
for k,v in doc_info.iteritems():
    print(k,v)

('tags', ['#data', '#science', '#datascience', '#python', '#analysis'])
((10, 20), 'Prasad')
('chapters', 10)
('l', 'Data Science in Python')


In [None]:
for k, v in doc_info.items():
    print("Key: {0} ; Value = {1}".format(k,v))

In [None]:
# Adding k-v pairs to an empty dictionary
doc_info['author'] = ['Prasad', "Prasad", "Mouli"]
doc_info['chapters']= [20,10,20]

In [None]:
doc_info

In [164]:
doc_info['tags']

['#data', '#science', '#datascience', '#python', '#analysis']

In [165]:
doc_info.keys()

['tags', 'title', 'chapters', 'author']

In [166]:
doc_info.pop('chapters')

10

In [167]:
doc_info.get('title', 'NA')

'Data Science in Python'

In [168]:
doc_info.keys()

['tags', 'title', 'author']

In [169]:
d={}

In [170]:
d

{}

In [171]:
type(d)

dict

In [172]:
d["Name"]="Alabs"; d["Location"]="Ggn"

In [173]:
d

{'Location': 'Ggn', 'Name': 'Alabs'}

In [174]:
d['Name']

'Alabs'

In [175]:
params = {"parameter1" : 1.0,
          "parameter2" : 2.0,
          "parameter3" : 3.0,}

print(type(params))
print(params)

<type 'dict'>
{'parameter1': 1.0, 'parameter3': 3.0, 'parameter2': 2.0}


In [176]:
print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))

parameter1 = 1.0
parameter2 = 2.0
parameter3 = 3.0


In [177]:
params["parameter1"] = "A"
params["parameter2"] = "B"

# add a new entry
params["parameter4"] = "D"

print("parameter1 = " + str(params["parameter1"]))
print("parameter2 = " + str(params["parameter2"]))
print("parameter3 = " + str(params["parameter3"]))
print("parameter4 = " + str(params["parameter4"]))

parameter1 = A
parameter2 = B
parameter3 = 3.0
parameter4 = D


In [178]:
#Sets
s1=set([1,2,2,3,4,4,4,4,5,5,5])

In [179]:
s1

{1, 2, 3, 4, 5}

In [180]:
s1.add(10)

In [181]:
s1

{1, 2, 3, 4, 5, 10}

In [182]:
s1.union([1,10,20])

{1, 2, 3, 4, 5, 10, 20}

In [183]:
s1.intersection([1,2,4,20,30,40])

{1, 2, 4}

In [184]:
s2=set([1,20,20,30,40])

In [185]:
print (s1); print (s2)

set([1, 2, 3, 4, 5, 10])
set([40, 1, 20, 30])


In [186]:
s1.union(s2)

{1, 2, 3, 4, 5, 10, 20, 30, 40}

## Control Flow

### Conditional statements: if, elif, else

The Python syntax for conditional execution of code uses the keywords `if`, `elif` (else if), `else`:

> ### Basic Syntax

    if (condition 1):
        action 1
    elif (condition 2):
        action 2
        ...
    elif (condition n):
        action n
        ...
    else:
        alternative
        
        
> ### Ternary if-then-else        
        
    action if condition else alternative   

In [None]:
statement1 = False
statement2 = False

if statement1:
    print("statement1 is True")
    
elif statement2:
    print("statement2 is True")
    
else:
    print("statement1 and statement2 are False")

#### Examples:

In [None]:
statement1 = statement2 = True

if statement1:
    if statement2:
        print("both statement1 and statement2 are True")

In [None]:
# Bad indentation!
if statement1:
    if statement2:
    print("both statement1 and statement2 are True")  # this line is not properly indented

In [None]:
name = 'Prasad'

In [None]:
if (name == 'Prasad'):
    print('Hi Prasad')

In [None]:
if name != 'Alice':
    print ('You are not Alice')

In [None]:
if name == 'John':
    print('How are you?')
else:
    print('Nice to meet you.')

In [None]:
name = 'Joe'
age = 15
lastname = 'Doc'

In [None]:
if (name == 'John'):
    print('How are you?')
elif age < 18:
    print("You're just a teenager")
elif (lastname == 'Doe'):
    print("Never heard that name before")
else:
    print('you do not qualify')

#### Ternary if-then-else example

In [None]:
x = 5; y = 13

1 if (x > y) else 0

In [None]:
'How are you' if name == 'John' else 'Who are you again?'

In [None]:
name='Prasad'

In [None]:
if name == 'John':
    print ('How are you')
else:
    print ('Who are you again?)'

In [None]:
'How are you' if name=='John' else 'Who are you?'

## Loops

In Python, loops can be programmed in a number of different ways. The most common is the `for` loop, which is used together with iterable objects, such as lists. The basic syntax is:

### **`for` loops**:
> #### Generating numbers for iterating over using <br><br> `range(), arange(), linspace(), list()`

In [None]:
# Python's builtin range function - gives only natural numbers
range(0, 10, 2)

# range() => a list

In [None]:
# arange from the numpy module - gives fractional numbers as well
import numpy as np
np.arange(0, 1, .2)

# arange() => numpy array

In [None]:
list('abcde')

### lists are iterables

In [None]:
for x in [1,2,3]:
    print(x)

The `for` loop iterates over the elements of the supplied list, and executes the containing block once for each element. Any kind of list can be used in the `for` loop. For example:

In [None]:
for x in range(4): # by default range start at 0
    print(x)

Note: `range(4)` does not include 4 !

In [None]:
for x in range(-3,3):
    print(x)

In [None]:
for word in ["scientific", "computing", "with", "python"]:
    print(word)

To iterate over key-value pairs of a dictionary:

In [None]:
for key, value in params.items():
    print(key + " = " + str(value))

Sometimes it is useful to have access to the indices of the values when iterating over a list. We can use the `enumerate` function for this:

In [None]:
for idx, x in enumerate(range(-3,3)):
    print(x, x**2)

In [None]:
for i in range(6):
    print( '*' * i)


for i in range(5, 0, -1):
    print ('*' * i )

### List comprehensions: Creating lists using `for` loops:

A convenient and compact way to initialize lists:

In [None]:
l1 = [x**2 for x in range(0,5)]

print(l1)

### `while` loops:

In [None]:
i = 0

while i < 5:
    print(i)
    
    i = i + 1
    
print("done")

Note that the `print("done")` statement is not part of the `while` loop body because of the difference in indentation.

---
### Searching for an object belonging to a collection is faster for sets and dicts than lists

In [None]:
odd = [i**3 for i in range(1,100) if(i%2!=0)]; print(odd)

In [None]:
set_1 = {1, 2, 3, 1, 2, 5, 6, 7}; set_1

In [None]:
%%timeit 
5 in set_1

In [None]:
list_1 = [1,2,3,4,510,20,5,30,4,5,3,20]

In [None]:
%%timeit
5 in list_1

In [201]:
dict_1 = {k:v for k, v in zip(range(1, 8), list('abcdefg'))}
dict_1

{1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e', 6: 'f', 7: 'g'}

In [204]:
%%timeit
5 in dict_1

10000000 loops, best of 3: 43.2 ns per loop


In [206]:
list_1 = range(10000)
set_1 = set(list_1)

In [207]:
len(set_1)

10000

In [208]:
%timeit 1123 in list_1

100000 loops, best of 3: 14.8 µs per loop


In [209]:
%timeit 1123 in set_1

The slowest run took 50.58 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 61.3 ns per loop


## Functions

A function in Python is defined using the keyword `def`, followed by a function name, a signature within parentheses `()`, and a colon `:`. The following code, with one additional level of indentation, is the function body.

Every function does 3 things:
1. Take an argument
2. Flow it through the body of the function
3. Return an object

Syntax

Function Definition:

```
def func-name(parameters):
    body-function
    return something
```

Function Call:


`func-name(arguments)`

In [210]:
def udf1(a):
    v=a**2+2*a+10
    return v

In [None]:
udf1(2)

In [None]:
def func0():   
    print("test")

In [None]:
func0()

Optionally, but highly recommended, we can define a so called "docstring", which is a description of the functions purpose and behaivor. The docstring should follow directly after the function definition, before the code in the function body.

In [None]:
def func1(s):
    """
    Print a string 's' and tell how many characters it has    
    """
    
    print(s + " has " + str(len(s)) + " characters")

In [None]:
help(func1)

In [None]:
func1("test")

Functions that returns a value use the `return` keyword:

In [None]:
def square(x):
    """
    Return the square of x.
    """
    return x ** 2

In [None]:
square(4)

We can return multiple values from a function using tuples (see above):

In [None]:
def powers(x):
    """
    Return a few powers of x.
    """
    return (x ** 2, x ** 3, x ** 4)

In [None]:
powers(3)

In [None]:
x2, x3, x4 = powers(3)

print(x3)

### Default argument and keyword arguments

In a definition of a function, we can give default values to the arguments the function takes:

In [198]:
def myfunc(x, p=2, debug=False):
    if debug:
        print("evaluating myfunc for x = " +\
              str(x) + " using exponent p = " + str(p))
    return x**p

In [199]:
print(myfunc(5, debug= True))

evaluating myfunc for x = 5 using exponent p = 2
25


In [200]:
myfunc(5, debug=True)

evaluating myfunc for x = 5 using exponent p = 2


25

If we explicitly list the name of the arguments in the function calls, they do not need to come in the same order as in the function definition. This is called *keyword* arguments, and is often very useful in functions that takes a lot of optional arguments.

In [192]:
myfunc(p=3, debug=True, x=7)

evaluating myfunc for x = 7 using exponent p = 3


343

#### Functions are Objects

In [211]:
def add_one(num):
    return num + 1

print (add_one(99))


def add_two(n):
    return n+2

print(add_two(998))


100
1000


In [212]:
list_of_funcs = [add_one, add_two]

print (list_of_funcs[0](49))
print (list_of_funcs[1](48))

50
50


### Unnamed functions (lambda function)

In Python we can also create unnamed functions, using the `lambda` keyword:

- do not have a name
- are temporary in nature and intent
- use & throw

In [213]:
def squarer(x):
    """
    This function takes a number and returns its square
    """
    return x**2

In [214]:
squarer(10)

100

In [215]:
squarer?

In [216]:
type(kx)

NameError: name 'kx' is not defined

In [217]:
f1 = lambda kx: kx**2

In [218]:
type(f1)

function

In [219]:
f1 = lambda x: x**2
    
# is equivalent to 

def f2(x):
    return x**2

In [220]:
f1(2), f2(2)

(4, 4)

This technique is useful for example when we want to pass a simple function as an argument to another function, like this:

In [221]:
# map is a built-in python function
map(lambda x: x**2, range(-3,4))

[9, 4, 1, 0, 1, 4, 9]

In [222]:
# in python 3 we can use `list(...)` to convert the iterator to an explicit list
list(map(lambda x: x**2, range(-3,4)))

[9, 4, 1, 0, 1, 4, 9]

# Lambda Functions with `map(), filter(), reduce()`

> ### `map` <br>
    map(function that will transform each element of a sequence, sequence) -> list

In [223]:
map(lambda x: x + 10, range(10))

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

---
> ### `reduce` <br>
    reduce(function that works with pairs of values, sequence) -> value

In [224]:
import functools

In [225]:
import functools
functools.reduce(lambda x, y: x + y, range(50, 55))

260

---
> ### `filter` <br>
    filter(function that returns a bool, sequence) -> sequence

In [None]:
list(filter(lambda x: x % 2 == 0, range(10)))

---
## `zip()` and `enumerate()` 

> #### `zip()` - Return a list of tuples, where each tuple contains the i-th element from each of the argument sequences.  The returned list is truncated in length to the length of the shortest argument sequence.

> #### `enumerate()` - Returns an interable values and indices for a given sequence

In [None]:
seq1 = list('abcde'); seq2 = range(5); print(seq1); print(seq2);
print(len(seq1) == len(seq2))

In [None]:
zipped_list = []
for i in range(5):
    zipped_list.append((seq1[i], seq2[i]))

print seq1
print seq2
print zipped_list 

In [None]:
zip(seq1, seq2)

In [None]:
list(zip([1, 2, 3], (4, 5, 6), ['a', 'b', 'c']))

#### Zipping lists of different length

In [None]:
print (range(1, 7))
print (list('abcdefghi'))
print 
print (list(zip(range(1, 7), list('abcdefghi'))))

In [None]:
enumerate?

In [None]:
str_1 = "Enumerate vs. len + for loop"; print(str_1); print(list(enumerate(str_1)))

In [None]:
for i in range(len(str_1)):
    if i % 2 == 0:
        print str_1[i]

In [None]:
for i, j in enumerate(str_1):
    if i % 2 == 0:
        print(j, end = '')

In [None]:
for i, j in enumerate(list('theskyisgray')):
    print (i, j)

In [None]:
for i,j in enumerate('abcde'):
    print (i, j)

---
# Dictionary Comprehensions

> Condenses loops and if-then routines in a single line

Used to

- Create new lists and dictionaries
- Filter existing lists and dictionaries

Create a dictionary using the syntax:

`{key:value for key, value in <iterator of k-v pairs) if <condition>}`


---
```
for x in iterable:
    if x condition:
        f(x)
```

can be condensed into 

```
[f(x) for x in iterable if condition]

```

- This is what we call a **List** comprehension

---
```
for k, v in zip(seqK, seqV):
    if condition on k or v:
        dict_1[k] = f(v)
```

is condensed into

`{k:v for k,v in zip/enumerate(<iterable(s)>) if <condition>}`

- And this is known as a **Dictionary** comprehension

---

### List Comprehensions

### Print the odd numbers between 1 and 20

In [None]:
%%timeit
list_1 = [n**2 for n in range(1, 21) if n % 2 == 0]

In [None]:
%%timeit
list_2 = []
for n in range(1, 21):
    if n % 2 == 0:
        list_2.append(n**2)

### Convert nonvowels to uppercase in a given string

In [None]:
alphabets = list('abcdefghijklmnouvwxyz')
print ([a.upper() for a in alphabets if a not in set('aeiou') ])

### Dictionary Comprehensions

In [None]:
list(zip(range(1, 10), 
    range(11, 20), 
    range(21, 30)))

In [None]:
for i,j in enumerate('abcdef'):
    print (i, j)

In [None]:
nums = range(9)
chars = list('abcdefghi')

len(nums) == len(chars)

kv = zip(chars, nums)
print (dict(kv))

In [None]:
dict_1 = {k:v for k, v in zip(chars, nums)}

In [None]:
print (dict_1)

In [None]:
{j:i for i, j in enumerate('eggscoffeebacontoast')}

In [None]:
# Using Zip
print ({k:v for k, v in zip(range(5), 'abcde')})

# Using Enumerate
print ({k:v for k, v in enumerate('abcde')})

---
## Comparison with Comprehensions

- Comprehensions slightly faster

In [None]:
%timeit filter(lambda x: x % 2 == 0, range(10))

In [None]:
%timeit [x for x in range(10) if x % 2 == 0]

## Classes

Classes are the key features of object-oriented programming. A class is a structure for representing an object and the operations that can be performed on the object. 

In Python a class can contain *attributes* (variables) and *methods* (functions).

A class is defined almost like a function, but using the `class` keyword, and the class definition usually contains a number of class method definitions (a function in a class).

* Each class method should have an argument `self` as its first argument. This object is a self-reference.

* Some class method names have special meaning, for example:

    * `__init__`: The name of the method that is invoked when the object is first created.
    * `__str__` : A method that is invoked when a simple string representation of the class is needed, as for example when printed.
    * There are many more, see http://docs.python.org/2/reference/datamodel.html#special-method-names

In [None]:
class Point:
    """
    Simple class for representing a point in a Cartesian coordinate system.
    """
    
    def __init__(self, x, y):
        """
        Create a new Point at x, y.
        """
        self.x = x
        self.y = y
        
    def translate(self, dx, dy):
        """
        Translate the point by dx and dy in the x and y direction.
        """
        self.x += dx
        self.y += dy
        
    def __str__(self):
        return("Point at [%f, %f]" % (self.x, self.y))

To create a new instance of a class:

In [None]:
p1 = Point(0, 0) # this will invoke the __init__ method in the Point class

print(p1)         # this will invoke the __str__ method

In [None]:
p1.translate(5,10)

In [None]:
print( p1)

To invoke a class method in the class instance `p`:

In [None]:
p2 = Point(1, 1)

p1.translate(0.25, 1.5)

print(p1)
print(p2)

Note that calling class methods can modifiy the state of that particular class instance, but does not effect other class instances or any global variables.

That is one of the nice things about object-oriented design: code such as functions and related variables are grouped in separate and independent entities. 

## Modules

One of the most important concepts in good programming is to reuse code and avoid repetitions.

The idea is to write functions and classes with a well-defined purpose and scope, and reuse these instead of repeating similar code in different part of a program (modular programming). The result is usually that readability and maintainability of a program is greatly improved. What this means in practice is that our programs have fewer bugs, are easier to extend and debug/troubleshoot. 

Python supports modular programming at different levels. Functions and classes are examples of tools for low-level modular programming. Python modules are a higher-level modular programming construct, where we can collect related variables, functions and classes in a module. A python module is defined in a python file (with file-ending `.py`), and it can be made accessible to other Python modules and programs using the `import` statement. 

Consider the following example: the file `mymodule.py` contains simple example implementations of a variable, function and a class:

In [None]:
%%file mymodule.py
"""
Example of a python module. Contains a variable called my_variable,
a function called my_function, and a class called MyClass.
"""

my_variable = 0

def my_function():
    """
    Example function
    """
    return my_variable
    
class MyClass:
    """
    Example class.
    """

    def __init__(self):
        self.variable = my_variable
        
    def set_variable(self, new_value):
        """
        Set self.variable to a new value
        """
        self.variable = new_value
        
    def get_variable(self):
        return self.variable

We can import the module `mymodule` into our Python program using `import`:

In [None]:
import mymodule

Use `help(module)` to get a summary of what the module provides:

In [None]:
help(mymodule)

In [None]:
mymodule.my_variable

In [None]:
mymodule.my_function() 

In [None]:
my_class = mymodule.MyClass() 
my_class.set_variable(10)
my_class.get_variable()

If we make changes to the code in `mymodule.py`, we need to reload it using `reload`:

In [None]:
reload(mymodule)  # works only in python 2

## Exceptions

In Python errors are managed with a special language construct called "Exceptions". When errors occur exceptions can be raised, which interrupts the normal program flow and fallback to somewhere else in the code where the closest try-except statement is defined.

To generate an exception we can use the `raise` statement, which takes an argument that must be an instance of the class `BaseException` or a class derived from it. 

In [None]:
raise (Exception("description of the error"))

A typical use of exceptions is to abort functions when some error condition occurs, for example:

    def my_function(arguments):
    
        if not verify(arguments):
            raise Exception("Invalid arguments")
        
        # rest of the code goes here

To gracefully catch errors that are generated by functions and class methods, or by the Python interpreter itself, use the `try` and  `except` statements:

    try:
        # normal code goes here
    except:
        # code for error handling goes here
        # this code is not executed unless the code
        # above generated an error

For example:

In [None]:
try:
#     print("test")
#     generate an error: the variable test is not defined
    print(test)
except:
    print("Caught an exception")

To get information about the error, we can access the `Exception` class instance that describes the exception by using for example:

    except Exception as e:

In [None]:
try:
    print("test")
    # generate an error: the variable test is not defined
    print(test)
except Exception as e:
    print("Caught an exception:" + str(e))

### Example 

In [None]:
def div_by(a, b):
    try:
        return a/float(b)
    except:
        return 'Invalid Input'

In [None]:
div_by(1245, 2)

In [None]:
div_by(0, 4)

In [None]:
div_by(4,0)

In [None]:
float?

In [None]:
float('123.5')

In [None]:
float('Prasad')

In [None]:
float(1)
float('123.45')