<a href="https://colab.research.google.com/github/schandrase/Comp-chem-course/blob/main/Copy_of_IntroToPython.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<div class="alert alert-block alert-info">
    
# Introduction to Python
    
</div>

## Python is great!

An example [Fortran](https://en.wikipedia.org/wiki/Fortran) code to compute the factorial of 4.
```fortran
program factorial  
implicit none  

   ! define variables
   integer :: nfact = 1   
   
   ! compute factorials   
   do n = 1, 4
      nfact = nfact * n 
      ! print values
      print*,  n, " ", nfact   
   end do 
   
end program factorial
```

Now with a recursive function in Python:

In [None]:
def factorial_recursive(n):
     return 1 if (n==1 or n==0) else n * factorial_recursive(n - 1)

Evaluate cells by pressing `Shift + Enter`

In [None]:
factorial_recursive(4)  - 4*3*2*1

In [None]:
varb = 5
print(varb)


5


In [None]:
varb 

5

In [None]:
import math

math.factorial(4)

In [None]:
%timeit -n 10_000 factorial_recursive(4)

In [None]:
%timeit -n 10_000 math.factorial(4)

What makes Python great?

- Dynamic language (no compiling)
- Interactive (Jupyter and Colab notebooks)
- Lots of packages
- Great **community**

In [None]:
# Zen of Python
import this

The material for this lecture is collected primarily from: [Python Programming And Numerical Methods: A Guide For Engineers And Scientists](https://pythonnumericalmethods.berkeley.edu/notebooks/Index.html)
and [Python Scripting for Computational Molecular Science](https://education.molssi.org/python_scripting_cms/)

## Basic Syntax

### Python as a calculator

In [None]:
7 + 11

The usual order of operations is respected. You can also use parentheses to ensure a certain order.

In [None]:
(3*4)/(2**2 + 4/2)

You can use `_` to refer to the output of the previous computation.

In [None]:
_*4

Arithmetic with complex numbers uses `j` with a number in front, or the keyword `complex`.

In [None]:
(3+4j)

In [None]:
complex(3,4)

In [None]:
abs(3+4j)

Basic functions such as `sin`, `cos`, `exp`, `log`, `sqrt`, can be found in various packages. For example, the math package we introduced above. Try `TAB` completion (`Control+SPACE` in Colab) to see what's in a package.

In [None]:
math.

There are different ways to get information about a certain function

In [None]:
math.factorial?

In [None]:
help(math.factorial)

### Variables and assignments

The assignment operator, denoted by the `=` symbol, assigns values to variables. The line `var_a=3` takes the known value, 3, and assigns that value to the variable `var_a`.

Don't confuse assignment with equality! Assignment is not symmetric, whereas equality is.

In [None]:
var_a = 3
var_b = 7

You can do algebra with variables as usual

In [None]:
var_a + var_b

or assign the result of a calculation to a new variable

In [None]:
var_sum = var_a + var_b
var_sum

Variables in Python come in many forms. Below are some examples.

In [None]:
"string"

In [None]:
an_integer = 42 # Just an integer
a_float = 0.1 # A non-integer number, up to a fixed precision
a_boolean = True # A value that can be True or False
a_string = '''just enclose text between two 's, or two "s, or do what we did for this string''' # Text
none_of_the_above = None # The absence of any actual value or variable type

Mathematical equality is ==  

In [None]:
var_sum == var_a + var_b

In [None]:
var_a == var_b

We can assign the value of one variable to another

In [None]:
var_a = var_b
print("Variable a is: ",var_a, "\nVariable b is: ", var_b)

We can assign multiple variables in one line

In [None]:
var_a, var_b = 3, 7

In [None]:
var_a + var_b

You can clear a variable using `del`

In [None]:
del var_a

In [None]:
var_a

One more difference between mathematical equality and computational assignment: the expression below doesn't make sense mathematically, but is common in computation

In [None]:
var_b = var_b + 1

It's so common that there is an short form for the increment

In [None]:
var_b += 1

In [None]:
var_b

List all the variables in this notebook

In [None]:
%whos

### Lists, tuples, sets, and dictionaries

**Lists** are groups of objects (values, variables, even functions). We construct them using square brackets [] with objects separated by commas.

In [None]:
values = [7, 11, 13, 17, 19, 23, 29]
type(values)

list

In [None]:
varb = True
type(varb)

bool

**Counting starts at 0.**

In [None]:
values[2]

13

In [None]:
values[0], values[1], values[2], values[3], values[4], values[5], values[6]

(7, 11, 13, 17, 19, 23, 29)

There are built-in functions for handling lists.

In [None]:
len(values)

7

In [None]:
# This doesn't work
values[7]

IndexError: ignored

You can access list elements in reverse. For example, the last element of the list is

In [None]:
values[-2]

23

Sometimes you want to access portions of lists. These are called slices

In [None]:
values[0:4]

(7, 11, 13, 17)

When you specify the last element for the slice, it goes up to but not including that element of the list.

If you do not include a start index, the slice automatically starts at the first element. If you do not include an end index, the slice automatically goes to the last element.

In [None]:
values[1:7], values[4:]

((11, 13, 17, 19, 23, 29), (19, 23, 29))

Lists can include objects of different types. Most common data types in Python are strings, integers, and floats. You can put these different types together in a list.

In [None]:
mixed_list = ['one', 1, 1e6, [1,1], 1==1, None]

In [None]:
type(mixed_list[2])

float

In [None]:
for item in mixed_list:
    print(str(type(item))[8:-2])

str
int
float
list
bool
NoneType


You can join lists together by "adding" them

In [None]:
values + mixed_list

[7, 11, 13, 17, 19, 23, 29, 'one', 1, 1000000.0, [1, 1], True, None]

Addition or multiplication doesn't work the way you might expect for lists of numbers

In [None]:
values + values

[7, 11, 13, 17, 19, 23, 29, 7, 11, 13, 17, 19, 23, 29]

In [None]:
3*values

[7,
 11,
 13,
 17,
 19,
 23,
 29,
 7,
 11,
 13,
 17,
 19,
 23,
 29,
 7,
 11,
 13,
 17,
 19,
 23,
 29]

Strings are similar (but not equal) to lists of characters

In [None]:
w = 'Hello World!'

You can slice them similar to lists

In [None]:
w[:5]

'Hello'

But strings are not lists. There are various built-in methods for handling strings.

In [None]:

w.

In [None]:
w.split()[0], w.upper()

('Hello', 'HELLO WORLD!')

Another data structure similar to lists is the **tuple**.

In [None]:
mixed_tuple = ('one', 1, 1e6, [1,1], 1==1, None)
mixed_tuple[0]

'one'

A major difference between the list and the tuple is that list items can be changed

In [None]:
mixed_list[0] = 'apple'
print(mixed_list)

['apple', 1, 1000000.0, [1, 1], True, None]


whereas tuple elements cannot (tuples are "immutable")

In [None]:
mixed_tuple[0] = 'apple'

TypeError: ignored

Also, we can add an element to the end of a list, which we cannot do with tuples.

In [None]:
mixed_list.append( 3.14 )
print(mixed_list)

['apple', 1, 1000000.0, [1, 1], True, None, 3.14]


**Sets** are unordered collections with no duplicate elements. They support mathematical operations like union, intersection, and complement. It is defined by using a pair of braces, and its elements are separated by commas

In [None]:
my_set = {3, 3, 2, 3, 1, 4, 5, 6, 4, 2}

In [None]:
my_set[0]

TypeError: ignored

One quick usage of this is to find out the unique elements in a string, list, or tuple.

In [None]:
set([1, 2, 2, 3, 2, 1, 2])

{1, 2, 3}

In [None]:
set('Banana')

{'B', 'a', 'n'}

Another useful data structure is the **dictionary**. Instead of using a sequence of numbers to index the elements (such as lists or tuples), dictionaries are indexed by keys, which could be a string, number or even a tuple. Values are labeled by a unique key and can be any data type.

So a dictionary consists of key-value pairs, and each key maps to a corresponding value. 

In [None]:
dict_1 = {34:3, 100:4, 2:2}

In [None]:
dict_1[34]

3

Within a dictionary, elements are stored without order, therefore, you can not access a dictionary based on a sequence of index numbers. To get access to a dictionary, we need to use the key of the element.

In [None]:
dict_1['apple']

KeyError: ignored

Keys and values can be listed by corresponding methods

In [None]:
dict_1.keys()

dict_keys([34, 100, 2])

In [None]:
dict_1.values()

dict_values([3, 4, 2])

Keys and values can have many different data types

In [None]:
a_dict = { 1:'This is the value, for the key 1', 'This is the key for a value 1':1, False:':)', (0,1):256 }

In [None]:
a_dict[1]

'This is the value, for the key 1'

New key/value pairs can be added by just supplying the new value for the new key

In [None]:
a_dict['new key'] = 'new_value'
a_dict

{1: 'This is the value, for the key 1',
 'This is the key for a value 1': 1,
 False: ':)',
 (0, 1): 256,
 'new key': 'new_value'}

### Loops

The power of coding comes from **loops** (repetitions) and **conditionals** (choices). There are different ways to construct loops. The most common way is to construct a for loop:

```python
for variable in list:
    do things using variable
```
- Watch for the colon `:` at the end of a `for` statement.
- Watch for the indentation on the second line. 

Indentation is *very* important in Python. There is no separate statement that closes code blocks such as loops and conditionals. It's all done via indentation. Indentation is with 4 spaces, but advanced editors that understand Python syntax will take care of that for you. 

In [None]:

values

In [None]:
value_squared = []
for value in values:
    value_squared.append(value**2)

print(value_squared)
# try this with different indentations

To loop over a range of numbers, the syntax is

In [None]:
n = 5
for j in range(n+1):
    print(j)

Note that it starts at 0 (by default), and ends at n-1 for range(n).

What is the sum of every integer from 1 to 10?

In [None]:
n = 1
for i in range(1, 11):
    n *= i
    
print(n)

List comprehensions allow sequences to be created from other sequence with very compact syntax

For example, below we append the square numbers up to 25 in a list.

In [None]:
y = []
for i in range(1, 6):
    y.append(i**2)
y

With list comprehension, we can write this as

In [None]:
y = [i**2 for i in range(1, 6)]
y

### Conditional statements

A conditional statement is a code construct that executes blocks of code only if certain conditions are met. These conditions are represented as logical expressions.

```python
if logical expression:
    code block
```

The word `if` is a keyword. When Python sees an if-statement, it will determine if the associated logical expression is true. If it is true, then the code in code block will be executed. If it is false, then the code in the if-statement will not be executed. The way to read this is “If logical expression is true then do code block.”

In [None]:
mixed_list

In [None]:
mixed_list.append('banana')

In [None]:
if 'apple' in mixed_list or 'banana' in mixed_list:
    print('We have a fruit!')
else:
    print('There is no apple')

What will be the value of y after the code is executed?

In [None]:
x = 3
if x > 1:
    y = 2
elif x > 2:
    y = 4
else:
    y = 0
print(y)

You can use if statements in a concise way (ternary operators)

In [None]:
is_student = True
person = 'student' if is_student else 'not student'
print(person)

This is equivalent to 

In [None]:
is_student = True
if is_student:
    person = 'student'
else:
    person = 'not student'
print(person)

## What is a function

In mathematics, functions are maps. They assign an element from their domain (inputs) to exactly one element in their range (outputs). In programming, a function is a sequence of instructions that performs a specific task. Functions break up our code into smaller, more easily understandable statements, and also allow code to be modular.

For example, the `math.sin` function in Python is a set of tasks (i.e., mathematical operations) that computes an approximation for $\sin x$. Rather than having to retype or copy these instructions every time you want to use the sin function, it is useful to store this sequence of instruction as a function that you can call over and over again.

In general, each function should perform only one computational task.

Here's the pseudocode for a function definition:
```python
def function_name(parameters):
    """ documentation"""
    ** function body code **
    return output
```

- Keyword `def`,
- Function name,
- Arguments,
- A colon to mark the end of the function header.
- Documentation to describe what the function does,
- Statements at the same indentation level (4 spaces, tab)
- Return statement for the value

Python has about 70 built-in functions (such as len or print). A Python package such as numpy includes hundreds of functions.

You can and should define your own functions.

In [None]:
type(len)

In [None]:
len?

In [None]:
def greet(name):
    print("Hello, " + name + ". Good morning!")

In [None]:
greet('1')

In [None]:
help(greet)

Include some documentation in your functions. 

```python
def greet(name):
    """This function greets a person by name"""
    print("Hello, " + name + ". Good morning!")
```

In [None]:
def greet(name):
    """
    This function greets a person by name

    Args:
        name (str): The name of person that is greeted.

    Returns:
        str: Personalized greeting that includes name
    """
    print("Hello, " + name + ". Good morning!")

In [None]:
help(greet)

Define a function that adds three numbers

In [None]:
def triple_adder(a, b, c):
    """
    This function sums up 3 numbers

    Args:
        a, b, c: Three numbers to be added

    Returns:
        out: The result of a+b+c
    """
    # Sum the inputs together
    out = a + b + c
    
    return out

In [None]:
triple_adder(1,2,3)

The code doesn't check for the type of input and will try to add whatever we give

In [None]:
triple_adder(1,1.2,4+1j)

In [None]:
out=10

In [None]:
triple_adder('Python ', 'is ', 'great!')

Parameters and variables defined inside a function are not visible from outside the function. Hence, they have a local scope.

A function does not remember the value of a variable from its previous calls.

Here is an example to illustrate the scope of a variable inside a function.

In [None]:
# Note the indentation
def some_func():
    x = 10
    print("Value inside function:",x)

x = 20
some_func()
print("Value outside function:",x)

You can change the values of variables outside the function using the `global` keyword.

In [None]:
meaning_of_life = 42

print(f'The meaning of life is {meaning_of_life}.')

def my_world():
    global meaning_of_life
    print(f'Within my world, meaning of life was {meaning_of_life}.')
    meaning_of_life = 1
    print(f'But the meaning of life changed to {meaning_of_life}.')

my_world()
print(f'Outside my world, meaning of life is now {meaning_of_life}.')

## Useful packages

[xkcd](https://xkcd.com) captured the power of Python packages in 2007.

In [None]:
from IPython import display
display.Image("https://imgs.xkcd.com/comics/python.png")

Importing packages is done with a line such as

In [None]:
import antigravity

A useful package to generate random numbers or to make random choices (for statistics and sampling) is `random`.

In [None]:
import random

In [None]:
mixed_list

In [None]:
for j in range(3):
    print('* Results from sample',j+1)
    print('\n    Random number from 0 to 1:', random.random() )
    print("\n    Random choice from our list:", random.choice( mixed_list ) )
    print('\n')

There are hundreds of thousands of Python packages. Some of the most useful ones for data science and machine learning are numpy, scipy, pandas, matplotlib, scikit-learn, and pytorch. Today, we will just focus on numpy number crunching and matplotlib for visualization.

In [None]:
random.choices?

### NumPy

Numpy comes with powerful features
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

In [None]:
import numpy as np

Numpy is important for mathematics (trigonometry, linear algebra, calculus, statistics, optimization, and more).

It's at the core of scientific computing in Python. It provides arrays, matrices, and fast routines such as additions, dot products, or sorting. 

At the core of the package is the **n-dimensional array**. Numpy arrays have **fixed sizes and datatypes** (as opposed to lists). 

In [None]:

np.

In [None]:
# One dimensional array (a vector)
x = np.array([1, 2, 3]) 

# Two dimensional array (a matrix)
y = np.array([[1.1, 2j, 3],[4, 5, 6]])

In [None]:
2*x

These arrays can be represented in mathematical formalism as
$$ x = \begin{pmatrix} 1 & 2 & 3 \end{pmatrix}$$
and 
$$ y = \begin{pmatrix}
1 & 2 & 3\\
4 & 5 & 6
\end{pmatrix} $$

In [None]:

x.

In [None]:
z=np.array(['a','b','c'])

In [None]:
z.dtype

In [None]:
# Array attributes

print('shape: ', x.shape, y.shape)
print('size: ', x.size, y.size)
print('type: ', x.dtype, y.dtype)
print('ndim: ', x.ndim, y.ndim)

Vectorization gives numpy a significant speed advantage. Looops over indices are handled by pre-compiled C-code in the background.

To see the difference, let's compare addition using lists and numpy arrays.

In [None]:
list1 = [i for i in range(1000)]
list2 = [3*i for i in range(1000)]
array1 = np.array(list1)
array2 = np.array(list2)

In [None]:
%timeit -n 1_000 array1 - 2*array2;

In [None]:
%timeit -n 1_000 [x - 2*y for x, y in zip(list1, list2)]

Very often we generate arrays that have a structure. For generating arrays that are in order and evenly spaced, it is useful to use the arange function in Numpy.

In [None]:
z = np.arange(1, 2000,10)

In [None]:
z[-1]

You can prescribe the increment

In [None]:
np.arange(0, 3, 0.5)

Sometimes we want to guarantee a start and end point for an array but still have evenly spaced elements. For instance, we may want an array that starts at -1, ends at 1, and has exactly 21 elements, or 20 grid cells. For this purpose you can use the function `np.linspace`.

In [None]:
grid = np.linspace(-1, 1, 21)

grid

In [None]:
another_grid = np.linspace(-1, 1, 20)


In [None]:
np.linspace?

Arithmetic operations work as in linear algebra. For example, operations between a scalar and an array are performed element-wise.

In [None]:
2*grid

In [None]:
grid+another_grid

Numpy has its own random package

In [None]:
x = np.random.rand(100)
y = np.random.rand(100)

In [None]:
%%timeit -n 1_000
for i in range(0, len(x)):
    x[i] + y[i]

In [None]:
%%timeit -n 1_000
x+y;

Vectorization is much faster (arount 70 times).

Such differences matter in modern data science and machine learning applications. A computation that runs for 2 weeks without vectorization can run in an afternoon with vectorization.

## Data handling

Storing data and the results of your programming efforts is important for working over multiple sessions and sharing your results with collaborators. When Python closes, all the variables in the memory are lost, so data must be stored in the file system. 

To work with text files, we need to use open function which returns a file object. It is commonly used with two arguments:

```
f = open(filename, mode) 
```

`f` is the returned file object. The filename is a string where the location of the file you want to open, and the mode is another string containing a few characters describing the way in which the file will be used, the common modes are:

- ‘r’, this is the default mode, which opens a file for reading
- ‘w’, this mode opens a file for writing, if the file does not exist, it creates a new file.
- ‘a’, open a file in append mode, append data to end of file. If the file does not exist, it creates a new file.
- ‘b’, open a file in binary mode.
- ‘r+’, open a file (do not create) for reading and writing.
- ‘w+’, open or create a file for writing and reading, discard existing contents.
- ‘a+’, open or create file for reading and writing, and append data to end of file.


Write into a file

In [None]:
f = open('test.txt', 'w')
for i in range(5):
    f.write(f"This is line {i}\n")
    
f.close()

Append into an existing file

In [None]:
f = open('text.txt', 'a')
f.write(f"This is another line\n")
f.close()

Read a file

In [None]:
f = open('./test.txt', 'r')
content = f.read()
f.close()
print(content)

Using this way, we could store all the lines in the file into one string variable, we could verify that variable content is a string.

In [None]:
content[0]

In [None]:
type(content)

But sometimes we want to read in the contents in the files line by line and store it in a list. We could use `f.readlines()` to achieve this.

In [None]:
f = open('./test.txt', 'r')
contents = f.readlines()
f.close()
print(contents)
print(type(contents))

In [None]:
contents[0]

When we work with numbers or arrays, we can use the numpy package to directly save/read an array.

In [None]:
arr = np.array([[1.20, 2.20, 3.00], [4.14, 5.65, 6.42]])

In [None]:

np.savetxt?

In [None]:
np.savetxt('my_arr.txt', arr, fmt='%.5f', header = 'Col1 Col2 Col3')

The first argument is the file name, second argument is the arr object we save, and the third argument is the format for the output (‘%.2f’ indicates 2 decimals). The fourth argument is the header.

We can load the array back into memory as follows.

In [None]:
my_arr = np.loadtxt('my_arr.txt')

In [None]:
my_arr.dot(my_arr.T)

Scientific data are sometimes stored in the comma-separated values (CSV) file format, a delimited text file that uses a comma to separate values. It is a very useful format that can store large tables of data (numbers and text) in plain text. Each line (row) in the data is one data record, and each record consists of one or more fields, separated by commas. It also can be opened using Microsoft Excel.

Python has its own csv module that could handle the reading and writing of the csv file, but we can also use numpy.

In [None]:
data = np.random.random((100,5))
np.savetxt('test.csv', data, fmt = '%.2f', delimiter=',', header = 'c1, c2, c3, c4, c5')

Download data from an online csv file.

In [None]:
import pandas as pd
url = 'https://raw.githubusercontent.com/hsiav2000/simple-regression/master/Salary_Data.csv'
data = pd.read_csv(url)

In [None]:
data

In [None]:
data.plot.scatter(x='YearsExperience', y='Salary')

## Visualization

Visualizing data is usually the best way to convey important engineering and science ideas and information, especially if the information is made up of many numbers. The ability to visualize and plot data quickly and in many different ways is one of Python’s most powerful features.

Python has numerous graphics functions that enable you to efficiently display plots, surfaces, volumes, vector fields, histograms, animations, and many other data plots. The most common package for visualization in Python is [matplotlib](https://matplotlib.org/).

Have a look at the [matplotlib gallery](https://matplotlib.org/stable/gallery/index.html) and get a sense of what could be done there. We'll cover the basic syntax for plotting here.

### Basic plotting

The matplotlib package is typically used as plt. Pyplot is a useful module within matplotlib for Jupyter notebooks. 

In [None]:
import matplotlib.pyplot as plt

Given the lists x = [0, 1, 2, 3] and y = [0, 1, 4, 9], use the plot function to produce a plot of x versus y.

In [None]:
x = [0, 1, 2, 3] 
y = [0, 1, 4, 9]
plt.plot(x, y)

By default, each point is connected with a blue line. To make the function look smooth, use a finer grid for the x-axis.

Let's plot the parabolic function $y = x^2$ on the domain $x\in[-5,5]$.

In [None]:
x = np.linspace(-5,5,100)
y = x**2

plt.plot(x,y)

You can play around with various customizations. Plot the sine function with green dashed lines and a star marking data points.

In [None]:
x = np.linspace(-np.pi, np.pi, 100)
y = np.sin(x)

In [None]:
plt.plot(x[::10], y[::10],'g*--')

You can chose predefined styles for your plots.

In [None]:
print(plt.style.available)
# plt.style.use('seaborn-paper')

In [None]:
plt.plot(x,np.sin(x), color='tab:blue',  linestyle='--', linewidth=2, label=r'$\sin x$')
plt.plot(x,np.cos(x), color='tab:orange', linestyle='-.',linewidth=6, label=r'$\cos x$')
plt.title('Phase-shifted waves')
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True)
plt.legend(loc='lower right')
plt.ylim(-1,1)
plt.xlim(-2,2);

Scatter plots work exactly the same as regular plots above except they have default behavior where the dots are not connected.

In [None]:
# Generate 20, normally distributed, random points
x, y = np.random.randn(2, 20)

plt.scatter(x, y)

In [None]:
plt.plot(x, y,'o', color='tab:blue')

Data points with a linear relationship

In [None]:
x = np.arange(100)
delta = np.random.poisson(40, size=100)

y = 0.8*x + 5 + delta

plt.scatter(x, y)

There are several other plotting functions that plot x versus y data. Some of them are `bar`, `loglog`, `semilogx`, and `semilogy`.  The bar function plots bars centered at x with height y. The loglog, semilogx, and semilogy functions plot the data in x and y with the x and y axis on a log scale, the x axis on a log scale and the y axis on a linear scale, and the y axis on a log scale and the x axis on a linear scale, respectively.

In [None]:
x = np.arange(11)
y = x**2

plt.figure(figsize = (14, 8))

plt.subplot(2, 3, 1)
plt.plot(x,y)
plt.title('Plot')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid()

plt.subplot(2, 3, 2)
plt.scatter(x,y)
plt.title('Scatter')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid()

plt.subplot(2, 3, 3)
plt.bar(x,y)
plt.title('Bar')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid()

plt.subplot(2, 3, 4)
plt.loglog(x,y)
plt.title('Loglog')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid(which='both')

plt.subplot(2, 3, 5)
plt.semilogx(x,y)
plt.title('Semilogx')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid(which='both')

plt.subplot(2, 3, 6)
plt.semilogy(x,y)
plt.title('Semilogy')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid()

plt.tight_layout()


The statement `plt.tight_layout` ensures that the sub-figures not overlap with each other.

Sometimes, you want to save the figures in a specific format, such as pdf, jpeg, png, and so on. You can do this with the function plt.savefig.

In [None]:
plt.figure(figsize = (8,6))
plt.plot(x,y)
plt.xlabel('x')
plt.ylabel('y')
plt.savefig('image.pdf')

Data points with a linear relationship

In [None]:
x = np.arange(100)
delta = np.random.poisson(40, size=100)

y = 0.8*x + 5 + delta

plt.scatter(x, y)

### "Object-oriented" plotting

> Indented block



So far, we used the procedural interface to make plots. You can get more granular control of yor plot by using the object-oriented interface.

You start the plot with the figure object.

In [None]:
fig = plt.figure()

In [None]:
type(fig)

In [None]:
fig = plt.figure()
ax = fig.add_subplot(1,1,1,)
# Set the title of plot
ax.set_title("Empty plot")

In [None]:
fig = plt.figure()

# Generate a grid of 2x2 subplots
# Axes object for 1st location
ax1 = fig.add_subplot(2,2,1)
ax1.set_title('First Location')

# Axes object for 2nd location
ax2 = fig.add_subplot(2,2,2)
ax2.set_title('Second Location')

# Axes object for 3rd location
ax3 = fig.add_subplot(2,2,3)
ax3.set_xlabel('Third Location')

# Axes object for 4th location
ax4 = fig.add_subplot(2,2,4)
ax4.set_xlabel('Fourth Location')

# Nice layout and display
plt.tight_layout()
plt.show()

In [None]:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2)
ax1.set_title('First Location')
ax2.set_title('Second Location')
ax3.set_xlabel('Third Location')
ax4.set_xlabel('Fourth Location')
plt.tight_layout()

In [None]:
# Import ticker to control tick labels and positions
import matplotlib.ticker as tck

# Generate the x-axes grid
x = np.linspace(-np.pi, np.pi, 100, endpoint=True)

# Figure
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, sharex=True)
fig.set_dpi(100)
fig.set_size_inches(10,6)

# First plot - sine
ax1.plot(x, np.sin(x))
ax1.set_title("sin")
ax1.set_ylabel("y")

# Second plot - cosine
ax2.plot(x, np.cos(x))
ax2.set_title("cos")
ax2.set_ylabel("y")

# Third plot - tangens
ax3.plot(x, np.tan(x))
ax3.set_title("tan")
ax3.set_ylabel("y")
ax3.set_xlabel("x")

# Set ticks for the shared axes on the bottom of the plot
x_ticks = np.arange(-np.pi,np.pi+np.pi/2,step=(np.pi/2))
ax3.set_xticks(x_ticks, [r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$', r'$\frac{\pi}{2}$', r'$\pi$'])

plt.tight_layout()

# Exercises

1. Check if ‘Python’ is in the string ‘Python is great!’.
1. Get the last word ‘great’ from ‘Python is great!’
1. Turn ‘Python is great!’ to a list.
1. Compute sin(87°).
1. Write a Python statement that generates the following error:
    "TypeError: math.sin() takes exactly one argument (0 given)"
1. Compute the surface area and volume of a cylinder from given radius and height. Make it a function.
1. Compute the slope between two points $p1=(x_1,y_1)$ and $p_2=(x_2,y_2)$. Recall that the slope between two such points is 
$ \frac{y_2−y_1}{x_2−x_1}$. Make it a function.
1. Compute the distance between two points as above. Recall that the distance between points in two dimensions is $\sqrt{(x_2−x_1)^2+(y_2-y_1)^2}$. Make it a function.
1. Generate an array with size 100 evenly spaced between -10 to 10.
1. Consider a triangle with vertices at (0,0), (1,0), and (0,1). Write a function `my_inside_triangle(x,y)` where the output is the string 'outside' if the point (x,y) is outside of the triangle, 'border' if the point is exactly on the border of the triangle, and 'inside' if the point is on the inside of the triangle.
1. Plot the functions $y_1(x)=3+e^{−𝑥}\sin(6 x)$ and $y_2(x)=4+e^{-x} \cos(6x)$ for $0\leq x \leq 5$ on a single axis. Give the plot axis labels, a title, and a legend.
1. A cycloid is the curve traced by a point located on the edge of a wheel rolling along a flat surface. The $(x,y)$ coordinates of a cycloid generated from a wheel with radius, $r$, can be described by the parametric equations:
$$ x = r (\phi - \sin \phi), \qquad  y = r (1-\cos\phi) $$
where $\phi$ is the number in radians that the wheel has rolled through. Generate a plot of the cycloid for $\phi \in [0,2\pi]$ using 1000 increments and $r=3$. Give your plot a title and labels. Turn the grid on and modify the axis limits to make the plot neat.
1. Generate 1000 normally distributed random numbers using the `np.random.randn` function. Use the `plt.hist` function to plot a histogram of the randomly generated numbers. Use the `plt.hist` function to distribute the randomly generated numbers into 10 bins. Create a bar graph of output of hist using the `plt.bar` function. It should look very similar to the plot produced by `plt.hist`.