# Overview of Python

We learned **a little** Python in our course, we just scratched the surface. As a programming language Python has lots of capabilities and consequently wide-spread use. 

In this overview, we'll demonstrate uses of Python in different fields.

* Scientific computing (in addition to NumPy)
  * SciPy
  * SymPy
  * NetworkX
  * Image processing
* Data science
  * Pandas
  * Bokeh
* Machine learning / deep learning
* Web content and web apps
  * Django
  * Flask
* Dask: Scalable analytics in Python
* Bioinformatics
  * BioPython

# Scientific computing

Here is the ecosystem of Python scientific computing.

![](images/scipy-eco.png)

[image source](https://www.datacamp.com/community/blog/python-scientific-computing-case)

As you can notice, there many packages for scientific computing. We are somewhat familiar with NumPy, which is an essential package not only in scientific computing but also in data science, machine learning, etc.

Now let's go over some packages briefly.

## SciPy

[SciPy](https://www.scipy.org/) is a Python-based ecosystem of open-source software for mathematics, science, and engineering. 

Please go over the reference manual and [tutorial page](https://docs.scipy.org/doc/scipy/reference/tutorial/index.html).

## SymPy

[SymPy](https://www.sympy.org/en/index.html) is a Python library for symbolic mathematics. The [features page](https://www.sympy.org/en/features.html) summarizes the capabilities of this library.

Below is a small example.

In [None]:
from sympy import *
x, y, z, t = symbols('x y z t')
expand((x + 2)*(x - 3))

In [None]:
factor(x**3 - x**2 + x - 1)

In [None]:
cancel((x**2 + 2*x + 1)/(x**2 + x))

Solving equations is also possible

In [None]:
from sympy.solvers import solve
from sympy import Symbol
x = Symbol('x')
solve(x**2 - 1, x)

## NetworkX

[NetworkX](https://networkx.github.io/) is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx


G = nx.random_geometric_graph(200, 0.125)
# position is stored as node attribute data for random_geometric_graph
pos = nx.get_node_attributes(G, 'pos')

# find node near center (0.5,0.5)
dmin = 1
ncenter = 0
for n in pos:
    x, y = pos[n]
    d = (x - 0.5)**2 + (y - 0.5)**2
    if d < dmin:
        ncenter = n
        dmin = d

# color by path length from node near center
p = dict(nx.single_source_shortest_path_length(G, ncenter))

plt.figure(figsize=(8, 8))
nx.draw_networkx_edges(G, pos, nodelist=[ncenter], alpha=0.4)
nx.draw_networkx_nodes(G, pos, nodelist=list(p.keys()),
                       node_size=80,
                       node_color=list(p.values()),
                       cmap=plt.cm.Reds_r)

plt.xlim(-0.05, 1.05)
plt.ylim(-0.05, 1.05)
plt.axis('off')
plt.show()

## Image processing

**Face recognition** : By the help of `opencv` package it's easy to do image processing on images. Face recognition is a neat example.

In [None]:
import cv2
from matplotlib import pyplot as plt

# Create the haar cascade
cascPath = "data/haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascPath)

In [None]:
# Read the image
image = cv2.imread("images/SolvayConference1927.jpg")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

In [None]:
# Detect faces in the image
faces = faceCascade.detectMultiScale(
    gray,
    scaleFactor=1.2,
    minNeighbors=5,
    minSize=(30, 30),
    flags=cv2.CASCADE_SCALE_IMAGE
)

In [None]:
print("Found {0} faces!".format(len(faces)))

# Draw a rectangle around the faces
for (x, y, w, h) in faces:
    cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)

In [None]:
plt.figure(figsize = (20,10))
plt.imshow(image)
plt.title('Solvay Conference')
plt.show()

Please visit [this page](https://rarehistoricalphotos.com/solvay-conference-probably-intelligent-picture-ever-taken-1927/) for the attendees of the conference.

> Can you fix the false detection (third from left, sitting) by adjusting `scaleFactor` or other parameters?

**Image processing with scikit-image**: This package contains many functions for image processing. Below is an example for *edge detection*

In [None]:
from skimage import data, io, filters

image = data.coins()
io.imshow(image)
io.show()


In [None]:
edges = filters.sobel(image)
io.imshow(edges)
io.show()

# Data science

Python is also very prominent in field of data science. Here's the list of packages grouped by different steps of data analysis.

![](images/py-data-science.png)

## Pandas

[Pandas](https://pandas.pydata.org/) pandas is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is an essential library for data science.

Here's a glimpse of what Pandas can do.

In [None]:
import pandas as pd
import numpy as np

In [None]:
df = pd.read_csv("data/OfficeSupplies.csv")

In [None]:
df.head()

Let's try to find out

* What rep sold the most?
* What region sold the most?

In [None]:
# who sold the most
df.groupby(["Rep"]).sum("Units").sort_values("Units", ascending=False)

In [None]:
df["Total Price"] = df["Units"] * df["Unit Price"]
df.head()

In [None]:
# who sold the most in total dollar amount
df.groupby("Rep").sum("Total Price").sort_values("Total Price", ascending=False).head()

In [None]:
# what region sold the most
group = df.groupby(["Region","Rep"]).sum("Total Price")
total_price = group["Total Price"].groupby(level=0, group_keys=False)
total_price.nlargest(5)

## Bokeh

Visualization is a must for data science to explore the data and to communicate the findings. `matplotlib` is used for visualization but it generates static images. If you need interactive plots then Bokeh is the library to go. Bokeh is an interactive visualization library that targets modern web browsers for presentation. 

Please visit [its gallery](https://bokeh.pydata.org/en/latest/docs/gallery.html) for stunning examples.

In [None]:
import numpy as np

from bokeh.io import output_notebook, show
from bokeh.models import HoverTool
from bokeh.plotting import figure
output_notebook()

In [None]:
n = 500
x = 2 + 2*np.random.standard_normal(n)
y = 2 + 2*np.random.standard_normal(n)

p = figure(title="Hexbin for 500 points", match_aspect=True,
           tools="wheel_zoom,reset", background_fill_color='#440154')
p.grid.visible = False

r, bins = p.hexbin(x, y, size=0.5, hover_color="pink", hover_alpha=0.8)

p.circle(x, y, color="white", size=1)

p.add_tools(HoverTool(
    tooltips=[("count", "@c"), ("(q,r)", "(@q, @r)")],
    mode="mouse", point_policy="follow_mouse", renderers=[r]
))

show(p)

In [None]:
from bokeh.models import ColumnDataSource
from bokeh.layouts import gridplot
from bokeh.sampledata.autompg import autompg

source = ColumnDataSource(autompg)

options = dict(width=300, height=300,
               tools="pan,wheel_zoom,box_zoom,box_select,lasso_select")

p1 = figure(title="MPG by Year", **options)
p1.circle("yr", "mpg", color="blue", source=source)

p2 = figure(title="HP vs. Displacement", **options)
p2.circle("hp", "displ", color="green", source=source)

p3 = figure(title="MPG vs. Displacement", **options)
p3.circle("mpg", "displ", size="cyl", line_color="red", fill_color=None, source=source)

p = gridplot([[ p1, p2, p3]], toolbar_location="right")

show(p)

# Machine learning / Deep learning

[Tensorflow](https://www.tensorflow.org/) and [PyTorch](https://pytorch.org/) frameworks are used for Deep Learning/Machine Learning.

Below are two notebooks demonstrating deep learning approaches (they will run in Google Colab).

* [mnist example](https://colab.research.google.com/drive/1rWbhvaRQUDK1Nu79RH570UMqMKpMkhMP)
* [fashion-mnist example](https://colab.research.google.com/drive/1fmWvLdV1QOx1rvDlVwizeIXIY3A-MNEk)


# Web Content and Web Apps via API

## Generate web content

Making web pages by writing html code manually is not scalable. Thus there are many languages which generates the html code programatically. PHP is one of them. Python has the similar ability, thanks to [Django](https://www.djangoproject.com/) module. 

Suppose you connected to a database and did a query. If you want to show the results in webpage, a small template (or view) shown below can be used to generate the content:

```html
<table>
    <tr>
        <th>Field 1</th>
        ...
        <th>Field N</th>
    </tr>
    {% for item in query_results %}
    <tr> 
        <td>{{ item.field1 }}</td>
        ...
        <td>{{ item.fieldN }}</td>
    </tr>
    {% endfor %}
</table>
```

As you can see, Python code can be used to programatically generate a table (or any other html structure) using loops.

## API

If you are just sharing data then you don't need to design or generate html view. Users can send their input or query and then receive output or result in plain text. Most of data intensive websites have APIs for that reason. API stands for [Application Programming Interfaces](https://en.wikipedia.org/wiki/Application_programming_interface).

Before using an API, let's see output or regular HTML page. Here's the content returned by http://google.com

In [None]:
import requests
import json
google = requests.get("https://www.google.com/")

In [None]:
google.text[0:500]

Regular HTML page has html code and javascript code which is required for rendering the page by the browser. Here's an example for a simple API output:

In [None]:
currency = requests.get("http://api.exchangeratesapi.io/v1/latest?access_key=571435552d0acaed7bc68b46c3604d9f&symbols=USD,TRY&format=1")
currency.json()

As you can see, API output is just data, no html code is involved.

Let's give another example. OMDb is open source movie database and it provides public API. We can search for movies or get detailed information about a movie if movie_id is provided. The [API page](http://www.omdbapi.com/#usage) describes parameters and provides examples. Let's search for "The Matrix".

In [None]:
omdb = requests.get("http://www.omdbapi.com/?t=The+Matrix&y=1999&apikey=291b688d")

In [None]:
print(json.dumps(omdb.json(),indent=2))

Most APIs return JSON objects which is plain text and can be parsed with most languages. Especially in web apps, handling JSON data with Javascript libraries is quite simple. So, API democratizes the data. It can be consumed by any programmer or used as-is by the browser. If you visit the [URL](http://www.omdbapi.com/?t=The+Matrix&y=1999&apikey=291b688d) to your browser you can see the results in your browser.

As you might have noticed, JSON object is similar to Python dictionaries. Let's get the imdb rating for the movie:

In [None]:
omdb.json()['imdbRating']

JSON object can be nested and rich in context. Let's get Rotten Tomatoes rating this time. The keys is 'Ratings' which refers to a list and that list is list of dictionaries and Rotten Tomatoes is the second source.

In [None]:
omdb.json()['Ratings'][1]['Value']

## Turn an existing function into an API

The [Flask](https://palletsprojects.com/p/flask/) package helps turning an existing Python function into an API. In other words, your function can be accessed by anybody (provided that a server is configured)

Remember the `isPrime()` function, let's make it public:

```python
from flask import Flask
from flask_restplus import Resource, Api, fields

@api.route('/isPrime/<number>')
class isPrime(Resource):
    def get(self, number):
        number=int(number)
        return number > 1 and all(number % i for i in range(2, int(number**0.5) + 1))
```

As you can see, by just adding a decorator (`@`) before the function, we can turn the function into an API.

Let's see it in action:

In [None]:
import requests
#data = requests.get("https://barebone-serverless-flask-api-zeit-now.alperyilmaz.now.sh/isPrime/982451653")
data = requests.get("https://ojncui8s6l.execute-api.us-west-2.amazonaws.com/api/isPrime/3")
data.json()

# DASK

![dask-intro](images/dask-intro.png)

Can utilize thousands of clusters or threads in local computer

![grid](images/grid_search_schedule.gif)

[link](https://mybinder.org/v2/gh/dask/dask-examples/master?urlpath=lab/tree/array.ipynb) for binder notebook

## VAEX

processing billion rows in your laptop computer. Plese visit its [home page](https://vaex.io/docs/index.html#)

## Numba

Please visit [main page](https://numba.pydata.org/) for more information.

> Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. 


In [None]:
from numba import jit
import numpy as np
import random

In [None]:
def monte_carlo_pi(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

In [None]:
%%timeit
monte_carlo_pi(10_000_000)

In [None]:
@jit(nopython=True)
def monte_carlo_pi_numba(nsamples):
    acc = 0
    for i in range(nsamples):
        x = random.random()
        y = random.random()
        if (x ** 2 + y ** 2) < 1.0:
            acc += 1
    return 4.0 * acc / nsamples

In [None]:
monte_carlo_pi_numba(1000)

In [None]:
%%timeit
monte_carlo_pi_numba(10_000_000)

# Bioinformatics

from Bio.Seq import Seq[BioPython](https://biopython.org/) is used for biological computation and bioinformatics. Please visit [another Jupyter notebook](http://bitly.com/biopython-jupyter) for examples.

In [None]:
from Bio.Seq import Seq
from Bio.SeqUtils import GC

coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG")
GC(coding_dna)


In [None]:
template_dna = coding_dna.reverse_complement()
template_dna

In [None]:
messenger_rna = coding_dna.transcribe()
messenger_rna.translate()

Sequences act like strings!

In [None]:
print(coding_dna[0])

In [None]:
coding_dna.count("AA")

### Parsing multi fasta files

In [None]:
from Bio import SeqIO

sequences = SeqIO.parse("data/ls_orchid.fasta", "fasta")

In [None]:
[seq.id for seq in sequences]

In [None]:
sequences = SeqIO.parse("data/ls_orchid.fasta", "fasta")
[record.seq.translate() for record in sequences if len(record)>780]

# Raspberry Pi

Here are the product line of Raspberry Pi

### Raspberry Pi 4

Raspberry Pi is a tiny, fanless, energy-efficient desktop computer. It USB and Ethernet ports along with WiFi connection capability. Please check the link for [success stories](https://www.raspberrypi.com/success-stories/).

![](images/wide-hero-shot-6b39618796ad96d159acebd1e4d1bcf2.png)

### Raspberry Pi Zero 2 W

Raspberry Pi Zero 2 W is a tiny computer (65mm × 30mm) and has a quad-core 64-bit ARM Cortex-A53 processor clocked at 1GHz and 512MB of SDRAM. Wireless LAN is built-in giving more flexibility. 

![](https://assets.raspberrypi.com/static/51035ec4c2f8f630b3d26c32e90c93f1/2b8d7/zero2-hero.webp)

Here's a list of [Raspberry Pi Zero projects](https://www.raspberrypi.com/news/tag/raspberry-pi-zero/)

Both, Raspberry Pi 4 and Raspberry Pi Zero run full Linux OS system and supports Python coding to interact with peripharals (sensors etc.)

### Raspberry Pico

Is a tiny microcontroller that can be programmed with [Micro Python] (https://www.raspberrypi.com/products/micropython-pico/) which is kind of equivalent of Arduino.

![](images/rasp-pico.png)

## Sensors for Raspberry Pi

There are numerous sensors that can be integrated with Raspberry Pi. Please go over [list of sensors](https://thepihut.com/collections/adafruit-sensors) and also possible [Pi cases](https://thepihut.com/collections/raspberry-pi-cases)

### Python for Raspberry Pi

Here's the setup:
* Raspberry Pi Zero
* NFC card reader
* buzzer

![](images/pi_nfc.jpg)

Below is the sample code which reads card id and prints to screen


```python
import RPi.GPIO as GPIO
from mfrc522 import SimpleMFRC522

reader=SimpleMFRC522()

id,text = reader.read()
print(id)
```

The sample code is executed by `python scriptname.py` in terminal in Raspberry Pi terminal. This simple code reads the card id, prints it and then exits. So, it won't allow reading multiple cards in a row.

If you want to continously read for cards then we need to add a while loop with `while True:`

```python
import RPi.GPIO as GPIO
from mfrc522 import SimpleMFRC522

reader=SimpleMFRC522()

while True:
    id,text = reader.read()
    print(id)
```

Here's the sample code which reads cards and writes the card ID to a local Sqlite3 database and then makes a buzz sound.

```python
import RPi.GPIO as GPIO
from mfrc522 import SimpleMFRC522
#from gpiozero import Buzzer
import sqlite3 as lite
import sys
from time import sleep

reader=SimpleMFRC522()
BUZZER = 11

def buzz(BUZZER, noteFreq, duration):
    GPIO.setmode(GPIO.BOARD)
    GPIO.setup(BUZZER, GPIO.OUT)
    halveWaveTime = 1 / (noteFreq * 2 )
    waves = int(duration * noteFreq)
    for i in range(waves):
       GPIO.output(BUZZER, True)
       sleep(halveWaveTime)
       GPIO.output(BUZZER, False)
       sleep(halveWaveTime)

def insert_new_line_in_rfid_table(PathToDatabase, TableName, card_id):
    try:
        con = lite.connect(PathToDatabase)
        cur = con.cursor()
        cur.execute("""
            INSERT INTO %s (card_id)
            VALUES( %i )
            """ % (TableName, card_id))
        con.commit()

    except lite.Error as e:
        if con:
            con.rollback()
            print("Error %s:" % e.args[0])
            sys.exit(1)

    finally:
        if con:
            con.close()
while True:
    try:
        id = reader.read_id()
        print(id)
        insert_new_line_in_rfid_table("card_reading.db","cardreadings", id)
        buzz(BUZZER, 44, 0.1)
        buzz(BUZZER, 440, 0.1) 
        sleep(0.5)
    finally:
        GPIO.cleanup()
```

# Installing packages 

During the lecture we used several libraries and they were already installed in the notebook. What if you want to use special commands for your project, assignment or hobby? It's highly likely that someone else wrote a package for your topic of interest. In that case, it's very easy to install and use a package.

The installation of a package is done by `pip install packagename` command. However, this is not a Python command, it's a terminal command, thus needs to be send to terminal. So, we need to use exclamation mark, `!`, in front of the command.

You can search for packages at [PyPI](https://pypi.org/) website. 

Let's try a package called `dnacurve`. When we search for that package name, we land at [dnacurve package information page](https://pypi.org/project/dnacurve/). The site tells us how to install the package, which is `pip install dnacurve`. Let's run the command with exclamation mark.

In [None]:
!pip install dnacurve

The message says the package is successfully installed. From the [package information page](https://pypi.org/project/dnacurve/) we can get sample code.

In [None]:
from dnacurve import CurvedDNA
result = CurvedDNA('ATGCAAATTG'*5, 'trifonov', name='Example')
result.curvature[:, 18:22]

In [None]:
result.save_csv('_test.csv')
result.save_pdb('_test.pdb')
result.plot('_test.png', dpi=160)

You can check the png or csv file in current working directory.