SciPy
SciPy is a scientific computation library that uses NumPy underneath.
SciPy stands for Scientific Python.
It provides more utility functions for optimization, stats and signal processing.

Why Use SciPy?

If SciPy uses NumPy underneath, why can we not just use NumPy?
SciPy has optimized and added functions that are frequently used in NumPy and Data Science.

SciPy is predominantly written in Python, but a few segments are written in C.

In [12]:
# constants: SciPy offers a set of mathematical constants, one of them is liter which returns 1 liter as cubic meters.

from scipy import constants
print(constants.liter)
print(constants.pi)
# print(dir(constants)) # A list of all units under the constants module

0.001
3.141592653589793


=== SciPy Optimizers

** Optimizers are a set of procedures defined in SciPy that either find the minimum value of a function, or the root of an equation.

** Essentially, all of the algorithms in Machine Learning are nothing more than a complex equation that needs to be minimized with the help of given data.


=== Roots of an Equation

NumPy is capable of finding roots for polynomials and linear equations, but it can not find roots for non linear equations, like this one:

x + cos(x)

For that you can use SciPy's **optimize.root function.

** This function takes two required arguments:
fun - a function representing an equation.
x0 - an initial guess for the root.

The function returns an object with information regarding the solution.
The actual solution is given under attribute x of the returned object:

In [11]:
from scipy.optimize import root
import numpy as np 
# if we use from math import cos; it will need a scalar value, but scipy.optimize.root() passes an array. It will show warning.

def eqn(x):
    return x + np.cos(x)

rootFn = root(eqn, 0)
print(rootFn.x)
print(rootFn) # Print all information about the solution (not just x which is the root)

[-0.73908513]
 message: The solution converged.
 success: True
  status: 1
     fun: [ 0.000e+00]
       x: [-7.391e-01]
    nfev: 9
    fjac: [[-1.000e+00]]
       r: [-1.674e+00]
     qtf: [-2.668e-13]


=== Minimizing a Function
A function, in this context, represents a curve, curves have high points and low points.

High points are called maxima.

Low points are called minima.

**The highest point in the whole curve is called global maxima**, whereas the rest of them are called local maxima.

**The lowest point in whole curve is called global minima**, whereas the rest of them are called local minima.

Finding Minima
We can use **scipy.optimize.minimize()** function to minimize the function.

The minimize() function takes the following arguments:

fun - a function representing an equation.

x0 - an initial guess for the root.

method - name of the method to use. Legal values:
    'CG'
    'BFGS'
    'Newton-CG'
    'L-BFGS-B'
    'TNC'
    'COBYLA'
    'SLSQP'

callback - function called after each iteration of optimization.

options - a dictionary defining extra params:

{
     "disp": boolean - print detailed description
     "gtol": number - the tolerance of the error
  }

In [2]:
# example: 
# Minimize the function x^2 + x + 2 with BFGS:

from scipy.optimize import minimize, root

def eqn(x):
    return x**2 + x + 2

mymin = minimize(eqn, 0, method = 'BFGS')
myroot = root(eqn, 0)
print(mymin)
print(myroot.x)

  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 1.75
        x: [-5.000e-01]
      nit: 2
      jac: [ 0.000e+00]
 hess_inv: [[ 5.000e-01]]
     nfev: 8
     njev: 4
[-0.49999999]


=== SciPy Sparse Data
Sparse data is data that has mostly unused elements (elements that don't carry any information ).
In scientific computing, when we are dealing with partial derivatives in linear algebra we will come across sparse data.

""" Sparse Data: is a data set where most of the item values are zero.
Dense Array: is the opposite of a sparse array: most of the values are not zero. """

How to Work With Sparse Data
SciPy has a module, scipy.sparse that provides functions to deal with sparse data.

There are primarily two types of sparse matrices that we use:

**CSC - Compressed Sparse Column. For efficient arithmetic, fast column slicing.**

**CSR - Compressed Sparse Row. For fast row slicing, faster matrix vector products**

In [13]:
""" CSR Matrix
We can create CSR matrix by passing an arrray into function **scipy.sparse.csr_matrix().** """

## Sparse Matrix Methods

import numpy as np
from scipy.sparse import csr_matrix

arr = np.array([[0, 0, 0], [0, 1, 0], [0, 1, 2]])

print(csr_matrix(arr))
print('stored data : ', csr_matrix(arr).data) # Viewing stored data (not the zero items) with the data property
print('No. of non-zeros :', csr_matrix(arr).count_nonzero())

  (1, 1)	1
  (2, 1)	1
  (2, 2)	2
stored data :  [1 1 2]
No. of non-zeros : 3


In [None]:
# Removing zero-entries from the matrix with the eliminate_zeros() method:
import numpy as np
from scipy.sparse import csr_matrix

arr = np.array([[0, 0, 0], [0, 0, 1], [1, 0, 2]])
mat = csr_matrix(arr)
mat.eliminate_zeros()
mat.sum_duplicates() # Eliminating duplicates by adding them
newarr = csr_matrix(arr).tocsc() # Converting from csr to csc with the tocsc() method
print('mat(csr) :\n',mat)
print('newarr(csc) :\n', newarr)

""" 
# Note: Apart from the mentioned sparse specific operations, sparse matrices support all of the 
operations that normal matrices support e.g. reshaping, summing, arithemetic, broadcasting etc. """

mat(csr) :
   (1, 2)	1
  (2, 0)	1
  (2, 2)	2
newarr(csc) :
   (2, 0)	1
  (1, 2)	1
  (2, 2)	2
