### SciPy Introduction

#### What is SciPy?
    SciPy is a scientific computation library that uses NumPy underneath.

    SciPy stands for Scientific Python.

    It provides more utility functions for optimization, stats and signal processing.

    Like NumPy, SciPy is open source so we can use it freely.

    SciPy was created by NumPy's creator Travis Olliphant.
    
#### Why Use SciPy?
    If SciPy uses NumPy underneath, why can we not just use NumPy?

    SciPy has optimized and added functions that are frequently used in NumPy and Data Science.
    

In [1]:
# SciPy Constants
    # As SciPy is more focused on scientific implementations, it provides many built-in scientific constants.

    # These constants can be helpful when you are working with Data Science.
    
from scipy import constants

print(constants.pi)



# Constant Units
    # A list of all units under the constants module can be seen using the dir() function.
    
print(dir(constants))


https://www.w3schools.com/python/scipy_constants.asp

3.141592653589793


### SciPy Optimizers

#### Optimizers in SciPy
    Optimizers are a set of procedures defined in SciPy that either find the minimum value of a function, or the root of an equation.
    
#### Optimizing Functions

    Essentially, all of the algorithms in Machine Learning are nothing more than a complex equation that needs to be minimized with the help of given data.
    
#### Roots of an Equation

    NumPy is capable of finding roots for polynomials and linear equations, but it can not find roots for non linear equations, like this one:
    x + cos(x)

    For that you can use SciPy's optimze.root function.

    This function takes two required arguments:

        fun - a function representing an equation.

        x0 - an initial guess for the root.

    The function returns an object with information regarding the solution.

    The actual solution is given under attribute x of the returned object:

In [3]:
from scipy.optimize import root
from math import cos

def eqn(x):
    return x + cos(x)

myroot = root(eqn, 0)

print(myroot.x)

# Print all information about the solution (not just x which is the root)
print(myroot)


[-0.73908513]
    fjac: array([[-1.]])
     fun: array([0.])
 message: 'The solution converged.'
    nfev: 9
     qtf: array([-2.66786593e-13])
       r: array([-1.67361202])
  status: 1
 success: True
       x: array([-0.73908513])


### Minimizing a Function
    A function, in this context, represents a curve, curves have high points and low points.

    High points are called maxima.

    Low points are called minima.

    The highest point in the whole curve is called global maxima, whereas the rest of them are called local maxima.

    The lowest point in whole curve is called global minima, whereas the rest of them are called local minima.

#### Finding Minima
    We can use scipy.optimize.minimize() function to minimize the function.

    The minimize() function takes the following arguments:

        fun - a function representing an equation.

        x0 - an initial guess for the root.

        method - name of the method to use. Legal values:
        
    'CG'
    'BFGS'
    'Newton-CG'
    'L-BFGS-B'
    'TNC'
    'COBYLA'
    'SLSQP'

        callback - function called after each iteration of optimization.

        options - a dictionary defining extra params:

    {
     "disp": boolean - print detailed description
     "gtol": number - the tolerance of the error
    }

In [4]:
from scipy.optimize import minimize

def eqn(x):
    return x**2 + x + 2

mymin = minimize(eqn, 0, method='BFGS')

print(mymin)

      fun: 1.75
 hess_inv: array([[0.50000001]])
      jac: array([0.])
  message: 'Optimization terminated successfully.'
     nfev: 8
      nit: 2
     njev: 4
   status: 0
  success: True
        x: array([-0.50000001])


In [11]:
# SciPy Sparse Data

# What is Sparse Data
    # Sparse data is data that has mostly unused elements (elements that don't carry any information ).

    # It can be an array like this one: [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0]
    
    
# Sparse Data: is a data set where most of the item values are zero.

# Dense Array: is the opposite of a sparse array: most of the values are not zero.



# How to Work With Sparse Data
    # SciPy has a module, scipy.sparse that provides functions to deal with sparse data.

    # There are primarily two types of sparse matrices that we use:

        # CSC - Compressed Sparse Column. For efficient arithmetic, fast column slicing.

        # CSR - Compressed Sparse Row. For fast row slicing, faster matrix vector products

# We will use the CSR matrix in this tutorial.



# CSR Matrix
    # We can create CSR matrix by passing an arrray into function scipy.sparse.csr_matrix().
    
import numpy as np
from scipy.sparse import csr_matrix

arr = np.array([0, 0, 0, 0, 0, 1, 1, 0, 2])

print(csr_matrix(arr))

    # The 1. item is in row 0 position 5 and has the value 1.

    # The 2. item is in row 0 position 6 and has the value 1.

    # The 3. item is in row 0 position 8 and has the value 2.

    

# Sparse Matrix Methods
    # Viewing stored data (not the zero items) with the data property:
    
arr = np.array([[0, 0, 0], [0, 0, 1], [1, 0, 2]])

print(csr_matrix(arr).data)


    # Counting nonzeros with the count_nonzero() method:
print(csr_matrix(arr).count_nonzero())


    # Removing zero-entries from the matrix with the eliminate_zeros() method:
mat = csr_matrix(arr)
mat.eliminate_zeros()

print(mat)

print('\n')
    # Eliminating duplicate entries with the sum_duplicates() method:
arr = np.array([[0, 0, 0], [0, 0, 1], [1, 0, 2]])

mat = csr_matrix(arr)
mat.sum_duplicates()

print(mat)


print('\n')
# Converting from csr to csc with the tocsc() method:
arr = np.array([[0, 0, 0], [0, 0, 1], [1, 0, 2]])

newarr = csr_matrix(arr).tocsc()

print(newarr)

  (0, 5)	1
  (0, 6)	1
  (0, 8)	2
[1 1 2]
3
  (1, 2)	1
  (2, 0)	1
  (2, 2)	2


  (1, 2)	1
  (2, 0)	1
  (2, 2)	2


  (2, 0)	1
  (1, 2)	1
  (2, 2)	2


### SciPy Graphs

#### Working with Graphs
    # Graphs are an essential data structure.

    # SciPy provides us with the module scipy.sparse.csgraph for working with such data structures.
    
    
#### Adjacency Matrix
    # Adjacency matrix is a nxn matrix where n is the number of elements in a graph.

    # And the values represents the connection between the elements.
    
    # For a graph like this, with elements A, B and C, the connections are:

    # A & B are connected with weight 1.

    # A & C are connected with weight 2.

    # C & B is not connected.

    # The Adjency Matrix would look like this:
    
      A B C
   A:[0 1 2]  
   B:[1 0 0]
   
   C:[2 0 0]

    

In [12]:
# Connected Components
    # Find all of the connected components with the connected_components() method.
    
import numpy as np
from scipy.sparse.csgraph import connected_components
from scipy.sparse import csr_matrix

arr = np.array([
  [0, 1, 2],
  [1, 0, 0],
  [2, 0, 0]
])

newarr = csr_matrix(arr)

print(connected_components(newarr))



(1, array([0, 0, 0]))


In [None]:
# PAREI AQUI - Pulei para Machine Learning 03/03/2021
https://www.w3schools.com/python/scipy_graphs.asp