## [Python paketi ili moduli](https://pypi.org/)

Python fajlovi i kompilacije tih fajlova koji sadrze klase i metode za neku specificnu vrstu proracuna se nazivaju python paketi ili python moduli. Njih se moze vrlo lako pronaci na internetu, a isto tako mozemo praviti svoje module za posebnu namjenu. Obicno je prije razvijanja takvih novih modula korisno pretraziti internet, jer je moguce da je neko vec imao ideju i razvio slican modul koji sluzi svrsi.

Neki od tih paketa, koje upoznajemo u ovoj svesci su:
- [Numpy](https://numpy.org/) - osnovni paket za matematicke proracune
- [Matplotlib](https://matplotlib.org/) i drugi paket za vizualizaciju
- [Pandas](https://pandas.pydata.org/) - analiza podataka

Prvo cemo nauciti kako se 'uvozi' modul na primjeru jednostavnog modula koji je Milica razvila i spremljen je u fajlu pod imenom `konverzija_temp.py`

In [9]:
from konverzija_temp import Konverzija as konv

# koje metode su nam na raspolaganju
help(konv)

Help on class Konverzija in module konverzija_temp:

class Konverzija(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  farenhajt_u_celzijus(temp_f)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [10]:
temp_fahr = 90

konv.farenhajt_u_celzijus(temp_fahr)

32.22222222222222

In [None]:
# Numpy and Matplotlib

#### Here I have included the code from the Python Boot Camp 2012 lecture on numpy and matplotlib.  You can follow along with the material on the slides without having to manually type the code itself.  The lecture was designed to provide an introduction to the numpy module (how python handles data arrays) and the matplotlib module (a python plotting module).

First we want to import the appropriate modules into our name space (note this is done automatically with the "--pylab" flag.

import numpy as np

The primary building block of the numpy module is the class "ndarray".  A ndarray object represents a multidimensional, homogeneous array of fixed-sized items.  An associated date-type object describes the format of each element in the array.  An ndarray object is (almost) never instantiated directly, but instead using a method that returns an instance of the class.

a = np.array([1, 2, 3])

a

The "ones" and "zeros" methods return an array object of the requested shape and type.

b = np.ones((3,2))

b

b.shape

c = np.zeros((1,3), int)

c

type(c)

c.dtype

"linspace" creates a one-dimensional array running from arg1 to arg2 (with length arg3).

d = np.linspace(1,5,11)

d

numpy provides a variety of methods to read and write data to disk (binary, ascii, fits, csv, etc.).  These include "loadtxt" and "tofile".  I haven't included these because of limitations with the Notebook.

ndarray objects can be indexed, sliced, and iterated over much like lists.  The format for slicing is still "x1:x2:dx".

a = np.arange(10)

a

a[2]

a[2:5]

a[:6:2] = -1000

a

a[::-1]

a[2:-2]

Arrays can hold (almost) any type of data, as long as each individual element is identical (i.e., requires the same amount of memory).  The format of the ndarray can be specified with the "dtype" attribute.  Individual elements may be "named" in a structured array.

x = np.zeros((2,),dtype=('i4,f4,a10'))

x

x['f1']

Note that the same issues of references and copies that apply to other variables also apply to array objects.

y = x['f1']

y

y += np.array([1.0, 1.0])

y

x

A universal function (or "ufunc" for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features. That is, a ufunc is a “vectorized” wrapper for a function that takes a fixed number of scalar inputs and produces a fixed number of scalar outputs.  Examples include add, subtract, multiply, exp, log, and power.

Most array operations thus occur on an element-by-element basis:

a = np.array([[1, 2], [3, 4]])

b = np.array([[2, 3], [4, 5]])

a + b

np.multiply(a, b)

a ** b

Standard linear algebra (i.e., matrix) operations are also available.  Many are stored in the linalg module.

np.dot(a,b)

Universal functions run *much* faster than for loops, which should be avoided whenever possible.

a = np.random.random((500,500))

b = np.random.random((500,500))

def mult1(a,b):
    return a * b

def mult2(a,b):
    c = np.empty(a.shape)
    for i in range(a.shape[0]):
        for j in range(a.shape[1]):
            c[i,j] = a[i,j] * b[i,j]
    return c

timeit mult1(a,b)

timeit mult2(a,b)

numpy will (usually) intelligently deal with arrays of different sizes.  The smaller array is *broadcast* across the larger array so that they have compatible shapes.  Note that the rules for broadcasting are not always intuitive, so be careful!

a=np.array([1,2,3.])

a + 2

b=np.array([10,20,30.,40])

a * b

a = a.reshape(3,1)

a

a * b

Universal functions make it nearly trivial to compare arrays on an element-by-element basis.

a = np.array([1, 3, 0], float)

b = np.array([0, 3, 2], float)

a > b

a == b

c = a <= b

c

np.logical_and(a > 0, a < 3)

np.logical_or(a,b)

The *where* method provides a fast way to search (and extract) individual elements of an array.  When called with a single (conditional) argument, the method returns an array of indicies where the conditional is met.  If two additional arguments are added, more complex returns are possible.

a = np.array([1, 3, 0, -5, 0], float)

np.where(a != 0)

a[a != 0]

np.where(a != 0.0, 1 / a, a)

x = np.arange(9.).reshape(3, 3)

x

np.where( x > 5 )

ndarray objects provide basic statistical methods (mean, median, standard deviation, etc.).

a = np.array([[1, 2], [3, 4]])

np.mean(a)

np.mean(a, axis=0)

np.mean(a, axis=1)

np.std(a)

np.average(range(1,11), weights=range(10,0,-1))

The *random* module contains basic random number generation, as well as a few common probability distribution functions.  Many more (complex) pdfs are available within scipy.

np.random.rand(5)

np.random.randint(5, 10)

np.random.normal(1.5, 4.0)

Plotting is done with the *matplotlib* module.  If you have used MATLAB before, the syntax should look very familiar.

import matplotlib.pylab as plt

The simplest (but still quite powerful) method is *plot*.

x = np.array([1,2,3])
y = x**2
plt.plot(x, y, "ro")

Another basic example.

x = np.linspace(0, 2*np.pi, 300)
y = np.sin(x)
plt.plot(x, y)

Here's a more realistic plot, including how to modify axis labels, create a legend, etc.

x = np.linspace(0, 2*np.pi, 300)
y = np.sin(x)
y2 = np.sin(x**2)
plt.plot(x, y, label=r'$\sin(x)$')
plt.plot(x, y2, label=r'$\sin(x^2)$')
plt.title('Some functions')
plt.xlabel('x')
plt.ylabel('y')
plt.grid()
plt.legend()

numpy also provides capabilities for (least squares) polynomial fitting.

x = np.array([0.0, 1.0, 2.0, 3.0,  4.0,  5.0])
y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0])
z = np.polyfit(x, y, 3)
p = np.poly1d(z)
p30 = np.poly1d(np.polyfit(x, y, 30))
xp = np.linspace(-2, 6, 100)
plt.plot(x, y, '.', xp, p(xp), '-', xp, p30(xp), '--')
plt.ylim(-2,2)

Here's an example of the histogram plotting method (as well as the random number generation).

mu, sigma = 0, 0.1
s = np.random.normal(mu, sigma, 1000)
count, bins, ignored = plt.hist(s, 30, normed=True)
plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (bins - mu)**2 / (2 * sigma**2) ), color='r')

Probably the best way to learn about making new plots is looking at the 
<a href="http://matplotlib.sourceforge.net/gallery.html">Matplotlib Gallery</a>.

In [None]:
pandas