## Hello Jupyter

In [2]:
print ('hello Alex')

hello Alex


In [3]:
x = 3
print (x)

3


In [4]:
# reexecute this cell several times
x = x + 5
# output of the last line in the cell is shown
x

8

In [5]:
my_string = "string can contain characters including \n new lines and \t tabs"
my_string

'string can contain characters including \n new lines and \t tabs'

In [6]:
print (my_string)

string can contain characters including 
 new lines and 	 tabs


In [7]:
12 + 45

57

In [8]:
# result types are safer
24 * 3.5

84.0

## Lists, Dictionaries

In [9]:
x = [1, 2, 3]

In [10]:
# in python enumeration starts from 0
x[0], x[2]

(1, 3)

In [11]:
y = x[0:2] + [5, 10]
y

[1, 2, 5, 10]

In [12]:
y[::2]

[1, 5]

In [13]:
y[::-1]

[10, 5, 2, 1]

## Dictionaries

In [14]:
d = {'a': 1, 'b': 2}

In [15]:
d['c'] = 3
d['casdsd'] = 34
d

{'a': 1, 'b': 2, 'c': 3, 'casdsd': 34}

In [16]:
for key, value in d.items():
    print (key, value)

a 1
b 2
c 3
casdsd 34


## Important python libraries

 - numpy (numerics) + scipy (scientific functions) 
 - matplotlib - plotting
 - pandas - convenient operations on data for Data Science
 - scikit-learn - machine learning
 
We'll meet them very soon.

## Hello numpy!

`numpy` is the core of scientific python. It is the most convenient way to organize number-crunching in python.

In [17]:
import numpy

In [18]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [19]:
x.reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [20]:
# x.reshape(3, 4)

In [21]:
# slicing has the same logic for lists / strings / tuples / numpy, etc
x[:4]

array([0, 1, 2, 3])

In [22]:
print (x[:3])
print (x[3:7])
print (x[7:])

[0 1 2]
[3 4 5 6]
[7 8 9]


### Vector operations

In [25]:
x = numpy.arange(10 ** 6) #Return evenly spaced values within a given interval.
# vector operations do similar task for each element. In this case each element is multiplied by 3 and 12 added.
3 * x + 12.

array([1.200000e+01, 1.500000e+01, 1.800000e+01, ..., 3.000003e+06,
       3.000006e+06, 3.000009e+06])

In [26]:
# use timing magic to understand this is quite fast
%timeit 3 * x + 12.

4.63 ms ± 325 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [30]:
Z = numpy.arange(15).reshape(5, 3)
Z

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [31]:
numpy.log(numpy.exp(Z)) # type conversion happened

array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.],
       [ 6.,  7.,  8.],
       [ 9., 10., 11.],
       [12., 13., 14.]])

In [32]:
Z += 4

In [33]:
Z

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

In [34]:
Z[::2, :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [35]:
Z[[0, 2, 4], :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [36]:
Z.sum(axis=1)

array([15, 24, 33, 42, 51])

In [37]:
# axes are also zero-numerated
Z.sum(axis=0)

array([50, 55, 60])

In [38]:
Z.max(axis=1)

array([ 6,  9, 12, 15, 18])

In [39]:
Z2 = - Z
Z2 = numpy.sort(Z2, axis=1)
Z2

array([[ -6,  -5,  -4],
       [ -9,  -8,  -7],
       [-12, -11, -10],
       [-15, -14, -13],
       [-18, -17, -16]])

## Indexing with boolean array

In [40]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [41]:
x > 3

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [42]:
x[x < 7.4]

array([0, 1, 2, 3, 4, 5, 6, 7])

## Copies

Many operations in numpy don't create copies, but operate with the same memory 

In [44]:
x = numpy.arange(10)
y = x[:5]

print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[10  1  2  3  4  5  6  7  8  9] [10  1  2  3  4]


this happened because x and y point __to the same place in memory__

In [47]:
x = numpy.arange(10)
y = x[:5].copy()
print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[0 1 2 3 4 5 6 7 8 9] [10  1  2  3  4]


## Random numbers

module `numpy.random` helps with generating random numbers

In [48]:
# generating 10000 random numbers at once
numpy.random.normal(loc=2, scale=12, size=10000)

array([-7.36130105, -3.89579987,  0.17604214, ...,  7.29555741,
       -1.65568502, 11.3454555 ])

## Sorting

In [49]:
x = numpy.random.random(size=1000)
x = numpy.sort(x)

In [51]:
print (x[:10])
print (x[-10:])

[0.00050614 0.00080774 0.00090352 0.00102425 0.00138886 0.00197455
 0.00375454 0.00412814 0.00436505 0.00503647]
[0.99073474 0.9911111  0.99331845 0.99376466 0.99423361 0.99551581
 0.99594238 0.99610095 0.99610258 0.99913341]


## Arg...

arg-functions allow writing non-trivial operations with a couple of lines

In [28]:
# random.random generates uniform in [0, 1]
random_numbers = numpy.random.random(size=1000)
indices = numpy.argsort(random_numbers) #Returns the indices that would sort an array.

In [29]:
numpy.alltrue(random_numbers[indices] == numpy.sort(random_numbers)) #Test whether all array elements along a given axis evaluate to True.

True

In [61]:
indices[:10]

array([ 60,   0, 546, 781, 353, 746, 356, 595, 466, 705])

In [62]:
random_numbers.min(), random_numbers.max()

(0.0005381736095251277, 0.9998512961979813)

In [63]:
random_numbers.argmax(), random_numbers[random_numbers.argmax()]

(446, 0.9998512961979813)

In [64]:
random_numbers.argmin(), random_numbers[random_numbers.argmin()]

(60, 0.0005381736095251277)

## Homework 3

In [31]:
import numpy as np

In [88]:
# 1. sample 1000 elements from normal distribution 
mu, sigma = 0, 0.1 
X = np.random.normal(mu,sigma,1000)

In [89]:
# 2. leave only positive numbers (from previous exercise)
poz = X[X>=0].copy()
print(poz)

[9.23050139e-02 5.94817434e-02 5.58088291e-02 7.00467949e-03
 8.85465082e-02 1.20347488e-02 7.21145203e-02 1.48341156e-01
 1.63089302e-02 2.04411506e-01 1.75615391e-02 1.20489835e-01
 3.95586584e-02 5.23925678e-02 2.37778589e-02 7.49925169e-02
 1.24581661e-01 8.97068064e-02 1.19195864e-01 9.17898524e-03
 3.54996747e-02 2.16363635e-02 2.97065366e-02 4.25490501e-02
 9.75417609e-02 5.64835242e-02 7.53194241e-02 1.35693259e-02
 2.57852131e-02 7.04359957e-02 3.13914421e-02 1.41895263e-01
 1.48338009e-01 1.76455787e-01 8.39481775e-02 1.80490644e-01
 7.00550594e-03 3.30602245e-02 6.57748321e-02 1.67548138e-01
 9.32958532e-02 1.21438684e-01 1.68363147e-02 5.12698170e-03
 1.46840100e-01 2.13239211e-02 4.08951234e-02 3.84490992e-02
 1.09899366e-01 3.73919952e-04 2.28317044e-02 8.21327167e-02
 1.19404937e-01 8.86689097e-02 6.77107846e-02 9.26039362e-02
 5.23682835e-02 2.56620804e-02 9.91118669e-02 4.73931298e-02
 1.75507581e-01 6.26288529e-02 1.29746097e-02 7.95514013e-03
 9.42530409e-02 4.773913

In [82]:
x = np.array([-5,-4,-8,-3,0,8,3,-5,7,-1])
negs = []
for q in range(len(x)):
    if x[q]<0:
        negs.append(q)
        
x = np.delete()

[-5 -4 -8 -3  0  8  3 -5  7 -1]


In [99]:
# 3. count number of left numbers, their minimum, maximum, mean and variance.
print('brojeva manjih od 0 ima: ' + str(len(X) - len(poz)))
print('njihov minimum: ' + str(np.amin(X[X<0])))
print('njihov maksimum: ' + str(np.amax(X[X<0])))
print('njihov mean: ' + str(np.mean(X[X<0])))
print('njihova varijansa: ' + str(np.var(X[X<0])))

brojeva manjih od 0 ima: 498
njihov minimum: -0.2954555864699102
njihov maksimum: -0.00024364853426902543
njihov mean: -0.08143156391019088
njihova varijansa: 0.0037285973651133997


## References:
* `numpy` documentation: https://docs.scipy.org/doc/numpy/reference/
    * almost any question about `numpy` is already answered on stackoverflow
* [From python to numpy: a beautiful book about numpy](https://github.com/rougier/from-python-to-numpy)
* Data manipulation with `numpy`: tips and tricks [part1](http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html), [part2](http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html)
