## Hello Jupyter

In [1]:
print ('hello Alex')

hello Alex


In [2]:
x = 3
print (x)

3


In [3]:
# reexecute this cell several times
x = x + 5
# output of the last line in the cell is shown
x

8

In [4]:
my_string = "string can contain characters including \n new lines and \t tabs"
my_string

'string can contain characters including \n new lines and \t tabs'

In [5]:
print (my_string)

string can contain characters including 
 new lines and 	 tabs


In [6]:
12 + 45

57

In [7]:
# result types are safer
24 * 3.5

84.0

## Lists, Dictionaries

In [8]:
x = [1, 2, 3]

In [9]:
# in python enumeration starts from 0
x[0], x[2]

(1, 3)

In [10]:
y = x[0:2] + [5, 10]
y

[1, 2, 5, 10]

In [11]:
y[::2]

[1, 5]

In [12]:
y[::-1]

[10, 5, 2, 1]

## Dictionaries

In [13]:
d = {'a': 1, 'b': 2}

In [14]:
d['c'] = 3
d['casdsd'] = 34
d

{'a': 1, 'b': 2, 'c': 3, 'casdsd': 34}

In [15]:
for key, value in d.items():
    print (key, value)

a 1
b 2
c 3
casdsd 34


## Important python libraries

 - numpy (numerics) + scipy (scientific functions) 
 - matplotlib - plotting
 - pandas - convenient operations on data for Data Science
 - scikit-learn - machine learning
 
We'll meet them very soon.

## Hello numpy!

`numpy` is the core of scientific python. It is the most convenient way to organize number-crunching in python.

In [16]:
import numpy

In [17]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [18]:
x.reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [19]:
# x.reshape(3, 4)

In [20]:
# slicing has the same logic for lists / strings / tuples / numpy, etc
x[:4]

array([0, 1, 2, 3])

In [21]:
print (x[:3])
print (x[3:7])
print (x[7:])

[0 1 2]
[3 4 5 6]
[7 8 9]


### Vector operations

In [22]:
x = numpy.arange(10 ** 6) #Return evenly spaced values within a given interval.
# vector operations do similar task for each element. In this case each element is multiplied by 3 and 12 added.
3 * x + 12.

array([1.200000e+01, 1.500000e+01, 1.800000e+01, ..., 3.000003e+06,
       3.000006e+06, 3.000009e+06])

In [23]:
# use timing magic to understand this is quite fast
%timeit 3 * x + 12.

5.61 ms ± 93.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [24]:
Z = numpy.arange(15).reshape(5, 3)
Z

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [25]:
numpy.log(numpy.exp(Z)) # type conversion happened

array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.],
       [ 6.,  7.,  8.],
       [ 9., 10., 11.],
       [12., 13., 14.]])

In [26]:
Z += 4

In [27]:
Z

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

In [28]:
Z[::2, :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [29]:
Z[[0, 2, 4], :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [30]:
Z.sum(axis=1)

array([15, 24, 33, 42, 51])

In [31]:
# axes are also zero-numerated
Z.sum(axis=0)

array([50, 55, 60])

In [32]:
Z.max(axis=1)

array([ 6,  9, 12, 15, 18])

In [33]:
Z2 = - Z
Z2 = numpy.sort(Z2, axis=1)
Z2

array([[ -6,  -5,  -4],
       [ -9,  -8,  -7],
       [-12, -11, -10],
       [-15, -14, -13],
       [-18, -17, -16]])

## Indexing with boolean array

In [34]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [35]:
x > 3

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [36]:
x[x < 7.4]

array([0, 1, 2, 3, 4, 5, 6, 7])

## Copies

Many operations in numpy don't create copies, but operate with the same memory 

In [37]:
x = numpy.arange(10)
y = x[:5]

print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[10  1  2  3  4  5  6  7  8  9] [10  1  2  3  4]


this happened because x and y point __to the same place in memory__

In [38]:
x = numpy.arange(10)
y = x[:5].copy()
print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[0 1 2 3 4 5 6 7 8 9] [10  1  2  3  4]


## Random numbers

module `numpy.random` helps with generating random numbers

In [39]:
# generating 10000 random numbers at once
numpy.random.normal(loc=2, scale=12, size=10000)

array([ 6.36731563, -8.89638254, -7.64944699, ..., -1.18300449,
       -5.16008339, 10.12748805])

## Sorting

In [40]:
x = numpy.random.random(size=1000)
x = numpy.sort(x)

In [41]:
print (x[:10])
print (x[-10:])

[0.00143608 0.00227151 0.00288429 0.00314382 0.00383709 0.00406789
 0.00517505 0.00774539 0.00803193 0.00943651]
[0.98685356 0.98899479 0.98933474 0.99071971 0.99231098 0.99319883
 0.99377646 0.99607943 0.99763553 0.99918855]


## Arg...

arg-functions allow writing non-trivial operations with a couple of lines

In [42]:
# random.random generates uniform in [0, 1]
random_numbers = numpy.random.random(size=1000)
indices = numpy.argsort(random_numbers) #Returns the indices that would sort an array.

In [43]:
numpy.alltrue(random_numbers[indices] == numpy.sort(random_numbers)) #Test whether all array elements along a given axis evaluate to True.

True

In [44]:
indices[:10]

array([ 56, 284, 590, 347, 543, 597,  37, 634, 763, 440], dtype=int64)

In [45]:
random_numbers.min(), random_numbers.max()

(0.00017055672403576416, 0.9985329770280684)

In [46]:
random_numbers.argmax(), random_numbers[random_numbers.argmax()]

(234, 0.9985329770280684)

In [47]:
random_numbers.argmin(), random_numbers[random_numbers.argmin()]

(56, 0.00017055672403576416)

## Homework 3

In [48]:
import numpy


In [61]:
sample = numpy.random.normal(0,10,1000)
print (sample)

In [77]:
positives = sample[sample > 0]

In [97]:
remaining = sample[sample <= 0]
num = len(remaining)
minimum = remaining.min()
maximum = remaining.max()
mean = remaining.mean()
variance = remaining.var()

print (num, minimum, maximum, mean, variance)


504 -31.551857715109087 -0.04911121257006529 -8.45084810788659 41.143052530517316


## References:
* `numpy` documentation: https://docs.scipy.org/doc/numpy/reference/
    * almost any question about `numpy` is already answered on stackoverflow
* [From python to numpy: a beautiful book about numpy](https://github.com/rougier/from-python-to-numpy)
* Data manipulation with `numpy`: tips and tricks [part1](http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html), [part2](http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html)
