## Hello Jupyter

In [2]:
print ('hello Alex')

hello Alex


In [4]:
x = 3
print (x)

3


In [5]:
# reexecute this cell several times
x = x + 5
# output of the last line in the cell is shown
x

8

In [6]:
my_string = "string can contain characters including \n new lines and \t tabs"
my_string

'string can contain characters including \n new lines and \t tabs'

In [8]:
print (my_string)

string can contain characters including 
 new lines and 	 tabs


In [9]:
12 + 45

57

In [10]:
# result types are safer
24 * 3.5

84.0

## Lists, Dictionaries

In [11]:
x = [1, 2, 3]

In [12]:
# in python enumeration starts from 0
x[0], x[2]

(1, 3)

In [13]:
y = x[0:2] + [5, 10]
y

[1, 2, 5, 10]

In [14]:
y[::2]

[1, 5]

In [15]:
y[::-1]

[10, 5, 2, 1]

## Dictionaries

In [16]:
d = {'a': 1, 'b': 2}

In [17]:
d['c'] = 3
d['casdsd'] = 34
d

{'a': 1, 'b': 2, 'c': 3, 'casdsd': 34}

In [19]:
for key, value in d.items():
    print (key, value)

a 1
b 2
c 3
casdsd 34


## Important python libraries

 - numpy (numerics) + scipy (scientific functions) 
 - matplotlib - plotting
 - pandas - convenient operations on data for Data Science
 - scikit-learn - machine learning
 
We'll meet them very soon.

## Hello numpy!

`numpy` is the core of scientific python. It is the most convenient way to organize number-crunching in python.

In [20]:
import numpy

In [21]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
x.reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [23]:
# x.reshape(3, 4)

In [24]:
# slicing has the same logic for lists / strings / tuples / numpy, etc
x[:4]

array([0, 1, 2, 3])

In [26]:
print (x[:3])
print (x[3:7])
print (x[7:])

[0 1 2]
[3 4 5 6]
[7 8 9]


### Vector operations

In [28]:
x = numpy.arange(10 ** 6) #Return evenly spaced values within a given interval.
# vector operations do similar task for each element. In this case each element is multiplied by 3 and 12 added.
3 * x + 12.

array([1.200000e+01, 1.500000e+01, 1.800000e+01, ..., 3.000003e+06,
       3.000006e+06, 3.000009e+06])

In [29]:
# use timing magic to understand this is quite fast
%timeit 3 * x + 12.

3.12 ms ± 310 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [30]:
Z = numpy.arange(15).reshape(5, 3)
Z

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [31]:
numpy.log(numpy.exp(Z)) # type conversion happened

array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.],
       [ 6.,  7.,  8.],
       [ 9., 10., 11.],
       [12., 13., 14.]])

In [32]:
Z += 4

In [33]:
Z

array([[ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

In [34]:
Z[::2, :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [35]:
Z[[0, 2, 4], :]

array([[ 4,  5,  6],
       [10, 11, 12],
       [16, 17, 18]])

In [36]:
Z.sum(axis=1)

array([15, 24, 33, 42, 51])

In [37]:
# axes are also zero-numerated
Z.sum(axis=0)

array([50, 55, 60])

In [38]:
Z.max(axis=1)

array([ 6,  9, 12, 15, 18])

In [39]:
Z2 = - Z
Z2 = numpy.sort(Z2, axis=1)
Z2

array([[ -6,  -5,  -4],
       [ -9,  -8,  -7],
       [-12, -11, -10],
       [-15, -14, -13],
       [-18, -17, -16]])

## Indexing with boolean array

In [40]:
x = numpy.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [41]:
x > 3

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [42]:
x[x < 7.4]

array([0, 1, 2, 3, 4, 5, 6, 7])

## Copies

Many operations in numpy don't create copies, but operate with the same memory 

In [44]:
x = numpy.arange(10)
y = x[:5]

print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[10  1  2  3  4  5  6  7  8  9] [10  1  2  3  4]


this happened because x and y point __to the same place in memory__

In [47]:
x = numpy.arange(10)
y = x[:5].copy()
print (x, y)
y[0] = 10
print (x, y)

[0 1 2 3 4 5 6 7 8 9] [0 1 2 3 4]
[0 1 2 3 4 5 6 7 8 9] [10  1  2  3  4]


## Random numbers

module `numpy.random` helps with generating random numbers

In [48]:
# generating 10000 random numbers at once
numpy.random.normal(loc=2, scale=12, size=10000)

array([-7.36130105, -3.89579987,  0.17604214, ...,  7.29555741,
       -1.65568502, 11.3454555 ])

## Sorting

In [49]:
x = numpy.random.random(size=1000)
x = numpy.sort(x)

In [51]:
print (x[:10])
print (x[-10:])

[0.00050614 0.00080774 0.00090352 0.00102425 0.00138886 0.00197455
 0.00375454 0.00412814 0.00436505 0.00503647]
[0.99073474 0.9911111  0.99331845 0.99376466 0.99423361 0.99551581
 0.99594238 0.99610095 0.99610258 0.99913341]


## Arg...

arg-functions allow writing non-trivial operations with a couple of lines

In [59]:
# random.random generates uniform in [0, 1]
random_numbers = numpy.random.random(size=1000)
indices = numpy.argsort(random_numbers) #Returns the indices that would sort an array.

In [60]:
numpy.alltrue(random_numbers[indices] == numpy.sort(random_numbers)) #Test whether all array elements along a given axis evaluate to True.

True

In [61]:
indices[:10]

array([ 60,   0, 546, 781, 353, 746, 356, 595, 466, 705])

In [62]:
random_numbers.min(), random_numbers.max()

(0.0005381736095251277, 0.9998512961979813)

In [63]:
random_numbers.argmax(), random_numbers[random_numbers.argmax()]

(446, 0.9998512961979813)

In [64]:
random_numbers.argmin(), random_numbers[random_numbers.argmin()]

(60, 0.0005381736095251277)

## Homework 3

In [1]:
# 0. import numpy
import numpy as np

In [4]:
# 1. sample 1000 elements from normal distribution 
sample=np.random.normal(loc=0,size=1000)

In [5]:
# 2. leave only positive numbers (from previous exercise)
positive_sample=sample[sample>0.]

In [9]:
# 3. count number of left numbers, their minimum, maximum, mean and variance.
negative_sample=sample[sample<0.]
number=len(negative_sample)
print(number)
print(max(negative_sample))
print(min(negative_sample))
print(np.mean(negative_sample))
print(np.var(negative_sample))

510
-0.0035187754005366573
-2.850032934347968
-0.801672796350423
0.35250454859760094


## References:
* `numpy` documentation: https://docs.scipy.org/doc/numpy/reference/
    * almost any question about `numpy` is already answered on stackoverflow
* [From python to numpy: a beautiful book about numpy](https://github.com/rougier/from-python-to-numpy)
* Data manipulation with `numpy`: tips and tricks [part1](http://arogozhnikov.github.io/2015/09/29/NumpyTipsAndTricks1.html), [part2](http://arogozhnikov.github.io/2015/09/30/NumpyTipsAndTricks2.html)
