# Uvod v NumPy


## Understanding Data Types in Python



```C
/* C code */
int result = 0;
for(int i=0; i<100; i++){
    result += i;
}
```


```python
# Python code
result = 0
for i in range(100):
    result += i
```


In [1]:
x = 10

In [2]:
x = "dsdsdsdsd"

### A Python Integer Is More Than Just an Integer



```C
struct _longobject {
    long ob_refcnt;
    PyTypeObject *ob_type;
    size_t ob_size;
    long ob_digit[1];
};
```



<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/cint_vs_pyint.png" alt="Integer Memory Layout">

### A Python List Is More Than Just a List



In [5]:
L = [1,2,3,4,5,6]

In [6]:
L

[1, 2, 3, 4, 5, 6]

In [7]:
def dum():
    pass

In [8]:
L = [1, 45.67, "SDSDDS", {"SDSDDSD": 445},5,6, dum]

In [9]:
[type(el) for el in L]

[int, float, str, dict, int, int, function]


<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png" alt="Array Memory Layout">

## How Vectorization Makes Code Faster


<p><img alt="Translating Python code to bytecode" src="https://s3.amazonaws.com/dq-content/289/bytecode.svg"></p>


<table>
<thead>
<tr>
<th>Language Type</th>
<th>Example</th>
<th>Time taken to write program</th>
<th>Control over program performance</th>
</tr>
</thead>
<tbody>
<tr>
<td>High-Level</td>
<td>Python</td>
<td>Low</td>
<td>Low</td>
</tr>
<tr>
<td>Low-Level</td>
<td>C</td>
<td>High</td>
<td>High</td>
</tr>
</tbody>
</table>



<p><img alt="For loop to sum rows" src="https://s3.amazonaws.com/dq-content/289/for_loop.svg"></p>

In [10]:
my_numbers = [[6,5], [1,3], [5,6]]

sums = []

for row in my_numbers:
    row_sum = row[0] + row[1]
    sums.append(row_sum)
    
print(sums)    

[11, 4, 11]


<p><img src="./images/numpy_for.gif"></p>

<p><img src="./images/numpy_vectorized.gif"></p>

## Numpy library

[Dokumentacija](http://www.numpy.org/)

In [13]:
import numpy as np

## Introduction to Ndarrays

<img alt="Dimensional Arrays" src="./images/one_dim.svg">

In [14]:
moj_array = np.array([5,10,15,20])

In [15]:
moj_array

array([ 5, 10, 15, 20])

In [16]:
print(type(moj_array))

<class 'numpy.ndarray'>


<img alt="Dimensional Arrays" src="./images/Two_Dim.svg">

In [17]:
data2 = [[1,2,3,4], [5,6,7,8]]
arr2 = np.array(data2)
print(arr2)

[[1 2 3 4]
 [5 6 7 8]]


## Priprava podatkov za delo

In [18]:
data = np.random.randint(1, 10, (8,4))

In [19]:
print(data)

[[3 1 8 2]
 [7 7 8 7]
 [2 5 7 7]
 [5 2 1 5]
 [8 3 3 7]
 [1 1 1 8]
 [9 6 3 5]
 [2 8 1 3]]


## Array Shapes

In [20]:
# oblika
data.shape

(8, 4)

In [21]:
# število elementov
data.size

32

In [22]:
# število dimanziji
data.ndim

2

In [23]:
# velikost posameznega elementa
data.itemsize

8

In [24]:
8*32

256

In [26]:
# velikost array-ja v bajtih
data.nbytes

256

## Selecting and Slicing Rows and Items from ndarrays

<img alt="Dimensional Arrays" src="./images/selection_rows.svg">

    ndarray[row_index,column_index]

    # or if you want to select all
    # columns for a given set of rows
    ndarray[row_index]

<img alt="Dimensional Arrays" src="./images/selection_item.svg">

In [27]:
test = np.random.randint(0,10, (5,5))

In [28]:
print(test)

[[2 0 3 2 9]
 [8 8 5 9 0]
 [9 5 9 9 5]
 [6 5 2 3 9]
 [5 2 8 5 8]]


In [30]:
test[0]

array([2, 0, 3, 2, 9])

In [31]:
test[2:]

array([[9, 5, 9, 9, 5],
       [6, 5, 2, 3, 9],
       [5, 2, 8, 5, 8]])

In [32]:
test[2:4]

array([[9, 5, 9, 9, 5],
       [6, 5, 2, 3, 9]])

In [33]:
test[3,2]

2

## Selecting Columns and Custom Slicing ndarrays

<img alt="Dimensional Arrays" src="./images/selection_columns_updated.svg">

<img alt="Dimensional Arrays" src="./images/selection_1darray_updated.svg">

<img alt="Dimensional Arrays" src="./images/selection_2darray_updated.svg">

In [37]:
print(test)

[[2 0 3 2 9]
 [8 8 5 9 0]
 [9 5 9 9 5]
 [6 5 2 3 9]
 [5 2 8 5 8]]


In [38]:
test[:, 3]

array([2, 9, 9, 3, 5])

In [39]:
test[:, 3:]

array([[2, 9],
       [9, 0],
       [9, 5],
       [3, 9],
       [5, 8]])

In [41]:
test[:, 1:3]

array([[0, 3],
       [8, 5],
       [5, 9],
       [5, 2],
       [2, 8]])

In [42]:
test[1:3, 1:]

array([[8, 5, 9, 0],
       [5, 9, 9, 5]])

In [43]:
test[3:, 3:]

array([[3, 9],
       [5, 8]])

## Vector Math

In [45]:
print(test)

[[2 0 3 2 9]
 [8 8 5 9 0]
 [9 5 9 9 5]
 [6 5 2 3 9]
 [5 2 8 5 8]]


In [46]:
x1 = test[1]

In [47]:
x1

array([8, 8, 5, 9, 0])

In [48]:
x2 = test[2]

In [49]:
x2

array([9, 5, 9, 9, 5])

In [50]:
x1 + x2

array([17, 13, 14, 18,  5])

In [52]:
addarr = test[1] + test[2]

In [53]:
print(addarr)

[17 13 14 18  5]


In [55]:
addarr + 2

array([19, 15, 16, 20,  7])

In [56]:
addarr * 15

array([255, 195, 210, 270,  75])

In [58]:
addarr / 3 

array([5.66666667, 4.33333333, 4.66666667, 6.        , 1.66666667])



<div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>The following table lists the arithmetic operators implemented in NumPy:</p>
<table>
<thead><tr>
<th>Operator</th>
<th>Equivalent ufunc</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>+</code></td>
<td><code>np.add</code></td>
<td>Addition (e.g., <code>1 + 1 = 2</code>)</td>
</tr>
<tr>
<td><code>-</code></td>
<td><code>np.subtract</code></td>
<td>Subtraction (e.g., <code>3 - 2 = 1</code>)</td>
</tr>
<tr>
<td><code>-</code></td>
<td><code>np.negative</code></td>
<td>Unary negation (e.g., <code>-2</code>)</td>
</tr>
<tr>
<td><code>*</code></td>
<td><code>np.multiply</code></td>
<td>Multiplication (e.g., <code>2 * 3 = 6</code>)</td>
</tr>
<tr>
<td><code>/</code></td>
<td><code>np.divide</code></td>
<td>Division (e.g., <code>3 / 2 = 1.5</code>)</td>
</tr>
<tr>
<td><code>//</code></td>
<td><code>np.floor_divide</code></td>
<td>Floor division (e.g., <code>3 // 2 = 1</code>)</td>
</tr>
<tr>
<td><code>**</code></td>
<td><code>np.power</code></td>
<td>Exponentiation (e.g., <code>2 ** 3 = 8</code>)</td>
</tr>
<tr>
<td><code>%</code></td>
<td><code>np.mod</code></td>
<td>Modulus/remainder (e.g., <code>9 % 4 = 1</code>)</td>
</tr>
</tbody>
</table>

</div>
</div>


In [59]:
x1+x2

array([17, 13, 14, 18,  5])

In [60]:
np.add(x1, x2)

array([17, 13, 14, 18,  5])

In [62]:
# iste dimanzije !
np.array([1,2,3]) + x2

ValueError: operands could not be broadcast together with shapes (3,) (5,) 

## Calculating Statistics For 1D ndarrays

In [63]:
x1

array([8, 8, 5, 9, 0])

In [64]:
x1.max()

9

In [65]:
x1.min()

0

In [68]:
# kot funkcija
np.min(x1)

0

In [66]:
x1.mean()

6.0

In [67]:
x1.sum()

30


<p></p><center><img alt="Method syntax" src="https://s3.amazonaws.com/dq-content/289/Method_syntax.svg"></center><p></p>


<div>

<table>
<thead>
<tr>
<th>Calculation</th>
<th>Function Representation</th>
<th>Method Representation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Calculate the minimum value of <code>trip_mph</code></td>
<td><code>np.min(trip_mph)</code></td>
<td><code>trip_mph.min()</code></td>
</tr>
<tr>
<td>Calculate the maximum value of <code>trip_mph</code></td>
<td><code>np.max(trip_mph)</code></td>
<td><code>trip_mph.max()</code></td>
</tr>
<tr>
<td>Calculate the <a target="_blank" href="https://en.wikipedia.org/wiki/Mean">mean average</a> value of <code>trip_mph</code></td>
<td><code>np.mean(trip_mph)</code></td>
<td><code>trip_mph.mean()</code></td>
</tr>
<tr>
<td>Calculate the <a target="_blank" href="https://en.wikipedia.org/wiki/Median">median average</a> value of <code>trip_mph</code></td>
<td><code>np.median(trip_mph)</code></td>
<td>There is no ndarray median method</td>
</tr>
</tbody>
</table>
</div>

## Calculating Statistics For 2D ndarrays

<img alt="Dimensional Arrays" src="./images/array_method_axis_none.svg">

<img alt="Dimensional Arrays" src="./images/array_method_axis_1.svg">

<img alt="Dimensional Arrays" src="./images/array_method_axis_0.svg">



<p><img alt="The axis parameter" src="https://s3.amazonaws.com/dq-content/289/axis_param.svg"></p>


In [69]:
print(test)

[[2 0 3 2 9]
 [8 8 5 9 0]
 [9 5 9 9 5]
 [6 5 2 3 9]
 [5 2 8 5 8]]


In [70]:
test.max()

9

In [71]:
test.max(axis=1)

array([9, 9, 9, 9, 8])

In [72]:
test.max(axis=0)

array([9, 8, 9, 9, 9])

## Reading CSV files with NumPy

<p>Below is information about selected columns from the data set:</p>
<ul>
<li><code>pickup_year</code>: The year of the trip.</li>
<li><code>pickup_month</code>: The month of the trip (January is <code>1</code>, December is <code>12</code>).</li>
<li><code>pickup_day</code>: The day of the month of the trip.</li>
<li><code>pickup_location_code</code>: The airport or <a target="_blank" href="https://en.wikipedia.org/wiki/Boroughs_of_New_York_City">borough</a> where the the trip started.</li>
<li><code>dropoff_location_code</code>: The airport or borough where the the trip finished.</li>
<li><code>trip_distance</code>: The distance of the trip in miles.</li>
<li><code>trip_length</code>: The length of the trip in seconds.</li>
<li><code>fare_amount</code>: The base fare of the trip, in dollars.</li>
<li><code>total_amount</code>: The total amount charged to the passenger, including all fees, tolls and tips.</li>
</ul>


In [75]:
taxi = np.genfromtxt("data/nyc_taxis.csv", delimiter=",", skip_header=1)

In [77]:
taxi.shape

(89560, 15)

## Datatypes

<div class="text_cell_render border-box-sizing rendered_html">
<table>
<thead><tr>
<th>Data type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>bool_</code></td>
<td>Boolean (True or False) stored as a byte</td>
</tr>
<tr>
<td><code>int_</code></td>
<td>Default integer type (same as C <code>long</code>; normally either <code>int64</code> or <code>int32</code>)</td>
</tr>
<tr>
<td><code>intc</code></td>
<td>Identical to C <code>int</code> (normally <code>int32</code> or <code>int64</code>)</td>
</tr>
<tr>
<td><code>intp</code></td>
<td>Integer used for indexing (same as C <code>ssize_t</code>; normally either <code>int32</code> or <code>int64</code>)</td>
</tr>
<tr>
<td><code>int8</code></td>
<td>Byte (-128 to 127)</td>
</tr>
<tr>
<td><code>int16</code></td>
<td>Integer (-32768 to 32767)</td>
</tr>
<tr>
<td><code>int32</code></td>
<td>Integer (-2147483648 to 2147483647)</td>
</tr>
<tr>
<td><code>int64</code></td>
<td>Integer (-9223372036854775808 to 9223372036854775807)</td>
</tr>
<tr>
<td><code>uint8</code></td>
<td>Unsigned integer (0 to 255)</td>
</tr>
<tr>
<td><code>uint16</code></td>
<td>Unsigned integer (0 to 65535)</td>
</tr>
<tr>
<td><code>uint32</code></td>
<td>Unsigned integer (0 to 4294967295)</td>
</tr>
<tr>
<td><code>uint64</code></td>
<td>Unsigned integer (0 to 18446744073709551615)</td>
</tr>
<tr>
<td><code>float_</code></td>
<td>Shorthand for <code>float64</code>.</td>
</tr>
<tr>
<td><code>float16</code></td>
<td>Half precision float: sign bit, 5 bits exponent, 10 bits mantissa</td>
</tr>
<tr>
<td><code>float32</code></td>
<td>Single precision float: sign bit, 8 bits exponent, 23 bits mantissa</td>
</tr>
<tr>
<td><code>float64</code></td>
<td>Double precision float: sign bit, 11 bits exponent, 52 bits mantissa</td>
</tr>
<tr>
<td><code>complex_</code></td>
<td>Shorthand for <code>complex128</code>.</td>
</tr>
<tr>
<td><code>complex64</code></td>
<td>Complex number, represented by two 32-bit floats</td>
</tr>
<tr>
<td><code>complex128</code></td>
<td>Complex number, represented by two 64-bit floats</td>
</tr>
</tbody>
</table>

</div>

In [79]:
x = np.array([1,2])
print(x.dtype)
print(x.nbytes)

int64
16


In [80]:
x = np.array([1.0,2.0])
print(x.dtype)
print(x.nbytes)

float64
16


In [81]:
x = np.array([1,2], dtype=np.int32)
print(x.dtype)
print(x.nbytes)

int32
8


In [82]:
x = np.array([1,2], dtype=np.int8)
print(x.dtype)
print(x.nbytes)

int8
2


In [83]:
x = np.array([189, 127, 128, 255], dtype=np.int8)
print(x)
print(x.dtype)
print(x.nbytes)

[ -67  127 -128   -1]
int8
4


In [98]:
a = np.array([3,4,5, "red", False])

In [99]:
print(a.dtype)

<U21


## Boolean Indexing

### Boolean Arrays

In [85]:
3 == 3

True

In [86]:
24 != 24

False

In [87]:
3 > 5

False

In [89]:
np.array([2,4,6,8]) + 10

array([12, 14, 16, 18])

In [90]:
np.array([2,4,6,8]) > 5

array([False, False,  True,  True])

In [91]:
a = np.array([1, 2, 3, 4, 5])
b = np.array(["blue", "blue", "red", "blue"])
c = np.array([80.0, 103.4, 96.9, 200.3])

In [92]:
a < 3

array([ True,  True, False, False, False])

In [93]:
b == "blue"

array([ True,  True, False,  True])

In [101]:
c > 80.6

array([False,  True,  True,  True])

### Boolean Indexing with 1D ndarrays

In [105]:
data = np.array([2,4,6,8])

In [106]:
b = data > 5

In [107]:
b

array([False, False,  True,  True])

In [108]:
data[b]

array([6, 8])

In [109]:
data[data > 5]

array([6, 8])

### Boolean Indexing with 2D ndarrays

<img alt="Dimensional Arrays" src="./images/bool_dims_updated.svg">

In [110]:
print(test)

[[2 0 3 2 9]
 [8 8 5 9 0]
 [9 5 9 9 5]
 [6 5 2 3 9]
 [5 2 8 5 8]]


In [111]:
test > 5

array([[False, False, False, False,  True],
       [ True,  True, False,  True, False],
       [ True, False,  True,  True, False],
       [ True, False, False, False,  True],
       [False, False,  True, False,  True]])

In [112]:
test[test > 5]

array([9, 8, 8, 9, 9, 9, 9, 6, 9, 8, 8])

## The meaning of shapes in NumPy

**1-DIMENSIONAL NUMPY ARRAYS ONLY HAVE ONE AXIS**

<p class="nitro-offscreen"><img alt="An image that shows that 1-d NumPy arrays have 1 axis, axis 0." width="385" nitro-lazy-src="https://cdn-coiao.nitrocdn.com/CYHudqJZsSxQpAPzLkHFOkuzFKDpEHGF/assets/static/optimized/rev-46bfc56/wp-content/uploads/2018/12/1d-array-has-one-axis.png" class="aligncenter size-full wp-image-3900 lazyloaded" nitro-lazy-empty="" id="NTc5OjUwNA==-1" src="https://cdn-coiao.nitrocdn.com/CYHudqJZsSxQpAPzLkHFOkuzFKDpEHGF/assets/static/optimized/rev-46bfc56/wp-content/uploads/2018/12/1d-array-has-one-axis.png"></p>

**2-DIMENSIONAL NUMPY ARRAYS**

<p class="nitro-offscreen"><img alt="An image that shows that NumPy arrays have axes." width="550" nitro-lazy-src="https://cdn-coiao.nitrocdn.com/CYHudqJZsSxQpAPzLkHFOkuzFKDpEHGF/assets/static/optimized/rev-46bfc56/wp-content/uploads/2018/11/numpy-arrays-have-axes.png" class="aligncenter size-full wp-image-3910 lazyloaded" nitro-lazy-empty="" id="NDAwOjQ5OA==-1" src="https://cdn-coiao.nitrocdn.com/CYHudqJZsSxQpAPzLkHFOkuzFKDpEHGF/assets/static/optimized/rev-46bfc56/wp-content/uploads/2018/11/numpy-arrays-have-axes.png"></p>

In [113]:
a = np.arange(12)

In [114]:
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

<pre class="lang-py s-code-block"><code class="hljs language-python">┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  <span class="hljs-number">0</span> │  <span class="hljs-number">1</span> │  <span class="hljs-number">2</span> │  <span class="hljs-number">3</span> │  <span class="hljs-number">4</span> │  <span class="hljs-number">5</span> │  <span class="hljs-number">6</span> │  <span class="hljs-number">7</span> │  <span class="hljs-number">8</span> │  <span class="hljs-number">9</span> │ <span class="hljs-number">10</span> │ <span class="hljs-number">11</span> │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
</code></pre>

In [115]:
a.shape

(12,)

In [116]:
a.__array_interface__['data']

(56578752, False)

In [117]:
a[2]

2

<pre class="lang-py s-code-block"><code class="hljs language-python">i= <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">3</span>    <span class="hljs-number">4</span>    <span class="hljs-number">5</span>    <span class="hljs-number">6</span>    <span class="hljs-number">7</span>    <span class="hljs-number">8</span>    <span class="hljs-number">9</span>   <span class="hljs-number">10</span>   <span class="hljs-number">11</span>
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  <span class="hljs-number">0</span> │  <span class="hljs-number">1</span> │  <span class="hljs-number">2</span> │  <span class="hljs-number">3</span> │  <span class="hljs-number">4</span> │  <span class="hljs-number">5</span> │  <span class="hljs-number">6</span> │  <span class="hljs-number">7</span> │  <span class="hljs-number">8</span> │  <span class="hljs-number">9</span> │ <span class="hljs-number">10</span> │ <span class="hljs-number">11</span> │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
</code></pre>

In [118]:
b = a.reshape((3,4))
print(b)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [119]:
b.shape

(3, 4)

In [120]:
b.__array_interface__['data']

(56578752, False)

<pre class="lang-py s-code-block"><code class="hljs language-python">i= <span class="hljs-number">0</span>    <span class="hljs-number">0</span>    <span class="hljs-number">0</span>    <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">1</span>    <span class="hljs-number">1</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">2</span>    <span class="hljs-number">2</span>    <span class="hljs-number">2</span>
j= <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">3</span>    <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">3</span>    <span class="hljs-number">0</span>    <span class="hljs-number">1</span>    <span class="hljs-number">2</span>    <span class="hljs-number">3</span>
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  <span class="hljs-number">0</span> │  <span class="hljs-number">1</span> │  <span class="hljs-number">2</span> │  <span class="hljs-number">3</span> │  <span class="hljs-number">4</span> │  <span class="hljs-number">5</span> │  <span class="hljs-number">6</span> │  <span class="hljs-number">7</span> │  <span class="hljs-number">8</span> │  <span class="hljs-number">9</span> │ <span class="hljs-number">10</span> │ <span class="hljs-number">11</span> │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
</code></pre>

In [121]:
b[0,2]

2

In [124]:
d = a.reshape((12,1))

In [125]:
print(d)

[[ 0]
 [ 1]
 [ 2]
 [ 3]
 [ 4]
 [ 5]
 [ 6]
 [ 7]
 [ 8]
 [ 9]
 [10]
 [11]]


In [126]:
d.shape

(12, 1)

In [127]:
d[10,0]

10

In [128]:
d.__array_interface__['data']

(56578752, False)

In [130]:
e = a.reshape((1, 12))

In [131]:
print(e)

[[ 0  1  2  3  4  5  6  7  8  9 10 11]]


In [132]:
e.shape

(1, 12)

In [133]:
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [134]:
a.shape

(12,)

In [136]:
e[0, 5]

5

## Assigning Values

### Assigning Values in ndarrays

In [137]:
a = np.array(['red','blue','black','blue','purple'])

In [138]:
a[3:] = "roza"

In [139]:
a

array(['red', 'blue', 'black', 'roza', 'roza'], dtype='<U6')

In [140]:
ones = np.ones((3,5))

In [141]:
ones

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [142]:
ones[2,1] = 99

In [143]:
ones

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1., 99.,  1.,  1.,  1.]])

In [144]:
ones[0] = 0

In [145]:
ones

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1., 99.,  1.,  1.,  1.]])

### Assignment Using Boolean Arrays

In [146]:
ones

array([[ 0.,  0.,  0.,  0.,  0.],
       [ 1.,  1.,  1.,  1.,  1.],
       [ 1., 99.,  1.,  1.,  1.]])

In [147]:
ones[ones == 1] = 13

In [148]:
ones

array([[ 0.,  0.,  0.,  0.,  0.],
       [13., 13., 13., 13., 13.],
       [13., 99., 13., 13., 13.]])

## Adding Rows and Columns to ndarrays

In [151]:
ones = np.ones((2,3))

In [152]:
ones

array([[1., 1., 1.],
       [1., 1., 1.]])

In [153]:
zeros = np.zeros(3)

In [154]:
zeros

array([0., 0., 0.])

In [155]:
np.concatenate([ones, zeros], axis=0)

ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)

In [156]:
ones.shape

(2, 3)

In [157]:
zeros.shape

(3,)

In [158]:
zeros = zeros.reshape(1,3)

In [160]:
zeros.shape

(1, 3)

In [161]:
np.concatenate([ones, zeros], axis=0)

array([[1., 1., 1.],
       [1., 1., 1.],
       [0., 0., 0.]])

## Computation on NumPy Arrays: Universal Functions


### The Slowness of Loops



In [162]:
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output
        
values = np.random.randint(1, 10, size=5)
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [163]:
big_array = np.random.randint(1, 100, size=1_000_000)
%timeit compute_reciprocals(big_array)

7.61 s ± 121 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Introducing UFuncs (Universal functions)


In [164]:
values = np.random.randint(1, 10, size=5)

In [165]:
values

array([3, 6, 4, 2, 7])

In [166]:
1 / values

array([0.33333333, 0.16666667, 0.25      , 0.5       , 0.14285714])

In [167]:
%timeit (1.0 / big_array)

9.18 ms ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Subarrays as no-copy views



In [168]:
r = np.ones((4,4))
print(r)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


In [169]:
r2 = r[:2, :2]
print(r2)

[[1. 1.]
 [1. 1.]]


In [170]:
r2[:] = 0

In [171]:
r2

array([[0., 0.],
       [0., 0.]])

In [172]:
r

array([[0., 0., 1., 1.],
       [0., 0., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

## Copying Data



In [173]:
r = np.ones((4,4))
print(r)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


In [174]:
r2 = r[:2, :2].copy()
print(r2)

[[1. 1.]
 [1. 1.]]


In [175]:
r2[:] = 0

In [176]:
r2

array([[0., 0.],
       [0., 0.]])

In [177]:
r

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])