# Chapter 7 - Course Text

## 7.2 Creating Arrays from Existing Data

In [1]:
import numpy as np

In [2]:
numbers = np.array([2, 3, 5, 7, 11])

In [3]:
type (numbers) # The array's type is an .ndarray

numpy.ndarray

In [4]:
numbers # but, the array will always output as an array, not .ndarray, even though we ran this through Numpy!

array([ 2,  3,  5,  7, 11])

### Multidimensional Arguments

In [5]:
np.array([[1, 2, 3], [4, 5, 6]]) # These are called multidimensional arguments, because there is more than 1 line of array

# Numpy auto - formats arrays based on their number of dimensions, aligning the columns within each row. 

array([[1, 2, 3],
       [4, 5, 6]])

In [6]:
np.array([x for x in range (2, 21, 2)])

# This is a 1 dimensional array that counts from 2 to 21 by 2. As 21 is odd, it will not be counted.

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [7]:
np.array([[2, 4, 6, 8, 10], [1, 3, 5, 7, 9]])

array([[ 2,  4,  6,  8, 10],
       [ 1,  3,  5,  7,  9]])

## 7.3 Array Attributes

In [8]:
integers = np.array([[1, 2, 3], [4, 5, 6]])

In [9]:
integers

array([[1, 2, 3],
       [4, 5, 6]])

In [10]:
floats = np.array([0.0, 0.1, 0.2, 0.3, 0.4])

In [11]:
floats # Numpy will not show 0's to the right of the decimal point in floating point values.

array([0. , 0.1, 0.2, 0.3, 0.4])

### Determining an Array's Element Type

In [12]:
integers.dtype # dtype will allow us to see an array's element type

dtype('int64')

In [13]:
floats.dtype

dtype('float64')

There are other types of dtypes than 'whatever64'. This is just because Numpy is written in C programming language. Other common types include `bool` for booleans and `object` for anything that isn't numbers (like strings!)

### Determining an Array's Dimensions

In [14]:
integers.ndim

# The .ndim function tells us how many dimensions are in an array

2

In [15]:
floats.ndim

1

In [16]:
integers.shape

# With the .shape function, it is telling us what the array looks like

(2, 3)

In [17]:
# (2, 3) means 2 rows by 3 columns, containing 6 elements

In [18]:
floats.shape

(5,)

In [19]:
# .shape 's like this mean there is 1 element tuple, containing 5 elements. Remember, floats are 1 dimensional!

### Determining an Array's Number of Elements and Element Size

In [20]:
integers.size

# This tells us the number of bytes it takes to store each element

6

In [21]:
integers.itemsize 

# This number would be 4 if C compiler used 32 bit ints

8

In [22]:
floats.size

5

In [23]:
floats.itemsize

8

### Iterating Through a Multidimensional Array's Elements

In [24]:
for row in integers:
    for column in row:
        print(column, end='  ')
    print()

# This is an example of an external iteration. This means showing the contents of the array only with an explicit script.

1  2  3  
4  5  6  


In [25]:
for i in integers.flat:
    print(i, end='  ')

# This is another example of external iteration. It is using the .flat script to pretend the array is only 1 dimension.

1  2  3  4  5  6  

## 7.4 Filling Arrays with Specific Values

In [26]:
np.zeros(5)

# The zeros, ones, and full functions create arrays with 0s, 1s, or specified values. This can be customizable (later script). This script specifies our dimensions

array([0., 0., 0., 0., 0.])

In [27]:
np.ones((2, 4), dtype=int)

# This array was generated with the ones function and created a 2x4 data array

array([[1, 1, 1, 1],
       [1, 1, 1, 1]])

In [28]:
np.full((3, 5), 13)

# This array was generated with the full function, creating a 3x5 data array of the number 13

array([[13, 13, 13, 13, 13],
       [13, 13, 13, 13, 13],
       [13, 13, 13, 13, 13]])

## 7.5 Creating Arrays from Ranges

### Creating Integer Ranges with arange

In [29]:
np.arange(5)

# function arange creates an array based on number of elements

array([0, 1, 2, 3, 4])

In [30]:
np.arange(5,10)

# arange has a minimum and a maximum. It will show the lowest, but not touch the highest

array([5, 6, 7, 8, 9])

In [31]:
np.arange(10, 1, -2)

# arange can also create a range backwards, counting back by whatever metric

array([10,  8,  6,  4,  2])

### Creating Floating - Point Ranges with linspace

In [32]:
np.linspace(0.0, 1.0, num=5)

# linspace function creates an evenly spaced array of however many elements defined in the num= portion

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

### Reshaping an Array

In [33]:
np.arange(1, 21).reshape(4, 5)

# The reshape function can transform a 1D array to a multidimensional array. This function can transform any array shape to another, they just have to have the same # of elements!

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

### Displaying Large Arrays

In [34]:
np.arange(1, 100001).reshape(4, 25000)

# Elipses ... represent data not shown!

array([[     1,      2,      3, ...,  24998,  24999,  25000],
       [ 25001,  25002,  25003, ...,  49998,  49999,  50000],
       [ 50001,  50002,  50003, ...,  74998,  74999,  75000],
       [ 75001,  75002,  75003, ...,  99998,  99999, 100000]],
      shape=(4, 25000))

In [35]:
np.arange(1, 100001).reshape(100, 1000)

array([[     1,      2,      3, ...,    998,    999,   1000],
       [  1001,   1002,   1003, ...,   1998,   1999,   2000],
       [  2001,   2002,   2003, ...,   2998,   2999,   3000],
       ...,
       [ 97001,  97002,  97003, ...,  97998,  97999,  98000],
       [ 98001,  98002,  98003, ...,  98998,  98999,  99000],
       [ 99001,  99002,  99003, ...,  99998,  99999, 100000]],
      shape=(100, 1000))

In [36]:
np.arange(2, 41, 2).reshape(4, 5)

array([[ 2,  4,  6,  8, 10],
       [12, 14, 16, 18, 20],
       [22, 24, 26, 28, 30],
       [32, 34, 36, 38, 40]])

In [37]:
import random

## 7.6 List vs. Array Performance: Introducing %timeit

### Timing the Creation of a List Containing Results of 6,000,000 Dice Rolls

In [38]:
%timeit rolls_list = \
    [random.randrange(1,7) for i in range(0, 6_000_000)]

# This times how long the list function takes to roll a 6 sided dice 6 million times.

2.84 s ± 254 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Timing the Creation of an Array Containing the Results of 6,000,000 Dice Rolls

In [39]:
%timeit rolls_array = np.random.randint(1, 7, 6_000_000)

# This times how long the array in Numpy takes to roll 6,000,000 6 sided dice

80.7 ms ± 16.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### 60,000,000 and 600,000,000 Dice Rolls

In [40]:
%timeit rolls_array = np.random.randint(1, 7, 60_000_000)

# This times how long the array in Numpy takes to roll 60,000,000 6 sided dice

713 ms ± 41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [41]:
%timeit rolls_array = np.random.randint(1, 7, 600_000_000)

# This times how long the array in Numpy takes to roll 600,000,000 6 sided dice

7.06 s ± 427 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### Customizing the %timeit Iterations

In [42]:
%timeit -n3 -r2 rolls_array = np.random.randint(1, 7, 6_000_000)

# This is how we can customize the %timeit function. -n# is number of loops, -r# is number of runs

62.9 ms ± 1.52 ms per loop (mean ± std. dev. of 2 runs, 3 loops each)


In [43]:
%timeit sum([x for x in range(10_000_000)])

662 ms ± 22.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [44]:
%timeit np.arange(10_000_000).sum()

29 ms ± 2.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


### Other IPython Magics
`%load` to read code into IPython from a local file or URL\
`%save` to save snippets to a file\
`%run` to execute a `.py` file from IPython\
`%precision` to change the default floating point precision for IPython outputs\
`%cd` to change directories without having to exit IPython first\
`%edit` to launch an external editor - handy if you need to modify more complex snippets\
`%history` to view a list of all snippets and commands you've executed in the current IPython session

## Array Operators

### Arithmetic Operations with Arrays and Individual Numeric Values

In [45]:
numbers = np.arange(1, 6)

In [46]:
numbers

array([1, 2, 3, 4, 5])

In [47]:
numbers * 2

array([ 2,  4,  6,  8, 10])

In [48]:
numbers ** 3

array([  1,   8,  27,  64, 125])

In [49]:
numbers # numbers is unchanged by arithmetic operators

array([1, 2, 3, 4, 5])

In [50]:
numbers += 10

In [51]:
numbers

array([11, 12, 13, 14, 15])

### Arithmetic Operations Between Arrays

In [52]:
numbers2 = np.linspace(1.1, 5.5, 5)

In [53]:
numbers2

array([1.1, 2.2, 3.3, 4.4, 5.5])

In [54]:
numbers * numbers2

array([12.1, 26.4, 42.9, 61.6, 82.5])

### Comparing Arrays

In [55]:
numbers

array([11, 12, 13, 14, 15])

In [56]:
numbers >=13

array([False, False,  True,  True,  True])

In [57]:
numbers2

array([1.1, 2.2, 3.3, 4.4, 5.5])

In [58]:
numbers2 < numbers

array([ True,  True,  True,  True,  True])

In [59]:
numbers == numbers2

array([False, False, False, False, False])

In [60]:
numbers == numbers

array([ True,  True,  True,  True,  True])

In [61]:
np.arange(1, 6) ** 2

array([ 1,  4,  9, 16, 25])

## 7.8 Numpy Calculation Methods

In [62]:
grades = np.array([[87, 96, 70], [100, 87, 90], [94, 77, 90], [100, 81, 82]])

In [63]:
grades

array([[ 87,  96,  70],
       [100,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

In [64]:
grades.sum()

np.int64(1054)

In [65]:
grades.min()

np.int64(70)

In [66]:
grades.mean()

np.float64(87.83333333333333)

In [67]:
grades.std()

np.float64(8.792357792739987)

In [68]:
grades.var()

np.float64(77.30555555555556)

### Calculations by Row or Column

In [69]:
grades.mean(axis=0)

array([95.25, 85.25, 83.  ])

In [70]:
grades.mean(axis=1)

array([84.33333333, 92.33333333, 87.        , 87.66666667])

In [71]:
grades = np.random.randint(60, 101, 12).reshape(4, 3)

In [72]:
grades

array([[ 89,  84,  97],
       [100,  75,  92],
       [ 78,  74,  76],
       [ 91,  85,  65]], dtype=int32)

In [73]:
grades.mean(axis=0)

array([89.5, 79.5, 82.5])

In [74]:
grades.mean(axis=1)

array([90.        , 89.        , 76.        , 80.33333333])

## 7.9 Universal Functions

In [75]:
numbers = np.array([1, 4, 9, 16, 25, 36])

In [76]:
np.sqrt(numbers)

array([1., 2., 3., 4., 5., 6.])

In [77]:
numbers2 = np.arange(1, 7) *10

In [78]:
numbers2

array([10, 20, 30, 40, 50, 60])

In [79]:
np.add(numbers, numbers2)

array([11, 24, 39, 56, 75, 96])

### Broadcasting with Universal Functions

In [80]:
np.multiply(numbers2, 5)

array([ 50, 100, 150, 200, 250, 300])

In [81]:
numbers3 = numbers2.reshape(2, 3)

In [82]:
numbers3

array([[10, 20, 30],
       [40, 50, 60]])

In [83]:
numbers4 = np.array([2, 4, 6])

In [84]:
np.multiply(numbers3, numbers4)

array([[ 20,  80, 180],
       [ 80, 200, 360]])

In [85]:
selfcheck2 = np.arange(1, 6) ** 3

In [86]:
selfcheck2

array([  1,   8,  27,  64, 125])

## 7.10 Indexing and Slicing

### Indexing with 2D Arrays

In [87]:
grades = np.array([[87, 96, 70], [100, 87, 90], [94, 77, 90], [100, 81, 82]])

In [88]:
grades

array([[ 87,  96,  70],
       [100,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

In [89]:
grades[0, 1]

np.int64(96)

### Selecting a Subset of a 2D Array's Rows

In [90]:
grades[1]

array([100,  87,  90])

In [91]:
grades[0: 2]

array([[ 87,  96,  70],
       [100,  87,  90]])

In [92]:
grades[1: 3]

array([[100,  87,  90],
       [ 94,  77,  90]])

### Selecting a Subset of a 2D Array's Columns

In [93]:
grades[:, 0]

# the colon : is a slice representing ALL ROWS

array([ 87, 100,  94, 100])

In [94]:
grades[:, 1:3]

array([[96, 70],
       [87, 90],
       [77, 90],
       [81, 82]])

In [95]:
grades[:, [0,2]]

array([[ 87,  70],
       [100,  90],
       [ 94,  90],
       [100,  82]])

## 7.11 Views: Shallow Copies

In [96]:
numbers = np.arange(1, 6)

In [97]:
numbers

array([1, 2, 3, 4, 5])

In [None]:
numbers2 = numbers.view() # This makes numbers2 a SHALLOW copy of numbers

In [99]:
numbers2

array([1, 2, 3, 4, 5])

In [121]:
id(numbers) # the id() function gives us the data identification number of something. We can use this to determine if something is the same or not.

2580352278160

In [122]:
id(numbers2)

2580353839728

In [102]:
numbers[1] *=10

In [103]:
numbers2

array([ 1, 20,  3,  4,  5])

In [104]:
numbers

array([ 1, 20,  3,  4,  5])

In [105]:
numbers2[1] /= 10

In [106]:
numbers2

array([1, 2, 3, 4, 5])

Shallow Copies are only viewing/reading the original data. That is why equations affect both numbers and numbers2.

In [107]:
numbers

array([1, 2, 3, 4, 5])

In [108]:
numbers2 = numbers[0:3]

In [109]:
numbers2

array([1, 2, 3])

In [111]:
numbers[1] *= 20

In [112]:
numbers2

array([ 1, 40,  3])

In [113]:
numbers

array([ 1, 40,  3,  4,  5])

## 7.12 Deep Copies

In [114]:
numbers[1] /= 20

In [115]:
numbers

array([1, 2, 3, 4, 5])

In [123]:
numbers2 = numbers.copy() # This creates a deep copy of the original source (numbers)

In [117]:
numbers2

array([1, 2, 3, 4, 5])

In [118]:
numbers[1] *= 10

In [119]:
numbers

array([ 1, 20,  3,  4,  5])

In [120]:
numbers2

array([1, 2, 3, 4, 5])

Deep Copies give the new grouping it's own copy of the data, so when you run an equation, it will only affect the original source.

## 7.13 Reshaping and Transposing

### Reshaping v. Resizing

`reshape` allows us to return a shallow copy of an array with new dimensions

In [124]:
grades = np.array([[87, 96, 70], [100, 87, 90]])

In [125]:
grades

array([[ 87,  96,  70],
       [100,  87,  90]])

In [127]:
grades.reshape(1, 6) # Instead of 2x3, it's now 1x6

array([[ 87,  96,  70, 100,  87,  90]])

In [128]:
grades # however, it shouldn't affect the original df

array([[ 87,  96,  70],
       [100,  87,  90]])

`resize` will modify the original array's shape.

In [130]:
grades.resize(1, 6)

In [131]:
grades

array([[ 87,  96,  70, 100,  87,  90]])

### flatten and ravel

We can make multidimensional arrays become 1 dimension

`flatten` will make deep copies in the original array's data

In [132]:
grades.resize(2, 3)

In [133]:
grades

array([[ 87,  96,  70],
       [100,  87,  90]])

In [134]:
flattened = grades.flatten()

In [135]:
flattened

array([ 87,  96,  70, 100,  87,  90])

In [136]:
grades

array([[ 87,  96,  70],
       [100,  87,  90]])

In [137]:
flattened[0] = 100

In [138]:
flattened

array([100,  96,  70, 100,  87,  90])

`ravel` produces a view of the original array, which shares the array "grades"'s data

In [139]:
raveled = grades.ravel()

In [140]:
raveled

array([ 87,  96,  70, 100,  87,  90])

In [141]:
grades

array([[ 87,  96,  70],
       [100,  87,  90]])

In [142]:
raveled[0] = 100

In [143]:
raveled

array([100,  96,  70, 100,  87,  90])

In [144]:
grades

array([[100,  96,  70],
       [100,  87,  90]])

### Transposing Rows and Columns