#### PGGM Data Science Bootcamp 2020
*Notebook by [Pedro V Hernandez Serrano](https://github.com/pedrohserrano)*

---
![](../img/image_1.png)

# 1. Data Science with Python
* [1.1. Python Data Structures](#1.1)
* [1.2. Numpy](#1.2)

---

Guido van rossum | Monty python
- | - 
![](https://gvanrossum.github.io/images/guido-portrait-dan-stroud.jpg) | ![](https://upload.wikimedia.org/wikipedia/en/c/cd/Monty_Python%27s_Flying_Circus_Title_Card.png)


The history of Python starts with ABC.

- ABC is a general-purpose programming language and programming 
environment, which had been developed in the Netherlands, Amsterdam, at 
the CWI (Centrum Wiskunde & Informatica).

- The greatest achievement of ABC was to influence the design of Python.  
He emphasizes on the DRY (Don’t Repeat Yourself) principle and readability.

- Python was conceptualized in the late 1980s. Guido van Rossum worked that 
time in a project at the CWI, called Amoeba, a distributed operating system.

- Python was designed as a simple scripting language that possessed some of 
ABC's better properties, but without its problems.

-  So, what about the name "Python": Most people think about snakes, but the 
name has something to do with excellent British humour. A show called Monty Python's Flying Circus was the culprit.

## Tutorials for Learning Python
    
- [Codecademy](https://www.codecademy.com/tracks/python) is great for beginner levels.
- There is also the [Official Beginners Guide](https://wiki.python.org/moin/BeginnersGuide).
- [Learn Python the Hard Way](https://learnpythonthehardway.org/book/) is a great tutorial for a more in-depth overview.
    - It isn't actually particularly hard, although note that the currently available version is in Python2. \n",
- [Whirlwind Tour of Python](https://github.com/jakevdp/WhirlwindTourOfPython) is a free collection of Jupyter notebooks that takes you through Python. 

## Python Practice
    
 - [Python Challenge](http://www.pythonchallenge.com/) is a good place for (sometimes infuriating) programming challenges.
 - [Leet Code](https://leetcode.com/) is a place for more intense technical coding questions and challenges (geared towards industry interviews).

## Getting Un-Stuck
At some point, you will get stuck. It happens. The internet is your friend.
    
If you get an error, or aren't sure how to proceed, use {your favourite search engine} with specific search terms relating to what you are trying to do. Sometimes this just means searching the error that you got.
   
Your are likely to find responses on [StackOverflow](https://stackoverflow.com) - which is basically a forum for programming questions, and a good place to find answers.

## Python2 vs. Python3
    
Python3 was a break from Python2, because there were some larger changes that broke [backwards compatibility](https://en.wikipedia.org/wiki/Backward_compatibility).
  
Python2 is still popular, and often used, partly because it takes a while with major new releases for everything to be available, and to get things updated. Now though, Python3 has pretty much everything available, and is the future of Python.

In practice, Python 2 & 3 are very similar - learning one will be mostly relevant for knowing the other, and code can usually be made compatible between both with minimal changes.

This programme will use Python3 - it is the currently developed version of Python. Specifically 3.7 is the most recent version.

## Packages

Packages are basically just collections of code. The anaconda distribution comes with all the core packages you will need for this class. 
  
For getting other packages, anaconda comes with
    <a href="https://conda.io/docs/using/pkgs.html" class="alert-link">conda</a>
    a package manager, with support for downloading and installing other packages.

---
### 1.1. Python Data Structures
<a id="1.1">

Many of the things I used to use a calculator for, I now use Python for:

In [121]:
2+2

4

In [122]:
(50-5*6)/4

5.0

There are some gotchas compared to using a normal calculator.

In [123]:
7/3

2.3333333333333335

Python integer division, like C or Fortran integer division, truncates the remainder and returns an integer. At least it does in version 2. In version 3, Python returns a floating point number. You can get a sneak preview of this feature in Python 2 by importing the module from the future features:

    from __future__ import division

Alternatively, you can convert one of the integers to a floating point number, in which case the division function returns another floating point number.

In [124]:
7/3.0

2.3333333333333335

In [125]:
7/float(3)

2.3333333333333335

In the last few lines, we have sped by a lot of things that we should stop for a moment and explore a little more fully. We've seen, however briefly, two different data types: **integers**, also known as *whole numbers* to the non-programming world, and **floating point numbers**, also known (incorrectly) as *decimal numbers* to the rest of the world.

We've also seen the first instance of an **import** statement. Python has a huge number of libraries included with the distribution. To keep things simple, most of these variables and functions are not accessible from a normal Python interactive session. Instead, you have to import the name. For example, there is a **math** module containing many useful functions. To access, say, the square root function, you can either first

    from math import sqrt

and then

In [126]:
from math import sqrt
sqrt(81)

9.0

In [127]:
#dir(sqrt)

or you can simply import the math library itself

In [128]:
import math as mt

In [129]:
mt.sqrt(81)

9.0

You can define variables using the equals (=) sign:

In [130]:
width = 20
length = 30
area = length*width

In [131]:
print(area)

600


If you try to access a variable that you haven't yet defined, you get an error:

In [132]:
#volume

In [133]:
depth = 10
volume = area*depth
volume

6000

In [134]:
# #MyVariableIsThis
# #my_variables_is_this
# from_ 
# _while
# _def

You can name a variable *almost* anything you want. It needs to start with an alphabetical character or "\_", can contain alphanumeric charcters plus underscores ("\_"). Certain words, however, are reserved for the language:

    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield

Trying to define a variable using one of these will result in a syntax error:

In [135]:
#return = 0

The [Python Tutorial](http://docs.python.org/2/tutorial/introduction.html#using-python-as-a-calculator) has more on using Python as an interactive shell. The [IPython tutorial](http://ipython.org/ipython-doc/dev/interactive/tutorial.html) makes a nice complement to this, since IPython has a much more sophisticated iteractive shell.

## Strings
Strings are lists of printable characters, and can be defined using either single quotes

In [136]:
'Hello, PGGM!'

'Hello, PGGM!'

or double quotes

In [137]:
"Hello, PGGM!"

'Hello, PGGM!'

But not both at the same time, unless you want one of the symbols to be part of the string.

In [138]:
"He's a Data Scientist"

"He's a Data Scientist"

In [139]:
'She asked, "How are you today?"'

'She asked, "How are you today?"'

Just like the other two data objects we're familiar with (ints and floats), you can assign a string to a variable

In [140]:
greeting = "Hello, PGGM!"

In [141]:
print(greeting)

Hello, PGGM!


The **print** statement is often used for printing character strings:

In [142]:
print (greeting)

Hello, PGGM!


But it can also print data types other than strings:

In [143]:
area = 60

In [144]:
print ("The area is ",area, volume, 10, 5*4)

The area is  60 6000 10 20


Also possible with the format method

In [145]:
print ("The area is {} and volume is {}".format(area, volume))

The area is 60 and volume is 6000


In the above snipped, the number 600 (stored in the variable "area") is converted into a string before being printed out.

You can use + to concatenate multiple strings in a single statement:

In [146]:
print ("This " + "is " + "a " + "longer " + "statement."+ str(volume))

This is a longer statement.6000


If you have a lot of words to concatenate together, there are other, more efficient ways to do this. But this is fine for linking a few strings together.

In [147]:
text1 = "The company ABN AMRO is a modern, full-service bank with a transparent and client-driven business model, a moderate risk profile"
text1

'The company ABN AMRO is a modern, full-service bank with a transparent and client-driven business model, a moderate risk profile'

In [148]:
len(text1) # The length of text1

128

In [149]:
text1

'The company ABN AMRO is a modern, full-service bank with a transparent and client-driven business model, a moderate risk profile'

In [150]:
text2 = text1.split(' ') # Return a list of the words in text2, separating by ' '.

In [151]:
#text2

In [152]:
len(text2)

20

In [153]:
len(text2[1])

7

List comprehension allows us to find specific words:

In [154]:
#[w for w in text2 if len(w) > 3] # Words that are greater than 3 letters long in text2

In [155]:
# alist = []
# for w in text2:
#     if len(w) > 3:
#         alist.append(w)
#     else:
#         pass
# alist

In [156]:
[w for w in text2 if w.istitle()] # Capitalized words in text2

['The']

In [157]:
[w for w in text2 if w.endswith('s')] # Words in text2 that end in 's'

['is', 'business']

We can find unique words using `set()`.

In [158]:
text3 = 'The annual Report for 2019 showing annual results'
text4 = text3.split(' ')

In [159]:
len(text4)

8

In [160]:
len(set(text4))

7

In [161]:
set(text4)

{'2019', 'Report', 'The', 'annual', 'for', 'results', 'showing'}

In [162]:
len(set([w.lower() for w in text4])) # .lower converts the string to lowercase.

7

In [163]:
set([w.lower() for w in text4])

{'2019', 'annual', 'for', 'report', 'results', 'showing', 'the'}

## Lists
Very often in a programming language, one wants to keep a group of similar items together. Python does this using a data type called **lists**.

In [164]:
days_of_the_week = ["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]

In [165]:
type(days_of_the_week)

list

In [166]:
list(range(3))

[0, 1, 2]

You can access members of the list using the **index** of that item:

In [167]:
days_of_the_week[2][2]

'e'

Python lists, like C, but unlike Fortran, use 0 as the index of the first element of a list. Thus, in this example, the 0 element is "Sunday", 1 is "Monday", and so on. If you need to access the *n*th element from the end of the list, you can use a negative index. For example, the -1 element of a list is the last element:

In [168]:
print(days_of_the_week[-2] == days_of_the_week[5])

True


You can add additional items to the list using the .append() command:

In [169]:
languages = ["Fortran","C","C++"]
languages.append("Python")
print (languages)

['Fortran', 'C', 'C++', 'Python']


In [170]:
#languages.remove('Python')

In [171]:
languages

['Fortran', 'C', 'C++', 'Python']

In [172]:
del languages[-1]

In [173]:
languages

['Fortran', 'C', 'C++']

The **range()** command is a convenient way to make sequential lists of numbers:

In [174]:
range(10)

range(0, 10)

Note that range(n) starts at 0 and gives the sequential list of integers less than n. If you want to start at a different number, use range(start,stop)

In [175]:
list(range(2,8))

[2, 3, 4, 5, 6, 7]

The lists created above with range have a *step* of 1 between elements. You can also give a fixed step size via a third command:

In [176]:
evens = range(0,20,2)
evens

range(0, 20, 2)

In [177]:
type(evens)#[3]

range

Lists do not have to hold the same data type. For example,

In [178]:
["Today",7,99.3,"", evens, days_of_the_week]

['Today',
 7,
 99.3,
 '',
 range(0, 20, 2),
 ['Sunday',
  'Monday',
  'Tuesday',
  'Wednesday',
  'Thursday',
  'Friday',
  'Saturday']]

However, it's good (but not essential) to use lists for similar objects that are somehow logically connected. If you want to group different data types together into a composite data object, it's best to use **tuples**, which we will learn about below.

You can find out how long a list is using the **len()** command:

In [179]:
help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



In [180]:
len(evens)

10

## Iteration, Indentation, and Blocks
One of the most useful things you can do with lists is to *iterate* through them, i.e. to go through each element one at a time. To do this in Python, we use the **for** statement:

In [181]:
for day in days_of_the_week:
    print (day)

Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday


This code snippet goes through each element of the list called **days_of_the_week** and assigns it to the variable **day**. It then executes everything in the indented block (in this case only one line of code, the print statement) using those variable assignments. When the program has gone through every element of the list, it exists the block.

(Almost) every programming language defines blocks of code in some way. In Fortran, one uses END statements (ENDDO, ENDIF, etc.) to define code blocks. In C, C++, and Perl, one uses curly braces {} to define these blocks.

Python uses a colon (":"), followed by indentation level to define code blocks. Everything at a higher level of indentation is taken to be in the same block. In the above example the block was only a single line, but we could have had longer blocks as well:

In [182]:
for day in days_of_the_week[2:4]:
    statement = "Today is " + day
    print (statement)

Today is Tuesday
Today is Wednesday


The **range()** command is particularly useful with the **for** statement to execute loops of a specified length:

In [183]:
for i in range(20):
    print ("The square of ",i," is ",i*i)
    #print("this")

print("something else")

The square of  0  is  0
The square of  1  is  1
The square of  2  is  4
The square of  3  is  9
The square of  4  is  16
The square of  5  is  25
The square of  6  is  36
The square of  7  is  49
The square of  8  is  64
The square of  9  is  81
The square of  10  is  100
The square of  11  is  121
The square of  12  is  144
The square of  13  is  169
The square of  14  is  196
The square of  15  is  225
The square of  16  is  256
The square of  17  is  289
The square of  18  is  324
The square of  19  is  361
something else


## Slicing
Lists and strings have something in common that you might not suspect: they can both be treated as sequences. You already know that you can iterate through the elements of a list. You can also iterate through the letters in a string:

In [184]:
for letter in "Sunday":
    print (letter)

S
u
n
d
a
y


This is only occasionally useful. Slightly more useful is the *slicing* operation, which you can also use on any sequence. We already know that we can use *indexing* to get the first element of a list:

In [185]:
days_of_the_week[0]

'Sunday'

If we want the list containing the first two elements of a list, we can do this via

In [186]:
days_of_the_week[0:2]

['Sunday', 'Monday']

or simply

In [187]:
days_of_the_week[:2]

['Sunday', 'Monday']

If we want the last items of the list, we can do this with negative slicing:

In [188]:
days_of_the_week[-2:]

['Friday', 'Saturday']

which is somewhat logically consistent with negative indices accessing the last elements of the list.

You can do:

In [189]:
workdays = days_of_the_week[1:6]
print (workdays)

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']


Since strings are sequences, you can also do this to them:

In [190]:
day = "Monday"
abbreviation = day[:3]
print (abbreviation)

Mon


If we really want to get fancy, we can pass a third element into the slice, which specifies a step length (just like a third argument to the **range()** function specifies the step):

In [191]:
numbers = range(0,40)
evens = numbers[2::2]
evens

range(2, 40, 2)

Note that in this example I was even able to omit the second argument, so that the slice started at 2, went to the end of the list, and took every second element, to generate the list of even numbers less that 40.

## Booleans and Truth Testing
We have now learned a few data types. We have integers and floating point numbers, strings, and lists to contain them. We have also learned about lists, a container that can hold any data type. We have learned to print things out, and to iterate over items in lists. We will now learn about **boolean** variables that can be either True or False.

We invariably need some concept of *conditions* in programming to control branching behavior, to allow a program to react differently to different situations. If it's Monday, I'll go to work, but if it's Sunday, I'll sleep in. To do this in Python, we use a combination of **boolean** variables, which evaluate to either True or False, and **if** statements, that control branching based on boolean values.

In [192]:
if day == "Sunday":
    print ("Sleep in")
else:
        print ("Go to work")

Go to work


(Quick quiz: why did the snippet print "Go to work" here? What is the variable "day" set to?)

Let's take the snippet apart to see what happened. First, note the statement

In [193]:
day == "Sunday"

False

In [194]:
day

'Monday'

If we evaluate it by itself, as we just did, we see that it returns a boolean value, False. The "==" operator performs *equality testing*. If the two items are equal, it returns True, otherwise it returns False. In this case, it is comparing two variables, the string "Sunday", and whatever is stored in the variable "day", which, in this case, is the other string "Saturday". Since the two strings are not equal to each other, the truth test has the false value.

The if statement that contains the truth test is followed by a code block (a colon followed by an indented block of code). If the boolean is true, it executes the code in that block. Since it is false in the above example, we don't see that code executed.

The first block of code is followed by an **else** statement, which is executed if nothing else in the above if statement is true. Since the value was false, this code is executed, which is why we see "Go to work".

You can compare any data types in Python:

In [195]:
1 == 2

False

In [196]:
50 == 2*25

True

In [197]:
3 < 3.14159

True

In [198]:
1 == 1.0

True

In [199]:
1 != 0

True

In [200]:
1 <= 2

True

In [201]:
1 >= 1

True

We see a few other boolean operators here, all of which which should be self-explanatory. Less than, equality, non-equality, and so on.

Particularly interesting is the 1 == 1.0 test, which is true, since even though the two objects are different data types (integer and floating point number), they have the same *value*. There is another boolean operator **is**, that tests whether two objects are the same object:

In [202]:
float(1.) is 1.0

True

In [203]:
day is 'Wednesday'

False

We can do boolean tests on lists as well:

In [204]:
[1,2,3] == [1,2,4]

False

In [205]:
[1,2,3] < [1,2,4]

True

In [206]:
[1,2,3] < [1,2,4]

True

Finally, note that you can also string multiple comparisons together, which can result in very intuitive tests:

In [207]:
hours = 5
0 < hours < 24

True

If statements can have **elif** parts ("else if"), in addition to if/else parts. For example:

In [208]:
if day != "Sunday":
    print ("Sleep in")
elif day == "Saturday":
    print ("Go run")
else:
    print ("Go to work")

Sleep in


Of course we can combine if statements with for loops, to make a snippet that is almost interesting:

In [209]:
for day in days_of_the_week:
    statement = "Today is " + day
    print (statement)    
    if day == "Sunday":
        print ("   Sleep in")
    elif day == "Saturday":
        print ("   Do chores")
    else:
        print ("   Go to work")

Today is Sunday
   Sleep in
Today is Monday
   Go to work
Today is Tuesday
   Go to work
Today is Wednesday
   Go to work
Today is Thursday
   Go to work
Today is Friday
   Go to work
Today is Saturday
   Do chores


In [210]:
type(1 == 2)

bool

In [211]:
bool(["This "," is "," a "," list"])

True

## Functions

A function is a block of organized, reusable code that can make your scripts more effective, easier to read, and simple to manage. You can think functions as little self-contained programs that can perform a specific task which you can use repeatedly in your code.

During the course we have already used some functions such as the `print()` command which is actually a built-in function in Python.  

We do this with the **def** statement in Python:

Let’s define our first function called celsiusToFahr

In [212]:
def celsiusToFahr(tempCelsius):
    celsius_value = 9/5 * tempCelsius + 32
    return celsius_value

The function definition opens with the keyword def followed by the name of the function and a list of parameter names in parentheses. The body of the function — the statements that are executed when it runs — is indented below the definition line.

Now let’s try using our function. Calling our self-defined function is no different from calling any other function such as print(). 

In [213]:
freezingPoint =  celsiusToFahr(0)

print('The freezing point of water in Fahrenheit is:', freezingPoint)
print('The boiling point of water in Fahrenheit is:', celsiusToFahr(100))

The freezing point of water in Fahrenheit is: 32.0
The boiling point of water in Fahrenheit is: 212.0


Now that we know how to create a function to convert Celsius to Fahrenheit, let’s create another function called `kelvinsToCelsius`

In [214]:
def kelvinsToCelsius(tempKelvins):
    return tempKelvins - 273.15

And let’s use it in the same way as the earlier one

In [215]:
absoluteZero = kelvinsToCelsius(tempKelvins=0)

print('Absolute zero in Celsius is:', absoluteZero)

Absolute zero in Celsius is: -273.15


What about converting Kelvins to Fahrenheit? We could write out a new formula for it, but we don’t need to. Instead, we can do the conversion using the two functions we have already created and calling those from the function we are now creating

In [216]:
def kelvinsToFahrenheit(tempKelvins):
    '''This function converts kelvin to fahrenheit'''
    tempCelsius = kelvinsToCelsius(tempKelvins)
    tempFahr = celsiusToFahr(tempCelsius)
    return tempFahr

In [217]:
help(kelvinsToFahrenheit)

Help on function kelvinsToFahrenheit in module __main__:

kelvinsToFahrenheit(tempKelvins)
    This function converts kelvin to fahrenheit



Now let’s use the function

In [218]:
absoluteZeroF = kelvinsToFahrenheit(tempKelvins=0)

print('Absolute zero in Fahrenheit is:', absoluteZeroF)

Absolute zero in Fahrenheit is: -459.66999999999996


We've introduced a several new features here. First, note that the function itself is defined as a code block (a colon followed by an indented block). This is the standard way that Python delimits things. Next, note that the first line of the function is a single string. This is called a **docstring**, and is a special kind of comment that is often available to people using the function through the python command line:

In [219]:
help(kelvinsToFahrenheit)

Help on function kelvinsToFahrenheit in module __main__:

kelvinsToFahrenheit(tempKelvins)
    This function converts kelvin to fahrenheit



If you define a docstring for all of your functions, it makes it easier for other people to use them, since they can get help on the arguments and return values of the function.

Next, note that rather than putting a comment in about what input values lead to errors, we have some testing of these values, followed by a warning if the value is invalid, and some conditional code to handle special cases.

## Two More Data Structures: Tuples and Dictionaries
Before we end the Python overview, I wanted to touch on two more data structures that are very useful (and thus very common) in Python programs.

A **tuple** is a sequence object like a list or a string. It's constructed by grouping a sequence of objects together with commas, either without brackets, or with parentheses:

In [220]:
t = (1,2,'hi',9.0)
t

(1, 2, 'hi', 9.0)

Tuples are like lists, in that you can access the elements using indices:

In [221]:
t[1]

2

However, tuples are *immutable*, you can't append to them or change the elements of them:

In [222]:
#t.append(7)

In [223]:
days_of_the_week

['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

In [224]:
days_of_the_week[1]=77

Tuples are useful anytime you want to group different pieces of data together in an object, but don't want to create a full-fledged class (see below) for them. For example, let's say you want the Cartesian coordinates of some objects in your program. Tuples are a good way to do this:

In [225]:
('Bob',0.0,21.0)

('Bob', 0.0, 21.0)

Again, it's not a necessary distinction, but one way to distinguish tuples and lists is that tuples are a collection of different things, here a name, and x and y coordinates, whereas a list is a collection of similar things, like if we wanted a list of those coordinates:

In [226]:
positions = [
             ('Bob',0.0,21.0),
             ('Cat',2.5,13.1),
             ('Dog',33.0,1.2)
             ]

In [227]:
positions

[('Bob', 0.0, 21.0), ('Cat', 2.5, 13.1), ('Dog', 33.0, 1.2)]

In [228]:
x = 4
y = 5

In [229]:
x, y = 4, 5

In [230]:
x+y

9

Here we did two things with tuples you haven't seen before. First, we unpacked an object into a set of named variables using *tuple assignment*:

    >>> name,x,y = obj

We also returned multiple values (minx,miny), which were then assigned to two other variables (x,y), again by tuple assignment. This makes what would have been complicated code in C++ rather simple.

Tuple assignment is also a convenient way to swap variables:

In [231]:
x,y = 1,2
y,x = x,y
x,y

(2, 1)

**Dictionaries** 

are an object called "mappings" or "associative arrays" in other languages. Whereas a list associates an integer index with a set of objects:

In [232]:
mylist = [1,2,9,21]

The index in a dictionary is called the *key*, and the corresponding dictionary entry is the *value*. A dictionary can use (almost) anything as the key. Whereas lists are formed with square brackets [], dictionaries use curly brackets {}:

In [235]:
ages = {"Rick": 46, 
        "Bob": 86,
        "Fred": 21}

print ("Rick's age is ",ages["Rick"])

Rick's age is  46


There's also a convenient way to create dictionaries without having to quote the keys.

In [236]:
dict(Rick=46,Bob=86,Fred=20)

{'Rick': 46, 'Bob': 86, 'Fred': 20}

The **len()** command works on both tuples and dictionaries:

In [237]:
len(t)

4

In [238]:
ages.items()

dict_items([('Rick', 46), ('Bob', 86), ('Fred', 21)])

---
### 1.2. Numpy
<a id="1.2">

Numpy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Load Numpy

In [239]:
import numpy as np

**Creating a Vector**  

Here we use Numpy to create a 1-D Array which we then call a vector.

In [241]:
vector = np.array([1,2,3])
vector

array([1, 2, 3])

In [242]:
type(vector)

numpy.ndarray

**Creating a Matrix**

We Create a 2-D Array in Numpy and call it a Matrix. It contains 2 rows and 3 columns.

In [262]:
matrix = np.array([[1,2,3],[4,5,6]])
print(matrix)

[[1 2 3]
 [4 5 6]]


**Selecting Elements**

When you need to select one or more element in a vector or matrix

Select 3rd element of Vector

In [281]:
print(vector[2])

3


Select 2nd row 2nd column

In [282]:
print(matrix[1,1])

5


Select all elements of a vector

In [283]:
print(vector[:])

[1 2 3]


Select everything up to and including the 3rd element

In [266]:
print(vector[:2])

[1 2]


Select all rows and the 2nd column of the matrix

In [267]:
print(matrix[:,1:2])

[[2]
 [5]]


**Describing a Matrix**

When you want to know about the shape size and dimensions of a Matrix.

In [268]:
matrix = np.array([[1,3,3],
                   [4,5,6],
                   [7,8,9]
])

In [269]:
print(matrix.shape)
print(matrix.size)
print(matrix.ndim)

(3, 3)
9
2


What these values correspond to?

**Applying operations to elements** 

You want to apply some function to multiple elements in an array.
Numpy’s vectorize class converts a function into a function that can apply to multiple elements in an array or slice of an array.

Create a function that adds 100 to something

In [270]:
add_100 = lambda x: x+100

In [271]:
def my(i):
    value = i+100
    return value    

Apply function to all elements in matrix

In [272]:
add_100(matrix)

array([[101, 103, 103],
       [104, 105, 106],
       [107, 108, 109]])

**Finding the max and min values**

We use Numpy’s max and min functions:

In [273]:
print(np.max(matrix))
print(np.min(matrix))
print(np.max(matrix,axis=0))
print(np.max(matrix,axis=1))

9
1
[7 8 9]
[3 6 9]


What these values correspond to?

**Reshaping Arrays**

When you want to reshape an array(changing the number of rows and columns) without changing the elements.

In [274]:
print(matrix.reshape(9,1))

[[1]
 [3]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]]


In [275]:
print(matrix.flatten())

[1 3 3 4 5 6 7 8 9]


**Operations with Matrices**

Adding, Subtracting and Multiplying

In [276]:
matrix_2 = np.array([[7,8,9],[4,5,6],[1,2,3]])

Add and substract 2 Matrices

In [277]:
print(np.add(matrix,matrix_2))

[[ 8 11 12]
 [ 8 10 12]
 [ 8 10 12]]


In [278]:
print(np.subtract(matrix,matrix_2))

[[-6 -5 -6]
 [ 0  0  0]
 [ 6  6  6]]


**Multiplication Element wise, and Dot Product**

In [279]:
print(matrix*matrix_2)
print(matrix@matrix_2)

[[ 7 24 27]
 [16 25 36]
 [ 7 16 27]]
[[ 22  29  36]
 [ 54  69  84]
 [ 90 114 138]]


In [280]:
print(matrix*matrix_2)

[[ 7 24 27]
 [16 25 36]
 [ 7 16 27]]


---
#### *Learn more about Python fundamentals at [dataquest.io](https://www.dataquest.io/blog/web-scraping-beautifulsoup/ ) you can check some differences and similarities between syntaxes at [NumPy for Matlab users¶](https://numpy.org/doc/1.18/user/numpy-for-matlab-users.html), the document is also [here](https://mas-dse.github.io/DSE200/cheat_sheets/1_python/6_2_NumPy_for_MATLAB_users.pdf)*. Some extra Python blogs on [towardsdatascience.com](https://towardsdatascience.com/a-beginners-guide-to-python-for-data-science-60ef022b7b67)