# Introducción a Python

<font size="3"> 

- Consideraciones generales
- Conceptos basicos de Python
- Listas
- Funciones y paquetes
- Numpy
    
</font> 

# Consideraciones generales

## Data Science Process

![title](imgs/ds_process2.png)

## Python

<font size="3"> 

- Guido Van Rossum   
    <https://gvanrossum.github.io/>   
    <https://twitter.com/gvanrossum?lang=en>
- General Purpose: build anything
- Open Source! Free!
- Python Packages, also for Data Science
    - Many applications and fields
- Version 3.x <https://www.python.org/downloads/>.
    
</font> 




## Python Data Types


<font size="3"> 
    
- **float** - real numbers
- **int** - integer numbers
- **str** - string, text
- **bool** - True,False
    
</font>

## Variable and Types

- Specific, case-sensitive name
- Call up value through variable name 
- 1.80 m - 79.8 kg

- PEP 8 -- Style Guide for Python Code

<https://www.python.org/dev/peps/pep-0008/#function-and-variable-names>

In [1]:
height = 1.80

weight = 79.8

print(height)
print(weight)

1.8
79.8


## Calculate BMI

![title](imgs/bmi.png)

In [4]:
height = 1.80

weight = 79.8

bmi = weight / height ** 2

print(bmi)

24.629629629629626


## Comments

In [5]:
# bmi

#round(bmi, 2)

print(round(bmi, 2))

24.63


## Python Types

In [7]:
type(bmi)

float

In [10]:
# Help
?type

#help(type)

In [11]:
day_of_week = 5

type(day_of_week)

int

## Python Types (2)

In [12]:
x = "body mass index"

y = 'this works too'

In [13]:
x

'body mass index'

In [15]:
type(y)

str

In [16]:
z = True

type(z)

bool

## Python Types (3)

In [17]:
2 + 3

5

In [18]:
'ab' + 'cd'

'abcd'

<font size="5"> Different type = different behavior!</font> 

# Data Structures

### Python Lists

**A list in Python is a collection of items, ordered and mutable objects that stores multiple items in a single variable and allows duplicate values..**


## Data Types


<font size="3"> 
    
- **float** - real numbers
- **int** - integer numbers
- **str** - string, text
- **bool** - True,False

**Each variable represents single value**

</font>

<https://docs.python.org/3/tutorial/datastructures.html>

## Problem

Data Science: many data points Height of entire family

In [19]:
height1 = 1.73
height2 = 1.68
height3 = 1.71
height4 = 1.89

## Python List

In [20]:
fam = [1.73, 1.68, 1.71, 1.89]

In [21]:
fam

[1.73, 1.68, 1.71, 1.89]

In [22]:
type(fam)

list

- Name a collection of values
- Contain any type
- Contain different types

In [23]:
# list
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [24]:
type(fam)

list

In [26]:
# --list of list
fam2 = [
        ["liz", 1.73], 
        ["emma", 1.68], 
        ["mom", 1.71], 
        ["dad", 1.89]
       ]

fam2

[['liz', 1.73], ['emma', 1.68], ['mom', 1.71], ['dad', 1.89]]

In [27]:
type(fam2)

list

- Specific functionality
- Specific behavior

## Subsetting Lists

In [28]:
# list
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

![title](imgs/index.png)

In [29]:
fam[3]

1.68

In [30]:
fam[6]

'dad'

In [31]:
fam[7]

1.89

In [32]:
fam[-1]

1.89

![title](imgs/index2.png)

## List slicing

In [35]:
fam[3:5]

[1.68, 'mom']

<font size="3"> 
    
**[start:end]**

**inclusive:exclusive**

</font>

In [36]:
fam[1:4]

[1.73, 'emma', 1.68]

In [37]:
fam[:5]

['liz', 1.73, 'emma', 1.68, 'mom']

In [38]:
fam[5:]

[1.71, 'dad', 1.89]

## Manipulating Lists
 


<font size="3"> 
    
List Manipulation
- Change list elements
- Add list elements
- Remove list elements

</font>

### Changing list elements

In [39]:
# list
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [41]:
# -- Assinging a new value
fam[7] = 1.86

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]

![title](imgs/index.png)

In [42]:
# -- Assinging a new values

fam[0:2] = ["lisa", 1.84]

fam

['lisa', 1.84, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]

## Adding and removing elements

In [1]:
# list
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [2]:
# -- Assinging a new value (overwriting)
fam + ["Alan", 1.81]


['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'Alan', 1.81]

In [3]:
print(fam)

fam_ext = fam + ["Alan", 1.81]

fam_ext

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]


['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'Alan', 1.81]

In [4]:
#print(fam)

# -- Removing emma

del(fam[2])

fam

['liz', 1.73, 1.68, 'mom', 1.71, 'dad', 1.89]

![title](imgs/index.png)

In [5]:
print(fam)

# -- Removing 1.68

del(fam[2])

fam

['liz', 1.73, 1.68, 'mom', 1.71, 'dad', 1.89]


['liz', 1.73, 'mom', 1.71, 'dad', 1.89]

# Functions
 

<font size="3"> 
    
- **type()**
- Piece of reusable code
- Solves particular task
- Call function instead of writing code yourself

</font>

### Example

In [54]:
fam = [1.73, 1.68, 1.71, 1.89]

fam

[1.73, 1.68, 1.71, 1.89]

In [55]:
#-- Return its biggest item

max(fam)

1.89

In [56]:
#-- Asking for help

help(max)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.



In [57]:
help(round)

Help on built-in function round in module builtins:

round(number, ndigits=None)
    Round a number to a given precision in decimal digits.
    
    The return value is an integer if ndigits is omitted or None.  Otherwise
    the return value has the same type as the number.  ndigits may be negative.



## Built-in Functions

### Methods

<font size="3"> 
    
- Functions that belong to objects
- Everything = object
- Object have methods associated, depending on type

</font>

### list methods

In [58]:
# list
fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]

fam

['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]

In [59]:
type(fam)

list

In [60]:
# fam.index()


list.index(x[, start[, end]])   

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.    

<https://docs.python.org/3/tutorial/datastructures.html>

In [61]:
#-- "Call method index() on fam"

fam.index("mom")

4

![title](imgs/index.png)

In [62]:
#-- "Counting number of values on fam"

fam.count(1.73)

1

list.count(x)   

Return the number of times x appears in the list.    

<https://docs.python.org/3/tutorial/datastructures.html>

In [63]:
sister = "liz"

sister

'liz'

In [64]:
sister.capitalize()

'Liz'

In [65]:
sister.replace("z", "sa")

'lisa'

In [66]:
sister

'liz'

## Packages



<font size="3"> 

- To leverage the code that Python developers have written, you'll learn about using functions, methods and packages. 
- This will help you to reduce the amount of code you need to solve challenging problems!
- Thousands of packages available
    - Numpy
    - Matplotlib
    - Scikit-learn

</font>

### Install package


<font size="3"> 

- **pip is already installed if you are using Python 2 >=2.7.9 or Python 3 >=3.4**
    
    - http://pip.readthedocs.org/en/stable/installing/
    - Download get-pip.py Terminal:
     - python3 get-pip.py

- Installing packages
     - pip3 install numpy

</font>


### Import package

<font size="3"> 

- Installing Python Packages from a Jupyter Notebook. 
    - https://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/

</font>

In [None]:
# Install a pip package in the current Jupyter kernel

#import sys
#!{sys.executable} -m pip install numpy

In [67]:
#-- Importing package by name

import numpy

In [None]:
#numpy.

In [68]:
numpy.array([1, 2, 3])

array([1, 2, 3])

In [69]:
#-- Importing a package by referring with a different name

import numpy as np

In [70]:
#np.

In [71]:
np.array([1, 2, 3])

array([1, 2, 3])

In [72]:
# -- Importing specific functions

from numpy import array

In [73]:
array([1, 2, 3])

array([1, 2, 3])

### Package documentation

- <https://numpy.org/>

- <https://numpy.org/install/>

- <https://numpy.org/doc/stable/>

# NumPy
 

<font size="3"> 

- Numeric Python
- Alternative to Python List: NumPy Array Calculations over entire arrays
- Easy and Fast
- Installation
    - In the terminal: pip3 install numpy

</font>

In [74]:
import numpy as np

In [77]:
height = [1.73, 1.68, 1.71, 1.89, 1.79]

weight = [65.4, 59.2, 63.6, 88.4, 68.7]

type(height)

list

In [81]:
np_height = np.array(height)

print(np_height)
type(np_height)


[1.73 1.68 1.71 1.89 1.79]


numpy.ndarray

In [82]:
np_weight = np.array(weight)

print(np_weight)
type(np_weight)

[65.4 59.2 63.6 88.4 68.7]


numpy.ndarray

In [83]:
bmi = np_weight / np_height ** 2

bmi

array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])

### Different types: different behavior!

In [84]:
python_list = [1, 2, 3]

python_list

[1, 2, 3]

In [85]:
numpy_array = np.array([1, 2, 3])

numpy_array

array([1, 2, 3])

In [86]:
python_list + python_list

[1, 2, 3, 1, 2, 3]

In [87]:
numpy_array + numpy_array

array([2, 4, 6])

### NumPy Subsettting

In [88]:
bmi

array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])

In [89]:
# -- Subsetting

bmi[1]

20.97505668934241

In [90]:
# -- Array of booleans

bmi > 23

array([False, False, False,  True, False])

In [91]:
# -- Subsetting with conditions

bmi[bmi > 23]

array([24.7473475])

### 2D NumPy Arrays
 

In [92]:
#import numpy as np

np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])

type(np_height)

numpy.ndarray

In [93]:
np_2d = np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
                          [65.4, 59.2, 63.6, 88.4, 68.7]
                 ]
                )

np_2d

array([[ 1.73,  1.68,  1.71,  1.89,  1.79],
       [65.4 , 59.2 , 63.6 , 88.4 , 68.7 ]])

In [94]:
np_2d.shape #2 rows, 5 columns

(2, 5)

In [95]:
# -- Numeric values and one string

np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
                  [65.4, 59.2, 63.6, 88.4, "68.7"]])

array([['1.73', '1.68', '1.71', '1.89', '1.79'],
       ['65.4', '59.2', '63.6', '88.4', '68.7']], dtype='<U32')

It changes all types (Single type!)

### Subsetting

![title](imgs/subsetting_2array.png)

In [None]:
np_2d = np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
                          [65.4, 59.2, 63.6, 88.4, 68.7]])

np_2d

In [None]:
np_2d[0]

In [None]:
np_2d[0][2]

In [None]:
np_2d[1,4]

### NumPy: Basic Statistics
 

<font size="3"> 

- Get to know your data
- Little data -> simply look at it
- Big data -> ?

</font>

### Generate data

In [None]:
#import numpy as np

#help(np.round)

In [None]:
#help(np.random.normal)

<https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html>

In [None]:
height = np.round(np.random.normal(1.75, 0.20, 5000), 2)

height

In [None]:
#Number of elements in the array.

height.size

In [None]:
weight = np.round(np.random.normal(60.32, 15, 5000), 2)

weight

In [None]:
np_city = np.column_stack((height, weight))

np_city

In [None]:
np_city.size

In [None]:
#help(np.mean)

In [None]:
np.mean(np_city[:,0])

<https://numpy.org/doc/stable/reference/generated/numpy.mean.html>

In [None]:
#help(np.median)

In [None]:
np.median(np_city[:,0])

<https://numpy.org/doc/stable/reference/generated/numpy.median.html>

## Student exercise

[Tutorial para acceder al curso](https://docs.google.com/presentation/d/1hz3ot-Kr30fvMTZMmK24O6pn7kclHWH1_oJu4jV_YIs/edit?ts=60cab5fe#slide=id.gddd0b6f1c4_2_0)

[Introducción a Python](https://www.datacamp.com/community/open-courses/introduccion-a-python)