# Data Science with Python - More Topics
---

# Introduction

> Python is a high-level programming language that supports multiple programming paradigms. It is an open source project and since its inception in 1991, it has become one of the most popular interpreted programming languages.
> 
> In recent years Python has developed an active community of scientific processing and data analysis and has been standing out as one of the most relevant languages when it comes to data science and machine learning, both in the academic environment and in the market.

## Installation and Development Environment

### Local Installation

> https://www.python.org/downloads/ or https://www.anaconda.com/distribution/

### Google Colaboratory

> https://colab.research.google.com

### Check Python version

In [None]:
!python -V      # OR !python --version

# Math Operations

### Arithmetic operators: $+$, $-$, $*$, $/$, $**$, $\%$, $//$

### Sum ($+$)

In [None]:
2 + 2

### Subtraction ($-$)

In [None]:
2 - 2

### Product ($*$)

In [None]:
2 * 3

### Division ($/$) e ($//$)

> The division operation always returns a floating point number

In [None]:
10 / 3

In [None]:
10 // 3

### Exponentiation ($**$)

In [None]:
2 ** 3

### Modulus Operator ($\%$)

In [None]:
10 % 3

In [None]:
10 % 2

### Mathematical Expressions

In [None]:
5 * 2 + 3 * 2

In [None]:
(5 * 2) + (3 * 2)

In [None]:
5 * (2 + 3) * 2

### The variable _

> In interactive mode, the last printed result is assigned to the variable _

In [None]:
5 * 2

In [None]:
_ + 3 * 2

In [None]:
_ / 2

# Variables 

### Variable names

- Variable names can start with letters (a - z, A - Z) or the underscore character (_):

  > height
  > 
  > _weight

- The rest of the name can contain letters, numbers and the "_" character:

  > variable_name
  > 
  > _value
  > 
  > day_28_11_

- The names are *case sensitive*:

  > Variable_Name $\ne$ variable_name $\ne$ VARIABLE_NAME

### Comments:

- There are some **reserved words** in the language that **cannot** be used as variable names:

| |Reserved Word List in Python| |
|:-------------:|:------------:|:-------------:|
| and           | as           | not           | 
| assert        | finally      | or            | 
| break         | for          | pass          | 
| class         | from         | nonlocal      | 
| continue      | global       | raise         | 
| def           | if           | return        | 
| del           | import       | try           | 
| elif          | in           | while         | 
| else          | is           | with          | 
| except        | lambda       | yield         | 
| False         | True         | None          | 

### Declaration of Variables

### Assignment Operators: $=$, $+=$, $-=$, $*=$, $/=$, $**=$, $\%=$, $//=$

In [2]:
year_current = 2019
year_manufacture = 2003
km_total = 44410.0

In [None]:
year_current

In [None]:
year_manufacture

In [None]:
km_total

# $$km_{avg} = \frac {km_{total}}{(year_{current} - year_{manufacture})}$$

### Operations with Variables

In [None]:
km_media = km_total / (year_current - year_manufacture)
km_media

In [None]:
year_current = 2019
year_manufacture = 2003
km_total = 44410.0
km_avg = km_total / (year_current - year_manufacture)
km_avg

In [None]:
year_current = 2019
year_manufacture = 2003
km_total = 44410.0
km_avg = km_total / (year_current - year_manufacture)

km_total = km_total + km_avg
km_total

In [None]:
year_current = 2019
year_manufacture = 2003
km_total = 44410.0
km_avg = km_total / (year_current - year_manufacture)

km_total += km_avg
km_total

### Conclusion:
```
"value = value + 1" is equivalent to "value += 1"
```

### Multiple Declaration

In [10]:
year_current, year_manufacture, km_total = 2019, 2003, 44410.0

In [None]:
year_current

In [None]:
year_manufacture

In [None]:
km_total

In [None]:
year_current, year_manufacture, km_total = 2019, 2003, 44410.0
km_avg = km_total / (year_current - year_manufacture)
km_avg

# Data Types

Data types specify how numbers and characters will be stored and manipulated in a program. Python's basic data types are:

- **Numbers**
    - ***int*** - integers
    - ***float*** - floating point
- **Booleans** - Assumes True or False values. Essential when we start working with conditional statements
- **Strings** - A sequence of one or more characters that can include letters, numbers, and other types of characters. Represents a text.
- **None** - represents the absence of a value


### Numbers

In [15]:
year_current = 2019

In [None]:
type(year_current)

In [17]:
km_total = 44410.0

In [None]:
type(km_total)

### Booleans

In [19]:
zero_km = True

In [20]:
type(zero_km)

bool

In [21]:
zero_km = False

In [22]:
type(zero_km)

bool

### Strings

In [None]:
name = 'Jetta Variant'
name

In [None]:
name = "Jetta Variant"
name

In [None]:
name = 'Jetta "Variant"'
name

In [None]:
name = "Jetta 'Variant'"
name

In [23]:
car = '''
  Name
  Age
  Grade
'''

In [None]:
type(car)

### None

In [25]:
mileage = None
mileage

In [None]:
type(mileage)

# Type Conversion

In [28]:
a = 10
b = 20
c = 'Python is '
d = 'cool'

In [None]:
type(a)

In [None]:
type(b)

In [None]:
type(c)

In [None]:
type(d)

In [None]:
a + b

In [None]:
c + d

In [None]:
# c + a

### Type Conversions

Functions int(), float(), str()

In [None]:
str(a)

In [None]:
type(str(a))

In [None]:
c + str(a)

In [None]:
float(a)

In [None]:
var = 3.141592

In [None]:
int(var)

In [31]:
var = 3.99

In [None]:
int(var)

# Indentation, comments, and *strings* formatting

### Indentation

In Python, programs are structured using indentation. In any programming language, the practice of indentation is very useful, making the code easier to read and also to maintain. In Python, indentation is not just a matter of organization and style, but a language requirement.

In [None]:
year_current = 2019
year_manufacture = 2019

if (year_current == year_manufacture):
  print('True')
else:
  print('False')

### Comments

Comments are extremely important in a program. It consists of text that describes what the program or a specific part of the program is doing. Comments are ignored by the Python interpreter.

We can have **single-line** or **multi-line** comments.

In [None]:
# This a comment
year_current = 2019
year_current

In [None]:
# This
# is also 
# a comment
year_current = 2019
year_current

In [None]:
'''
This is
a multi-line
comment
'''
year_current = 2019
year_current

In [None]:
# Define variables
year_current = 2019
year_manufacture = 2019

'''
Conditional structure
'''
if (year_current == year_manufacture):    # Testing if condition is true
  print('True')
else:                                     # Testing if condition is false
  print('False')


### Formatting *Strings* 

#### *str.format()*

https://docs.python.org/3.6/library/stdtypes.html#str.format

In [None]:
print('Hello, {}!'.format('Alexandre'))

In [None]:
print('Hello, {}! This is your access #{}'.format('Alexandre', 32))

In [None]:
print('Hello, {name}! This is your access #{acesses}'.format(acesses = 32, name = 'Alexandre'))

#### *f-Strings*

https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings

In [42]:
name = 'Alexandre'
acesses = 32

In [43]:
print(f'Hello, {name}! This is your access # {acesses}')

Hello, Alexandre! This is your access # 32


# Lists

Lists are **mutable** sequences that are used to store collections of items, usually homogeneous. They can be built in several ways:

```
- Using a pair of square brackets: [ ], [ 1 ]
- Using a pair of square brackets with comma-separated items: [ 1, 2, 3 ]
```

In [None]:
accessories = ['Alloy wheels', 'Power locks', 'Autopilot', 'Leather seats', 'Air conditioning', 'Parking sensor', 'Twilight sensor', 'Rain sensor']
accessories

In [None]:
type(accessories)

### List with different data types

In [48]:
car_1 = ['Jetta Variant', '4.0 Turbo Engine', 2003, 44410.0, False, ['Alloy Wheels', 'Power Locks', 'Autopilot'], 88078.64]
car_2 = ['Passat', 'Diesel Engine', 1991, 5712.0, False, ['Multimedia Center', 'Panoramic Roof', 'ABS Brakes'], 106161.94]

In [None]:
car_1

In [None]:
car_2

In [None]:
cars = [car_1, car_2]
cars

### List operations

https://docs.python.org/3.6/library/stdtypes.html#common-sequence-operations

#### *x in A*

Returns **True** if an element in the list *A* is equal to *x*.

In [52]:
accessories

['Alloy wheels',
 'Power locks',
 'Autopilot',
 'Leather seats',
 'Air conditioning',
 'Parking sensor',
 'Twilight sensor',
 'Rain sensor']

In [None]:
'Alloy wheels' in accessories

In [None]:
'4 X 4' in accessories

In [None]:
'Alloy wheels' not in accessories

In [None]:
'4 X 4' not in accessories

#### *A + B*

Concatenates *A* and *B* lists.

In [57]:
A = ['Alloy wheels', 'Power locks', 'Autopilot', 'Leather seats']
B = ['Air conditioning', 'Parking sensor', 'Twilight sensor', 'Rain sensor']

In [None]:
A

In [None]:
B

In [None]:
A + B

#### *len(A)*

List size of A.

In [None]:
len(accessories)

### Selections in lists

#### *A[ i ]*

Returns the i-th item in the list *A*.

**Note:** Lists have zero-source indexing.

In [None]:
accessories

In [None]:
accessories[0]

In [None]:
accessories[1]

In [None]:
accessories[-1]

In [None]:
cars

In [None]:
cars[0]

In [None]:
cars[0][0]

In [None]:
cars[0][-2]

In [None]:
cars[0][-2][1]

#### *A[ i : j ]*

Cut list *A* from index *i* to *j*. In this slicing the element with index *i* is **included** and the element with index *j* is **not included** in the result.

In [None]:
accessories

In [None]:
accessories[2:5]

In [None]:
accessories[2:]

In [None]:
accessories[:5]

## List methods

https://docs.python.org/3.6/library/stdtypes.html#mutable-sequence-types

In [None]:
accessories = ['Alloy wheels', 'Power locks', 'Autopilot', 'Leather seats', 'Air conditioning', 'Parking sensor', 'Twilight sensor', 'Rain sensor']

#### *A.sort()*

Sort the *A* list.

In [None]:
accessories

In [None]:
accessories.sort()
accessories

#### *A.append(x)*

Add the *x* element to the end of the *A* list.

In [None]:
accessories.append('4 X 4')
accessories

#### *A.pop(i)*

Removes and returns index element i from list *A*.

**Note:** By *default* the *pop()* method removes and returns the last element of a list.

In [None]:
accessories.pop()

In [None]:
accessories

In [None]:
accessories.pop(3)

In [None]:
accessories

#### *A.copy()*

Creates a copy of the *A* list.

**Note:** The same result can be obtained with the following code: 
```
A[:]
```

In [None]:
accessories_2 = accessories
accessories_2

In [None]:
accessories_2.append('4 X 4')
accessories_2

In [None]:
accessories

In [None]:
accessories.pop()
accessories

In [None]:
accessories_2


In [None]:
accessories_2 = accessories.copy()
accessories_2

In [None]:
accessories_2.append('4 X 4')
accessories_2

In [None]:
accessories

In [None]:
accessories_2 = accessories[:]
accessories_2

# *For* loops

#### Standard format

```
for <variable> in <collection>:
    <instructions>
```

### Loops with lists

In [None]:
accessories = ['Alloy wheels', 'Power locks', 'Autopilot', 'Leather seats', 'Air conditioning', 'Parking sensor', 'Twilight sensor', 'Rain sensor']
accessories

In [None]:
for item in accessories:
  print(item)

###  List comprehensions

https://docs.python.org/3.6/tutorial/datastructures.html#list-comprehensions

*range()* -> https://docs.python.org/3.6/library/functions.html#func-range

In [None]:
range(10)

In [None]:
list(range(10))

In [None]:
for i in range(10):
  print(i ** 2)

In [None]:
square = []
for i in range(10):
  square.append(i ** 2)
  
square

In [None]:
[i ** 2 for i in range(10)]

# Nested loops

In [None]:
data = [
     ['Alloy wheels', 'Power locks', 'Autopilot', 'Leather seats', 'Air conditioning', 'Parking sensor', 'Twilight sensor', 'Rain sensor'],
     ['Multimedia center', 'Panoramic roof', 'ABS brakes', '4 X 4', 'Digital panel', 'Autopilot', 'Leather seats', 'Parking camera'],
     ['Autopilot', 'Stability control', 'Twilight sensor', 'ABS brakes', 'Automatic transmission', 'Leather seats', 'Multimedia center', 'Power windows']
]
data

In [None]:
for items in data:
  print(items)

In [None]:
for items in data:
  for item in items:
    print(item)

In [None]:
accessories = []

for items in data:
  for item in items:
    accessories.append(item)
    
accessories

### *set()*

https://docs.python.org/3.6/library/stdtypes.html#types-set


In [None]:
list(set(accessories))

### List comprehensions

In [None]:
[item for items in data for item in items]

In [None]:
list(set([item for items in data for item in items]))

# *If* statement

#### Standard format

```
if <condition>:
     <instructions to be followed in case of the condition is true>
```

#### Comparison operators: 
> $==$, $!=$, $>$, $<$, $>=$, $<=$

#### Logical operators: 
> $and$, $or$, $not$

In [None]:
# 1st item on the list - Vehicle name
# 2nd item on the list - Year of manufacture
# 3rd item on the list - Vehicle is zero km?

data = [
     ['Jetta Variant', 2003, False],
     ['Passat', 1991, False],
     ['Crossfox', 1990, False],
     ['DS5', 2019, True],
     ['Aston Martin DB4', 2006, False],
     ['Palio Weekend', 2012, False],
     ['A5', 2019, True],
     ['Series 3 Cabrio', 2009, False],
     ['Dodge Jordan', 2019, False],
     ['Carens', 2011, False]
]

data

In [None]:
zero_km_Y = []

for items in data:
  if(items[2]):
    zero_km_Y.append(items)
    
zero_km_Y

In [None]:
zero_km_N = []

for items in data:
  if not (items[2]):
    zero_km_N.append(items)
    
zero_km_N

### List comprehensions

In [None]:
[items for items in data if items[2]]

# *If-else* e *If-elif-else* statements

#### *if-else* standard format

```
if <condition>:
    <instructions to be followed in case of the condition is true>
else:
    <instructions to be followed in case of the condition is not true>
```

In [102]:
zero_km_Y, zero_km_N = [], []

for items in data:
  if (items[2]):
    zero_km_Y.append(items)
  else:
    zero_km_N.append(items)

In [None]:
zero_km_Y

In [None]:
zero_km_N

#### *if-elif-else* standard format

```
if <condition #1>:
    <instructions to be followed in case of the condition #1 is true>
elif <condition #2>:
    <instructions to be followed in case of the condition #2 is true>
elif <condition 3>:
    <instructions to be followed in case of the condition #3 is true>
                        .
                        .
                        .
else:
    <instructions to be followed in case of the previous conditions are not true>
```

In [None]:
data

In [None]:
print('AND')
print(f'(True and True) is: {True and True}')
print(f'(True and False) is: {True and False}')
print(f'(False and True) is: {False and True}')
print(f'(False and False) is: {False and False}')

In [None]:
print('OR')
print(f'(True or True) is: {True or True}')
print(f'(True or False) is: {True or False}')
print(f'(False or True) is: {False or True}')
print(f'(False or False) is: {False or False}')

In [108]:
A, B, C = [], [], []

for items in data:
  if(items[1] <=2000):
    A.append(items)
  elif(items[1] > 2000 and items[1] <= 2010):
    B.append(items)
  else:
    C.append(items)

In [None]:
A

In [None]:
B

In [None]:
C

In [109]:
A, B, C = [], [], []

for items in data:
  if(items[1] <=2000):
    A.append(items)
  elif(2000 < items[1] <= 2010):
    B.append(items)
  else:
    C.append(items)