# Lighthouse Labs
### W01D2 Programming in Python
Slides adapted from originals by: Socorro Dominguez  
2 May 2022

# Agenda

 - Lecture Basics
 - Introductions
 - Introduction to Python
    - What is Python?
    - Python Ecosystem
    - IDEs
 - Python Programming
     - Basic Syntax and Data Types
     - Data Structures
 - Challenge


# Lecture Basics

- Mics muted (to avoid background noises)
- "Cameras on" if possible
- Feel free to ask questions anytime
- We will take a 10 min break about halfway through the class

#### CHANGE THIS LINK [Download this notebook](https://downgit.github.io/#/home?url=https://github.com/sedv8808/LighthouseLabs/blob/main/W01D2/W01D2_PPDS.ipynb)

# What is Python?

[<img src="Python-logo-notext.svg" width=200 align="center">](https://commons.wikimedia.org/w/index.php?curid=34991651)

Python
 - is an interpreted, object-oriented, high-level, general programming language with dynamic typing  
 - has a simple, easy to learn syntax  
 - extensive error handling
 - has a large standard library
 - supports modules and packages  
 - runs anywhere
 - is free

([source 1](https://www.python.org/doc/essays/blurb/) and [source 2](https://wiki.python.org/moin/BeginnersGuide/Overview))

## [Zen of Python](https://peps.python.org/pep-0020/)

> Beautiful is better than ugly. 

> Explicit is better than implicit.

> Simple is better than complex.

> Complex is better than complicated.

> Readability counts.



## Python History

Python was first introduced by **Guido Van Rossum** in 1991 at the National Research Institute for Mathematics and Computer Science, Netherlands. For more details see [Python's Wikipedia page](https://en.wikipedia.org/wiki/Python_(programming_language). 

## Python Versions

 - Python 1.0 was released in 1994 
 - Python 2.0 was released in 2001  
 - Python 3.0 was released in 2008 (generally not backwards compatible)

## Why Python for Data Science ?

 - Simple programming language to pick up, from a syntax point of view
 - Python also has an active community with a vast selection of libraries and resources
 - Well-suited to an iterative process

## IDEs for Python

![](https://static.javatpoint.com/python/images/python-ides.png)

### What is Jupyter?

- For this Bootcamp, we will mostly be using Python via [Jupyter](https://jupyter.org/index.html)
- You can think of Python like a car’s engine, while Jupyter is like a car’s dashboard
  - Python is the programming language that runs computations
  - Jupyter is an integrated development environment (IDE) that provides an interface by adding convenient features and tools

### Jupyter Notebooks

 - [Literate programming](https://en.wikipedia.org/wiki/Literate_programming)
 - Code, plots, formatted text, equations, etc. in a single document
 - Run Python code interactively
 - Also supports R, Julia, Perl, and over 100 other languages (and counting!)
 - Notebooks are great for exploration and for documenting your workflow
- Many options for sharing notebooks in human readable format:
  - Share online with [nbviewer.jupyter.org](http://nbviewer.jupyter.org/)
  - If you use Github, any notebooks you upload are automatically rendered on the site
  - Convert to HTML, PDF, etc. with [nbconvert](https://nbconvert.readthedocs.io/en/latest/)

### Working with Notebooks

A notebook consists of a series of "cells":
- **Code cells**: execute snippets of code and display the output
- *Markdown cells*: formatted text, equations, images, and more

#### Images
![](https://numpy.org/doc/stable/_images/broadcasting_1.png)

#### Code

In [1]:
print("hello world")

hello world


#### Math 

An example of in-line math: $\sum_i x_i$

And for math that needs a bit more space/attention: 

$$
w_i = w_i - \eta \frac{\partial \cal{L}}{\partial w_i}
$$

# Python Ecosystem

The Python libraries for data science are developed and maintained by external "3rd party" development teams
- Python core + 3rd party libraries = **ecosystem** 
- To install and manage 3rd party libraries, you need to use a package manager such as `conda` or `pip`

![](http://res.cloudinary.com/dyd911kmh/image/upload/f_auto,q_auto:best/v1509622333/scipy-eco_kqi2su.png)

During the Bootcamp, we will be working with `pandas`, `numPy`, `seaborn`, `matplotlib`, `plotly`, `sklearn` and `keras` libraries.

# Python Programming
## Basic Syntax and Data Types

### Built in Data Types
* Integers - `int`
* Floating-point numbers - `float` - `NaN` belongs to this group 4.5
* Strings - `str`
* Booleans - `bool` - two values: True and False.
* Lists - `list`
* Tuples - `tuple`
* Sets - `set`
* Dictionaries - `dict`


In [3]:
length = 46

In [5]:
type(length)

int

In [4]:
string = "This is just a regular sentence."

In [6]:
type(string)

str

In [1]:
x = False

In [2]:
type(x)

bool

### Strings

There are several methods/verbs to transform strings or extract information from them.

In [7]:
len(string)

32

In [8]:
string.split()

['This', 'is', 'just', 'a', 'regular', 'sentence.']

In [9]:
string.upper()

'THIS IS JUST A REGULAR SENTENCE.'

In [5]:
string.lower()

'this is just a regular sentence.'

### Numerical Data Types and Casting

In [9]:
age = 6/2
print(age)

print(type(age))

3.0
<class 'float'>


In [10]:
age = 6.0 
type(age)

float

In [41]:
import numpy as np

type(np.nan)

float

**Casting**

- `int` to `float`:

In [11]:
four = 4
float(four)

4.0

- `int` to `str`

In [12]:
length = 40
length_string = str(length)
print(length_string)
type(length_string)

40


str

- `float` to `int`

In [13]:
int(4.99)

4

## Python Data Structures

### Lists

Similarly to how a string is a sequence of characters in order, a   
list is a sequence of elements with a particular order.

Lists can be identified by their square brackets.

The elements in a list can be any objects, and they don’t all need to have the same type.

In [24]:
my_list = ["hello", 4, 3.0, {"good-bye!": 0}]

In [25]:
my_list[3]

{'good-bye!': 0}

In [26]:
my_list_2 = [my_list, "hello2"]
my_list_2

[['hello', 4, 3.0, {'good-bye!': 0}], 'hello2']

In [27]:
my_list_2[0]

['hello', 4, 3.0, {'good-bye!': 0}]

In [28]:
my_list.append("hello3")
my_list

['hello', 4, 3.0, {'good-bye!': 0}, 'hello3']

In [29]:
my_list_2

[['hello', 4, 3.0, {'good-bye!': 0}, 'hello3'], 'hello2']

In [30]:
my_list.extend([991, 992, 993])
my_list

['hello', 4, 3.0, {'good-bye!': 0}, 'hello3', 991, 992, 993]

### Slicing

We slice with `[]`; the start is inclusive, and the end is exclusive.

So, `string_list[1:3]` fetches elements 1 and 2, but not 3.

In [45]:
string_list[1:3]

['is', 'just']

### Iterative

In [31]:
for i in string_list:
    print(i)

This
is
just
a
regular
sentence.


### List Comprehension

We can manipulate a whole list using list comprehension:

In [47]:
new_list = [0, 3, 4, 5, 7, 8, 3, 3, 7, 9, 5, 5]

other_new_list = []
another_new_list = []

for i in new_list:
    x = i+5
    y = i+3
    other_new_list.append(x)
    another_new_list.append(y)
    
other_new_list

[5, 8, 9, 10, 12, 13, 8, 8, 12, 14, 10, 10]

In [49]:
other_new_list2 = [x + 5 for x in new_list]

other_new_list2

[5, 8, 9, 10, 12, 13, 8, 8, 12, 14, 10, 10]

### Tuples
Tuples are a data structure very similar to lists but with two main differences:

They are represented with parentheses instead of square brackets, and
they are immutable

In [50]:
my_tuple = ('I', None,  'do', 1, False)

### Sets

Data structure that:
- are unordered, meaning there is no element 0 and element 1, and
- the values contained are unique - meaning there are no duplicate entries
- sets are made with curly brackets

In [31]:
my_set = {2, 1.0, 'apple', 1.0, 'apple'}
my_set

{1.0, 2, 'apPle', 'apple'}

In [10]:
my_list = [2, 1.0, 'apple', 1.0, 'apple']
my_set = set(my_list)
my_set

{1.0, 2, 'apple'}

In [55]:
import json

with open("some_text.txt", "r") as f:
    some_text = json.load(f)

In [57]:
some_text[:100]

'The hint was immediately taken up by Mr Shepherd, whose interest was involved in the reality of Sir '

In [58]:
len(some_text.split())

19735

In [62]:
list(set(some_text.split()))[:10]

['parting,',
 'meeting.',
 'board',
 'shift',
 'nodded',
 'shooting,',
 'autumn,',
 'son',
 'asking.',
 'directly,']

In [64]:
len(set(some_text.split()))

4365

### Dictionaries
Dictionaries are unordered pairs of keys and corresponding values

In [51]:
account_details = {'Name':'Jack Sparrow',
                   'Account_Type':'Checking',
                   'Branch': 13,
                   'Age': 23}

account_details

{'Name': 'Jack Sparrow', 'Account_Type': 'Checking', 'Branch': 13, 'Age': 23}

In [52]:
account_details.keys()

dict_keys(['Name', 'Account_Type', 'Branch', 'Age'])

In [53]:
account_details.items()

dict_items([('Name', 'Jack Sparrow'), ('Account_Type', 'Checking'), ('Branch', 13), ('Age', 23)])

In [54]:
account_details['Age']

23

#### Why do we need dictionaries if we have lists?

Dictionaries can have labels or keys associated with a value whereas a list only has an index. Can make it a much more natural choice for your data.

### Summary

|Data Structure	| Preserves order | Mutable | Symbol| Can contain duplicates | Can be sliced |
|---------|------|------|------|------|------|
|str	|✓	|☓	|''  , ""|	✓|✓|
|list	|✓	|✓	|[] |	✓|✓|
|tuple	|✓	|☓	|() |	✓|✓|
|set	|☓	|✓	|{} |	☓|☓|
|dict  |✓	|✓	|{ key : value} | 	☓| ☓|

### Shallow vs Deep Copy

In [71]:
x = [[1, 2, 3], 4, 5, 6]
y = x

print(x)
print(y)

[[1, 2, 3], 4, 5, 6]
[[1, 2, 3], 4, 5, 6]


In [72]:
x[-1] = 999
print(x)
print(y)

[[1, 2, 3], 4, 5, 999]
[[1, 2, 3], 4, 5, 999]


In [73]:
x = [[1, 2, 3], 4, 5, 6]
y = list(x)

print(x)
print(y)

[[1, 2, 3], 4, 5, 6]
[[1, 2, 3], 4, 5, 6]


In [74]:
x[-1] = 999
print(x)
print(y)

[[1, 2, 3], 4, 5, 999]
[[1, 2, 3], 4, 5, 6]


In [76]:
x = [[1, 2, 3], 4, 5, 6]
y = list(x)

x[0][0] = 999

print(x)
print(y)

[[999, 2, 3], 4, 5, 6]
[[999, 2, 3], 4, 5, 6]


In [79]:
import copy

x = [[1, 2, 3], 4, 5, 6]
y = copy.deepcopy(x)

x[0][0] = 999

print(x)
print(y)

[[999, 2, 3], 4, 5, 6]
[[1, 2, 3], 4, 5, 6]


### Errors and Exceptions Handling

In [36]:
def div_func(a,b):
    '''Here write documentation'''
    result = a/b
    return result

In [37]:
div_func(5,2.0)

2.5

In [None]:
div_func(5,0.0)

In [39]:
def div_func(a,b):
    try:
        result = a/b
    except ZeroDivisionError:
        print('b is zero and division is not possible')
        return
    return result

In [40]:
div_func(5,-5)

-1.0

In [41]:
div_func(10,0.0)

b is zero and division is not possible


In [42]:
def count_letter_in_word(input_word='gulp'):
    return len(input_word)

In [None]:
count_letter_in_word(10)

In [44]:
def count_letter_in_word(input_word='gulp'):
    if type(input_word)!=str:
        raise TypeError('incorrect input type. Please provide a string input')
    return len(input_word)

In [None]:
count_letter_in_word(10)

In [None]:
count_letter_in_word(5)

In [47]:
def count_letter_in_word_new(input_word='gulp'):
    result=0
    try:
        result=len(input_word)
    except Exception as e:
        print(e)
    return result

In [48]:
count_letter_in_word_new(103984)

object of type 'int' has no len()


0

### Handling Date and Time in Python

**Get current date time**

In [11]:
# import datetime class from datetime module
from datetime import datetime

# get current date
datetime_object = datetime.now()
print(datetime_object)
print('Type :- ', type(datetime_object))

2022-05-02 09:52:46.639563
Type :-  <class 'datetime.datetime'>


In [30]:
my_string = '2022-May-3'

# Create date object in given time format yyyy-mm-dd
my_date = datetime.strptime(my_string, "%Y-%b-%d")

print(my_date)
print('Type: ',type(my_date))

2022-05-03 00:00:00
Type:  <class 'datetime.datetime'>


#### Accessing certain date attributes

In [33]:
print('Year: ', my_date.year) # To get month from year
print('Month: ', my_date.month) # To get month from date
print('Day: ', my_date.day) # To get month from year
print('Weekday: ', my_date.weekday()) # To get month from year

Year:  2022
Month:  5
Day:  3
Weekday:  1


#### Measuring Time Span with Timedelta Objects

In [19]:
#import datetime
from datetime import datetime, timedelta

# get current time
now = datetime.now()
print("Today's date: ", str(now))

Today's date:  2022-05-02 10:00:41.593773


In [20]:
#add 15 days to current date
future_date_after_15days = now + timedelta(days = 15)
print('Date after 15 days: ', future_date_after_15days)

#subtract 2 weeks from current date
two_weeks_ago = now - timedelta(weeks = 2)
print('Date two weeks ago: ', two_weeks_ago)
print('two_weeks_ago object type: ', type(two_weeks_ago))

Date after 15 days:  2022-05-17 10:00:41.593773
Date two weeks ago:  2022-04-18 10:00:41.593773
two_weeks_ago object type:  <class 'datetime.datetime'>


#### Find the Difference Between Two Dates and Times

In [22]:
# import datetime
from datetime import date

# Create two dates
date1 = date(2008, 8, 18)
date2 = date(2008, 8, 10)

# Difference between two dates
delta = date2 - date1
print(delta)
print("Difference: ", delta.days)
print('delta object type: ', type(delta))

-8 days, 0:00:00
Difference:  -8
delta object type:  <class 'datetime.timedelta'>


#### Formatting Dates: More on <code>strftime()</code> and <code>strptime()</code>

In [39]:
# import datetime
from datetime import datetime
date_string = "1 August 2019"

# format date
date_object = datetime.strptime(date_string, "%d %B %Y")

print("date_object: ", date_object)

date_object:  2019-08-01 00:00:00


In [40]:
# import datetime
from datetime import datetime
now = datetime.now()

# convert datetime to string
date_string = now.strftime("%d %B %Y")

print("date_string: ", date_string)

date_string:  02 May 2022


#### Datetime formatting

| Directive | Meaning|
|-----------|--------|
| %a |Weekday abb. Sun, Mon, …, Sat (en_US)|
| %A | Weekday Full Sunday,Monday (en_US)|
| %w | weekday as decimal 0,1,2,3.. |
| %d | day of month as decimal 0,1,2,...30 ||
| %b | Month Jan, Feb, …, Dec (en_US) |
| %B | Month January,February... |
| %m | Month as a zero-padded decimal |
| %y | Year without century as zero padded decimal 00,01,..,99 |
| %Y | Year with century 1970,1980 etc. |
| %H | our (24-hour clock) as a zero-padded decimal number.00, 01, …, 23 |
| %I | Hour (12-hour clock) as a zero-padded decimal number.01, 02, …, 12 |
| %p | Am,PM |
| %M | Minute 00,01..,59 |
| %S | Second 00,01..,59 |
