# Automate Boring Stuff with Python

## I. Basics

### Lesson 1 : Getting Help
* Explain what you are trying to do, not just what you did.
* If you get an error message, specify the point at which the error happens. (Including the line number.)
* Explain what you’ve already tried to do to solve your problem.
* List the version of Python you’re using.

### Lesson 2: Getting Started

* An instruction that evaluates to a single value is an expression. An instruction that doesn't is a statement.
* IDE is an editor. Jupyter is specifically for Data science and scientific computing while VS code is for software development. 
* The interactive shell window has the >>> prompt.
* Data types: 
    * Numeric: int, float, complex
    * Character: String
    * Bool
* Strings hold text and begin and end with quotes: ‘Hello world!'
* Values can be stored in variables: spam = 42
* Variables can be used anywhere values can be used in expressions: spam + 1

In [1]:
2 + 2 #This is an Expression

4

In [2]:
X = 2 + 2 # This a Statement (usually to create variables)

In [3]:
# Operator Predence Concept

(2 + 2) * 5 # BODMAS

20

In [4]:
# Variable always keeps on Updating

x = 2  #Old value
x = 3  #New value
x

3

### Lesson 3: Writing First Program

* The execution starts at the top and moves down.
* Comments begin with a # character and are ignored by Python; they are notes & reminders for the programmer.
* Functions are like mini-programs in your program.
* The print() function displays the value passed to it.
* The input() function lets users type in a value. (String by Default)
* The len() function takes a string value and returns an integer value of the string's length.
* The int(), str(), and float() functions can be used to convert values' data type.

In [6]:
print('Hey Buddy, What is your name?')
myname = input()
print('What is your age?')
myage = int(input())
lenname = len(myname)
print('Nice to meet you', myname)
print('Your name has', lenname, 'letters')
print('You will be', myage + 1, 'on your next birthday, Cheers!')

Hey Buddy, What is your name?
What is your age?
Nice to meet you tajamul
Your name has 7 letters
You will be 30 on your next birthday, Cheers!


## II. Flow Control

*Flow Control controls the order of execution of statements based on Conditions*

Three Types:
  * If Else, Elif
  * While Loops
  * For Loop
  
 Flow Control involves three things
 
  * Boolean Value
  * Comparison Operators
  * Boolean or Logical Operators: *are used to evaluate boolean values or expressions.*

### Lesson 4: How Flow Control Works

* The Boolean data type has only two values: True and False (both beginning with capital letters).
* Comparison operators compare two values and evaluate to a Boolean value: ==, !=, <, >, <=, >=
* == is a comparison operator, while = is the assignment operator for variables.
* Logical or Boolean operators (and, or, not) also evaluate to Boolean values.

In [7]:
True and True # Only True Condition for AND

True

In [8]:
print(True or True) # True Condition for OR
print(False or True)
print(True or False)

True
True
True


In [9]:
x = 1   # Here = is assignment operator
print(x)
2 == 1 # Here == is comparison operation

1


False

In [61]:
# not is opposite 

not 5 == 5

False

### Lesson 5 : If Else, Elif Else, Truthy Concept

* An if statement can be used to conditionally execute code, depending on whether the "if" statement's condition is True or False.
* An elif (that is, "else if") statement can follow an if statement. Its block executes if its condition is True and all the previous conditions have been False.
* An else statement comes at the end. Its block is executed if all of the previous conditions have been False.
* The values 0 (integer), 0.0 (float), and ‘‘ (the empty string) are considered to be "falsey" values. When used in conditions they are considered False. You can always see for yourself which values are truthy or falsey by passing them to the bool() function.

In [96]:
#if else Program - True or False

password = 'ucan'

'Hi, please enter password'

yourpassword = input()

if yourpassword == password:
    print('Access Granted')
else:
    print('Blocked')

uca
Blocked


In [83]:
#if elif else Program - Provides the first True Result (Order Matters)

password = 'ucan'

'Hi, please enter password'

yourpassword = input()

if yourpassword == password:
    print('Access Granted')
elif yourpassword == 'uca':
    print('Almost there')
else:
    print('Blocked') 

uca
Almost there


The concepts of truthy and falsy are fundamental for control flow and decision-making within programs

**Truthy Values**

* Any non-zero numeric value (e.g., 1, 2.5, -3.14).
* Non-empty sequences or collections (e.g., lists, tuples, sets, dictionaries).
* Non-empty strings (e.g., 'hello').
* Any object that is not explicitly defined as falsey.

**Falsy Values**

* The numerical value 0 (integer or float).
* Empty sequences or collections (e.g., [], (), {}, '').
* The special value None.
* Boolean value False itself.

In [100]:
# Truthy and Falsy Concept

print(bool(0))  # Falsy
print(bool(101))   # Truthy   

print(bool(""))  # Falsy
print(bool("abc"))  # Truthy

False
True
False
True


### Lesson 6 : While Loops

*while loop is a control flow statement which excutes the code in loop till the condition is True*

* While loop is usually condition based
* You can press ctrl-c to interrupt an infinite loop. This hotkey stops Python programs.
* A "break" statement causes the execution to immediately leave the loop, without re-check the condition.
* A "continue" statement causes the execution to immediate jump back to the start of the loop and re-check the condition.

In [2]:
#Simple While Loop
x = 0
while x<=10:
    print(x)
    x+=1

0
1
2
3
4
5
6
7
8
9
10


In [23]:
# Break

x = 0
while True:
    x += 1
    print(x)
    if x == 10:
        break

1
2
3
4
5
6
7
8
9
10


In [28]:
# Continue
x = 0
while x <= 5:
    x+=1
    if x == 3: # 3 is skipped out of loop
        continue
    print(x)

1
2
4
5
6


### Lesson 7 : For Loop

*For loop is a control flow statement which excutes the code in loop for fixed number of times*

* A for loop will loop a specific number of times.
* The range() function can be called with one, two, or three arguments.
* The break and continue statements can be used in for loops just like they're used in while loops.

In [34]:
# for loop 

for i in range(1,10,2):
    print(i)

1
3
5
7
9


## III. Functions

### Lesson 8 : Python Built in Functions

* You can import modules and get access to new functions.
* The modules that come with Python are called the standard library, but you can also install third-party modules using the pip tool.
* The sys.exit() function will immediately quit your program.

In [56]:
# Importing modules

# ! pip install pandas -- for installing modules

import math
import pandas
import sys
import os
import warnings
warnings.filterwarnings("ignore")

In [53]:
# sys.exit()

print(5)
sys.exit() # Execution stops
print(7)

5


SystemExit: 

### Lesson 9 : Writing your own Functions

* Functions are like a mini-program inside your program.
* The main point of functions is to get rid of duplicate code.
* The def statement defines a function.
* The input to functions are called arguments. The output is called the return value.
* The parameters are the variables in between the function's parentheses in the def statement.
* The return value is specified using the return statement.
* Every function has a return value. If your function doesn't have a return statement, the default return value is None. (Like True and False, None is written with a capital first letter.)
* Keyword arguments to functions are usually for optional arguments. The print() function has keyword arguments "end" and "sep".

In [76]:
# Self Defined Function

def salary_increment(x):  #Here x is a parameter not an argument
    return 'your revised salary is', x * 1.15
    
salary_increment(2000)

('your revised salary is', 2300.0)

In [83]:
# Type Conversion 

'hi ' + str(5)

'hi 5'

In [86]:
# args and kwargs i.e., arguments and keyword arguments

print(10, end= ', ')  #Kwargs are optional
print(20)

10, 20


### Lesson 10 : Global and Local Scopes

* A scope can be thought of as a container of variables.
* The global scope is code outside of all functions. Variables assigned here are global variables.
* Each function's code is in its own local scope. Variables assigned here are local variables.
* Code in the global scope cannot use any local variables.
* Code in a function's local scope cannot use variables in any another function's local scope. While global can be used.
* If there's an assignment statement for a variable in a function, that is a local variable. The exception is if there's a global statement for that variable; then it's a global variable.

In [109]:
# Global Variable
x = 10 

# Local Variable
def tajamul():
    x = 5
    print(x)

In [106]:
# using global variable inside function 

x = 100

def ik(): # Here no Assignment operator, so global variable
    print(x)

ik() 

100


In [102]:
# Local variable is created by Assignment Operation

def ik():
    egg = 10 # Here Assignment operator changes x to Local
    print(egg)

ik() 

100


In [107]:
# Changing local variable to global inside function

x = 10
def ty():
    global tj  #changed to global scope
    tj = 100
    print(tj)

## IV. Exception Handling

### Lesson 11 : Try and Except Statements

* A divide-by-zero error happens when Python divides a number by zero.
* Errors cause the program to crash. (This doesn't damage your computer at all. It's just that the computer doesn't know how to carry out this instruction, so it immediately stops the program by "crashing" rather than continue.)
* An error that happens inside a try block will cause code in the except block to execute. That code can handle the error or display a message to the user so that the program can keep going.
* The try and except blocks should handle the exception when calling the function, not when defining it.

In [115]:
# Dividing number by zero gives error

def divider(x):
    return 42/x

divider(0)

ZeroDivisionError: division by zero

In [98]:
# Try and Except Method -- ZeroDivisionError

def divider(x):
    try:
        return 42/x
    except ZeroDivisionError:
        print('Dont divide by zero')
            
print(divider(0))

Dont divide by zero
None


In [141]:
# Cat Counter -- ValueError

try:
    x = int(input('How many cats do you have? '))
    if x <= 4:
        print('less cats')
    else:
        print('more cats')
except ValueError:
    print('Enter Numbers only')

How many cats do you have? six
Enter Numbers only


## V. Guess the Number Game

In [None]:
import random

x = input('Hello, what is your name? ')
print('so, ' +  x + ', I am thinking of a number between 1 and 10')

secret_number = random.randint(1,10)

for i in range(1,7):
    print('take a guess')
    
    guess = int(input())
    if guess < secret_number:
        print('Too Low')
    elif guess > secret_number:
        print('Too High')
    else:
        break
        
if guess == secret_number:
    print('good job, It took you ' + str(i) + ' guesses to guess my number')
else:
    print('Exceeded guess limits, My guess number was ' + str(secret_number))

## VI. List

*a list is a built-in data structure that represents an ordered collection of elements i.e., [1,2,3] != [3,2,1]
and is mutable.*

### Lesson 13: List Basics

* A list is a value that contains multiple values: [42, 3.14, ‘hello']
* The values in a list are called items.
* You can access items in a list with its integer index.
* The indexes start at 0, not 1.
* You can also use negative indexes: -1 refers to the last item, -2 refers to the second to last item, and so on.
* You can get multiple items from the list using a slice.
* The slice has two indexes. 
* The new list's items start at the first index and go up to, but doesn't include, the second index.
* The len() function, concatenation, and replication work the same way on lists that they do with strings.
* You can convert a value into a list by passing it to the list() function.

**Indexing**

In [14]:
# Basic Indexing

x = ['a','b','c']
x[0]

'a'

In [15]:
# Nested List Indexing

y = [['n','c'], 'a', 'b']

In [17]:
y[0][1]

'c'

In [18]:
# Negative Indexing

x = ['a','b','c']
x[-1]

'c'

**Slicing**

In [30]:
# Basic Slicing

x = ['a','b','c']

x[0:2] #(n,n-1) concept

['a', 'b']

In [29]:
#Negative Slicing

x = ['a','b','c']

x[-3::]

['a', 'b', 'c']

**Mutability of Lists**

In [35]:
x = ['a','b','c']
x[2:] = [1,2]
x

['a', 'b', 1, 2]

**List Functions**

In [60]:
fruits = ['apple','banana','grapes']

In [61]:
len(fruits)  #len

3

In [64]:
del fruits[2]   #del
fruits

['apple', 'banana']

In [71]:
bind = [1,2,3,4] + [5,6,7,8]  #concatenate
bind

[1, 2, 3, 4, 5, 6, 7, 8]

In [75]:
x = ['a', 'b', 'c', 'd', 'e']  #Membership

print('a' in x)
print('a' not in x)

True
False


In [82]:
x = (1,2,3)  #List conversion
list(x)

[1, 2, 3]

### Lesson 14: For Loops with Lists, Multiple Assignment and Augmented Operators

* A for loop technically iterates over the values in a list.
* The range() function returns a list-like value, which can be passed to the list() function if you need an actual list value.
* Variables can swap their values using multiple assignment: a, b = b, a
* Augmented assignment operators like += are used as shortcuts.

In [83]:
# The range() function returns a list-like value, which can be passed to the list() function.
for i in [0,1,2,3]:
    print(i)

0
1
2
3


In [91]:
# Conversion of Range to List
list(range(0,10,2))

[0, 2, 4, 6, 8]

**For loop for List with strings**

In [100]:
# How to use for loop for list containing String elements?
animals = ["Bat", "cat", "Rat", "Dog"]

for i in range(len(animals)):
    print(i, animals[i]) #index and value

0 Bat
Bat
1 cat
cat
2 Rat
Rat
3 Dog
Dog


**Multiple Assignment**

In [103]:
Dog = ['Slim', 'Black', 'Loud']

In [106]:
Size, Color, Bark = Dog  # Multiple Assignment

In [105]:
Size

'Slim'

In [107]:
Color

'Black'

In [108]:
Bark

'Loud'

**Swapping Values in Variables**

In [117]:
a = 'cat'
b = 'rat'

In [118]:
a, b = b, a #Swapping Values

In [119]:
a

'rat'

In [120]:
b

'cat'

**Augmented Assignment Operator**

In [122]:
x = 2 + 1
x

3

In [126]:
y = 1
y += 3  # Assignment Operator
y

4

### Lesson 15: List Methods
*Methods are functions that are "called on" values.*

Return values are not used for methods like x = x.append('z')

* The index() list method returns the index of an item in the list.
* The append() list method adds a value to the end of a list.
* The insert() list method adds a value anywhere inside a list.
* The remove() list method removes an item, specified by the value, from a list.
* The sort() list method sorts the items in a list.
* The sort() method's reverse=True keyword argument can make the sort() method sort in reverse order.
* The sort() cant be used on list with multiple data types - important
* These list methods operate on the list "in place", rather than returning a new list value.

**Functions vs Methods**

In [156]:
print(len('Hello'))  #Function

x = ['Hello', 'Hey', 'Hi']

x.index('Hello')  #Method because it is called on Value

5


0

In [157]:
fruits = ['apple','banana','grapes']

fruits.append('orange')  #append
fruits

['apple', 'banana', 'grapes', 'orange']

In [158]:
fruits.pop()   #pop
fruits

['apple', 'banana', 'grapes']

In [159]:
fruits.insert(1, 'Mango')  #insert (Index, New Value)
fruits

['apple', 'Mango', 'banana', 'grapes']

In [160]:
fruits.remove('apple')  #remove
fruits

['Mango', 'banana', 'grapes']

In [165]:
number = [5, 2.5, 7.15, 7, 9, 8.15] #Sorting Numbers in Asc
number.sort()
number

[2.5, 5, 7, 7.15, 8.15, 9]

In [169]:
animals = ['monkeys', 'dogs', 'zebra', 'cat'] #Sorting Strings alphabetically
animals.sort()
animals

['cat', 'dogs', 'monkeys', 'zebra']

In [170]:
number = [5, 2.5, 7.15, 7, 9, 8.15] #Sorting Numbers in Desc
number.sort(reverse = True)
number

[9, 8.15, 7.15, 7, 5, 2.5]

In [171]:
#The sort() cant be used on list with multiple data types - important

x = ['monkeys', 'dogs', 1 , 0]
x.sort()

TypeError: '<' not supported between instances of 'int' and 'str'

**Alphabetical vs ASCII - betical Order (AKA askey)** 

*This mean elements with capital letter will be given preference*

In [187]:
x = ['andy', 'Andrew', 'Bob', 'bob']  #Capital will be given preference
x.sort()
x

['Andrew', 'Bob', 'andy', 'bob']

In [179]:
# In order to solve above problem

x = ['andy', 'Andrew', 'Bob', 'bob']  
x.sort(key = str.lower)  #Get Elements in True Form
x

['andy', 'Andy', 'Bob', 'bob']

### Lesson 16: Similarities between List and String
*Lists are Mutable while strings are not*

* Strings can do a lot of the same things lists can do, but strings are immutable.
* Immutable values like strings and tuples cannot be modified "in place".
* Mutable values like lists can be modified in place.
* Variables don't contain lists, they contain references to lists.
* When passing a list argument to a function, you are actually passing a list reference.
* Changes made to a list in a function will affect the list outside the function.
* The \ line continuation character can be used to stretch Python instruction across multiple lines.

In [193]:
# String does not support Mutability

x = 'I am a Data Analyst'
x[0] = 'U'

TypeError: 'str' object does not support item assignment

In [198]:
# List Supports Mutability

x = list(range(0,10,2))
print(x)
x[0] = 10
x

[0, 2, 4, 6, 8]


[10, 2, 4, 6, 8]

**References**

In [202]:
A = 20
B = A
A += 2

print(A)
print(B) #B is not updated because it is not referenced

22
20


In [224]:
# Variables contain references to lists because lists are saved as Reference IDs to save space, 
# lets say we have thousands of value in list, then it will only be referenced by ID hence saving space

spam = [1,2,3,4,5]
cheese = spam

spam[0] = 22

print(xmas)
print(bat)  #both getting updated

[22, 2, 3, 4, 5]
[22, 2, 3, 4, 5]


![image.png](attachment:image.png)


**Line Continuation Character**

In [222]:
print('Automate Boring Stuff with Python \
you can literally enjoy and relax')

Automate Boring Stuff with Python you can literally enjoy and relax


## VII. Dictionaries

*A dictionary in Python is an unordered, mutable, and indexed collection of key-value pairs. Each key is unique and is used to access the corresponding value in the dictionary. Dictionaries are highly efficient for lookups, insertion, and deletion of key-value pairs.*

### Lesson 17: The Dictionary Data Type

**Keys:** *Must be immutable and unique. Examples include integers, strings, and tuples.*

**Values:** *Can be mutable or immutable. Examples include integers, strings, lists, and dictionaries.*


* Dictionaries contain key-value pairs. Keys are like a list's indexes.
* Dictionaries are mutable. 
* Variables hold references to dictionary values, not the dictionary value itself.
* Dictionaries are unordered. There is no "first" key-value pair in a dictionary.
* The keys(), values(), and items() methods will return list-like values of a dictionary's keys, vaues, and both keys and values, respectively.
* The get() method can return a default value if a key doesn't exist.
* The setdefault() method can set a value if a key doesn't exist.
* The pprint module's pprint() "pretty print" function can display a dictionary value cleanly. 
* The pformat() function returns a string value of this output.

In [234]:
#Key value pair
mycat = {'size' : 'fat', 'color' : 'orange' , 'age': [3,4]}

In [231]:
mycat['size']

'fat'

In [233]:
mycat['age'][1]

4

In [240]:
#Dictionaries are unordered unlike lists

print([1,2,3] == [3,2,1]) #False, because list is ordered

print({'size' : 'fat', 'color' : 'orange'} == {'color' : 'orange', 'size' : 'fat'}) #True (Unordered)

False
True


In [245]:
#check if key exists or not in Dict (in and not in)

dict = {'size' : 'fat', 'color' : 'orange'}

print('size' in dict) #only for keys
print('fat' in dict)   #not for values

True
False


In [247]:
# Conversion of Dict into list

#Keys
print(list(dict.keys()))

#Values
print(list(dict.values()))

#Items
print(list(dict.items()))

['size', 'color']
['fat', 'orange']
[('size', 'fat'), ('color', 'orange')]


In [265]:
# Unpacking with for loop

dict = {'size' : 'fat', 'color' : 'orange'}

for i in dict.keys():
    print(i)

for i in dict.values():
    print(i)
    
for i in dict.items():  #Returns tuples
    print(i)
    
for a,b in dict.items():
    print(a, b)

size
color
fat
orange
('size', 'fat')
('color', 'orange')
size fat
color orange


In [269]:
# Checking Membership

dict = {'size' : 'fat', 'color' : 'orange'}

'fat' in dict.values()

'size' in dict.keys()

if 'size' in dict.keys():
    print(dict.values())

dict_values(['fat', 'orange'])


**get Method**

In [272]:
category = {'size' : 'fat', 'color' : 'orange', 'age' : 8}

In [276]:
category.get('speed', 0) #returns 0 if not found

0

**setdefault**

In [278]:
category.setdefault('size', 'thin') #if value for size is missing then 'thin' will be default

'fat'

**Count Repitition of Letters in String**

In [289]:
m = 'a big fat hen'

count = {}

for character in m.upper():
    count.setdefault(character, 0)
    count[character] += 1

count


{'A': 2,
 ' ': 3,
 'B': 1,
 'I': 1,
 'G': 1,
 'F': 1,
 'T': 1,
 'H': 1,
 'E': 1,
 'N': 1}

**pprint function**

*Pretty print function*
The pprint module's pprint() "pretty print" function can display a dictionary value cleanly

In [294]:
import pprint

m = 'a big fat hen'

count = {}

for character in m.upper():
    count.setdefault(character, 0)
    count[character] += 1

pprint.pprint(count)

{' ': 3, 'A': 2, 'B': 1, 'E': 1, 'F': 1, 'G': 1, 'H': 1, 'I': 1, 'N': 1, 'T': 1}


**pprint format**

*The pformat() function returns a string value of this output.*

In [302]:
m = 'a big fat hen'

count = {}

for character in m.upper():
    count.setdefault(character, 0)
    count[character] += 1

pprint.pformat(count)

"{' ': 3, 'A': 2, 'B': 1, 'E': 1, 'F': 1, 'G': 1, 'H': 1, 'I': 1, 'N': 1, 'T': 1}"

### Lesson 18: Data Structures

In [349]:
#Tic Tac Toe Game

theboard = {
'top-L' : 'O',
'top-M' : 'X',
'top-R' : 'O',
'mid-L' : 'O',
'mid-M' : 'X',
'mid-R' : 'X',
'low-L' : 'X',
'low-M' : 'O',
'low-R' : 'O'
}

In [350]:
def printboard(x):
    print(x['top-L'] + '|' + x['top-M'] + '|' + x['top-R'])
    print('-----')
    print(x['mid-L'] + '|' + x['mid-M'] + '|' + x['mid-R'])
    print('-----')
    print(x['low-L'] + '|' + x['low-M'] + '|' + x['low-R'])

In [351]:
printboard(theboard)

O|X|O
-----
O|X|X
-----
X|O|O


**type Function**

In [348]:
print(type(6))
print(type('ab'))
print(type(1.2))
print(type(2j))
print(type(True))

<class 'int'>
<class 'str'>
<class 'float'>
<class 'complex'>
<class 'bool'>


## VIII. More about Strings

### Lesson 19: Advanced String Syntax

* Strings are enclosed by a pair of single quotes or double quotes (as long as the same kind are used).
* Escape characters let you put quotes and other characters that are hard to type inside strings.
* Raw strings (which have the r prefix before the first quote) will literally print any backslashes in the string and ignore escape characters.
* Multiline strings begin and end with three quotes, and can span multiple lines.
* Indexes, slices, and the "in" and "not in" operators all work with strings.

In [355]:
#escape character

'that is alice\'s cat'

"that is alice's cat"

![image.png](attachment:image.png)

In [360]:
# new line

print('Hey\nHow are you\nHow is everything going')

Hey
How are you
How is everything going


**raw string**

In [363]:
# raw string helps us literally print even escape characters

print(r'Hey\nHow are you\nHow is everything going') #raw character

Hey\nHow are you\nHow is everything going


**multiple line string**

In [367]:
print("""This is useful for really 
really
large string""")

This is useful for really 
really
large string


**Similarities between lists and strings**

In [381]:
spam = 'Hello world'

spam[0] #same indexing
spam[0:6] #same slicing
'Hello' in spam #Membership also works

True

### Lesson 20: String Methods

* upper() and lower() return an uppercase or lowercase string.
* isupper(), islower(), isalpha(), isalnum(), isdecimal(), isspace(), istitle() returns True or False if the string is that uppercase, lowercase, alphabetical letters, and so on.
* startswith() and endswith() also return bools.
* ‘,'.join([‘cat', ‘dog']) returns a string that combines the strings in the given list.
* ‘Hello world'.split() returns a list of strings split from the string it's called on.
* rjust() ,ljust(), center() returns a string padded with spaces.
* strip(), rstrip(), lstrip() returns a string with whitespace stripped off the sides.
* replace() will replace all occurrences of the first string argument with the second string argument.

In [388]:
x = 'Hello world'

print(x.upper())
print(x.lower())

HELLO WORLD
hello world


In [399]:
# Solving Case of input

x = input('do you wana code? ')

if x.lower() == 'yes':
    print('Thanks for help!')

do you wana code? YES
Thanks for help!


In [402]:
# isupper

x = 'Hello world'
x.isupper()

False

In [406]:
# isalpha means is alphabetical : to check if any integers are in string

x = 'Helloworld'
x.isalpha()

True

In [408]:
# isalnum means is alphanumerical

x = 'Helloworld12'
x.isalnum()

True

In [418]:
# isspace : To find space in strings

x = ' '
x.isspace()

True

In [423]:
# Start with , End with

print('Helloword'.startswith('H'))
print('Helloword'.endswith('d'))

True
True


In [440]:
# Join String

print('ing '.join(['sleep', 'eat', 'beat']))

print('\n\n'.join(['sleep', 'eat', 'beat']))

sleeping eating beat
sleep

eat

beat


In [443]:
# Split String

'There is a data, trick'.split(',')

['There is a data', ' trick']

In [None]:
# Strip String

x = ' There '
print(x.strip())

# lstrip = left strip

y = ' There '
print(y.lstrip())

# rstrip = right strip

z = ' There '
print(z.rstrip())

In [454]:
# Replace String

x = 'Hello There'

x.replace('Hello', 'Hey')

'Hey There'

### Lesson 21: String Formatting

In [460]:
name = 'Tajamul'
address = 'Main colony'

f' I think I know u, Are u {name} from {address}'

' I think I know u, Are u Tajamul from Main colony'

## IX. Regular Expressions

*Regex or Regexp is a sequence of characters used to specify text patterns e.g., search, match, replace and split*



### Lesson 23: Regex Basics

* Writing code to do pattern matching without regular expressions is a huge pain.
* Regex strings often use backslashes (like \d), so they are often written using raw strings: r'\d'
* \d is the regex for a numeric digit character.
* Import the re module first.
* Call the re.compile() function to create a regex object.
* Call the regex object's search() method to create a match object.
* Call the match object's group() method to get the matched string.

**Search**

In [346]:
import re
message = 'My phone number is 415-555-1011 and alternate will be 415-504-1022'
message = 'My phone number is 415-555-1011 and alternate will be 415-504-1022'

phoneregex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')  #\d used for digit search
mo = phoneregex.search(message)
print(mo)

<re.Match object; span=(19, 31), match='415-555-1011'>


**findall**

In [88]:
phoneregex = re.compile(r'\d\d\d-\d\d\d-\d\d\d\d')  #\d used for digit search
print(phoneregex.findall('My phone number is 415-555-1011 and alternate will be 415-504-1022'))

['415-555-1011', '415-504-1022']


### Lesson 24: Regex Groups and the Pipe Character
* Groups are created in regex strings with parentheses.
* The first set of parentheses is group 1, the second is 2, and so on.
* Calling group() or group(0) returns the full matching string, group(1) returns group 1's matching string, and so on.
* Use \ ( and \ ) to match literal parentheses in the regex string.
* The | pipe can match one of many possible groups.

**Group**

In [68]:
message = 'My phone number is 415-555-1011 and alternate will be 415-504-1022'

phoneregex = re.compile(r'(\d\d\d)-(\d\d\d)-(\d\d\d\d)')
mo = phoneregex.search(message)

print(mo.group(1))
print(mo.group(2))
print(mo.group(3))

415
555
1011


**Searching Parenthesis**

In [69]:
message = 'My phone number is (415) 555-1011'

regexp = re.compile(r'\(\d\d\d\) \d\d\d-\d\d\d\d')
regexp.search(message)

<re.Match object; span=(19, 33), match='(415) 555-1011'>

**| Pipe Character**

In [70]:
regexp = re.compile(r'Hel(icopter|ipad|lo)')

mo = regexp.search('there was a Helicopter')
print(mo.group(1)) #suffix
print(mo.group()) 

### Lesson 25: Repitition and Greedy/Nongreedy matching

* The ? says the group matches zero or one times.
* The * says the group matches zero or more times.
* The + says the group matches one or more times.
* The curly braces can match a specific number of times.
* The curly braces with two numbers matches a minimum and maximum number of times.
* Leaving out the first or second number in the curly braces says there is no minimum or maximum.
* "Greedy matching" matches the longest string possible, "nongreedy matching" (or "lazy matching") matches the shortest string possible.
* Putting a question mark after the curly braces makes it do a nongreedy/lazy match.

**star character**

In [106]:
batregex = re.compile(r'Bat(wo)*man')  #wo can appear 0 or more times

In [347]:
m1 = batregex.search('She is a Batwowoman')
print(m1.group())

m1 = batregex.search('She is a Batman')
print(m1.group())

Batwowoman
Batman


In [117]:
#print escape character

plusregex = re.compile(r'\+\*\?\d\d')
plusregex.search('my number starts with +*?91')

**Set of characters**

* '?' = 0 or 1
* '*' = 0 or more
* '+' = 1 or more
* {x} = exact

In [350]:
# ? for 0 or 1
Qregex = re.compile(r'name?')
Qregex.search('his name is')

<re.Match object; span=(4, 8), match='name'>

In [352]:
# + for 1 or more
plusregex = re.compile(r'(name)+')
plusregex.findall('his name name is joy')

['name', 'name']

In [354]:
# {x} for exact iterations
exactregex = re.compile(r'(name){3}')
exactregex.search('his namenamename is joy')

<re.Match object; span=(4, 16), match='namenamename'>

In [159]:
# {x,y} range of desired outcomes 
exactregex = re.compile(r'(name){3,5}')
exactregex.search('his namenamenamename is joy')

<re.Match object; span=(4, 20), match='namenamenamename'>

**Greedy Match**

In [157]:
greedyregex = re.compile(r'(\d){3,5}')
greedyregex.search('123456789')  #It will return maximum result hence greedy

<re.Match object; span=(0, 5), match='12345'>

**Non Greedy Match**

In [158]:
nongreedyregex = re.compile(r'(\d){3,5}?')
nongreedyregex.search('123456789')  #It will any string result hence greedy

<re.Match object; span=(0, 3), match='123'>

### Lesson 26: Regex Character Classes and findall() Method

* The regex method findall() is passed a string, and returns all matches in it, not just the first match.
* If the regex has 0 or 1 group, findall() returns a list of strings.
* If the regex has 2 or more groups, findall() returns a list of tuples of strings.
* \d is a shorthand character class that matches digits. \w matches "word characters" (letters, numbers, and the underscore). \s matches whitespace characters (space, tab, newline).
* The uppercase shorthand character classes \D, \W, and \S match charaters that are not digits, word characters, and whitespace.
* You can make your own character classes with square brackets: [aeiou]
* A ^ caret makes it a negative character class, matching anything not in the brackets: [^aeiou]

**findall**

In [164]:
phoneregex = re.compile(r'\d\d\d-\d\d\d')

resume = """my number is 123-456 and my friend's number is 789-012"""
phoneregex.findall(resume)

['123-456', '789-012']

In [167]:
#Searching in groups

phoneregex = re.compile(r'((\d\d\d)-(\d\d\d))')

resume = """my number is 123-456 and my friend's number is 789-012"""
phoneregex.findall(resume) #Three groups one outer, 2 inner

[('123-456', '123', '456'), ('789-012', '789', '012')]

**Regex Character Classes**

Hint 
* small letter matches
* capital does not match

![image.png](attachment:image.png)

In [175]:
# Digits + Words

sentence = 'There were 12 friends, and 2 dogs, 3 coffee mugs and 1 orange'
DigitRegex = re.compile(r'\d+\s\w+')
DigitRegex.findall(sentence)

['12 friends', '2 dogs', '3 coffee', '1 orange']

**Creating own Regex Character Class**

Syntax = RegexObj = re.compile(r'[text]') similar to #r'(a\e\i\o\u)'

In [182]:
#Creating Vowel regex character

RVC = re.compile(r'[aeiouAEIOU]')
RVC.findall('A quick Brown Fox runs over a lAzy dOg')

['A', 'u', 'i', 'o', 'o', 'u', 'o', 'e', 'a', 'A', 'O']

**Ignore Case Sensitiveness = re.Ignore**

In [290]:
#Case Insensitive Matching 

RVC = re.compile(r'[aeiou]', re.I)  #re.I will make sure to pass sensitiveness
RVC.findall('A quick Brown Fox runs over a lAzy dOg')

['A', 'u', 'i', 'o', 'o', 'u', 'o', 'e', 'a', 'A', 'O']

In [291]:
#Creating Double Vowel regex character

DRVC = re.compile(r'[aeiouAEIOU]{2}')
DRVC.findall('A quick Brown Fox runs over a lAAzy dOOg')

['ui', 'AA', 'OO']

In [292]:
#Creating Not in Vowel regex character with ^ Caret symbol

RVC = re.compile(r'[^aeiouAEIOU]')
RVC.findall('A quick Brown')  #none of the vowels

[' ', 'q', 'c', 'k', ' ', 'B', 'r', 'w', 'n']

### Lesson 27: Regex Character Classes and findall() Method

* ^ means the string must start with pattern, $ means the string must end with the pattern. Both means the entire string must match the entire pattern.
* The . dot is a wildcard; it matches any character except newlines.
* Pass re.DOTALL as the second argument to re.compile() to make the . dot match newlines as well.
* Pass re.I as the second argument to re.compile() to make the matching case-insensitive.

**^ this pattern begins**

In [309]:
RgexS = re.compile(r'^Hello')
Sent = ('Hello smile Hello darkness friend') #only ist Hello is True

RgexS.search(Sent)

<re.Match object; span=(0, 5), match='Hello'>

**$ this pattern ends**

In [310]:
RgexE = re.compile(r'world$')
Sent = ('Hello world') #only last world is True

RgexE.search(Sent)

<re.Match object; span=(6, 11), match='world'>

**using ^ and $ Together**

In [311]:
RgexSE = re.compile(r'^Hello world$')
Sent = ('Hello world') #only last world is True

RgexSE.search(Sent)

<re.Match object; span=(0, 11), match='Hello world'>

**dot . = Any SINGLE character than new line**

In [312]:
RgexD = re.compile(r'.at')
Sent = ('cat sat on a flat mat')

RgexD.findall(Sent) #flat is not there because only one character allowed

['cat', 'sat', 'lat', 'mat']

In [313]:
# Any two characters with dot

RgexDD = re.compile(r'.{1,2}at') #can be more than 1 character
Sent = ('cat sat on a flat mat')

RgexDD.findall(Sent) 

['cat', ' sat', 'flat', ' mat']

**dot * = Whatever Value**

In [314]:
x = 'First Name: Tajamul, Last Name: Khan'
dotstarregex = re.compile(r'First Name: (.*) Last Name: (.*)')
dotstarregex.findall(x)

[('Tajamul,', 'Khan')]

In [315]:
# Dot Star is always greedy 

y = '<he is my boss> he is truely amazing>'

Dotstarreg = re.compile(r'<.*>') #inorder to avoid this we can put ? after *
Dotstarreg.findall(y)

['<he is my boss> he is truely amazing>']

In [316]:
# Dot does not pass new line, how to counter that ?

tex = 'hey!\nhow are you\how is it going'

rege = re.compile(r'.*', re.DOTALL)  #DOTALL Helps
rege.search(tex)

<re.Match object; span=(0, 32), match='hey!\nhow are you\\how is it going'>

### Lesson 28: Regex sub() Method and Verbose Mode

* The sub() regex method will substitute matches with some other text.
* Using \1, \2 and so will substitute group 1, 2, etc in the regex pattern.
* Passing re.VERBOSE lets you add whitespace and comments to the regex string passed to re.compile().
* If you want to pass multiple arguments (re.DOTALL , re.IGNORECASE, re.VERBOSE), combine them with the | bitwise operator.

**Matching Word Character**

In [323]:
tex = 'Mr Tajamul is also known as Mr Khan'
regexp = re.compile(r'Mr \w+')

regexp.findall(tex)

['Mr Tajamul', 'Mr Khan']

**sub Method = Find and Replace**

In [326]:
tex = 'Mr Tajamul is also known as Mr Khan'
regexp = re.compile(r'Mr \w+')

regexp.sub('Anonymus',tex) #Finds and replaces value

'Anonymus is also known as Anonymus'

**Verbose Format**

In [343]:
#Can ignore whitespaces and add comments

number = '(630)-1234 , (630)-1245'
regexp = re.compile(r'''
\(\d\d\d\) #Area code
-
\d\d\d\d  #Extension''' 
, re.VERBOSE)

regexp.findall(number)

['(630)-1234', '(630)-1245']

**Passing multiple Options**

In [345]:
number = '(630)-1234 , (630)-1245'
regexp = re.compile(r'''
\(\d\d\d\) #Area code
-
\d\d\d\d''' 
, re.VERBOSE | re.I | re.DOTALL)

regexp.findall(number)

['(630)-1234', '(630)-1245']

### Lesson 29: Regex Example Program: Phone and Email Scraper

In [52]:
pdf = """EXAMPLE PHONE AND EMAIL
DIRECTORY
This example PDF was created to practice writing programs that could automatically get phone numbers
and email addresses from PDFs.
You can learn to program with the free resources at https://inventwithpython.com
PUBLIC DOMAIN IMAGE OF THE SEAL OF APPROVAL
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum."
Jessie Mckay jmckay67@aol.com 479-205-4874
Tom Jordan tjordan@msn.com 678-560-3485
Clayton Cross ccross20@gmail.com 724-900-2986
Rayford Sutton rayfords66@hotmail.com 242-391-3183
Jerome Gentry jgentry@me.com 604-720-6426
Weldon Camacho wcamacho57@icloud.com 651-807-8065
Quinton Franks qfranks@comcast.net 209-754-9111
Adam Hubbard cygzfjd61@outlook.com 641-433-6698
Jarred Fox jfox39@live.com 701-528-9851
Arnoldo Parker aparker39@sbcglobal.net 304-491-9583
Sid Mcdaniel mcdanie3354@att.net 863-583-8107
Raymon Combs uqcwsntti71@att.net 507-948-3980
Ervin Francis efrancis@optonline.net 546-367-3454
Gilberto Austin austi363@optonline.net 321-854-5616
Lino Barlow lbarlow22@me.com 904-896-2920
Stacey Shepherd sshepherd61@sbcglobal.net 309-387-1990
Roscoe Terry rterry64@outlook.com 605-373-2329
Eddie Meadows eddiem89@yahoo.com 573-454-1209
Carlos Simpson csimpson8@verizon.net 252-822-2439
"""

**Phone Scrapper**

In [58]:
import re 
Phoneregex = re.compile(
r'''
(
((\d\d\d) | (\(\d\d\d\)))?    # Optional area code: can be 123 or (123)
(\s|-)?                       # Optional separator: space or dash
\d\d\d                        # First 3 digits of the phone number
-                             # Separator: dash
\d\d\d\d                      # Last 4 digits of the phone number
(((ext(\.)?\s) | x)          # Optional extension prefix: ext, ext., or x
(\d{2,5}))?                   # Optional extension number: 2 to 5 digits
)
''', re.VERBOSE)

In [59]:
phoneNumbers = Phoneregex.findall(pdf)

In [60]:
allPhoneNumbers = []
for Number in phoneNumbers:
    allPhoneNumbers.append(Number[0])

In [56]:
allPhoneNumbers

['479-205-4874',
 '678-560-3485',
 '724-900-2986',
 '242-391-3183',
 '604-720-6426',
 '651-807-8065',
 '209-754-9111',
 '641-433-6698',
 '701-528-9851',
 '304-491-9583',
 '863-583-8107',
 '507-948-3980',
 '546-367-3454',
 '321-854-5616',
 '904-896-2920',
 '309-387-1990',
 '605-373-2329',
 '573-454-1209',
 '252-822-2439']

**Email Scrapper**

In [65]:
import re 

emailregex = re.compile(
    r'''
    [a-zA-Z0-9_.+]+  #name
    @                #symbol
    [a-zA-Z0-9_.+]+  #domain name part
    ''', re.VERBOSE)

In [67]:
emails = emailregex.findall(pdf)
emails

['jmckay67@aol.com',
 'tjordan@msn.com',
 'ccross20@gmail.com',
 'rayfords66@hotmail.com',
 'jgentry@me.com',
 'wcamacho57@icloud.com',
 'qfranks@comcast.net',
 'cygzfjd61@outlook.com',
 'jfox39@live.com',
 'aparker39@sbcglobal.net',
 'mcdanie3354@att.net',
 'uqcwsntti71@att.net',
 'efrancis@optonline.net',
 'austi363@optonline.net',
 'lbarlow22@me.com',
 'sshepherd61@sbcglobal.net',
 'rterry64@outlook.com',
 'eddiem89@yahoo.com',
 'csimpson8@verizon.net']

## X. Files

### Lesson 30: File Names and Absolute/Relative File Path

* Files have a name and a path.
* The root folder is the lowest folder. c:\ for windows and / for mac
* In a file path, the folders and filename are separated by backslashes on Windows and forward slashes on Linux and Mac.
* Use the os.path.join() function to combine folders with the correct slash.
* The current working directory is the oflder that any relative paths are relative to.
* os.getcwd() will return the current working directory.
* os.chdir() will change the current working directory.
* Absolute paths begin with the root folder, relative paths do not.
* The . folder represents "this folder", the .. folder represents "the parent folder".
* os.path.abspath() returns an absolute path form of the path passed to it.
* os.path.relpath() returns the relative path between two paths passed to it.
* os.makedirs() can make folders.
* os.path.getsize() returns a file's size.
* os.listdir() returns a list of strings of filenames.
* os.path.exists() returns True if the filename passed to it exists.
* os.path.isfile() and os.path.isdir() return True if they were passed a filename or file path.

In [1]:
r'/Users/tajamulk2/Downloads/lesson6-recap.txt'

'/Users/tajamulk2/Downloads/lesson6-recap.txt'

**os library**

The os module in Python helps you work with your computer's operating system. It lets you do things like create, delete, and move files and folders, get information about your system, and manage processes. 

It's a bridge between python program and computer operating system.

In [146]:
import os 

# To combine folder with file name

os.path.join('folder1', 'folder2', '.png')

'folder1/folder2/.png'

In [3]:
# current working directory

os.getcwd()

'/Users/tajamulk2'

In [4]:
# change directory

os.chdir('/Users/tajamulk2/Downloads/')
os.getcwd()

'/Users/tajamulk2/Downloads'

**Absolute and Relative File Path**

In [119]:
# Absolute File path = Complete File Path  (..)

'/Users/tajamulk2/Downloads/Hey.png''

# Relative File path = Considers the file will be in last path/ is not complete (.)

'Downloads/Hey.png''

![image.png](attachment:image.png)

In [7]:
#get absolute and real path 

print(os.path.abspath('shinchan.png'))

x = os.getcwd()
os.path.relpath(r"/Users/tajamulk2/Downloads, 'tajamulk2'") #abspath , cwd

/Users/tajamulk2/Downloads/shinchan.png


"../Downloads, 'tajamulk2'"

In [133]:
#conversion of relative file path into abs

os.path.abspath('../../shinchan.png')

'/Users/shinchan.png'

**dirname basename**

In [148]:
# dirname = name of directory
print(os.path.dirname('Users/tajamulk2/folder1/del.img'))

# basename = last name of directory
print(os.path.basename('Users/tajamulk2/folder1/del.img'))

Users/tajamulk2/folder1
del.img


**exists**

In [151]:
os.path.exists('/Users/tajamulk2/Downloads/lesson6-recap.txt')

True

**size**

In [152]:
os.path.getsize('/Users/tajamulk2/Downloads/lesson6-recap.txt')

434

**list**

In [32]:
#lists all files and folders
os.listdir('/Users/tajamulk2/Downloads/P3-BFSI-Home-Loan-Data-Analysis-with-Power-BI-main')

['README.md', 'home_loan.pbix', '1.png']

**create folder AKA directory**

In [164]:
os.makedirs('/Users/tajamulk2/Downloads/new/walnut/waffles')

### Lesson 31: Reading and Writing Plain Text Files

The open() function will return a file object which has reading and writing –related methods.
Pass ‘r' (or nothing) to open() to open the file in read mode. Pass ‘w' for write mode. Pass ‘a' for append mode.
Opening a nonexistent filename in write or append mode will create that file.
Call read() or write() to read the contents of a file or write a string to a file.
Call readlines() to return a list of strings of the file's content.
Call close() when you are done with the file.
The shelve module can store Python values in a binary file.
The shelve.open() function returns a dictionary-like shelf value.

**open read & close text file**

In [59]:
# open file
file = open('/Users/tajamulk2/Downloads/lesson26-recap.txt')

# read returns single string
file.read()

''

In [60]:
# readlines returns list of strings

file = open('/Users/tajamulk2/Downloads/lesson26-recap.txt')

file.readlines()

[]

In [61]:
# closing the file

file.close()

**Writing and Appending Text Files**

In [62]:
# Writing mode: Over Writes the content

nfile = open('/Users/tajamulk2/Downloads/lesson26-recap.txt', 'w')
nfile.write('naanana')

7

In [63]:
# Append mode: does not over write the content

nfile = open('/Users/tajamulk2/Downloads/lesson26-recap.txt', 'w')
nfile.write('naanana')

7

**Shelve Module**

Shelve files are used to store data structures inside a variable. They are very similar to dictionaries as it stores data in keys and values structure

In [64]:
import shelve

shelf_file = shelve.open('mydata')
shelf_file['cats'] = ['Hurair', 'Tiku']
shelf_file.close

nshelf_file = shelve.open('mydata')
nshelf_file['cats']

print(list(nshelf_file.keys()))
list(nshelf_file.values())

['cats']


[['Hurair', 'Tiku']]

### Lesson 32: Copying and Moving Files and Folders

Shutil is the module used to copy, move and rename files

In [66]:
import shutil

In [69]:
# Copy File into another Folder
shutil.copy('/Users/tajamulk2/Downloads/lesson26-recap.txt', '/Users/tajamulk2/Desktop/')

'/Users/tajamulk2/Desktop/lesson26-recap.txt'

In [71]:
# Copy and Rename File into another Folder
shutil.copy('/Users/tajamulk2/Downloads/lesson26-recap.txt', '/Users/tajamulk2/Desktop/kuuch.txt')

'/Users/tajamulk2/Desktop/kuuch.txt'

In [73]:
# Copy Entire Folder and Content
shutil.copytree('/Users/tajamulk2/Downloads', '/Users/tajamulk2/Desktop/Downloads backup')

'/Users/tajamulk2/Desktop/Downloads backup'

In [76]:
# Moving Files from one folder to other
shutil.move('/Users/tajamulk2/Desktop/Downloads backup/02.jpeg', '/Users/tajamulk2/Desktop')

'/Users/tajamulk2/Desktop/02.jpeg'

In [77]:
# Moving and Renaming Files from one folder to other
shutil.move('/Users/tajamulk2/Desktop/02.jpeg', '/Users/tajamulk2/Desktop/meme.jpeg')

'/Users/tajamulk2/Desktop/meme.jpeg'

### Lesson 33: Deleting Files

* os.unlink() will delete a file.
* os.rmdir() will delete a folder (but the folder must be empty).
* shutil.rmtree() will delete a folder and all its contents.
* Deleting can be dangerous, so do a "dry run" first.
* send2trash.send2trash() will send a file or folder to the recycling bin.

In [79]:
# Deleting single file
os.unlink('/Users/tajamulk2/Desktop/meme.jpeg')

In [83]:
# Deleting and empty folder
os.rmdir('/Users/tajamulk2/Desktop/Empty Folder') #Folder has to be completely Empty

In [84]:
# Deleting Entire Folder
shutil.rmtree('/Users/tajamulk2/Desktop/Downloads backup')

In [93]:
# Deleting files on condition based

for file in os.listdir(): # First have a dry run 
    if file.endswith('.txt'):
        print(file)

for file in os.listdir(): # Then delete files
for file in os.listdir():
    if file.endswith('.txt'):
        os.unlink(file)
    if file.endswith('.txt'):
        os.unlink(file)

lesson33-recap.txt


In [96]:
os.getcwd()

'/Users/tajamulk2/Desktop/Test Folder'

In [97]:
# Sending Files to Trash Can

import send2trash

send2trash.send2trash('/Users/tajamulk2/Desktop/Test Folder/1.jpeg')

### Lesson 34: Walking a Directory Tree

In [104]:
for folder, subfolder, file in os.walk('/Users/tajamulk2/Desktop/Test Folder/'):
    print(folder)
    print(str(subfolder))
    print(str(file))
    print()

/Users/tajamulk2/Desktop/Test Folder/
['Sub Test']
['.DS_Store', 'videoplayback.mp4']

/Users/tajamulk2/Desktop/Test Folder/Sub Test
[]
[]



In [106]:
# Removing file from folder and sub folder
for folder, subfolder, file in os.walk('/Users/tajamulk2/Desktop/Test Folder/'):
    for f in file:
        if f.endswith('.py'):
            os.unlink(f)

## XI. Debugging

### Lesson 35: Raise and Assert Statements

* You can raise your own exceptions: raise Exception(‘This is the error message.')
* You can also use assertions: assert condition, ‘Error message'
* Assertions (Sanity Checks) are for detecting programmer errors that are not meant to be recovered from. User errors should raise exceptions. 

**Try Except**

In [140]:
def myfunc(x):
    try:   #Executes the Code
        return(40/x)
    except ZeroDivisionError: #Catches the Error and Returns custom Message
        print('Not allowed')
        
myfunc(0)

Not allowed


**Raise Exception**

In [141]:
def positivity(num):
    if num < 0:
        raise Exception('The Number must be positive')
    return num

positivity(-2)

Exception: The Number must be positive

**Copying all error messages in a text file**

In [142]:
import traceback

try:
    raise Exception('This is the Error Message 1')
    raise Exception('This is the Error Message 2')
except:
    errorfile = open('errorlog.txt', 'a')
    errorfile.write(traceback.format_exc())
    errorfile.close()
    print('The traceback info was written into errorlog.txt')

The traceback info was written into errorlog.txt


**Assert**

*You can think of assert as a check statement in Python. It's a way to ensure that certain conditions hold true at specific points in your code. If the condition being asserted is not met, an AssertionError is raised, providing a clear indication that something has gone wrong.*

In [143]:
def positive_number_check(x):
    assert x > 0, "The number must be positive"
    return x

# This will work fine
print(positive_number_check(-5)) 

AssertionError: The number must be positive

### Lesson 36: Logging

* The logging module lets you display logging messages.
* Log messages create a "breadcrumb trail" of what your program is doing.
* After calling logging.basicConfig() to set up logging, call logging.debug(‘This is the message') to create a log message.
* When done, you can disable the log messages with logging.disable(logging.CRITICAL)
* Don't use print() for log messages: It's hard to remove the mall when you're done debugging.
* The five log levels are: DEBUG, INFO, WARNING, ERROR, and CRITICAL.
* You can also log to a file instead of the screen with the filename keyword argument in the logging.basicConfig() function.

![image.png](attachment:image.png)

In [186]:
#logging.disable(logging.CRITICAL) # helps hide log messages

In [10]:
import logging

# Configure the logger
logging.basicConfig(
    filename = 'errorloging.txt',
    level=logging.DEBUG,  # Set the logging level to DEBUG
    format='%(asctime)s - %(levelname)s - %(message)s',  # Format of the log messages
    datefmt='%Y-%m-%d %H:%M:%S'  # Format of the date in log messages
)

# Simple function that uses logging
def calculate(a, b, c):
    logging.debug("Function calculate called with arguments: %s, %s, %s", a, b, c)
    resultab = a + b
    logging.debug("Result of a+b: %s", resultab)
    totalresult = resultab + c
    logging.debug("Calculated result: %s", totalresult)
    return totalresult

# Call the function
result = calculate(5, 7, 10)
print(f"Result: {result}")

2024-07-31 06:55:52 - DEBUG - Function calculate called with arguments: 5, 7, 10
2024-07-31 06:55:52 - DEBUG - Result of a+b: 12
2024-07-31 06:55:52 - DEBUG - Calculated result: 22


Result: 22


### Lesson 37: Using the Debugger

* The debugger is a tool that lets you execute Python code one instruction at a time and shows you the values in variables.
* Open the Debug Control window with Debug > Debugger before running the program.
* The Over button will step over the current line of code and pause on the next one.
* The Step button will step into a function call.
* The Out button will step out of the current function you are in.
* The Go button will continue the program until the next breakpoint or until the end of the program if there are no breakpoints.
* The Quit button will immediately terminate the program.
* Breakpoints are lines where the debugger will pause and let you choose when to continue running the program.
* Breakpoints can be set by right-clicking the file editor window and selecting "Set Breakpoint"

## XI. Web Scraping

### Lesson 38: Webbrowser Module

several modules that make it easy to scrape web pages in Python.

* webbrowser. Comes with Python and opens a browser to a specific page.

* Requests. Downloads files and web pages from the Internet.

* Beautiful Soup. Parses HTML, the format that web pages are written in.

* Selenium. Launches and controls a web browser. Selenium is able to fill in forms and simulate mouse clicks in this browser.

In [17]:
import webbrowser
webbrowser.open('https://www.linkedin.com/in/tajamulk/')

True

In [23]:
import webbrowser
import sys

sys.argv  # ['mapiyt.py', '870', 'Valencia', 'St.']

if len(sys.argv) > 1:
    # ['mapiyt.py', '870', 'Valencia', 'St.'] -> '870 Valencia St.'
    address = ' '.join(sys.argv[1:])
else:
    address = input("Please enter the address: ")

# Assuming you want to open this address in a web browser
webbrowser.open(f'https://www.google.com/maps/place/{address}')


True

**How to Automate Searching for Web Address**

Creating a batch file (mapit.bat) to automate the execution of your Python script is a great idea. Here’s how you can do it:

1. Save above Python script as mapit.py.
2. Create a batch file named mapit.bat with the following content:
    * @echo off
    * python path\to\your\mapit.py %*
3. Open the command prompt and run:
    * example mapit.bat 870 Valencia St.

### Lesson 39: Downloading files from the Web

* The Requests module is a third-party module for downloading web pages and files.
* requests.get() returns a Response object.
* The raise_for_status() Response method will raise an exception if the download failed.
* You can save a downloaded file to your hard drive with calls to the iter_content() method.

In [25]:
import requests

# URL of a free and SSL-verified text file
url = 'https://www.gutenberg.org/files/84/84-0.txt'

# Send a GET request to the URL
res = requests.get(url)

print(len(res.text))
print(res.raise_for_status)  # Resonse 200 is Good, while 404 is error
print(res.text[:550])

448642
<bound method Response.raise_for_status of <Response [200]>>
ï»¿The Project Gutenberg eBook of Frankenstein, by Mary Wollstonecraft Shelley

This eBook is for the use of anyone anywhere in the United States and
most other parts of the world at no cost and with almost no restrictions
whatsoever. You may copy it, give it away or re-use it under the terms
of the Project Gutenberg License included with this eBook or online at
www.gutenberg.org. If you are not located in the United States, you
will have to check the laws of the country where you are located before
using this eBook.

Title: Frankenst


In [30]:
# Saving the above downloaded file

downloadfile = open('gutenbergfile', 'wb') #wb = write binary
for chunk in res.iter_content(100000):
    downloadfile.write(chunk)

### Lesson 40: Parsing HTML with the Beautiful Soup Module

* Web pages are plaintext files formatted as HTML.
* HTML Stands for HYPER TEXT MARKUP LANGUAGE used for creating Web Pages
* HTML can be parsed with the BeautifulSoup module.
* Parsing refers to the process of analyzing and converting an HTML or XML document into a format that allows you to navigate and manipulate the data easily.
* Beautiful Soup parses the raw HTML or XML content and creates a parse tree, which represents the structure and content of the document. This tree can then be traversed and queried to extract specific elements or information.
* BeautifulSoup is imported with the name bs4.
* Pass the string with the HTML to the bs4.BeautfiulSoup() function to get a Soup object.
* The Soup object has a select() method that can be passed a string of the CSS selector for an HTML tag.
* You can get a CSS selector string from the browser's developer tools by right-clicking the element and selecting Copy CSS Path.
* The select() method will return a list of matching Element objects.

In [27]:
pip install beautifulsoup4

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [28]:
from bs4 import BeautifulSoup
import requests

# URL of an example HTML page with a price element
url = 'https://www.amazon.com/dp/B08N5WRWNW'  # Replace with a valid URL

# Send a GET request to the URL
res = requests.get(url)

# Create a BeautifulSoup object
soup = BeautifulSoup(res.text, 'html.parser')

# Use CSS Selector to find elements; Also select returns list of elements
soup.select('.a-price .a-price-whole, .a-price .a-price-symbol')  # Adjust selector to match the page structure

# Saving into a List
elements = soup.select('.a-price .a-price-whole, .a-price .a-price-symbol')  # Adjust selector to match the page structure

elements[0].text.strip()

#OR

for element in elements:
    print(element.get_text())  #For Multiple Instances


IndexError: list index out of range

**Creating a Beautiful Soup Function**

In [26]:
from bs4 import BeautifulSoup
import requests

def getAmazonPrices(ProductUrl):
    res = request.get(producturl)
    
    soup = BeautifulSoup(res.text, 'html.parser')
    elems = soup.select('.a-price .a-price-whole, .a-price .a-price-symbol')
    return elems[0].text.strip()

### Lesson 41: Controlling the Browser with the Selenium Module 

* To import selenium, you need to run: "from selenium import webdriver" (and not "import selenium").
* To open the browser, run: browser = webdriver.Firefox()
* To send the browser to a URL, run: browser.get(‘https://inventwithpython.com')
* The browser.find_elements_by_css_selector() method will return a list of WebElement objects: elems = browser.find_elements_by_css_selector(‘img')
* WebElement objects have a "text" variable that contains the element's HTML in a string: elems[0].text
* The click() method will click on an element in the browser.
* The send_keys() method will type into a specific element in the browser.
* The submit() method will simulate clicking on the Submit button for a form.
* The browser can also be controlled with these methods: back(), forward(), refresh(), quit().

![image.png](attachment:image.png)

In [5]:
#! pip install webdriver-manager
#! pip install selenium

from selenium import webdriver

browser = webdriver.Chrome()
