# Python IT Automation: Intro to Python, Regex, and Bash Scripting


## Hello World in Python

**What is Python?**

Python is a dynamic, interpreted (bytecode-compiled) language. 

**Why programming with Python?**
* Easy syntax
* Most chosen language for IT
* Omnipresent



In [1]:
! python --version

Python 3.9.5


In [2]:
print('hello, bangkit!')

hello, bangkit!


## Basic Python Syntax

### Data Types

* **String** (str): text.
* **Integer** (int): numbers, without fraction.
* **Float** (float): numbers with fraction.
* **Boolean** (bool): data type which only has 2 values

We can convert from one data type to others by committing to implicit conversion or defining an explicit conversion.


### Variables
* Name to certain values
* The values can be any data type
* The process of storing a value inside a variable is called an assignment.
* Can only be made up of letters, numbers, and underscore. 
* Can’t be Python reserved keywords.

In [3]:
# String (str)
color = "red"
print(type(color))

<class 'str'>


In [4]:
# Integer (int)
length = 10
print(type(length))

<class 'int'>


In [5]:
# Float
width = 2.0
print(type(width))

<class 'float'>


In [6]:
# Boolean
Boolean = True
print(type(Boolean))

<class 'bool'>


In [7]:
str(width)

'2.0'

In [8]:
print(type(int(width)))

<class 'int'>


**Don't Do This**

In [9]:
def = "Function"
class = "Class"

SyntaxError: invalid syntax (Temp/ipykernel_14524/2213772507.py, line 1)

### Functions
* Define function with def keyword.
* Function has body, written as a block after colon in function definition. The block has indented to the right.
* To get value from a function use the return keyword.


In [None]:
def greeting(name):
  return 'Hello, ' + name

print(greeting("Bangkit 2022"))

Hello, Bangkit 2022


In [None]:
def greeting(name):
  text =  'Hello, ' + name

print(greeting("Bangkit 2022"))

None


### Comparison
* Boolean (bool) data type represents one of two possible states, either True or False.
* Not all data types can be compared, so be aware to compare two different data types.
* Comparison operator not only checking equality and less/more it also includes logical operator: and, or, not.


In [10]:
print(1 < 10)

print("Linux" == "Windows")

print(1 != "1")

print(not True)


True
False
True
False


* Comparison operators
  * a == b: a is equal to b
  * a != b: a is different than b
  * a < b: a is smaller than b
  * a <= b: a is smaller or equal to b
  * a > b: a is bigger than b
  * a >= b: a is bigger or equal to b

* Logical operators
  * a and b: True if both a and b are True. False otherwise.
  * a or b: True if either a or b or both are True. False if both are False.
  * not a: True if a is False, False if a is True.


### Conditional & If Statements
* The ability of a program to alter its execution sequence is called branching.
* The if block will be executed only if the condition is True.
* Use elif & else statement to handle multiple conditions. 

In [11]:
hour = 11
if hour < 12:
  print("Good morning!")

Good morning!


In [12]:
def check(number):
    if number > 0:
        return "Positive"
    elif number == 0:
        return "Zero"
    else: 
        return "Negative"

print(check(-10))

Negative


### Loops

**while** loop instruct computer to continuously execute code based on the value of a condition.




In [14]:
x = 30  # also try with x = 0

while x > 0:
  print("positive x=" + str(x))
  x = x - 1
  print("now x=" + str(x))


positive x=30
now x=29
positive x=29
now x=28
positive x=28
now x=27
positive x=27
now x=26
positive x=26
now x=25
positive x=25
now x=24
positive x=24
now x=23
positive x=23
now x=22
positive x=22
now x=21
positive x=21
now x=20
positive x=20
now x=19
positive x=19
now x=18
positive x=18
now x=17
positive x=17
now x=16
positive x=16
now x=15
positive x=15
now x=14
positive x=14
now x=13
positive x=13
now x=12
positive x=12
now x=11
positive x=11
now x=10
positive x=10
now x=9
positive x=9
now x=8
positive x=8
now x=7
positive x=7
now x=6
positive x=6
now x=5
positive x=5
now x=4
positive x=4
now x=3
positive x=3
now x=2
positive x=2
now x=1
positive x=1
now x=0



**for** loop iterates over a sequence of values.

In [None]:
for x in range(3):
  print("x=" + str(x))

x=0
x=1
x=2


Both while and for loops can be interrupted using the **break** keyword. 



In [None]:
for x in range(3):
    print("x=" + str(x))
    if x == 1:
        break  # quit from loop


x=0
x=1


Use the **continue** keyword to skip the current iteration and continue with the next one.

In [None]:
for x in range(3, 0, -1):
    if x % 2 == 0:
        continue  # skip even
    print(x)


3
1


## Python Data Structure


### Strings

* **Represent a piece of text**.

String is a data type in Python employed to represent a piece of text. It’s written between quotes, either single quotes, double quotes, or triple quotes. Escape character using backslash (\).

String can be as short as zero characters (empty string) or significantly long. String concatenation using plus sign (+). The **len** function tells the number of characters contained in the string.


In [None]:
program_name = 'bangkit'
program_year = "it's the 2nd"
multi_line = """hello,
email test.
signature."""
  
# let's
# "bangkit"
print("let's\n\""+program_name+"\"")


print(len(''))  # 0
print(len(program_name)==7)  # True


let's
"bangkit"
0
True


* **To access substring, use index or slicing**.

Python starts counting indexes from 0 not 1. Access index greater than its length - 1, triggers index out of range. Negative indexes starts from behind.

To access substring, use slicing, similar to index, with range using a colon as a separator, starts from first number, up to 1 less than last.

Slicing with one of two indexes means the other index is either 0 for the first value or its length for the second value.


In [None]:
name = 'bangkit'
print(name[1])  # a
print(name[len(name)-1])  # t
print(name[-1])  # t
print(name[-2])  # i
  
print(name[4:len(name)-1])  # ki
  
print(name[:4])  # bang (0-3)
print(name[4:])  # kit  (4-len)

a
t
t
i
ki
bang
kit


* **Strings in Python are immutable**

Strings in Python are immutable, meaning they can't be modified, can’t change individual characters. It'll trigger TypeError object does not support item assignment.

To change string, replace it with the new string.

Use in keyword to check if substring is a part of the string.

In [None]:
year = "it's 2021"
year[-1] = "0"  # TypeError

TypeError: ignored

* **Provide a bunch of methods for working with text**

String class provide a bunch of methods for working with text. Not only related to text modification, there's also many of text checking method.

Remember, the goal is not for memorize all of the methods, just check the documentation or search on the web anytime.

In [None]:
program = 'bangkit 2021'
print(program.index('g'))  # 3

print(program.upper())  # BANGKIT 2021
print(program.endswith('2021'))  # True
print(program.replace('2021', '2020'))

year = 2021  # integer 2021
print(str(year).isnumeric())  # True
# bangkit for 2021
print("{} for {}".format("bangkit", year))


3
BANGKIT 2021
True
bangkit 2020
True
bangkit for 2021


### List
Think of list as container with space inside divided up into different slots. Each slot can contain a different value.

Python use square brackets [] to indicate where the list starts and ends. list indexes starts from 0, just like string, also slicing to return another list.



In [None]:
program_year = [2020, 2021]

print(type(program_year))    # list
print(program_year)
print(len(program_year))     # 2
print(2019 in program_year)  # False

print(program_year[0])       # 2020
print(program_year[:1])     # [2020]

for year in program_year:
    print(year)  # element per line


<class 'list'>
[2020, 2021]
2
False
2020
[2020]
2020
2021


If strings are immutable, lists are mutable, means able to add, remove, or modify elements in a list.

Use append to add to last element. To add on specific index, use insert. To delete element, use remove with element or pop with index.

For element modification, change directly to the specific index.



In [None]:
paths = ['ML', 'Cloud']
paths.append('Android')
print(len(paths))  # 3
paths.remove('Android')
paths.insert(1, 'Mobile')

# ['ML', 'Mobile', 'Cloud']
print(paths)
paths.append('Python')
paths.pop(-1)  # remove 'Python'

# change 'ML' to 'Machine Learning'
paths[0] = 'Machine Learning'


3
['ML', 'Mobile', 'Cloud']


We can create a new list from a sequence or a range in single line using list comprehensions.

List comprehensions can be really powerful, but can also be utterly complex, resulting to codes that are hard to read.


In [None]:
even = [x*2 for x in range(1,5)]
print(even)  # [2, 4, 6, 8]

tens = [x for x in range(50) if x % 10 == 0]
print(tens)  # [0, 10, 20, 30, 40]

[2, 4, 6, 8]
[0, 10, 20, 30, 40]


### Tuples
Tuples are like lists. They can contain elements of any data type. But, unlike lists, tuples are immutable.

Python using parentheses () to indicate where the tuple starts and ends.

Good example of tuple is when a function returns multiple values.

Strings, Lists, and Tuples are included as sequence types.



In [None]:
def get_stat(numbers):
  total = sum(numbers)
  length = len(numbers)
  mean = total / length
  return length, total, mean

stat = get_stat([1, 3, 5, 7])
print(stat)  # (4, 16, 4.0)
print(type(stat))  # tuple

for data in stat:
    print(data)  # element per line


(4, 16, 4.0)
<class 'tuple'>
4
16
4.0


### Dictionaries
Like lists, dictionaries are used to organize elements into collections. Unlike lists, not accessing elements inside dictionaries using position.

Data inside dictionaries take the form of **pairs of keys and values**. To get a dictionary value, use its corresponding key.

Not like list index must be a number, type of key in dictionary use strings, integers, tuples & more.
dictionary use curly brackets {}.



In [None]:
students = {
    'ml': 500,
    'mobile': 700,
    'cloud': 900
}
print(type(students))     # dict
print(students['cloud'])  # 900

# keys: ['ml', 'mobile', 'cloud']
for key in students.keys():
    # eg: ml:500
    print(key + ': '+ str(students[key]))


<class 'dict'>
900
ml: 500
mobile: 700
cloud: 900


Use for loops to iterate through the contents of dictionary (implicitly over keys).

To get both key and value as tuple at the same time, use items.

Other than using keys to get all keys, use values to get all dictionary values.



In [None]:
file_counts = {"jpg": 10,
               "txt": 14,
               "csv": 2,
               "py": 23}
for extension in file_counts:
    print(extension)  # eg: jpg


for ext, amount in file_counts.items():
    print('{} files .{}'
          .format(amount, ext))


jpg
txt
csv
py
10 files .jpg
14 files .txt
2 files .csv
23 files .py


Just like lists, dictionaries are mutable, means able to add, remove, or modify elements in a dictionary.

Set new value using associated key. Add item (pairs of key & value) by set new key with new value. Delete item with del keyword or delete all items using clear.

Use dictionary over list if aims to access data via its key instead of iterate to find the key.


In [None]:
# point in line y = 2x + 1
point_a = {'x': 2, 'y': 5}
point_a['x'] = 3
point_a['y'] = 7


new_point = {}  # empty dictionary
new_point['z'] = 2
print(len(new_point.keys()))  # 1
del new_point['z']  # remove item
print(new_point)    # {}


new_point = {'x': 0, 'y': 1}
new_point.clear()
print(new_point)    # {}


1
{}
{}


## Regular Expressions (Regex)

A regular expression, also known as regex or regexp, is essentially a search query for text that’s expressed by string pattern.

There are multiple different ways through which we can apply regular expressions, a whole wide range of programming languages, includes Python. Also command line tools that know how to apply regexes, like grep, sed or awk. The implementation may vary, but the principles remain the same.



* Regular expressions in Python uses raw string (r""). It means that interpreter shouldn't try to interpret any special characters. Just pass the string to the function as is.
* The Match object includes information like position in the string that matched and the actual matching string.




In [None]:
import re


result = re.search(r"aza", "plaza")
print(result)
# <re.Match object; span=(2, 5), match='aza'>
print(re.search(r"aza", "maze"))
# None

<re.Match object; span=(2, 5), match='aza'>
None


* Circumflex (^) pattern matches the beginning of the line. Dot (.) matches any character. Option re.IGNORECASE to match insensitive case.

In [None]:
print(re.search(r"^x", "xenon"))
# <re.Match object; span=(0, 1), match='x'>

print(re.search(r"p.ng", "sponge"))
# <re.Match object; span=(1, 5), match='pong'>


<re.Match object; span=(0, 1), match='x'>
<re.Match object; span=(1, 5), match='pong'>


* To matched a range of characters, use another feature of regexes called character classes ([ ]).


In [None]:
print(re.search(r"cloud[a-zA-Z0-9]", "cloud9"))
# <re.Match object; span=(0, 6), match='cloud9'>

print(re.search(r"[^a-zA-Z]", "This is a sentence."))
# <re.Match object; span=(4, 5), match=' '>

<re.Match object; span=(0, 6), match='cloud9'>
<re.Match object; span=(4, 5), match=' '>


* Use the pipe symbol (|) to match one expression or another.

In [None]:
print(re.search(r"cat|dog", "I like cats."))
# <re.Match object; span=(7, 10), match='cat'>

<re.Match object; span=(7, 10), match='cat'>


* Dollar sign ($) pattern match the end of the line.

In [None]:
print(re.search(r"cat$", "I like cats."))
# None

None


Repeated matches is another regex concept.
* The star (*) takes as many character as possible.


In [None]:
print(re.search(r"Py[a-z]*n", "Python Programming"))
# <re.Match object; span=(0, 6), match='Python'>

<re.Match object; span=(0, 6), match='Python'>


* The plus (+) character matches one or more occurrences of the character before it.

In [None]:
print(re.search(r"o+l+", "woolly"))
# <re.Match object; span=(1, 5), match='ooll'>

<re.Match object; span=(1, 5), match='ooll'>


* The question (?) mark symbol means either zero or one occurrence of the character before it.

In [None]:
print(re.search(r"p?each", "I like peaches"))
# <re.Match object; span=(7, 12), match='peach'>

<re.Match object; span=(7, 12), match='peach'>


## Managing Files & Bash Scripting

### Managing Files with Python

* Function `open` will start to open the file.
* To read file, use the readline & read function.
* To ensure that all open files are always closed, use an alternative method to write it as a block of code  using the `with` keyword.
* Function `open` by default using mode "r" (read-only), can be used to write file with mode "w" (write-only), if file exists will be overwritten. To avoid, use mode "a" for append, or mode "r+" for read-write.



In [None]:
text = """
The itsy bitsy spider climbed up the waterspout.
Down came the rain
and washed the spider out.
"""

with open("spider.txt", "w") as file:
  file.write(text)

In [None]:
with open("spider.txt", "r") as file:
  print(file.read())


The itsy bitsy spider climbed up the waterspout.
Down came the rain
and washed the spider out.



### Bash Scripting

* echo: print information (like environment variable) to standard output
* cat file: shows the content of the file through standard output
* ls: lists the contents of the current directory
* cd directory: change current working directory to the specified one 
* rm: remove file or directory (with specific arguments)
* chmod modifiers files: change permissions for the files according to the provided modifiers
* man: show command documentation

In [None]:
!echo "Hello"

Hello


In [None]:
!ls

sample_data  spider.txt


In [None]:
!cat "spider.txt"


The itsy bitsy spider climbed up the waterspout.
Down came the rain
and washed the spider out.


In [None]:
!mkdir "MyFirstFolder"
!ls

MyFirstFolder  sample_data  spider.txt


In [None]:
!for i in *; do echo $i; done

MyFirstFolder
sample_data
spider.txt
