Python RegEx
A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern.

RegEx can be used to check if a string contains the specified search pattern.

RegEx Module
Python has a built-in package called re, which can be used to work with Regular Expressions.
import re

In [3]:
import re
# Check if the string starts with "The" and ends with "Spain":

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

if x:
  print("YES! We have a match!")
else:
  print("No match")


YES! We have a match!


RegEx Functions
The re module offers a set of functions that allows us to search a string for a match:

Python `re` module functions

| Function   | Description                                                    |
|------------|----------------------------------------------------------------|
| findall    | Returns a list containing all matches                          |
| search     | Returns a Match object if there is a match anywhere in the string |
| split      | Returns a list where the string has been split at each match   |
| sub        | Replaces one or many matches with a string                     |


Metacharacter
| Character | Description                                   | Example        |
|-----------|-----------------------------------------------|----------------|
| `[]`      | A set of characters                          | `[a-m]`        |
| `\`       | Signals a special sequence (can also escape special characters) | `\d` |
| `.`       | Any character (except newline character)     | `he..o`        |
| `^`       | Starts with                                  | `^hello`       |
| `$`       | Ends with                                    | `planet$`      |
| `*`       | Zero or more occurrences                     | `he.*o`        |
| `+`       | One or more occurrences                      | `he.+o`        |
| `?`       | Zero or one occurrences                      | `he.?o`        |
| `{}`      | Exactly the specified number of occurrences  | `he.{2}o`      |
| `\|`       | Either or                                    | `falls\|stays` |
| `()`      | Capture and group                            | `(hello)`      |


Special Sequences
| Character | Description                                                                                                                                               | Example          |
|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|
| `\A`      | Returns a match if the specified characters are at the beginning of the string                                                                           | `\AThe`          |
| `\b`      | Returns a match where the specified characters are at the beginning or end of a word (use `r` to ensure raw string treatment)                            | `r"\bain"`, `r"ain\b"` |
| `\B`      | Returns a match where the specified characters are present, but NOT at the beginning or end of a word (use `r` to ensure raw string treatment)           | `r"\Bain"`, `r"ain\B"` |
| `\d`      | Returns a match where the string contains digits (numbers from 0-9)                                                                                      | `\d`             |
| `\D`      | Returns a match where the string DOES NOT contain digits                                                                                                 | `\D`             |
| `\s`      | Returns a match where the string contains a white space character                                                                                        | `\s`             |
| `\S`      | Returns a match where the string DOES NOT contain a white space character                                                                                | `\S`             |
| `\w`      | Returns a match where the string contains any word characters (letters, digits, and underscore `_`)                                                      | `\w`             |
| `\W`      | Returns a match where the string DOES NOT contain any word characters                                                                                    | `\W`             |
| `\Z`      | Returns a match if the specified characters are at the end of the string                                                                                 | `Spain\Z`        |


In [4]:
import re

txt = "The rain in Spain"

# Return a match for every non-digit character
x = re.findall(r"\D", txt)  # Use raw string notation for the regex pattern

print(x)  # This will print all non-digit characters found

if x:
    print("Yes, there is at least one match!")
else:
    print("No match")


['T', 'h', 'e', ' ', 'r', 'a', 'i', 'n', ' ', 'i', 'n', ' ', 'S', 'p', 'a', 'i', 'n']
Yes, there is at least one match!


Sets

A set is a set of characters inside a pair of square brackets [] with a special meaning:
| Set         | Description                                                                                      | Example        |
|-------------|--------------------------------------------------------------------------------------------------|----------------|
| `[arn]`     | Returns a match where one of the specified characters (`a`, `r`, or `n`) is present              | `[arn]`        |
| `[a-n]`     | Returns a match for any lowercase character, alphabetically between `a` and `n`                  | `[a-n]`        |
| `[^arn]`    | Returns a match for any character EXCEPT `a`, `r`, and `n`                                       | `[^arn]`       |
| `[0123]`    | Returns a match where any of the specified digits (`0`, `1`, `2`, or `3`) are present            | `[0123]`       |
| `[0-9]`     | Returns a match for any digit between `0` and `9`                                                | `[0-9]`        |
| `[0-5][0-9]`| Returns a match for any two-digit number from `00` to `59`                                       | `[0-5][0-9]`   |
| `[a-zA-Z]`  | Returns a match for any character alphabetically between `a` and `z`, lowercase OR uppercase     | `[a-zA-Z]`     |
| `[+]`       | In sets, special characters like `+`, `*`, `.`, `|`, `()`, `$`, `{}` have no special meaning. `[+]` means: return a match for any `+` character | `[+]` |


In [5]:
# The findall() Function
# The findall() function returns a list containing all matches.
import re

txt = "The rain in Spain"
x = re.findall("ai", txt)
print(x)


['ai', 'ai']


In [6]:
# Return an empty list if no match was found:

import re

txt = "The rain in Spain"
x = re.findall("Portugal", txt)
print(x)

[]


In [7]:
# The search() Function
# The search() function searches the string for a match, and returns a Match object if there is a match.

# If there is more than one match, only the first occurrence of the match will be returned:
# Search for the first white-space character in the string:

import re

txt = "The rain in Spain"
x = re.search(r"\s", txt)  # Use raw string notation for regex patterns

if x:
    print("The first occurrence of 'r' followed by a white-space character is located in position:", x.start())
else:
    print("No match found.")


The first occurrence of 'r' followed by a white-space character is located in position: 3


In [8]:
import re

txt = "The rain in Spain"
x = re.search("Portugal", txt)
print(x)

None


In [9]:
# The split() Function
# The split() function returns a list where the string has been split at each match:
# Split at each white-space character:

import re

#Split the string at every white-space character:

txt = "The rain in Spain"
x = re.split(r"\s", txt)
print(x)


['The', 'rain', 'in', 'Spain']


In [10]:
import re

#Split the string at the first white-space character:

txt = "The rain in Spain"
x = re.split(r"\s", txt, 1)
print(x)

['The', 'rain in Spain']


In [11]:
# The sub() Function
# The sub() function replaces the matches with the text of your choice:
# Replace every white-space character with the number 9:

import re

txt = "The rain in Spain"
x = re.sub(r"\s", "9", txt)
print(x)

The9rain9in9Spain


In [12]:
# Match Object
# A Match Object is an object containing information about the search and the result.
# Note: If there is no match, the value None will be returned, instead of the Match Object.
# Do a search that will return a Match Object:

import re

txt = "The rain in Spain"
x = re.search("ai", txt)
print(x) #this will print an object

<re.Match object; span=(5, 7), match='ai'>


The Match object has properties and methods used to retrieve information about the search, and the result:

.span() returns a tuple containing the start-, and end positions of the match.
.string returns the string passed into the function
.group() returns the part of the string where there was a match

In [13]:
# Print the position (start- and end-position) of the first match occurrence.

# The regular expression looks for any words that starts with an upper case "S":

import re

txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.span())

(12, 17)


In [17]:
import re

#The string property returns the search string:
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.string)

The rain in Spain


In [18]:
# Print the part of the string where there was a match.

# The regular expression looks for any words that starts with an upper case "S":

import re
txt = "The rain in Spain"
x = re.search(r"\bS\w+", txt)
print(x.group())

Spain


What is PIP?
PIP is a package manager for Python packages, or modules if you like.

What is a Package?
A package contains all the files you need for a module.

Modules are Python code libraries you can include in your project.

C:\Users\Your Name\AppData\Local\Programs\Python\Python36-32\Scripts>pip --version

In [20]:
pip install camelcase

Defaulting to user installation because normal site-packages is not writeable
Collecting camelcase
  Downloading camelcase-0.2.tar.gz (1.3 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: camelcase
  Building wheel for camelcase (setup.py): started
  Building wheel for camelcase (setup.py): finished with status 'done'
  Created wheel for camelcase: filename=camelcase-0.2-py3-none-any.whl size=1778 sha256=59ae0bc37374d4db9e6916925c9e56a4ad34fbcbbd25fa369cd5f77a7bc6101f
  Stored in directory: c:\users\rima debnath\appdata\local\pip\cache\wheels\a7\40\a3\900133dd6de3e10c219659fec4118138db05d778e519c0b2bc
Successfully built camelcase
Installing collected packages: camelcase
Successfully installed camelcase-0.2
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.0
[notice] To update, run: python.exe -m pip install --upgrade pip


In [21]:
import camelcase

c = camelcase.CamelCase()

txt = "lorem ipsum dolor sit amet"

print(c.hump(txt))

#This method capitalizes the first letter of each word.

Lorem Ipsum Dolor Sit Amet


Python Try Except
The try block lets you test a block of code for errors.

The except block lets you handle the error.

The else block lets you execute code when there is no error.

The finally block lets you execute code, regardless of the result of the try- and except blocks.

Exception Handling
When an error occurs, or exception as we call it, Python will normally stop and generate an error message.

These exceptions can be handled using the try statement:
The try block will generate an exception, because x is not defined:

#The try block will generate an error, because x is not defined:

try:
  print(x)
except:
  print("An exception occurred")

In [26]:
#The try block will generate a NameError, because x is not defined:
x=10
try:
  print(x)
except NameError:
  print("Variable x is not defined")
except:
  print("Something else went wrong")

10


In [27]:
try:
  print("Hello")
except:
  print("Something went wrong")
else:
  print("Nothing went wrong")


Hello
Nothing went wrong


In [None]:
# Finally
# The finally block, if specified, will be executed regardless if the try block raises an error or not.
try:
  print(x)
except:
  print("Something went wrong")
finally:
  print("The 'try except' is finished")

In [35]:
# This can be useful to close objects and clean up resources:
try:
    f = open("demofile.txt", "w")  # Open the file in write mode
    try:
        f.write("Lorem Ipsum")  # Write to the file
    except Exception as e:
        print(f"Something went wrong when writing to the file: {e}")
    finally:
        f.close()  # Ensure the file is closed
except Exception as e:
    print(f"Something went wrong when opening the file: {e}")


Raise an exception
As a Python developer you can choose to throw an exception if a condition occurs.

To throw (or raise) an exception, use the raise keyword.

In [None]:
x = -1 #it gives an error

if x < 0:
  raise Exception("Sorry, no numbers below zero")

The raise keyword is used to raise an exception.

You can define what kind of error to raise, and the text to print to the user.

In [39]:
x = "hello" #it gives an error

if not type(x) is int:
  raise TypeError("Only integers are allowed")

TypeError: Only integers are allowed

Python User Input
User Input
Python allows for user input.

That means we are able to ask the user for input.

The method is a bit different in Python 3.6 than Python 2.7.

Python 3.6 uses the input() method.

Python 2.7 uses the raw_input() method.

The following example asks for the username, and when you entered the username, it gets printed on the screen:

In [None]:
username = input("Enter username: ")
print("Username is: " + username)
# Enter username:rima
# Username is: rima

Username is: 


Python String Formatting
F-String was introduced in Python 3.6, and is now the preferred way of formatting strings.

Before Python 3.6 we had to use the format() method.
F-Strings

F-string allows you to format selected parts of a string.

To specify a string as an f-string, simply put an f in front of the string literal, like this:

In [3]:
txt = f"The price is 49 dollars"
print(txt)

The price is 49 dollars


In [4]:
# Placeholders and Modifiers
# To format values in an f-string, add placeholders {}, a placeholder can contain variables, operations, functions, and modifiers to format the value.
# Add a placeholder for the price variable:

price = 59
txt = f"The price is {price} dollars"
print(txt)

The price is 59 dollars


In [5]:
# A placeholder can also include a modifier to format the value.

# A modifier is included by adding a colon : followed by a legal formatting type, like .2f which means fixed point number with 2 decimals:
# Display the price with 2 decimals:

price = 59
txt = f"The price is {price:.2f} dollars"
print(txt)

The price is 59.00 dollars


In [6]:
# You can also format a value directly without keeping it in a variable:
# Display the value 95 with 2 decimals:

txt = f"The price is {95:.2f} dollars"
print(txt)

The price is 95.00 dollars


In [7]:
# Perform Operations in F-Strings
# You can perform Python operations inside the placeholders.

# You can do math operations:
# Perform a math operation in the placeholder, and return the result:

txt = f"The price is {20 * 59} dollars"
print(txt)

The price is 1180 dollars


In [8]:
# You can perform math operations on variables:
# Add taxes before displaying the price:

price = 59
tax = 0.25
txt = f"The price is {price + (price * tax)} dollars"
print(txt)

The price is 73.75 dollars


In [9]:
# You can perform if...else statements inside the placeholders:

# Return "Expensive" if the price is over 50, otherwise return "Cheap":

price = 49
txt = f"It is very {'Expensive' if price>50 else 'Cheap'}"

print(txt)

It is very Cheap


In [10]:
# Execute Functions in F-Strings
# You can execute functions inside the placeholder:
# Use the string method upper()to convert a value into upper case letters:

fruit = "apples"
txt = f"I love {fruit.upper()}"
print(txt)

I love APPLES


In [12]:
# The function does not have to be a built-in Python method, you can create your own functions and use them:
# Create a function that converts feet into meters:

def myconverter(x):
  return x * 0.3048

txt = f"The plane is flying at a {myconverter(30000)} meter altitude"
print(txt)

The plane is flying at a 9144.0 meter altitude


In [13]:
# More Modifiers
# At the beginning of this chapter we explained how to use the .2f modifier to format a number into a fixed point number with 2 decimals.
# There are several other modifiers that can be used to format values:
# Use a comma as a thousand separator:

price = 59000
txt = f"The price is {price:,} dollars"
print(txt)

The price is 59,000 dollars


String format()
Before Python 3.6 we used the format() method to format strings.

The format() method can still be used, but f-strings are faster and the preferred way to format strings.

The next examples in this page demonstrates how to format strings with the format() method.

The format() method also uses curly brackets as placeholders {}, but the syntax is slightly different:

In [14]:
price = 49
txt = "The price is {} dollars"
print(txt.format(price))

The price is 49 dollars


In [15]:
price = 49
txt = "The price is {:.2f} dollars"
print(txt.format(price))

The price is 49.00 dollars


In [16]:
quantity = 3
itemno = 567
price = 49
myorder = "I want {} pieces of item number {} for {:.2f} dollars."
print(myorder.format(quantity, itemno, price))

I want 3 pieces of item number 567 for 49.00 dollars.


In [17]:
# Index Numbers
# You can use index numbers (a number inside the curly brackets {0}) to be sure the values are placed in the correct placeholders:

quantity = 3
itemno = 567
price = 49
myorder = "I want {0} pieces of item number {1} for {2:.2f} dollars."
print(myorder.format(quantity, itemno, price))

I want 3 pieces of item number 567 for 49.00 dollars.


In [18]:
# Also, if you want to refer to the same value more than once, use the index number:

age = 36
name = "John"
txt = "His name is {1}. {1} is {0} years old."
print(txt.format(age, name))

His name is John. John is 36 years old.


In [19]:
# Named Indexes
# You can also use named indexes by entering a name inside the curly brackets {carname}, but then you must use names when you pass the parameter values txt.format(carname = "Ford"):

myorder = "I have a {carname}, it is a {model}."
print(myorder.format(carname = "Ford", model = "Mustang"))

I have a Ford, it is a Mustang.


# Formatting Types in Python

| Format Specifier | Description                                                          |
|------------------|----------------------------------------------------------------------|
| `:<`            | Left aligns the result (within the available space).                 |
| `:>`            | Right aligns the result (within the available space).                |
| `:^`            | Center aligns the result (within the available space).               |
| `:=`            | Places the sign to the leftmost position.                            |
| `:+`            | Use a plus sign to indicate if the result is positive or negative.   |
| `:-`            | Use a minus sign for negative values only.                           |
| `:`             | Use a space to insert an extra space before positive numbers (and a minus sign before negative numbers). |
| `:,`            | Use a comma as a thousand separator.                                 |
| `:_`            | Use an underscore as a thousand separator.                           |
| `:b`            | Binary format.                                                       |
| `:c`            | Converts the value into the corresponding Unicode character.         |
| `:d`            | Decimal format.                                                     |
| `:e`            | Scientific format, with a lower case `e`.                            |
| `:E`            | Scientific format, with an upper case `E`.                           |
| `:f`            | Fixed-point number format.                                           |
| `:F`            | Fixed-point number format, in uppercase (showing `inf` and `nan` as `INF` and `NAN`). |
| `:g`            | General format.                                                     |
| `:G`            | General format (using an uppercase `E` for scientific notations).    |
| `:o`            | Octal format.                                                       |
| `:x`            | Hexadecimal format, lowercase.                                       |
| `:X`            | Hexadecimal format, uppercase.                                       |
| `:n`            | Number format.                                                      |
| `:%`            | Percentage format.                                                  |


In [20]:
#Use "<" to left-align the value:

txt = f"We have {49:<8} chickens."
print(txt)

We have 49       chickens.


In [21]:
#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.

#Use ">" to right-align the value:

txt = f"We have {49:>8} chickens."
print(txt)

We have       49 chickens.


In [22]:
#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.

#Use "^" to center-align the value:

txt = f"We have {49:^8} chickens."
print(txt)

We have    49    chickens.


In [23]:
#To demonstrate, we insert the number 8 to specify the available space for the value.

#Use "=" to place the plus/minus sign at the left most position:

txt = f"The temperature is {-5:=8} degrees celsius."

print(txt)

The temperature is -      5 degrees celsius.


In [24]:
#Use "+" to always indicate if the number is positive or negative:

txt = f"The temperature is between {-3:+} and {7:+} degrees celsius."

print(txt)

The temperature is between -3 and +7 degrees celsius.


In [26]:
#Use "-" to always indicate if the number is negative (positive numbers are displayed without any sign):

txt = f"The temperature is between {-3:-} and {7:-} degrees celsius."

print(txt)


The temperature is between -3 and 7 degrees celsius.


In [27]:
#Use " " (a space) to insert a space before positive numbers and a minus sign before negative numbers:

txt = f"The temperature is between {-3: } and {7: } degrees celsius."

print(txt)

The temperature is between -3 and  7 degrees celsius.


In [28]:
#Use "," to add a comma as a thousand separator:

txt = f"The universe is {13800000000:,} years old."

print(txt)

The universe is 13,800,000,000 years old.


In [29]:
#Use "_" to add a underscore character as a thousand separator:

txt = f"The universe is {13800000000:_} years old."

print(txt)

The universe is 13_800_000_000 years old.


In [30]:
#Use "b" to convert the number into binary format:

txt = f"The binary version of 5 is {5:b}"

print(txt)

The binary version of 5 is 101


In [31]:
#Use "d" to convert a number, in this case a binary number, into decimal number format:

txt = f"We have {0b101:d} chickens."

print(txt)

We have 5 chickens.


In [32]:
#Use "e" to convert a number into scientific number format (with a lower-case e):

txt = f"We have {5:e} chickens."

print(txt)

We have 5.000000e+00 chickens.


In [33]:
#Use "E" to convert a number into scientific number format (with an upper-case E):

txt = f"We have {5:E} chickens."

print(txt)

We have 5.000000E+00 chickens.


In [34]:
#Use "f" to convert a number into a fixed point number, default with 6 decimals, but use a period followed by a number to specify the number of decimals:

txt = f"The price is {45:.2f} dollars."
print(txt)

#without the ".2" inside the placeholder, this number will be displayed like this:

txt = f"The price is {45:f} dollars."
print(txt)

The price is 45.00 dollars.
The price is 45.000000 dollars.


In [35]:
#Use "F" to convert a number into a fixed point number, but display inf and nan as INF and NAN:

x = float('inf')

txt = f"The price is {x:F} dollars."
print(txt)

#same example, but with a lower case f:

txt = f"The price is {x:f} dollars."
print(txt)


The price is INF dollars.
The price is inf dollars.


In [36]:
#Use "o" to convert the number into octal format:

txt = f"The octal version of 10 is {10:o}"

print(txt)

The octal version of 10 is 12


In [37]:
#Use "x" to convert the number into Hex format:

txt = f"The Hexadecimal version of 255 is {255:x}"

print(txt)

The Hexadecimal version of 255 is ff


In [38]:
#Use "X" to convert the number into upper-case Hex format:

txt = f"The Hexadecimal version of 255 is {255:X}"

print(txt)

The Hexadecimal version of 255 is FF


In [39]:
#Use "%" to convert the number into a percentage format:

txt = f"You scored {0.25:%}"
print(txt)

#Or, without any decimals:

txt = f"You scored {0.25:.0%}"
print(txt)

You scored 25.000000%
You scored 25%
