***
# Python Alchemy - Volume One
# Chapter 7 - Speaking Python‚Äôs Language

- [7.1 Text in Python](#71-text-in-python)
- [7.2 Crafting and Using Strings](#72-crafting-and-using-strings)
- [7.3 String Operations and Methods](#73-string-operations-and-methods)
- [7.4 Python‚Äôs String Formatting Methods](#74-pythons-string-formatting-methods)
- [7.5 Multiline Strings and Docstrings](#75-multiline-strings-and-docstrings)
- [7.6 Regular Expressions](#76-regular-expressions)

***

## 7.1 Text in Python

Python is not only capable of working with letters and numbers, it fully supports Unicode, the universal encoding standard that encompasses symbols, emojis, and virtually every writing system used across the world.

In [2]:
print("Hello. World! \U0001F30F")
print("Hello, World! \U0001F40D")
print("‡§®‡§Æ‡§∏‡•ç‡§§‡•á Python") # Hindi
print("„Åì„Çì„Å´„Å°„ÅØ Python") # Japanese
print("–ü—Ä–∏–≤–µ—Ç Python") # Russian

Hello. World! üåè
Hello, World! üêç
‡§®‡§Æ‡§∏‡•ç‡§§‡•á Python
„Åì„Çì„Å´„Å°„ÅØ Python
–ü—Ä–∏–≤–µ—Ç Python


## 7.2 Crafting and Using Strings

At its core, a string is simply a sequence of characters, such as letters, numbers, punctuation marks, or spaces.

In [3]:
name = 'Ivaan'
greeting = f"Hello, {name}!"
word = "Python"
print(word[0]) # Output: P
print(word[3]) # Output: h

P
h


Python also supports triple quotes (‚Äò‚Äò‚Äò or ‚Äú‚Äú‚Äú), which allow strings to span multiple lines.

In [4]:
paragraph = """Python is a powerful language.
It is beginner-friendly,
yet highly versatile for experts."""
print(paragraph)

Python is a powerful language.
It is beginner-friendly,
yet highly versatile for experts.


#### Formatting Strings

Common escape characters:
\n ‚Üí newline
\t ‚Üí tab (indentation)
\‚Äù ‚Üí double quote inside a string
\\ ‚Üí backslash itself

In [6]:
print("Hello\nWorld") # Newline
print("Name:\tEve") # Tab space
print("She said, \"Python rocks!\"") # Escaping quotes
print("Path: C:\\Users\\Eve") # Escaping backslashes

Hello
World
Name:	Eve
She said, "Python rocks!"
Path: C:\Users\Eve


## 7.3 String Operations and Methods

Strings in Python come with a rich set of operations and built-in methods that make text handling powerful and flexible.

#### Basic operations

Strings in Python support several fundamental operations that make them versatile for everyday programming tasks.

#### Concatenation (+)

In [7]:
first = "Hello"
second = "World"
result = first + " " + second
print(result) # Output: Hello World

Hello World


#### Repetition (*)

In [8]:
line = "-=" * 5
print(line) # Output: -=-=-=-=-=-=

-=-=-=-=-=


#### Membership (in, not in)

In [None]:
text = "Python programming"
print("Python" in text) # True
print("Java" not in text) # True

#### Length of String (len())

In [None]:
message = "Hello!"
print(len(message)) # Output: 6

#### Common String Methods

Python offers a comprehensive suite of string methods that significantly streamline the process of text manipulation and analysis.

#### Changing Case

.lower() ‚Üí converts all characters to lowercase.
.upper() ‚Üí converts all characters to uppercase.
.title() ‚Üí capitalizes the first letter of each word.
.capitalize() ‚Üí capitalizes only the first letter of the string.

In [9]:
text = "python programming"
print(text.lower()) # python programming
print(text.upper()) # PYTHON PROGRAMMING
print(text.title()) # Python Programming
print(text.capitalize()) # Python programming

python programming
PYTHON PROGRAMMING
Python Programming
Python programming


#### Trimming Whitespace

.strip() ‚Üí removes spaces from both ends.
.lstrip() ‚Üí removes spaces from the left side.
.rstrip() ‚Üí removes spaces from the right side.

In [10]:
raw = " hello world "
print(raw.strip()) # "hello world"
print(raw.lstrip()) # "hello world "
print(raw.rstrip()) # " hello world"

hello world
hello world 
 hello world


#### Searching and Replacing

.find() ‚Üí returns the index of the first occurrence of a substring (or -1 if not found).
.replace(old, new) ‚Üí replaces all occurrences of a substring with another.

In [11]:
text = "I love Python programming"
print(text.find("Python")) # 7
print(text.replace("Python", "Java")) # I love Java programming

7
I love Java programming


#### Splitting and Joining

.split() ‚Üí splits a string into a list using a delimiter (default is space).
.join() ‚Üí joins elements of a list into a string, using the given separator.

In [12]:
sentence = "Python is fun"
words = sentence.split()
print(words) # ['Python', 'is', 'fun']
joined = "-".join(words)
print(joined) # Python-is-fun

['Python', 'is', 'fun']
Python-is-fun


#### Checking String Properties

.isdigit() ‚Üí checks if all characters are digits.
.isalpha() ‚Üí checks if all characters are letters.
.isalnum() ‚Üí checks if all characters are letters or digits.
.startswith(sub) ‚Üí checks if a string starts with a substring.
.endswith(sub) ‚Üí checks if a string ends with a substring.

In [13]:
data = "Python3"
print(data.isdigit()) # False
print(data.isalpha()) # False
print(data.isalnum()) # True
print(data.startswith("Py")) # True
print(data.endswith("3")) # True

False
False
True
True
True


## 7.4 Python‚Äôs String Formatting Methods

Formatting solves the limitations of simple string concatenation. While concatenation (+) can join text and variables, it quickly becomes messy, especially when mixing different data types like numbers and strings. For example,

In [None]:
a = 2
b = 5
message = ‚Äúthe sum of numbers ‚Äú + str(a) + ‚Äú and ‚Äú + str(b) + ‚Äú is ‚Äú + str(a+b)
print(message)

Here, constructing a meaningful message requires multiple explicit type conversions using str() in order to concatenate integers with string literals.

#### F-Strings

f-strings provide a concise, expressive, and highly readable mechanism for embedding variables, expressions, and even function calls directly within string literals.

In [14]:
name = "Ivaan"
score = 92.5
status = "passed" if score >= 60 else "failed"

message = f"Student {name} has {status} the exam with a score of {score:.2f}."
print(message)

Student Ivaan has passed the exam with a score of 92.50.


f-Strings are not restricted to simple variable substitution; they can also evaluate arbitrary expressions within the curly braces.

In [15]:
price = 120
discount = 15
message = (
    f"Original Price: ‚Çπ{price}\n"
    f"Discount: {discount}%\n"
    f"Final Price: ‚Çπ{price - (price * discount / 100)}"
)

print(message)

Original Price: ‚Çπ120
Discount: 15%
Final Price: ‚Çπ102.0


Another Example with Conditional Logic

In [16]:
score = 78
result = f"Status: {'Pass' if score >= 60 else 'Fail'} (Score: {score})"
print(result)

Status: Pass (Score: 78)


You can even use function calls inside f-Strings:

In [17]:
def square(x):
    return x * x

print(f"The square of 7 is {square(7)}")

The square of 7 is 49


f-Strings also support format specifiers, making it easy to format numbers, dates, or other values.

In [19]:
pi = 3.1415926535
print(f"Value of pi: {pi:.2f}") # 2 decimal places
print(f"Value of pi: {pi:.4f}") # 4 decimal places

Value of pi: 3.14
Value of pi: 3.1416


Example with alignment and width:

In [21]:
for num in range(1, 6):
    print(f"{num:>3}") # right-align numbers in a field of width 3

  1
  2
  3
  4
  5


#### Alternative Formatting Methods

Before the introduction of f-Strings in Python 3.6, developers commonly used the .format() method or the older % operator for string formatting.

#### .format() Method

In [None]:
name = "Ivaan"
age = 25
sentence = "My name is {} and I am {} years old.".format(name, age)

print(sentence)

My name is Ivaan and I am 25 years old.


Alternative method to do so is by utilizing positional and keywards arguments with .formar() method.

In [23]:
sentence = "Coordinates: {0}, {1}".format(10, 20)
print(sentence) # Output: Coordinates: 10, 20

sentence = "Name: {name}, Age: {age}".format(name="Bob", age=30)
print(sentence) # Output: Name: Bob, Age: 30

Coordinates: 10, 20
Name: Bob, Age: 30


you can also format numbers same as f-string using .format(), Here is an Example:

In [25]:
pi = 3.1415926535
print("Pi rounded to 2 decimals: {:.2f}".format(pi))

Pi rounded to 2 decimals: 3.14


#### % Formatting

The % operator is Python‚Äôs oldest way of formatting strings, inspired by C-style formatting.

%s ‚Üí string
%d ‚Üí integer
%f ‚Üí floating-point number

In [27]:
name = "Charlie"
age = 28

print("My name is %s and I am %d years old." % (name, age))

pi = 3.1415926535
print("Pi rounded to 2 decimals: %.2f" % pi)

My name is Charlie and I am 28 years old.
Pi rounded to 2 decimals: 3.14


## 7.5 Multiline Strings and Docstrings

Multiline strings provide a highly convenient mechanism for conveying extended or richly structured information, enabling developers to compose multi-paragraph text directly within the source code.

#### Multiline Strings

A multiline string enables text to span multiple lines naturally, without requiring explicit newline escape characters (\n).

In [28]:
message = """Python is a powerful language.
It is beginner-friendly,
yet highly versatile for experts."""
print(message)

Python is a powerful language.
It is beginner-friendly,
yet highly versatile for experts.


#### Docstrings

Docstrings constitute a specialized form of multiline string positioned at the beginning of a function, class, or module to articulate its purpose, behavior, and expected usage.

For example documenting function‚Äôs purpose as Single-line docstring

In [29]:
def greet(name):
    """This function greets a person by their name."""
    return f"Hello, {name}!"

Docstring are accessable through Python‚Äôs in-build introspection tools and ._doc_ attribute or the built-in help() function:

In [31]:
print(greet.__doc__)
help(greet)

This function greets a person by their name.
Help on function greet in module __main__:

greet(name)
    This function greets a person by their name.



#### Multiline Docstrings

Multiline docstrings are used to provide comprehensive descriptions of a function, class, or module, encompassing details such as purpose, usage instructions, parameter specifications, and expected return values.

In [36]:
def factorial(n):
    """
    Calculate the factorial of a number n.
    
    Parameters:
    n (int): Non-negative integer
    
    Returns:
    int: Factorial of n
    """
    if n == 0:
        return 1
    
    return n * factorial(n-1)

print(factorial(5)) # Output: 120

120


## 7.6 Regular Expressions

Regular expressions (regex) in Python are powerful tools for pattern matching and text processing.

#### Python ‚Äòre‚Äô Module

The Python re module provides regular expression support, allowing powerful text pattern matching, searching, and manipulation.

#### re.search() vs re.match()

The functions re.search() and re.match() within Python‚Äôs re module are fundamental tools for pattern recognition and text analysis.

In [37]:
import re
text = "Python is a powerful programming language."

# Using re.match()
match_result = re.match("Python", text)
print("re.match() result:", match_result) # Matches because 'Python' is at the start

# Using re.search()
search_result = re.search("powerful", text)
print("re.search() result:", search_result) # Matches because 'powerful' appears later in the string

# Example of a non-match with re.match()
no_match = re.match("powerful", text)
print("re.match() no match:", no_match) # None, because 'powerful' is not at the start

re.match() result: <re.Match object; span=(0, 6), match='Python'>
re.search() result: <re.Match object; span=(12, 20), match='powerful'>
re.match() no match: None


#### re.findall()

The re.findall() function in Python‚Äôs re module is a powerful tool for extracting all non-overlapping occurrences of a specified pattern within a string.

In [38]:
import re
text = "Emails: ivaan@example.com, laisha@test.org, eve@abc.net"
# Using re.findall() to extract all email addresses
all_emails = re.findall( r"\w+@\w+\.\w+", text)
print("re.findall() result:", all_emails)
# Output: ['ivaan@example.com', 'laisha@test.org', 'eve@abc.net']
# Using re.search() to extract only the first email
first_email = re.search(r"\w+@\w+\.\w+", text)
print("re.search() result:", first_email.group())
# Output: 'ivaan@example.com'
# Using re.match() to check at the start of the string
match_email = re.match(r"\w+@\w+\.\w+", text)
print("re.match() result:", match_email)
# Output: None, because the string starts with "Emails: "

re.findall() result: ['ivaan@example.com', 'laisha@test.org', 'eve@abc.net']
re.search() result: ivaan@example.com
re.match() result: None


#### Special Sequences

Special sequences enable the identification of common character classes within strings without the need for elaborate or verbose pattern definitions. For example

In [None]:
import re
text = ‚ÄúMy phone number is 9876543210‚Äù
digits = re.findall(r‚Äù\d‚Äù, text)
print(digits) # Output: [‚Äò9‚Äô, ‚Äò8‚Äô, ‚Äò7‚Äô, ‚Äò6‚Äô, ‚Äò5‚Äô, ‚Äò4‚Äô, ‚Äò3‚Äô, ‚Äò2‚Äô, ‚Äò1‚Äô, ‚Äò0‚Äô]

#### Quantifiers

Quantifiers specify the frequency of occurrence of a character or pattern within a string

In [None]:
text = ‚ÄúI have 2 apples and 15 oranges.‚Äù
numbers = re.findall(r‚Äù\d+‚Äù, text)
print(numbers) # Output: [‚Äò2‚Äô, ‚Äò15‚Äô]

#### Character Classes

Character classes in regular expressions allow the definition of sets of characters to be matched at a particular position within a string.

In [39]:
import re
text = "User IDs: A123, B456, C789, D_01"

# Match any uppercase letter followed by digits
pattern = r"[A-Z]\d+"
matches = re.findall(pattern, text)
print(matches) # output: ['A123', 'B456', 'C789']

['A123', 'B456', 'C789']
