#  Format Strings in Python

Format strings are a way to inject variables into a string in Python. They are used to format strings and produce more human-readable outputs. There are several ways to format strings in Python:

## String interpolation (f-strings)

Introduced in Python 3.6, f-strings are a new way to format strings in Python. They are prefixed with 'f' and use curly braces {} to enclose the variables that will be formatted. For example:

In [36]:
name = "Umair"
age = 20
print(f"My name is {name} and I am {age} years old.")

My name is Umair and I am 20 years old.


## `.title()` in Python

The `.title()` method is used with strings to **capitalize the first letter of each word** and make all other letters lowercase.

**Syntax:**
```python
string.title()


In [61]:
# Example of .title() in Python

article_title = "data science basics"
formatted_title = article_title.title()

print("Original:", article_title)
print("Formatted:", formatted_title)


Original: data science basics
Formatted: Data Science Basics


# Raw String (r’’)

In Python, raw strings are a powerful tool for handling textual data, especially when dealing with escape characters. By prefixing a string literal with the letter ‘r’, Python treats the string as raw, meaning it interprets backslashes as literal characters rather than escape sequences.

Consider the following examples of regular string and raw string:

Regular string:

In [37]:
regular_string = "C:\new_folder\file.txt"
print("Regular String:", regular_string)

Regular String: C:
ew_folderile.txt


In the regular string regular_string variable, the backslashes (\n) are interpreted as escape sequences. Therefore, \n represents a newline character, which would lead to an incorrect file path representation.

Raw string:

In [38]:
raw_string = r"C:\new_folder\file.txt"
print("Raw String:", raw_string)

Raw String: C:\new_folder\file.txt



# String Operations

## Objectives

After completing this part you will be able to:

*   Work with Strings
*   Perform operations on String
*   Manipulate Strings using indexing and escape sequences


<h2>Table of Contents</h2>
<div class="alert alert-block alert-info" style="margin-top: 20px">
    <ul>
        <li>
            <a href="#What-are-Strings?">What are Strings?</a>
        </li>
        <li>
            <a href="#Indexing">Indexing</a>
            <ul>
                <li><a href="#Negative-Indexing">Negative Indexing</a></li>
                <li><a href="#Slicing">Slicing</a></li>
                <li><a href="#Stride">Stride</a></li>
                <li><a href="#Concatenate-Strings">Concatenate Strings</a></li>
            </ul>
        </li>
        <li>
            <a href="#Escape-Sequences">Escape Sequences</a>
        </li>
        <li>
            <a href="#String-Manipulation-Operations">String Manipulation Operations</a>
        </li>
        <li>
            <a href="#Quiz-on-Strings">Quiz on Strings</a>
        </li>
    </ul>

</div>

<hr>


In [39]:
# Use single quotation marks for defining string

'The BodyGuard'

'The BodyGuard'

## Indexing


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/xwKqWxQWBL47h718d3BLzw/IMG1.png" width="600" align="center">


In [40]:
# Print the first element in the string
Name="The BodyGuard"
print(Name[0])

T


### Negative Indexing


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/VAdKOVUWpsM7hC7CjWmEdQ/IMG2.png" width="600" align="center">


In [41]:
# Print the last element in the string

print(Name[-1])

d


### Slicing
We can obtain multiple characters from a string using slicing, we can obtain the 0 to 4th and 8th to the 12th element:


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/Ph9xvvIvaf-krPI-qoaZ2Q/IMG3.png" width="600" align="center">


In [42]:
# Take the slice on variable name with only index 0 to index 3

Name[0:4]

'The '

In [43]:
# Take the slice on variable name with only index 8 to index 11

Name[8:12]

'Guar'

### Stride


We can also input a stride value as follows, with the '2' indicating that we are selecting every second variable:


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/SaORQ3bZLArgeix_9-jjRQ/IMG4.png" width="600" align="center">


In [44]:
# Write your code below and press Shift+Enter to execute

e = 'clocrkr1e1c1t'
e[::2]

'correct'

### Concatenate Strings


We can concatenate or combine strings by using the addition symbols, and the result is a new string that is a combination of both:


In [45]:
# Concatenate two strings

statement = Name + " is the best album"
statement

'The BodyGuard is the best album'

## String Manipulation Operations


There are many string operation methods in Python that can be used to manipulate the data. We are going to use some basic string operations on the data.


Let's try with the method <code>upper</code>; this method converts lower case characters to upper case characters:


In [46]:
# Convert all the characters in string to upper case

a = "Thriller is the sixth studio album"
print("before upper:", a)
b = a.upper()
print("After upper:", b)

before upper: Thriller is the sixth studio album
After upper: THRILLER IS THE SIXTH STUDIO ALBUM


The method <code>replace</code> replaces a segment of the string, i.e. a substring  with a new string. We input the part of the string we would like to change. The second argument is what we would like to exchange the segment with, and the result is a new string with the segment changed:


In [47]:
# Replace the old substring with the new target substring is the segment has been found in the string

a = "The BodyGuard is the best album"
b = a.replace('BodyGuard', 'Janet')
b

'The Janet is the best album'

The method <code>find</code> finds a sub-string. The argument is the substring you would like to find, and the output is the first index of the sequence. We can find the sub-string <code>he</code> or <code>Guard<code>.


<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/eK6pP3xD4kLWk2vri9KA5A/IMG5.png" width="600" align="center">


In [48]:
# Find the substring in the string. Only the index of the first elment of substring in string will be the output

name = "The BodyGuard"
name.find('he')

1

If the  sub-string is not in the string then the output is a negative one. For example, the string 'Jasdfasdasdf' is not a substring:


In [49]:
# If cannot find the substring in the string

name.find('Jasdfasdasdf')

-1

The method <code>Split</code> splits the string at the specified separator, and returns a list.

**Syntax**

<code>string.split(separator, maxsplit)</code>

**Parameters**
- separator (optional): This is the delimiter at which the string will be split. If not provided, the default separator is any whitespace.
- maxsplit (optional): This specifies the maximum number of splits to perform. If not provided, there is no limit on the number of splits.

**Return Value**:

The method returns a list of substrings.


In [50]:
#Split the substring into list
name = "The BodyGuard"
split_string = (name.split())
split_string

['The', 'BodyGuard']

## RegEx

In Python, RegEx (short for Regular Expression) is a tool for matching and handling strings. 
This RegEx module provides several functions for working with regular expressions, including <code>search, split, findall,</code> and <code>sub</code>. 
Python provides a built-in module called <code>re</code>, which allows you to work with regular expressions. 
First, import the <code>re</code> module


In [51]:
import re

The search() function searches for specified patterns within a string. Here is an example that explains how to use the search() function to search for the word "Body" in the string "The BodyGuard is the best".


In [52]:
s1 = "The BodyGuard is the best album"

# Define the pattern to search for
pattern = r"Body"

# Use the search() function to search for the pattern in the string
result = re.search(pattern, s1)

# Check if a match was found
if result:
    print("Match found!")
else:
    print("Match not found.")


Match found!


Regular expressions (RegEx) are patterns used to match and manipulate strings of text. There are several special sequences in RegEx that can be used to match specific characters or patterns.

| Special Sequence | Meaning                 | 	Example             |
| -----------  | ----------------------- | ----------------------|
| \d|Matches any digit character (0-9)|"123" matches "\d\d\d"|
|\D|Matches any non-digit character|"hello" matches "\D\D\D\D\D"|
|\w|Matches any word character (a-z, A-Z, 0-9, and _)|"hello_world" matches "\w\w\w\w\w\w\w\w\w\w\w"|
|\W|Matches any non-word character|	"@#$%" matches "\W\W\W\W"|
|\s|Matches any whitespace character (space, tab, newline, etc.)|"hello world" matches "\w\w\w\w\w\s\w\w\w\w\w"|
|\S|Matches any non-whitespace character|"hello_world" matches "\S\S\S\S\S\S\S\S\S\S\S"|
|\b|Matches the boundary between a word character and a non-word character|"cat" matches "\bcat\b" in "The cat sat on the mat"|
|\B|Matches any position that is not a word boundary|"cat" matches "\Bcat\B" in "category" but not in "The cat sat on the mat"|


Special Sequence Examples:

A simple example of using the <code>\d</code> special sequence in a regular expression pattern with Python code:


In [53]:
pattern = r"\d\d\d\d\d\d\d\d\d\d"  # Matches any ten consecutive digits
text = "My Phone number is 1234567890"
match = re.search(pattern, text)

if match:
    print("Phone number found:", match.group())
else:
    print("No match")

Phone number found: 1234567890


The match.group() method is used in Python's re module to retrieve the part of the string where the regular expression pattern matched. Here's a detailed explanation:

**Purpose**
- Extract Matched Text: match.group() returns the exact substring that matched the pattern.
 
**Usage**
- When you use functions like re.search() or re.match(), they return a match object if the pattern is found. You can then use match.group() to get the matched text.

Here `match.group()` retrieves the substring 1234567890 from the text, which is the part that matched the pattern.


A simple example of using the <code>\W</code> special sequence in a regular expression pattern with Python code:


In [54]:
pattern = r"\W"  # Matches any non-word character
text = "Hello, world!"
matches = re.findall(pattern, text)

print("Matches:", matches)

Matches: [',', ' ', '!']


The regular expression pattern is defined as r"\W", which uses the \W special sequence to match any character that is not a word character (a-z, A-Z, 0-9, or _). The string we're searching for matches in is "Hello, world!".


In [55]:
s2 = "The BodyGuard is the best album of 'Whitney Houston'."


# Use the findall() function to find all occurrences of the "st" in the string
result = re.findall("st", s2)

# Print out the list of matched words
print(result)


['st', 'st']


A regular expression's <code>split()</code> function splits a string into an array of substrings based on a specified pattern.


In [56]:
# Use the split function to split the string by the "\s"
split_array = re.split(r"\s", s2)

# The split_array contains all the substrings, split by whitespace characters
print(split_array)

['The', 'BodyGuard', 'is', 'the', 'best', 'album', 'of', "'Whitney", "Houston'."]


Here's a detailed explanation: 

<code>re.split("\s", s2)</code>:

**re.split**: This function splits a string by the occurrences of a pattern.
- **r"\s"**: This is a regular expression pattern that matches any whitespace character (spaces, tabs, newlines, etc.).
- **s2**: This is the string that you want to split.


The <code>sub</code> function of a regular expression in Python is used to replace all occurrences of a pattern in a string with a specified replacement.


In [57]:
# Define the regular expression pattern to search for
pattern = r"Whitney Houston"

# Define the replacement string
replacement = "legend"

# Use the sub function to replace the pattern with the replacement string
new_string = re.sub(pattern, replacement, s2, flags=re.IGNORECASE)

# The new_string contains the original string with the pattern replaced by the replacement string
print(new_string) 

The BodyGuard is the best album of 'legend'.


In [None]:
Question 6
A content management system needs to standardize how article titles are displayed. If a writer submits the title “data science basics” stored in the variable article_title, which method would transform it to “Data Science Basics”?

#  Summary Table (Detective Tools Edition) RegEx


| Function   | What it does                          | Detective Gadget       |
|------------|---------------------------------------|------------------------|
| `search()` | Finds the first match                 | 🔍 Magnifying glass    |
| `findall()`| Finds all matches                     | 🛒 Shopping cart       |
| `split()`  | Cuts text at the pattern              | ✂️ Scissors            |
| `sub()`    | Replaces text with something new      | 🎭 Disguise tool       |


<div style="text-align: center;">
  <h2>Module 1 of Data Science AI (Coursera) Complete✅�</h2>
</div>
>
