## String Manipulation II

#### Hemant Thapa

Regular Expressions

Python's re module allows for complex string searching and manipulation using regular expressions. This is powerful for pattern matching, searching, and complex replacements.

In [1]:
import re
import string
from string import Template

##### 1. Searching with Regular Expression 

In [2]:
string = """
With its mountainous landscape, it is no coincidence that on average it is Scotland that receives the most annual rain in the UK.
The wettest parts of the UK are generally in mountainous regions, with the Western Highlands prone to high levels of rain. 
Here, rainfall can be 3,000 millimeters per year. However, the East of Scotland can see levels as low as 800 millimeters. 
This is often due to rainfall from the Atlantic weather systems coming in from the West and as these systems move east, rain deposits reduce.
"""

In [3]:
match = re.search("Atlantic", string)
if match:
    print("We have a match!")
else:
    print("No Match")

We have a match!


##### 2. Splitting Regular Expressions

In [4]:
word_split = re.split("\s", string[0:53])

In [5]:
word_split

['',
 'With',
 'its',
 'mountainous',
 'landscape,',
 'it',
 'is',
 'no',
 'coincidence']

##### 3. Replacing with Regular Expression 

In [6]:
statement = "Scotland has a cold wheather"

In [7]:
new_word = re.sub("S", "$", statement)
new_word_2 = re.sub("a", "@", new_word)
print(new_word_2)

$cotl@nd h@s @ cold whe@ther


In [8]:
def replace_multiple(char):
    replacements = {"S": "$", "a": "@", "o": "0"}
    return replacements.get(char, char)

In [9]:
new_statement = re.sub("[Sa]", lambda match: replace_multiple(match.group(0)), statement)
print(new_statement)

$cotl@nd h@s @ cold whe@ther


##### 4. Count Occurrences of a Substring

In [10]:
print(statement)

Scotland has a cold wheather


In [11]:
count = statement.count("a")
print(count)

4


##### 5. Find All Occurrences of a Substring

In [12]:
def find_all(sub, string):
    start = 0
    while start < len(string):
        start = string.find(sub, start)
        if start == -1:return
        yield start
        start += 1

In [13]:
indices = list(find_all('in', "The rain in Spain falls mainly in the plain"))
print(indices)

[6, 9, 15, 26, 31, 41]


##### 6. String Interpolation / Template Strings

In [14]:
t = Template('Hello, $name, $greeting')
string = t.substitute(name="Harry", greeting="How you doing?")
print(string)

Hello, Harry, How you doing?


##### 7.  Encoding and Decoding Strings

In [15]:
string = "Coding in Python and JavaScript is interesting"
encode = string.encode("ascii", "ignore")
encode

b'Coding in Python and JavaScript is interesting'

In [16]:
decode = encode.decode()
decode

'Coding in Python and JavaScript is interesting'

In [17]:
#writing to a file with non-ASCII characters
with open('example.txt', 'w', encoding='utf-8') as file:
    file.write("Hello, world! 🌍")

#reading from the file
with open('example.txt', 'r', encoding='utf-8') as file:
    content = file.read()
    print(content)

Hello, world! 🌍


##### 8. Working with Unicode Data

In [18]:
string = "Pythön!"
#encoding to UTF-8
print(string.encode("utf-8")) 
#decoding back to a string
print(string.encode("utf-8").decode("utf-8"))  

b'Pyth\xc3\xb6n!'
Pythön!


##### 9. Working with Bytes

In [19]:
byte_data = b'This is bytes'
print(byte_data) 

b'This is bytes'


In [20]:
#string to bytes
string = "Hello World"
byte_data = string.encode("utf-8")
print(byte_data)

b'Hello World'


In [21]:
#bytes to string
string = byte_data.decode("utf-8")
print(string)

Hello World


##### References: 

https://docs.python.org/3/library/functions.html