 # String objects

 ## Creating string objects

In [1]:
s = "hello world"
print(s)
print(type(s))

hello world
<class 'str'>


 ## Converting objects to strings

 numbers can be converted to strings using the str() function, for example:

In [2]:
print(str(1))
print(type(str(1)))
print(str(1.0))
print(type(str(1.0)))

1
<class 'str'>
1.0
<class 'str'>



 ## String slicing

 Strings can be sliced using the [start:end] syntax, for example:

In [3]:
print(s[0:5])  # hello
print(s[:5])  # hello
print(s[6:])  # world
print(s[-1])  # d
print(s[:-2])  # hello wor
print(s[-5:-1])  # worl

hello
hello
world
d
hello wor
worl



A `::` can be used to specify a step value. E.g. `s[::2]` means, "every second character".
Negative step values can be used to reverse the string: `s[::-1]`

In [4]:
print(s[::2])  # hlowrd, i.e. every second character
print(s[1::2])  # el ol, i.e. every second character starting from the index 1 (second character)
print(s[::-1])  # dlrow olleh, i.e. the reversed string
print(s[-3::-1])  # row olleh, i.e. the reversed string starting from the third-to-last character

hlowrd
el ol
dlrow olleh
row olleh


 ## Concatenating strings

 strings can be concatenated, for example:

In [5]:
s2 = "!"
print(s + s2)

hello world!


 we can also include numbers stored in objects in strings, for example:

In [6]:
x = 3
print("x is equal to " + str(x))

x is equal to 3


 ## String formatting

 It is more elegant to use string formatting in such cases, for example:

In [7]:
print(f"x is equal to {x}")

x is equal to 3


 Floats can be formatted using the f-string method, for example:

In [8]:
y = 3.14159265359
print(f"y is equal to {y:.2f}")

y is equal to 3.14


 ## Exercise: Manipulate and format strings

 ### Tasks:

 1. Create a variable called jedi_name with your favorite Jedi's name.

 2. Create a variable called sith_name with your favorite Sith's name.

 3. Concatenate jedi_name and sith_name with " vs. " in between and print the result.

 4. Use an f-string to print a message that includes the jedi_name and sith_name in a Star Wars duel context.

 5. BONUS: use the capitalize() method to capitalize the first letter of jedi_name and sith_name before concatenating them.

 6. BONUS: calculate the length of the *sith_name* and print it on the screen.

 7. BONUS: print the ratio of the length of *sith_name* to the length of *jedi_name* on the screen, rounded to 2 decimal places.

 ### Solution:

In [9]:
# 1. Create a variable called jedi_name with your favorite Jedi's name.
jedi_name = "Luke Skywalker"

# 2. Create a variable called sith_name with your favorite Sith's name.
sith_name = "Darth Vader"

# 3. Concatenate jedi_name and sith_name with " vs. " in between and print the result.
print(jedi_name + " vs. " + sith_name)

# 4. Use an f-string to print a message that includes the jedi_name and sith_name in a Star Wars duel context.
print(f"{jedi_name} vs. {sith_name} in a Star Wars duel!")

# 5. BONUS: use the capitalize() method to capitalize the first letter of jedi_name and sith_name before concatenating them.
print(jedi_name.capitalize() + " vs. " + sith_name.capitalize())

# 6. BONUS: calculate the length of the *sith_name* and print it on the screen.
print(len(sith_name))

# 7. BONUS: print the ratio of the length of *sith_name* to the length of *jedi_name* on the screen, rounded to 2 decimal places.
print(round(len(sith_name) / len(jedi_name), 2))

# using f-string for the rounding
print(f"{len(sith_name) / len(jedi_name):.2f}")

Luke Skywalker vs. Darth Vader
Luke Skywalker vs. Darth Vader in a Star Wars duel!
Luke skywalker vs. Darth vader
11
0.79
0.79


 ## Creating very long strings

 Very long strings can be created using triple quotes, for example:

In [10]:
s3 = """This is a very long string
that spans multiple lines
and can be used to create
documentation"""
print(s3)

This is a very long string
that spans multiple lines
and can be used to create
documentation


 Alternatively, you can use the parenthesis (), for example:

In [11]:
s3 = "This is a very long string that spans multiple lines and can be used to create documentation"
print(s3)

This is a very long string that spans multiple lines and can be used to create documentation


 ## Working with file paths

 File paths are often stored in strings, for example:

In [12]:
path = "C:/Users/Thomas/Documents/Python Scripts"
print(path)

C:/Users/Thomas/Documents/Python Scripts


 However, some operating systems (like Windows) use backslashes (\) instead of forward slashes (/) to separate folders in a file path. This can cause problems when using strings to store file paths. One solution is to use raw strings, for example:

In [13]:
path = r"C:\Users\Thomas\Documents\Python Scripts"
print(path)

C:\Users\Thomas\Documents\Python Scripts


 alternatively, you can use double backslashes

In [14]:
path = "C:\\Users\\Thomas\\Documents\\Python Scripts"
print(path)

C:\Users\Thomas\Documents\Python Scripts


 ## Special characters / escape characters

 Backslashes are usually used to 'escape' special characters. The term 'escape' means that the character following the backslash is treated as a special character, for example:

In [15]:
print("hello\nworld")
print("hello\tworld")
print("hello\\world")

hello
world
hello	world
hello\world


 ## String methods and chaining

 Generally, most objects have methods that can be used to perform operations on the object. Methods are called using the dot notation. Strings also have specialized methods, for example:

In [16]:
s = "hello world"
print(s.upper())
print(s.capitalize())
print(s.replace("l", "L"))
print(s.split(" "))

HELLO WORLD
Hello world
heLLo worLd
['hello', 'world']


 Methods can be chained, for example:

In [17]:
print(s.upper().capitalize().replace("l", "L").split(" "))

['HeLLo', 'worLd']


 ## Breaking long lines of code

 Long lines of code can be broken into multiple lines using parentheses, for example:

In [18]:
# fmt: off
s4 = (
    s.upper()
    .capitalize()
    .replace("l", "L")
    .split(" ")
)
print(s4)
# fmt: on

['HeLLo', 'worLd']


In [19]:
s4 = s.upper().capitalize().replace("l", "L").split(" ")
print(s4)

['HeLLo', 'worLd']


 <blockquote><b>&#x1F517; Data Preparation &amp; Analysis</b>

 * very often, data is stored as string in *DataFrames* objects

 * string methods can then be used to clean text data (e.g. remove special characters, or trailing white space)

 * for example, harmonize the spelling by enforcing lowercase (Munich vs. munich)

 * in other cases, string methods can be used to extract information from strings (e.g. extract the domain from an email address)

 * if numbers are provided with a comma as a decimal separator, the comma can be replaced with a dot to convert the string to a float

 * similarly, if numbers are provided with a unit (e.g. 1000 kWh), the unit can be removed to convert the string to a float

 * many functions and methods accept strings as arguments

 </blockquote>

 ## Exercice: string methods and methods chaining

 ### Tasks:

 1. create a variable called *s* with the value "Python is great" and print it on the screen

 2. use the upper() method to convert *s* to uppercase and print it on the screen

 3. use the capitalize() method to capitalize *s* and print it on the screen

 4. use the replace() method to replace "great" with "awesome" in *s* and print it on the screen

 5. first use replace() to replace "great" with "awesome" and then use upper() to convert *s* to uppercase and print it on the screen

 6. BONUS: use the split() method to split *s* into a list of words and print it on the screen

 7. BONUS: use the join() method to join the list of words into a single string separated by a comma and print it on the screen

 8. BONUS: remove any leading or trailing whitespaces from the string " Python is great " and print it on the screen

 ### Solution:

In [20]:
# 1. create a variable called *s* with the value "Python is great" and print it on the screen
s = "Python is great"
print(s)

# 2. use the upper() method to convert *s* to uppercase and print it on the screen
print(s.upper())

# 3. use the capitalize() method to capitalize *s* and print it on the screen
print(s.capitalize())

# 4. use the replace() method to replace "great" with "awesome" in *s* and print it on the screen
print(s.replace("great", "awesome"))

# 5. first use replace() to replace "great" with "awesome" and then use upper() to convert *s* to uppercase and print it on the screen
print(s.replace("great", "awesome").upper())

# 6. BONUS: use the split() method to split *s* into a list of words and print it on the screen
print(s.split())

# 7. BONUS: use the join() method to join the list of words into a single string separated by a comma and print it on the screen
print(", ".join(s.split()))

# 8. BONUS: remove any leading or trailing whitespaces from the string " Python is great " and print it on the screen
print(" Python is great ".strip())

Python is great
PYTHON IS GREAT
Python is great
Python is awesome
PYTHON IS AWESOME
['Python', 'is', 'great']
Python, is, great
Python is great
