Q: If we wanted to split a string every time there is an upper case letter, how could we do that?

A: We need to break this up, and it's actually more complicated than you might think:

- Initialize an empty list to hold the result.
- Sets another variable to an empty string to start collecting characters of the current word.
- Iterate through each character in the input string.
- For each character, check if it is an uppercase letter using the isupper() method.
- If the character is uppercase and there is already a word being formed (i.e., current_word is not empty), append the current word to the list split_strings and starts a new word with the current uppercase character.
- If the character is not uppercase, add the character to the current_word.
- After the loop finishes, there may be a current_word that has not been appended to the list if the last character(s) of the string are not uppercase. So check if there is a current_word and appends it to the list.
- Print the list of split strings.

In [18]:
s = "Here is a Sentence with Uppercase letters"

In [14]:
# we need something to contain the parts and the current word we're looking at
split_strings = []
current_word = ""

In [15]:
type(current_word)

str

In [16]:
for char in s: # for each character
    if char.isupper(): # if that character is uppercase
        if current_word: # this means: if current_word is not empty
            split_strings.append(current_word) # add whatever current word is to the list (up to but not including) 
            current_word = char # reset current_word to te current character
        else: # but if we don't have a current_word yet
            current_word += char # add all characters to the current word
    else: # and if the character is not uppercase at all
        current_word += char # just add the character to current_word

if current_word: # if there's a last part of the string after the last uppercase, add it
    split_strings.append(current_word)

print(split_strings) # print what we have


['Here is a ', 'Sentence with ', 'Uppercase letters']


There's more (shorter) ways to do it by importing packages--e.g, by using regex.

In [19]:
import re

parts = re.findall('[A-Z][^A-Z]*', s)
print(parts)

['Here is a ', 'Sentence with ', 'Uppercase letters']


[A-Z]: This part of the regular expression matches any uppercase letter from A to Z. It specifies a character set containing all uppercase letters.

[^A-Z]*: The ^ symbol inside the square brackets negates the character set, so [^A-Z] matches any character that is not an uppercase letter. The * quantifier means that it matches zero or more occurrences of the preceding character set.

[A-Z][^A-Z]*: This regular expression pattern matches a sequence of characters that starts with an uppercase letter ([A-Z]) followed by zero or more characters that are not uppercase letters ([^A-Z]*). Essentially, it matches each word or sequence of characters starting with an uppercase letter.