# String Parsing Interview Exercises

Each exercise includes the question and a sample input string to work with.

## Question 1
Extract the substring between the first and last occurrence of character `c` in a given string `s`.

In [1]:
# Sample input for Question 1
sample_1 = """abxcdefgcwxcz"""

left_index = sample_1.index('c')
right_index = sample_1.rindex('c')
sample_1[left_index + 1: right_index]

'defgcwx'

## Question 2
Get the file extension from a file path string.

In [4]:
# Sample input for Question 2
sample_2 = """/home/dev/project/file.py"""

right_index = sample_2.rindex('/')
sample_2[right_index + 1:]

'file.py'

## Question 3
Extract query parameters from a URL string into a dictionary.

In [None]:
# Sample input for Question 3
sample_3 = """https://example.com/page?param1=foo&param2=bar"""

q_index = sample_3.index('?')
query_string = sample_3[q_index + 1:]
params = dict(pair.split('=', 1) for pair in query_string.split('&'))
params

[['param1', 'foo'], ['param2', 'bar']]

## Question 4
Parse the timestamp, log level, and message from a log entry like `'2025-06-14 17:53:22 - ERROR - Something happened'`.

In [10]:
# Sample input for Question 4
sample_4 = """2025-06-14 17:53:22 - ERROR - Something happened"""
components = sample_4.rsplit('-', 2)
[c.strip() for c in components]


['2025-06-14 17:53:22', 'ERROR', 'Something happened']

## Question 7
Split a multiline string into lines and remove any empty lines.

In [11]:
# Sample input for Question 7
sample_7 = """line1

line2
line3

"""

[line for line in sample_7.split('\n') if line.strip()]

['line1', 'line2', 'line3']

## Question 8
Remove all HTML tags from a string containing HTML.

In [None]:
# Sample input for Question 8
sample_8 = """<div>Hello</div><p>World!</p>"""

## Question 9
Split a Windows file path `'C:\Users\Dev\Documents\file.txt'` into its components.

In [14]:
# Sample input for Question 9
sample_9 = r"""C:\Users\Dev\Documents\file.txt"""
components = sample_9.split('\\')
components

['C:', 'Users', 'Dev', 'Documents', 'file.txt']

## Question 10
Convert a `snake_case` string to `camelCase`.

In [None]:
# Sample input for Question 10
sample_10 = """snake_case_example"""


## Question 11
Convert a `CamelCase` string to `snake_case`.

In [None]:
# Sample input for Question 11
sample_11 = """CamelCaseExample"""

## Question 12
Count the number of words in a sentence.

In [None]:
# Sample input for Question 12
sample_12 = """This is a sample sentence for word count"""

## Question 13
Check if a string is a palindrome, ignoring spaces and punctuation.

In [None]:
def is_palindrome(s):
    cleaned = ''.join(c.lower() for c in s if c.isalnum())
    return cleaned == cleaned[::-1]

# Example usage
sample_13 = "A man a plan a canal Panama"
print(is_palindrome(sample_13))  # Output: True

True

## Question 14
Extract all integers from a string containing mixed text.

In [19]:
# Sample input for Question 14
sample_14 = """abc123def456ghi789"""
[c for c in sample_14 if c.isdigit()]

['1', '2', '3', '4', '5', '6', '7', '8', '9']

## Question 15
Separate letters and digits in a mixed alphanumeric string.

In [None]:
# Sample input for Question 15
sample_15 = """A1B2C3D4"""

## Question 16
Parse a query string like `'a=1&b=2&c=3'` into a dictionary.

In [21]:
# Sample input for Question 16
sample_16 = """a=1&b=2&c=3"""
result = {k: v for k, v in (item.split('=', 1) for item in sample_16.split('&'))}


## Question 17
Extract year, month, and day from a date string in `YYYY-MM-DD` format.

In [27]:
# Sample input for Question 17
import datetime

sample_17 = """2025-06-14"""
d = datetime.datetime.strptime(sample_17, "%Y-%m-%d")
d.year, d.month, d.day

(2025, 6, 14)

## Question 18
Parse hours, minutes, and seconds from a time string `HH:MM:SS`.

In [30]:
# Sample input for Question 18
sample_18 = """12:34:56"""
d = datetime.datetime.strptime(sample_18, "%H:%M:%S")
d.hour, d.minute, d.second


(12, 34, 56)

In [28]:
import time
help(time.strptime)

Help on built-in function strptime in module time:

strptime(...)
    strptime(string, format) -> struct_time

    Parse a string to a time tuple according to a format specification.
    See the library reference manual for formatting codes (same as
    strftime()).

    Commonly used format codes:

    %Y  Year with century as a decimal number.
    %m  Month as a decimal number [01,12].
    %d  Day of the month as a decimal number [01,31].
    %H  Hour (24-hour clock) as a decimal number [00,23].
    %M  Minute as a decimal number [00,59].
    %S  Second as a decimal number [00,61].
    %z  Time zone offset from UTC.
    %a  Locale's abbreviated weekday name.
    %A  Locale's full weekday name.
    %b  Locale's abbreviated month name.
    %B  Locale's full month name.
    %c  Locale's appropriate date and time representation.
    %I  Hour (12-hour clock) as a decimal number [01,12].
    %p  Locale's equivalent of either AM or PM.

    Other codes may be available on your platform.  Se

## Question 19
Parse key-value pairs from a string `'key1:value1,key2:value2'` into a dictionary.

In [31]:
# Sample input for Question 19
sample_19 = """key1:value1,key2:value2"""
{k:v for k,v in (pair.split(':') for pair in sample_19.split(','))}


{'key1': 'value1', 'key2': 'value2'}

## Question 21
Collapse multiple spaces into a single space in a string.

In [32]:
import re

# Sample input for Question 21
sample_21 = """This   has    multiple     spaces"""
re.sub(r'\s+', ' ', sample_21)

'This has multiple spaces'

## Question 22
Remove all non-alphanumeric characters from a string.

In [34]:
# Sample input for Question 22
sample_22 = """Hello! Welcome @2025"""
re.sub(r'[^a-zA-Z0-9]', '', sample_22)

'HelloWelcome2025'

## Question 23
Replace tabs with four spaces in a multiline text string.

In [33]:
# Sample input for Question 23
sample_23 = """This	is	a	tabbed	text"""
sample_23.replace('\t', '    ')

'This    is    a    tabbed    text'

## Question 24
Extract the heading level and text from a Markdown header like `'## Heading'`.

In [None]:
# Sample input for Question 24
sample_24 = """## Sample Heading"""

## Question 25
Parse a string `'name=John Doe; age=30; location=USA'` into a dictionary.

In [35]:
# Sample input for Question 25
sample_25 = """name=John Doe; age=30; location=USA"""

{k:v for k,v in (pair.strip().split('=') for pair in sample_25.split(';'))}

{'name': 'John Doe', 'age': '30', 'location': 'USA'}

## Question 26
Reverse the order of words in a sentence.

In [37]:
# Sample input for Question 26
sample_26 = """Hello world from ChatGPT"""
' '.join(sample_26.split(' ')[::-1])

'ChatGPT from world Hello'

## Question 27
Reverse the characters in a string.

In [None]:
# Sample input for Question 27
sample_27 = """HelloWorld"""

## Question 28
Find the longest substring without repeating characters in a given string.

In [54]:
# Sample input for Question 28
from itertools import combinations
sample_28 = """abcabcbb"""
max_sub = 0
for pair in combinations(range(len(sample_28) + 1), 2):
    a,b = pair
    slice = sample_28[a:b]
    if len(slice) == len(set(slice)):
        max_sub = max(max_sub, b-a)
max_sub

3

In [None]:
# Dict approach

sample_28 = "abcabcbb"

max_len = 0
start = 0
seen = {}

for i, char in enumerate(sample_28):
    if char in seen and seen[char] >= start:
        start = seen[char] + 1
    seen[char] = i
    max_len = max(max_len, i - start + 1)

print(max_len)

3


: 

## Question 30
Normalize mixed path separators `'/usr/bin;C:\Windows\System32'` to UNIX separators.

In [None]:
# Sample input for Question 30
sample_30 = """/usr/bin;C:\Windows\System32"""

## Question 31
Detect and remove HTML comments like `<!-- comment -->` from a string.

In [None]:
# Sample input for Question 31
sample_31 = """<!-- hidden comment -->Visible text<!-- another -->"""

## Question 32
Validate an email address's basic structure and extract the username and domain.

In [None]:
# Sample input for Question 32
sample_32 = """user@example.com"""

## Question 33
Extract the version number from a filename like `'file_v1.2.3.txt'`.

In [None]:
# Sample input for Question 33
sample_33 = """file_v1.2.3.txt"""

## Question 34
Extract the domain name from a URL.

In [None]:
# Sample input for Question 34
sample_34 = """https://sub.example.co.uk/path"""

## Question 35
Split a socket address `'127.0.0.1:8080'` into host and port.

In [None]:
# Sample input for Question 35
sample_35 = """127.0.0.1:8080"""

## Question 36
Unescape a quoted string containing escaped quotes.

In [None]:
# Sample input for Question 36
sample_36 = """He said, \"Hello\""""

## Question 37
Parse a CSV line with quoted fields containing commas correctly.

In [None]:
# Sample input for Question 37
sample_37 = """field1,"field, with, commas",field3"""

## Question 38
Dedent a multiline text to the minimal indentation level.

In [None]:
# Sample input for Question 38
sample_38 = """    indented line1
        indented line2
    indented line3"""

## Question 39
Detect whether a string uses `\n`, `\r\n`, or `\r` for line endings.

In [None]:
# Sample input for Question 39
sample_39 = """Line1
Line2
Line3
Line4"""

## Question 41
Find all overlapping occurrences of a substring in a string.

In [None]:
# Sample input for Question 41
sample_41 = """abababa"""

## Question 42
Split a hyphenated string into individual words.

In [None]:
# Sample input for Question 42
sample_42 = """word1-word2-word3"""

## Question 43
Parse a string representation of a list `"[1,2,3]"` into a Python list of integers.

In [None]:
# Sample input for Question 43
sample_43 = """[1,2,3,4]"""

## Question 44
Extract the content inside the outermost parentheses in a string.

In [None]:
# Sample input for Question 44
sample_44 = """func(arg1, arg2)"""

## Question 45
Mask all digits in a string with `*`.

In [None]:
# Sample input for Question 45
sample_45 = """My phone number is 123-456-7890"""

## Question 46
Convert a string to title case (capitalize each word).

In [None]:
# Sample input for Question 46
sample_46 = """this is title case"""

## Question 47
Convert tabs to spaces and wrap lines at 80 characters.

In [None]:
# Sample input for Question 47
sample_47 = """	Indented with tabs and then some text that is really long and should maybe be wrapped"""

## Question 48
Compress consecutive repeated characters in a string to a single character.

In [None]:
# Sample input for Question 48
sample_48 = """aaabbbccc"""

## Question 49
Find the index of the Nth occurrence of a substring in a string.

In [None]:
# Sample input for Question 49
sample_49 = """find the nth occurrence of a substring in a string"""

## Question 50
Parse key-value pairs separated by newlines into a dictionary.

In [None]:
# Sample input for Question 50
sample_50 = """key1=value1
key2=value2
key3=value3"""