# Character Strings In Python

---

**Table of Contents**<a id='toc0_'></a>    
- [String In Python](#toc1_)    
- [Searching For Substrings](#toc2_)    
- [Constructing Related Strings](#toc3_)    
- [Testing Boolean Conditions](#toc4_)    
- [Splitting And Joining Strings](#toc5_)    
- [String Formatting](#toc6_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

---

## <a id='toc1_'></a>String In Python [&#8593;](#toc0_)

- Sequence of characters that come from some alphabet
- Using the built-in `str` class
- Based on *Unicode Character Set*
  - 16-bit character encoding
  - Extension of the 7-bit ASCII Character Set (Latin Alphabet, Numeral, and Symbols)
- Particularly important in most programming applications
  - Text is often used for input and output

## <a id='toc2_'></a>Searching For Substrings [&#8593;](#toc0_)

Syntax|Description
:-|:-
`SUB in st`|Determine if a given pattern `SUB` occurs as a substring of string `st`
`st.count(SUB[, start, end])`|Return the number of non-overlapping occurrences of `SUB`
`st.find(SUB[, start, end])`|Return the index starting the *leftmost* occurrence of `SUB`; else `-1`
`st.index(SUB[, start, end])`|Same as `st.find()` but raise `ValueError` if not found
`st.rfind(SUB[, start, end])`|Return the index starting the *rightmost* occurrence of pattern; else `-1`
`st.rindex(SUB[, start, end])`|Same as `st.rfind()`, but raise `ValueError` if not found

In [1]:
# Using substring search methods for strings
# ------------------------------------------

GREETING: str = "Hello my friend! How are you doing this morning, my friend?"
SUB: str = "friend"
header: str = f"GREETING = \"{GREETING}\"; SUB = \"{SUB}\"\n"
header += f"{'-' * (len(header) - 1)}"
print(f"{header}")

print(f"\"{SUB}\" in GREETING: {SUB in GREETING}")
print(f"GREETING.count(SUB): {GREETING.count(SUB)}")
print(f"GREETING.find(SUB): {GREETING.find(SUB)}")
print(f"GREETING.index(SUB): {GREETING.index(SUB)}")
print(f"GREETING.rfind(SUB): {GREETING.rfind(SUB)}")
print(f"GREETING.rindex(SUB): {GREETING.rindex(SUB)}")

GREETING = "Hello my friend! How are you doing this morning, my friend?"; SUB = "friend"
----------------------------------------------------------------------------------------
"friend" in GREETING: True
GREETING.count(SUB): 2
GREETING.find(SUB): 9
GREETING.index(SUB): 9
GREETING.rfind(SUB): 52
GREETING.rindex(SUB): 52


## <a id='toc3_'></a>Constructing Related Strings [&#8593;](#toc0_)

- Strings in Python are *immutable*
  - A given string cannot be modified in any way
  - Methods only return a newly constructed string that is closely related to an existing one

Syntax|Description
:-|:-
`st.replace(old, new)`|Return a copy of `st` with all occurrences of `old` replaced by `new`
`st.capitalize()`|Return a copy of `st` with its first character having uppercase
`st.title()`|Return a copy of `st` with the first character of each word having uppercase
`st.upper()`|Return a copy of `st` with all alphabetic characters in uppercase
`st.lower()`|Return a copy of `st` with all alphabetic characters in lowercase
`st.center(width)`|Return a copy of `st`, padded to `width`, centered among *spaces*
`st.ljust(width)`|Return a copy of `st`, padded to `width` with trailing *spaces*
`st.rjust(width)`|Return a copy of `st`, padded to `width` with leading *spaces*
`st.zfill(width)`|Return a copy of `st`, padded to `width` with leading *zeros*
`st.strip()`|Return a copy of `st`, with leading and trailing whitespace removed
`st.lstrip()`|Return a copy of `st`, with leading whitespace removed
`st.rstrip()`|Return a copy of `st`, with trailing whitespace removed

- Several of these methods accept optional parameters

In [2]:
# Constructing related strings
# ----------------------------

ST: str = "   a brand new day   "
header = f"ST = {ST}\n"
header += f"{'-' * (len(header) - 1)}"
print(f"{header}")

print(f"ST.replace(\"day\", \"night\"): {ST.replace('day', 'night')}")
print(f"ST.capitalize(): {ST.capitalize()}")
print(f"ST.title(): {ST.title()}")
print(f"ST.upper(): {ST.upper()}")
print(f"ST.lower(): {ST.lower()}")
print(f"ST.center(50, '-'): {ST.center(50, '-')}")
print(f"ST.ljust(50, '-'): {ST.ljust(50, '-')}")
print(f"ST.rjust(50, '-'): {ST.rjust(50, '-')}")
print(f"ST.zfill(50): {ST.zfill(50)}")
print(f"ST.strip(): {ST.strip()}")
print(f"ST.lstrip(): {ST.lstrip()}")
print(f"ST.rstrip(): {ST.rstrip()}")

ST =    a brand new day   
--------------------------
ST.replace("day", "night"):    a brand new night   
ST.capitalize():    a brand new day   
ST.title():    A Brand New Day   
ST.upper():    A BRAND NEW DAY   
ST.lower():    a brand new day   
ST.center(50, '-'): --------------   a brand new day   ---------------
ST.ljust(50, '-'):    a brand new day   -----------------------------
ST.rjust(50, '-'): -----------------------------   a brand new day   
ST.zfill(50): 00000000000000000000000000000   a brand new day   
ST.strip(): a brand new day
ST.lstrip(): a brand new day   
ST.rstrip():    a brand new day


## <a id='toc4_'></a>Testing Boolean Conditions [&#8593;](#toc0_)

Syntax|Description
:-|:-
`st.startswith(pattern)`|Return `True` if `pattern` is a *prefix* of string `st`
`st.endswith(pattern)`|Return `True` if `pattern` is a *suffix* of string `st`
`st.isspace()`|Return `True` if all characters of nonempty string are whitespace
`st.isalpha()`|Return `True` if all characters of nonempty string are alphabetic
`st.islower()`|Return `True` if there are one or more alphabetic characters, all of which are lowercased
`st.isupper()`|Return `True` if there are one or more alphabetic characters, all of which are uppercased
`st.isdigit()`|Return `True` if all characters of nonempty string are in `0–9`
`st.isdecimal()`|Return `True` if all characters of nonempty string represent digits `0–9`, including Unicode equivalents
`st.isnumeric()`|Return `True` if all characters of nonempty string are numeric Unicode characters (e.g., `0–9`, equivalents, fraction characters)
`st.isalnum()`|Return `True` if all characters of nonempty string are either alphabetic or numeric (as per above definitions)

In [3]:
# Testing boolean conditions
# --------------------------

passage: str = "and thus, it has been so since then..."
header = f"passage = {passage}\n"
header += f"{'-' * (len(header) - 1)}"
print(f"{header}")

print(f"passage.startswith('And'): {passage.startswith('And')}")
print(f"passage.endswith('.'): {passage.endswith('.')}")
print(f"passage.isspace(): {passage.isspace()}")
print(f"passage.isalpha(): {passage.isalpha()}")
print(f"passage.islower(): {passage.islower()}")
print(f"passage.isupper(): {passage.isupper()}")
print()

passage = passage.replace(".", "").replace(",", "")
header = f"passage = {passage}\n"
header += f"{'-' * (len(header) - 1)}"
print(f"{header}")

print(f"passage.isdigit(): {passage.isdigit()}")
print(f"passage.isdecimal(): {passage.isdecimal()}")
print(f"passage.isnumeric(): {passage.isnumeric()}")
print(f"passage.isalnum(): {passage.isalnum()}")


passage = and thus, it has been so since then...
------------------------------------------------
passage.startswith('And'): False
passage.endswith('.'): True
passage.isspace(): False
passage.isalpha(): False
passage.islower(): True
passage.isupper(): False

passage = and thus it has been so since then
--------------------------------------------
passage.isdigit(): False
passage.isdecimal(): False
passage.isnumeric(): False
passage.isalnum(): False


## <a id='toc5_'></a>Splitting And Joining Strings [&#8593;](#toc0_)

- *Splitting* - Take an existing string and determine a decomposition based upon a given separating pattern
- *Joining* - Compose a sequence of strings together using a delimiter to separate each pair

Syntax|Description
:-|:-
`sep.join(strings)`|Return the composition of the given sequence of strings, inserting `sep` as delimiter between each pair. Used to assemble a string from a series of pieces
`st.splitlines()`|Return a list of substrings of `st`, as delimited by newlines
`st.split(sep, count)`|Return a list of substrings of `st`, as delimited by the first count occurrences of `sep`. If `count` is not specified, split on all occurrences. If `sep` is not specified, use whitespace as delimiter
`st.rsplit(sep, count)`|Similar to `st.split()` but using the rightmost occurrences of `sep`
`st.partition(sep)`|Return `(head, sep, tail)` such that `st = head + sep + tail`, using leftmost occurrence of `sep`, if any; else return `(st, , )`
`st.rpartition(sep)`|Return `(head, sep, tail)` such that `st = head + sep + tail`, using rightmost occurrence of `sep`, if any; else return `( , , st)`

In [4]:
# Splitting and joining strings
# -----------------------------

from keyword import kwlist
from typing import Tuple

print(f"', '.join(kwlist):\n{', '.join(kwlist)}")
print()

PARAGRAPH: str = "Good morning!\nToday, we will discuss about the weather.\nThank you!"
print(f"PARAGRAPH.splitlines():\n{PARAGRAPH.splitlines()}")
print()
print(f"PARAGRAPH.split():\n{PARAGRAPH.split()}")
print()
print(f"PARAGRAPH.rsplit():\n{PARAGRAPH.rsplit()}")
print()

parts: Tuple[str, str, str] = PARAGRAPH.partition("\n")
print(f"PARAGRAPH.partition('\\n'):\n{parts}")
print()

parts = PARAGRAPH.rpartition("\n")
print(f"PARAGRAPH.rpartition('\\n'):\n{parts}")


', '.join(kwlist):
False, None, True, and, as, assert, async, await, break, class, continue, def, del, elif, else, except, finally, for, from, global, if, import, in, is, lambda, nonlocal, not, or, pass, raise, return, try, while, with, yield

PARAGRAPH.splitlines():
['Good morning!', 'Today, we will discuss about the weather.', 'Thank you!']

PARAGRAPH.split():
['Good', 'morning!', 'Today,', 'we', 'will', 'discuss', 'about', 'the', 'weather.', 'Thank', 'you!']

PARAGRAPH.rsplit():
['Good', 'morning!', 'Today,', 'we', 'will', 'discuss', 'about', 'the', 'weather.', 'Thank', 'you!']

PARAGRAPH.partition('\n'):
('Good morning!', '\n', 'Today, we will discuss about the weather.\nThank you!')

PARAGRAPH.rpartition('\n'):
('Good morning!\nToday, we will discuss about the weather.', '\n', 'Thank you!')


## <a id='toc6_'></a>String Formatting [&#8593;](#toc0_)

- Strings in Python can be formatted using `f`-string of the `.format()` method
- The pairs of curly braces in the formatting string are the placeholders for fields that will be substituted into the result
- Allow use of annotations to pad an argument to a particular width, using a choice of fill character and justification mode
  - By default, space is used as a fill character
  - An implied `<` character dictates left-justification
  - An explicit `>` character dictates right-justification
  - An explicit `^` character dictates center-justification

In [5]:
# String Formatting: Center the string (^) and pad with -
# -------------------------------------------------------

VAL: str = "hello"

print("{:*>20}".format(VAL))
print(f"{VAL:-^20}")

***************hello
-------hello--------


- A number will be padded with zeros rather than spaces if its width description is prefaced with a
zero

In [6]:
# String Formatting: Pad numbers with 0
# -------------------------------------

YEAR: int = 2023
MONTH: int = 3
DAY: int = 15

print(f"{YEAR}/{MONTH:02}/{DAY:02}")

2023/03/15


- Integers can be converted to *binary*, *octal*, or *hexadecimal* by respectively adding the character `b`, `o`, or `x` as a *suffix* to the annotation

In [7]:
# String Formatting: Converting integers to bin, octal, or hex
# ------------------------------------------------------------

MY_INT: int = 1965

print(f"MY_INT: {MY_INT}")
print(f"MY_INT in binary: {MY_INT:b}")
print(f"MY_INT in octal: {MY_INT:o}")
print(f"MY_INT in hexadecimal: {MY_INT:x}")

MY_INT: 1965
MY_INT in binary: 11110101101
MY_INT in octal: 3655
MY_INT in hexadecimal: 7ad


- Displayed precision of a floating-point number is specified with a decimal point and the subsequent number of desired digits
- Suffix `f` for floating-point representation or `e` for scientific notation

In [8]:
# String Formatting: Formatting Floating-Points
# ---------------------------------------------

MY_FLOAT: float = 3.1415

print(f"MY_FLOAT: {MY_FLOAT:.3f}")
print(f"MY_FLOAT in scientific notation: {MY_FLOAT:.3e}")

MY_FLOAT: 3.142
MY_FLOAT in scientific notation: 3.142e+00
