## Day 03

# Strings

Strings are a sequence of characters which can be stored either as a constant or a different variable.Text is a string data type. Any data type written as text is a string. Any data under single, double or triple quote are strings

### For example, declaring a string in python:

In [7]:
# Declaring a string variable
string = "This is a python string"
print(string)

This is a python string


In [9]:
print(type(string))

<class 'str'>


In [13]:
multiline_string = '''Hello Everyone this is day 03 
of 30 days of python learning,
'''
print(multiline_string)

Hello Everyone this is day 03 
of 30 days of python learning,



## String len()


The len() function returns the length of a string, the number of chars in it. It is valid to have a string of zero characters, written just as '', called the "empty string". The length of the empty string is 0. The len() function in Python is omnipresent - it's used to retrieve the length of every data type, with string just a first example.

In [17]:
s = 'Python'
len(s)

6

In [21]:
len('') # empty string

0

## Convert Between Int and String
The formal name of the string type in Python is "str". The str() function serves to convert many values to a string form. This code computes the str form of the number 123:

In [26]:
str(123)

'123'

Looking carefully at the values, 123 is a number, while '123' is a string length-3, made of the three chars '1' '2' and '3'.

## String Indexing [ ]
Chars in a string are numbered with zero-based indexing, so the first char is at index 0, the next index 1, and the last char is at index len-1. Access the individual characters using square brackets, e.g. s[0] is the first char.

| Index | Letter |
|-------|--------|
|   0   |   P    |
|   1   |   y    |
|   2   |   t    |
|   3   |   h    |
|   4   |   o    |
|   5   |   n    |


In [32]:
s = 'Python'

In [34]:
len(s)

6

In [36]:
s[0]   # access char at index 0

'P'

In [38]:
s[1]

'y'

In [40]:
s[4]

'o'

In [44]:
s[6]  #IndexError: string index out of range

IndexError: string index out of range

## String +
The + operator combines (aka "concatenates") two strings to make a bigger string. This creates a new string to represent the result, leaving the original strings unchanged

In [47]:
s1 = 'Hello'
s2 = 'There'
s3 = s1+' '+s2
print(s3)

Hello There


In [49]:
print(s1)

Hello


Concatenate + only works with 2 or more strings, not for example to concatenate a string and an int. Call the str() function to make a string out of an int, then concatenation works.

In [56]:
'Number:' + 6 

TypeError: can only concatenate str (not "int") to str

In [58]:
'Number:' + str(6)

'Number:6'

## String Functions
Here are the most commonly used string functions.

### String in
The in operator checks, True or False, if something appears anywhere in a string. In this and other string comparisons, characters much match exactly, so 'a' matches 'a' but does not match 'A'. (Mnemonic: this is the same word "in" as used in the for-loop.)



In [64]:
'b' in 'abcd'

True

In [66]:
'B' in 'abcd'

False

In [74]:
'aa' in 'iiiaaiibb' # test string can be any length

True

In [76]:
'bbb' in 'iiaaabba'

False

In [78]:
'' in 'abcd'  # empty string is always true

True

### Character Tests: s.isalpha() s.isdigit() s.isspace()

The characters that make up a string can be divided into several categories or "character classes":

### Character Classes in Strings

#### Alphabetic Characters
- **Examples:** `'abcXYZ'`
- Used to write words.
- **Divided into:**
  - Uppercase versions (e.g., `'A'`, `'B'`, `'C'`, etc.).
  - Lowercase versions (e.g., `'a'`, `'b'`, `'c'`, etc.).
- The details depend on the particular Unicode alphabet.

#### Digit Characters
- **Examples:** `'0'`, `'1'`, ..., `'9'`
- Used to write numbers.

#### Space Characters
- **Examples:** 
  - Space (`' '`)
  - Newline (`'\n'`)
  - Tab (`'\t'`)

#### Miscellaneous Characters
- These are neither alphabetic, digit, nor space characters.
- **Examples:** `$`, `^`, `<`, etc.

---

### Test Functions for Character Classes

These functions return `True` if all the characters in the string `s` belong to the respective class:

- **`s.isalpha()`**
  - Returns `True` for alphabetic "word" characters like `'abcXYZ'`.
  - Applies to alphabetic characters in other Unicode alphabets too, e.g., `'Σ'`.

- **`s.isdigit()`**
  - Returns `True` if all characters in `s` are digits (`'0..9'`).

- **`s.isspace()`**
  - Returns `True` for whitespace characters, e.g., space, tab, newline.

- **`s.isupper()` / `s.islower()`**
  - `s.isupper()` - Returns `True` for uppercase alphabetic characters.
  - `s.islower()` - Returns `True` for lowercase alphabetic characters.
  - Returns `False` for other characters like `'4'` and `'$'` which do not have uppercase/lowercase versions.


### Example

In [100]:
'a' .isalpha()

True

In [102]:
'$' .isalpha()

False

In [104]:
'a' .islower()

True

In [106]:
'a' .isupper()

False

In [108]:
'7' .isdigit()

True

In [110]:
'a' .isdigit()

False

In [112]:
' ' .isspace()

True

## Change Case s.upper() s.lower()

s.lower() - returns a new version of s where each char is converted to its lowercase form, so 'A' becomes 'a'. Chars like '$' are returned unchanged. The original s is unchanged - a good example of strings being immutable

In [116]:
s = 'Python'
s.lower()

'python'

In [118]:
s.upper()

'PYTHON'

In [120]:
print(s)

Python


### Testing String Start and End: `s.startswith()` and `s.endswith()`

These convenient functions return `True` or `False` depending on what appears at one end of a string. They are especially useful for checking prefixes or suffixes, such as verifying if a filename ends with `.html`.  

**Style Note:** These functions are examples of well-named methods, making the code where they are used highly readable.

---

#### **`s.startswith(x)`**
- **Description:** Returns `True` if the string `s` starts with the substring `x`.
- **Example:**
  ```python
  s = "example.html"
  print(s.startswith("example"))  # True
  print(s.startswith("html"))     # False


In [124]:
s = "example.html"
print(s.endswith(".html"))  # True
print(s.endswith("example")) # False


True
False


In [126]:
'Python' .startswith('Py')

True

In [128]:
'Python' .startswith('Px')

False

In [130]:
'resume.html'.endswith('.html')

True

### Searching Strings with `s.find()` and Related Functions

#### **`s.find(x)`**
- **Description:** Searches the string `s` from left to right for the substring `x`.  
  - Returns the **index** (integer) of the first occurrence of `x`.
  - Returns `-1` if `x` is not found.
- **Use Case:** Determines where a substring first appears in a string.

#### **Examples:**



In [134]:
s = 'Python'

# Using `in`
print('y' in s)       # Output: True
print('xx' in s)      # Output: False

# Using `s.find()`
print(s.find('y'))    # Output: 1
print(s.find('xx'))   # Output: -1


True
False
1
-1


### Removing Whitespace: `s.strip()`

#### **`s.strip()`**
- **Description:** Returns a version of the string `s` with all leading and trailing **whitespace characters** removed.  
  - Whitespace includes spaces (`' '`), tabs (`'\t'`), and newlines (`'\n'`).
- **Use Case:** Commonly used to clean up strings read from files, user input, or other sources where unintended leading or trailing whitespace might be present.

---

#### **Example Usage:**
```python
# Removing whitespace
s = '   hi there  \n'
print(s.strip())  # Output: 'hi there'

# No leading or trailing whitespace
s = 'Python'
print(s.strip())  # Output: 'Python' (unchanged)


In [139]:
'  hi there  \n'.strip()

'hi there'

In [143]:
hi = '       hello everyone   '
hi.strip()

'hello everyone'

### Replacing Substrings: `s.replace()`

#### **`s.replace(old, new)`**
- **Description:** Returns a version of the string `s` where all occurrences of the substring `old` are replaced with the substring `new`.
  - Replaces **every instance** of `old` in `s`, without considering word boundaries.
  - If `new` is an empty string (`''`), all occurrences of `old` are effectively removed from `s`.

---

#### **Example Usage:**
```python
# Basic replacement
s = 'this is it'
print(s.replace('is', 'xxx'))  # Output: 'thxxx xxx it'

# Removing a substring
print(s.replace('is', ''))     # Output: 'th  it'


In [148]:
'this is it'.replace('is','why')

'thwhy why it'

In [150]:
'this is it'.replace('is','')

'th  it'

In [152]:
greet = "Hello everyone, What's up there"
greet.replace('Hello', 'Hey')

"Hey everyone, What's up there"

### Working with Immutable Strings: `x = change(x)`

Strings in Python are **immutable**, meaning that once a string is created, its characters cannot be changed directly. Instead, any operation that appears to "modify" a string actually creates a **new string**. To apply changes, you must **explicitly store the result** in a variable, often following the `x = change(x)` pattern.

---

#### **Understanding Immutability**
Consider the following example:
```python
s = 'Hello'
print(s.upper())  # Computes the uppercase form of 'Hello'
# Output: 'HELLO'

print(s)          # The original string is unchanged
# Output: 'Hello'


In [155]:
s = 'Hello everyone'
s.upper()

'HELLO EVERYONE'

In [157]:
s

'Hello everyone'

In [159]:
s = 'Hello'

# Step 1: Convert to uppercase
s = s.upper()

# Step 2: Add an exclamation mark
s = s + '!'

print(s)
# Output: 'HELLO!'


HELLO!


### Backslash Special Characters in Strings

In Python, a **backslash (`\`)** in a string literal "escapes" certain special characters, allowing them to be included or interpreted in the string. These escaped characters represent various formatting or control sequences.

---

#### **Common Backslash Escapes**
| Escape Sequence | Description                        | Example Output         |
|------------------|------------------------------------|------------------------|
| `\'`            | A single quote                    | `'Hello'`              |
| `\"`            | A double quote                    | `"Hello"`              |
| `\\`            | A backslash                       | `\`                    |
| `\n`            | A newline (line break)            | (new line)             |

---

#### **Example with `\n` (Newline):**
```python
a = 'First line\nSecond line\nThird line\n'
print(a)
# Output:
# First line
# Second line
# Third line


In [167]:
a = 'First line\nSecond line\nThird line\n'
print(a)

First line
Second line
Third line



In [171]:
greet = 'Hello \nEveryone\n'
print(greet)

Hello 
Everyone



### Format Strings in Python

A **format string** is a modern and convenient way to embed variables, expressions, or function calls directly into a string. Format strings are denoted by a leading **`f`** before the opening quote, e.g., `f'...'`.

---

#### **Basic Syntax**
- Curly braces `{}` hold an expression, such as a variable or function call.
- The result of evaluating the expression is inserted into the string at that position.

---

#### **Examples:**
```python
# Accessing a variable
name = 'Sally'
print(f'Name: {name}')
# Output: 'Name: Sally'

# Calling a function within the format string
scores = [19, 34, 22]
print(f'Max score: {max(scores)}')
# Output: 'Max score: 34'

# Using multiple expressions
age = 25
print(f'{name} is {age} years old.')
# Output: 'Sally is 25 years old.'


In [174]:
print(f'A curly brace: {{')
# Output: 'A curly brace: {'


A curly brace: {


In [178]:
# Arithmetic in format strings
x, y = 5, 10
print(f'{x} + {y} = {x + y}')
# Output: '5 + 10 = 15'

# String methods
greeting = 'hello'
print(f'Uppercase: {greeting.upper()}')
# Output: 'Uppercase: HELLO'

# Nested function calls
scores = [19, 34, 22]  # Define the scores list
print(f'Sum of scores: {sum(scores)} and max: {max(scores)}')
# Output: 'Sum of scores: 75 and max: 34'


5 + 10 = 15
Uppercase: HELLO
Sum of scores: 75 and max: 34


### String Formatting: Floating Point Precision

When working with floating-point numbers in Python, default formatting often produces more digits than necessary (15+ digits). To make the output more readable, the **format string** syntax allows specifying the desired number of digits.

---

#### **Basic Syntax for Floating Point Formatting**
1. **`:.<digits>`**: Limits the output to approximately the specified number of digits.
2. **`:.<digits>f`**: Ensures a fixed number of decimal places.

---

#### **Examples:**
```python
# Default floating-point output
x = 2 / 3
print(f'Default: {x}')
# Output: 'Default: 0.6666666666666666'

# Limit to approximately 4 digits
print(f'Limited to 4 digits: {x:.4}')
# Output: 'Limited to 4 digits: 0.6667'

# Fixed to exactly 4 decimal places
print(f'Fixed to 4 decimal places: {x:.4f}')
# Output: 'Fixed to 4 decimal places: 0.6667'


In [181]:
# Approximately 4 digits (default behavior)
print(f'{0.025:.4}')  
# Output: '0.025'

# Exactly 4 decimal places
print(f'{0.025:.4f}')  
# Output: '0.0250'

# Another example
value = 123.45678
print(f'Value with 2 decimal places: {value:.2f}')
# Output: 'Value with 2 decimal places: 123.46'


0.025
0.0250
Value with 2 decimal places: 123.46


### String Formatting: Hexadecimal, Binary, and Thousand Separators

Using Python's **format strings**, integers can be formatted in different number systems or with added separators for readability. Here are some common formatting options:

---

#### **Hexadecimal and Binary Formatting**
1. **`:x`**: Converts the integer to **hexadecimal** (base-16).
2. **`:b`**: Converts the integer to **binary** (base-2).

---

#### **Examples:**
```python
# Define an integer
n = 215

# Default (decimal) representation
print(f'Decimal: {n}')
# Output: 'Decimal: 215'

# Hexadecimal representation
print(f'Hexadecimal: {n:x}')
# Output: 'Hexadecimal: d7'

# Binary representation
print(f'Binary: {n:b}')
# Output: 'Binary: 11010111'


In [184]:
n = 1_000_000  # Underscores in numbers are ignored and improve code readability

# Add thousand separators
print(f'With commas: {n:,}')
# Output: 'With commas: 1,000,000'


With commas: 1,000,000


### String Slicing in Python

String slicing is a way to extract parts (substrings) of a string using a range of indices. Python strings are **indexed** and support both **positive** and **negative indexing**, enabling flexible slicing options.

---

#### **Basic Slicing Syntax**
```python
s[start:end:step]


In [189]:
# Define a string
s = "Hello, World!"

In [193]:
# Slice with positive indices
print(s[0:5])   # 'Hello'
print(s[:5])    # 'Hello' (start defaults to 0)
print(s[7:])    # 'World!' (end defaults to the length of the string)

Hello
Hello
World!
