# 03 — Strings & Type Conversion

Goal: Get very comfortable with strings and converting between basic types.

Why this matters for AI / data:
- Most raw data (CSV, JSON, logs) starts as **strings**
- You constantly convert between `str` ↔ `int` ↔ `float` ↔ `bool`
- Text preprocessing, file paths, config values, etc. are all string-heavy


## 1. What Is a String?

A **string** (`str`) is a sequence of characters.

- Single quotes: `'hello'`
- Double quotes: `"hello"`
- Triple quotes for multi-line: `"""long string"""`

Strings are:
- **Ordered**
- **Immutable** (you cannot modify them in place)
- Indexable and sliceable like lists


In [1]:
s1 = "hello"
s2 = 'world'
s3 = """multi
line
string"""

print(s1, type(s1))
print(s2, type(s2))
print(s3)

hello <class 'str'>
world <class 'str'>
multi
line
string


## 2. Indexing & Slicing Strings

Strings behave like sequences of characters:

- `s[0]` → first character
- `s[-1]` → last character
- `s[start:stop]`
- `s[start:stop:step]`

Remember: strings are **immutable**, so these return **new strings**.


In [2]:
text = "neural"

print("text[0]:", text[0])
print("text[1:4]:", text[1:4])
print("text[-1]:", text[-1])
print("text[::2]:", text[::2])  # every second character


text[0]: n
text[1:4]: eur
text[-1]: l
text[::2]: nua


## 3. Immutability

You can't change a string in place:

```python
s = "cat"
s[0] = "b"   # error
```
Instead, you create new strings:
```python
s = "cat"
s = "b" + s[1:]   # "bat"
```

In [3]:
s = "cat"
try:
    s[0] = "b"
except TypeError as e:
    print("Strings are immutable:", e)

s2 = "b" + s[1:]
print("New string:", s2)

Strings are immutable: 'str' object does not support item assignment
New string: bat


## 4. Common String Methods

You'll use these constantly:

- Case:
  - `s.lower()`
  - `s.upper()`
  - `s.title()`
- Whitespace:
  - `s.strip()` (removes leading/trailing spaces)
- Search / check:
  - `s.startswith(prefix)`
  - `s.endswith(suffix)`
  - `sub in s`
- Splitting & joining:
  - `s.split()` → string → list of substrings
  - `" ".join(list_of_strings)` → list → string


In [4]:
s = "   Neural Networks From Scratch   "

print("Original:", repr(s))
print("strip():", repr(s.strip()))
print("lower():", s.lower())
print("upper():", s.upper())

print("Starts with 'Neural'?", s.strip().startswith("Neural"))
print("Ends with 'Scratch'?", s.strip().endswith("Scratch"))

# Split and join
words = s.strip().split()  # split on whitespace
print("Words:", words)

joined = " | ".join(words)
print("Joined with pipes:", joined)


Original: '   Neural Networks From Scratch   '
strip(): 'Neural Networks From Scratch'
lower():    neural networks from scratch   
upper():    NEURAL NETWORKS FROM SCRATCH   
Starts with 'Neural'? True
Ends with 'Scratch'? True
Words: ['Neural', 'Networks', 'From', 'Scratch']
Joined with pipes: Neural | Networks | From | Scratch


## 5. f-Strings — Modern String Formatting

f-strings let you embed expressions directly in strings:

```python
name = "Joe"
score = 0.87
msg = f"{name} scored {score:.2f}"
```
Advantages:
- Readable
- Fast
- Great for logging / debugging / printing metrics

In [5]:
name = "Joe"
epoch = 5
loss = 0.03456

msg = f"Epoch {epoch}: loss = {loss:.4f} (by {name})"
print(msg)

# You can put expressions inside too
lr = 0.001
print(f"Learning rate × 10 = {lr * 10}")

Epoch 5: loss = 0.0346 (by Joe)
Learning rate × 10 = 0.01


## 6. Type Conversion (Casting)

Converting between types is essential when reading data.

Core functions:

- `int(x)`   → convert to integer
- `float(x)` → convert to float
- `str(x)`   → convert to string
- `bool(x)`  → convert to boolean (truthiness rules)

Many conversions are straightforward, but some are surprising.


In [6]:
# String to int
s = "42"
n = int(s)
print(s, "->", n, type(n))

# String to float
s2 = "3.14"
f = float(s2)
print(s2, "->", f, type(f))

# Number to string
x = 123
sx = str(x)
print(x, "->", sx, type(sx))

# Beware invalid conversions
for bad in ["3.14", "abc"]:
    try:
        print("int(", bad, ") ->", int(bad))
    except ValueError as e:
        print("Cannot convert:", repr(bad), "->", e)


42 -> 42 <class 'int'>
3.14 -> 3.14 <class 'float'>
123 -> 123 <class 'str'>
Cannot convert: '3.14' -> invalid literal for int() with base 10: '3.14'
Cannot convert: 'abc' -> invalid literal for int() with base 10: 'abc'


### `bool()` Can Be Surprising

`bool(x)` uses truthiness rules:

- `bool(0)` → `False`
- `bool(0.0)` → `False`
- `bool("")` → `False`
- `bool([])` → `False`
- Everything else (including `"False"`) is `True`.

This is important when parsing configuration flags from text.


In [7]:
print("bool(0):", bool(0))
print("bool(1):", bool(1))
print("bool(''):", bool(""))
print("bool('False'):", bool("False"))  # this is True!
print("bool([]):", bool([]))
print("bool([0]):", bool([0]))


bool(0): False
bool(1): True
bool(''): False
bool('False'): True
bool([]): False
bool([0]): True


## 7. Tiny Data-Flavoured Example

Imagine we read a CSV line and everything starts as strings:

```python
row = ["42", "0.001", "True", "mnist"]
```
We often need to convert:
- ID → int
- learning rate → float 
- flag → bool (carefully)
- name → keep as str

In [8]:
row = ["42", "0.001", "True", "mnist"]

id_str, lr_str, flag_str, dataset = row

id_val = int(id_str)
lr = float(lr_str)

# Simple (but naive) bool parse:
is_active = (flag_str == "True")

print("id:", id_val, type(id_val))
print("lr:", lr, type(lr))
print("is_active:", is_active, type(is_active))
print("dataset:", dataset, type(dataset))

id: 42 <class 'int'>
lr: 0.001 <class 'float'>
is_active: True <class 'bool'>
dataset: mnist <class 'str'>


# 03 — Exercises (Strings & Type Conversion)

### Exercise 1 — Indexing & Slicing

Given:

```python
s = "NeuralNet"
```
Predict:

- s[0]
- s[-1]
- s[0:6]
- s[::2]

Then check in a code cell.

In [9]:
# Excercise 1
s = "Neuralnet"

### Exercise 2 — Normalising Text

Given:
```python
raw = "   HeLLo  WoRLd   "
```
use string methods to transform this into exactly 

```python
"Hello World!"
```
(no extra spaces, all lowercase, single space between words)
<br>
the ! is an optional extra chalenge

In [10]:
# Excercise 2 
raw = "   HeLLo  WoRLd   "

Exercise 3 — Splitting & Joining

Given:
```python
sentence = "deep learning is fun"
```
- Split it into a list of words

- Then join the words with " | " to get:

```python
"deep | learning | is | fun"
```

In [11]:
# Excercise 3
sentence = "deep learning is fun"


Exercise 4 — Safe Conversion

You have:
```python
values = ["10", "3.5", "abc", "7"]
```

Write code that loops over values and:

- converts each entry to int if possible

- skips the ones that cannot be converted

Result: a list [10, 7].

(Hint: use try / except.)

In [13]:
# Excercise 4
values = ["10", "3.5", "abc", "7"]

Exercise 5 — Boolean from String

Implement a function:
```python
def str_to_bool(s):
```
Rules:

- "true", "1", "yes" (any case) → True

- "false", "0", "no" (any case) → False

- anything else → raise ValueError


In [16]:
# Excercise 5
def str_to_bool(s):
    return True