# Python Fundamentals (Data Engineering Focus)

These notes summarize key **Python Fundamentals** essential for data engineering.


## 1. Core Execution and Syntax

- **Interpreted Language:** Python executes code **line by line** during runtime.
- **Execution Process:** When a script runs, Python compiles the script into **byte code**(intermediate code), then runs it using the **Python Virtual Machine (PVM)**.
- **Indentation:** Indentation is **crucial** in Pythonâ€”defines code hierarchy and improves readability.
- **Comments:**
  - **Single-line:** `# comment`
  - **Multi-line:** `''' comment '''` or `""" comment """`


In [None]:
# Single-line comment

""" 
This is a multi-line comment style.
Often used for docstrings too.
"""

def greet(name: str) -> str:
    """Return a greeting message."""
    return f"Hello, {name}!"

print(greet("Data Engineer"))


## 2. Variables and Data Handling

- **Variables:** Containers that store data values.
- **Assignment:** Uses `=`
  - Multiple assignments: `x, y, z = 10, 5, 20`
  - Same value assignment: `x = y = z = 20`
- **Type Casting:** Converting one data type to another.
  - **Explicit:** `int("10")`
  - **Implicit:** Python auto converts types in some operations (e.g., `int + float -> float`)


In [None]:
# Variable assignment
x = 10
name = "Anshu"

# Multiple assignments
a, b, c = 10, 5, 20

# Same value assignment
p = q = r = 20

print(x, name)
print(a, b, c)
print(p, q, r)

# Type casting
num_str = "123"
num_int = int(num_str)  # explicit

num_float = 10.5
result = num_int + num_float  # implicit casting (int + float -> float)

print("num_int:", num_int, type(num_int))
print("result:", result, type(result))


## 3. The Print Statement

- **Basic Function:** `print()` displays output.
- **Separators:** `sep=` adds a delimiter between items.
- **Multi-line Print:** Use triple quotes for multi-line strings.
- **Escape Characters:** Use `\` to treat special characters literally.


In [None]:
print("A", "B", "C", sep=" | ")

multi_line = """This is line 1
This is line 2
This is line 3"""
print(multi_line)

quote = "He said, \"Python is awesome!\""
print(quote)


## 4. String Mastery

- Strings behave like **arrays** (sequence of characters).
- **Indexing** starts at **0**.
- **Slicing:** `[start:stop]` where `stop` is **not included**.
- **Negative slicing:** counts from the end.
- **f-strings:** embed variables using `{}`.
- Common methods:
  - `upper()`, `lower()`
  - `replace()`
  - `split()`
  - `startswith()`, `endswith()`


In [None]:
s = "data_engineering.csv"

# Indexing
print("First char:", s[0])

# Slicing
print("Slice [0:4]:", s[0:4])  # 'data'

# Negative slicing
print("Last 3 chars:", s[-3:])

# f-string
file_type = s.split(".")[-1]
print(f"File '{s}' is of type: {file_type}")

# Common methods
print("Upper:", s.upper())
print("Replace:", s.replace("_", " "))
print("Split:", s.split("_"))
print("Starts with 'data':", s.startswith("data"))
print("Ends with '.csv':", s.endswith(".csv"))


## 5. Control Flow (Conditionals and Loops)

### Conditionals
- Use `if`, `elif`, `else` for decision-making.
- Use `==` for comparison (single `=` is assignment).

### Loops
- **For loop:** iterate over sequences or ranges.
- **While loop:** run while condition is true (be careful about infinite loops).
- **break:** exit loop immediately.
- **continue:** skip to next iteration.


In [None]:
# Conditionals
x = 10
if x == 10:
    print("x is 10")
elif x > 10:
    print("x is greater than 10")
else:
    print("x is less than 10")

# For loop (common in data engineering: iterate through tables/files)
files = ["a.csv", "b.json", "c.csv", "d.parquet"]
for f in files:
    if f.endswith(".csv"):
        print("CSV file:", f)

# While loop
i = 0
while i < 5:
    if i == 2:
        i += 1
        continue  # skip 2
    if i == 4:
        break  # stop early
    print("i =", i)
    i += 1


---
## End of Notebook

You can expand this notebook by adding:
- Functions & OOP examples
- File handling (CSV/JSON)
- Exception handling (`try/except`)
- Basic pandas workflows
