# Chapter 1: Pythonic Thinking

This notebook covers essential Python concepts with comprehensive code examples and explanations.

---

## Item 1: Know Which Version of Python You're Using

### Why Version Matters

Python has two major versions: Python 2 (end-of-life January 1, 2020) and Python 3 (actively maintained). Always use Python 3 for new projects.

### Python 2 vs Python 3 Comparison Table

| Feature | Python 2 | Python 3 |
|---------|----------|----------|
| Support Status | Ended Jan 1, 2020 | Actively maintained |
| Library Support | Limited compatibility | Full compatibility |
| Migration Tools | 2to3, six | Native features |
| Future Proof | No updates | Regular improvements |
| Print Statement | `print "hello"` | `print("hello")` |
| Division | `5/2 = 2` | `5/2 = 2.5` |
| Unicode | Separate unicode type | Strings are Unicode by default |

**Recommendation**: Use Python 3 for all projects

### Checking Your Python Version

In [None]:
import sys

# Check Python version at runtime
print("Version Info:", sys.version_info)
print("Full Version:", sys.version)
print("\nMajor Version:", sys.version_info.major)
print("Minor Version:", sys.version_info.minor)
print("Micro Version:", sys.version_info.micro)

In [None]:
# Conditional code based on Python version
if sys.version_info.major >= 3:
    print("✓ You're using Python 3!")
else:
    print("⚠ Warning: Python 2 is no longer supported!")

### Key Differences Example: Division Behavior

In [None]:
# Python 3 division behavior
print("Regular division (/):", 5 / 2)      # Returns float: 2.5
print("Floor division (//):", 5 // 2)     # Returns int: 2
print("Modulo (%):", 5 % 2)               # Returns remainder: 1

# Note: In Python 2, 5/2 would return 2 (integer division)
# In Python 3, you must explicitly use // for floor division

### Key Takeaways

- Python 3 is the most up-to-date and well-supported version
- Always verify the Python version on your system
- Avoid Python 2 as it's no longer maintained

---

## Item 2: Follow the PEP 8 Style Guide

### What is PEP 8?

PEP 8 is the official style guide for Python code. Following it makes your code:
- More readable
- Easier to maintain
- Consistent with community standards

### Whitespace Rules

| Rule | Guideline | Example |
|------|-----------|----------|
| Indentation | 4 spaces per level | `def func():` |
| Line Length | 79 characters max | Break long lines |
| Function Spacing | 2 blank lines | Between functions |
| Method Spacing | 1 blank line | Between methods |
| Assignment | Space around `=` | `x = 5` |
| Operators | Space around operators | `x + y` |
| Commas | Space after comma | `[1, 2, 3]` |

In [None]:
# GOOD: Proper spacing and indentation
def calculate_area(length, width):
    """Calculate rectangle area."""
    area = length * width
    return area


def calculate_volume(length, width, height):
    """Calculate box volume."""
    volume = length * width * height
    return volume


# Test the functions
print("Area:", calculate_area(5, 3))
print("Volume:", calculate_volume(5, 3, 2))

In [None]:
# BAD: Poor spacing (for comparison - don't do this!)
def bad_function(x,y):
    result=x+y  # No spaces around operators
    return result

# GOOD: Proper spacing
def good_function(x, y):
    result = x + y  # Spaces around operators
    return result

print("Bad function:", bad_function(3, 4))
print("Good function:", good_function(3, 4))

### Naming Conventions

| Element | Convention | Example |
|---------|-----------|----------|
| Functions | lowercase_underscore | `calculate_total()` |
| Variables | lowercase_underscore | `user_name` |
| Classes | CapitalizedWord | `DataProcessor` |
| Constants | ALL_CAPS | `MAX_SIZE` |
| Protected | _leading_underscore | `_internal` |
| Private | __double_underscore | `__private` |

In [None]:
# Naming examples demonstrating different conventions

# Functions and variables: lowercase_underscore
def calculate_total_price(item_price, quantity):
    """Calculate total price for multiple items."""
    total_price = item_price * quantity
    return total_price


# Classes: CapitalizedWord (PascalCase)
class ShoppingCart:
    """A shopping cart that holds items."""
    
    def __init__(self):
        self._items = []          # Protected attribute
        self.__total = 0          # Private attribute
    
    def add_item(self, item, price):
        """Add an item to the cart."""
        self._items.append((item, price))
        self.__total += price
    
    def get_total(self):
        """Get the cart total."""
        return self.__total


# Constants: ALL_CAPS
MAX_CART_ITEMS = 100
DEFAULT_DISCOUNT = 0.1
SALES_TAX_RATE = 0.08

# Example usage
cart = ShoppingCart()
cart.add_item("Apple", 1.50)
cart.add_item("Banana", 0.75)

print(f"Cart total: ${cart.get_total():.2f}")
print(f"Max items allowed: {MAX_CART_ITEMS}")
print(f"Sales tax rate: {SALES_TAX_RATE * 100}%")

### More Naming Examples

In [None]:
# Example showing different naming conventions in action

# Constants (configuration values)
DATABASE_URL = "postgresql://localhost/mydb"
API_TIMEOUT = 30
MAX_RETRIES = 3

# Class with various attribute types
class UserAccount:
    """Represents a user account."""
    
    # Class variable (shared across instances)
    total_accounts = 0
    
    def __init__(self, username, email):
        # Public attributes
        self.username = username
        self.email = email
        
        # Protected attribute (internal use, but accessible)
        self._created_at = "2025-01-01"
        
        # Private attribute (name mangling applied)
        self.__password_hash = "secret_hash"
        
        UserAccount.total_accounts += 1
    
    def get_account_info(self):
        """Return account information."""
        return f"User: {self.username}, Email: {self.email}"
    
    def _internal_method(self):
        """Protected method (by convention, for internal use)."""
        return "This is for internal use"
    
    def __private_method(self):
        """Private method (name mangled)."""
        return "This is truly private"


# Usage
user1 = UserAccount("john_doe", "john@example.com")
user2 = UserAccount("jane_smith", "jane@example.com")

print(user1.get_account_info())
print(f"Total accounts created: {UserAccount.total_accounts}")

# Accessing different attribute types
print(f"Public attribute: {user1.username}")
print(f"Protected attribute: {user1._created_at}")  # Accessible but not recommended
# print(user1.__password_hash)  # This would raise AttributeError!

### Expressions and Statements Best Practices

In [None]:
# GOOD: Use inline negation
a = 10
b = 20
if a is not b:
    print("Different objects")

# BAD: Negation of positive expression (don't do this!)
# if not a is b:
#     print("Different objects")

In [None]:
# GOOD: Check for empty containers using truthiness
my_list = []
if not my_list:
    print("List is empty")

# BAD: Checking length explicitly (unnecessarily verbose)
# if len(my_list) == 0:
#     print("List is empty")

In [None]:
# GOOD: Check for non-empty containers
my_list = [1, 2, 3]
if my_list:
    print("List has items")
    print(f"Items: {my_list}")

# Demonstrating truthiness with different container types
empty_dict = {}
filled_dict = {"key": "value"}
empty_string = ""
filled_string = "hello"

print(f"Empty dict is falsy: {not empty_dict}")
print(f"Filled dict is truthy: {bool(filled_dict)}")
print(f"Empty string is falsy: {not empty_string}")
print(f"Filled string is truthy: {bool(filled_string)}")

### Import Organization

**Order of imports:**
1. Standard library modules
2. Third-party modules
3. Your own modules

Each section should be in alphabetical order and separated by a blank line.

In [None]:
# GOOD import organization

# Standard library imports (alphabetical)
import os
import sys
from collections import defaultdict
from datetime import datetime

# Third-party imports (would go here if available)
# import numpy as np
# import pandas as pd
# import requests

# Your own module imports
# from mypackage import mymodule
# from myproject.utils import helper_function

print("Imports organized correctly!")
print(f"Current directory: {os.getcwd()}")
print(f"Python version: {sys.version_info.major}.{sys.version_info.minor}")

### Line Length and Continuation

In [None]:
# GOOD: Breaking long lines with parentheses (preferred)
long_variable_name = (
    "This is a very long string that would exceed "
    "the 79 character limit if written on one line"
)

# GOOD: Function calls with many arguments
def complex_function(arg1, arg2, arg3, arg4, arg5):
    return arg1 + arg2 + arg3 + arg4 + arg5

result = complex_function(
    arg1=10,
    arg2=20,
    arg3=30,
    arg4=40,
    arg5=50
)

print(long_variable_name)
print(f"Result: {result}")

In [None]:
# GOOD: List comprehensions and long expressions
long_list_comp = [
    item * 2 
    for item in range(10) 
    if item % 2 == 0
]

# GOOD: Dictionary with many key-value pairs
configuration = {
    "database_url": "postgresql://localhost/mydb",
    "api_key": "your_api_key_here",
    "timeout": 30,
    "max_retries": 3,
    "debug_mode": False,
}

print(f"Even numbers doubled: {long_list_comp}")
print(f"Config keys: {list(configuration.keys())}")

### Key Takeaways

- Always follow PEP 8 style guide
- Consistent style facilitates collaboration
- Use tools like Pylint or Black for automatic enforcement
- Well-formatted code is easier to maintain
- Readability counts!

---

## Item 3: Know the Differences Between bytes and str

### bytes vs str Overview

| Type | Contains | Example | Usage |
|------|----------|---------|-------|
| bytes | Raw 8-bit values | `b'hello'` | Binary data, files |
| str | Unicode code points | `'hello'` | Text data |
| Conversion | `.encode()` / `.decode()` | - | Between types |

### Understanding bytes

In [None]:
# bytes contain raw 8-bit values
a = b'h\x65llo'  # \x65 is hexadecimal for 'e'
print("Bytes as list:", list(a))  # Shows numeric values
print("Bytes repr:", a)           # Shows bytes representation

# Each element is an integer (0-255)
for byte in a:
    print(f"Byte value: {byte}, Character: {chr(byte)}")

In [None]:
# More bytes examples
binary_data = bytes([72, 101, 108, 108, 111])  # ASCII values for "Hello"
print("From integers:", binary_data)

# Creating bytes from a string
text_bytes = b'Python 3'
print("Text bytes:", text_bytes)
print("Length:", len(text_bytes))
print("First byte:", text_bytes[0])  # Returns integer

### Understanding str

In [None]:
# str contains Unicode code points
a = 'a\u0300 propos'  # \u0300 is a combining grave accent
print("String as list:", list(a))  # Shows individual characters
print("String repr:", a)           # Shows the rendered string

# Unicode examples
unicode_string = 'Hello 世界 🌍'  # Mixed scripts and emoji
print("\nUnicode string:", unicode_string)
print("Length (code points):", len(unicode_string))
for char in unicode_string:
    print(f"Character: '{char}', Unicode: U+{ord(char):04X}")

### Converting Between Types

In [None]:
# str to bytes: use encode()
text = 'hello'
data = text.encode('utf-8')
print(f"String: {text!r}")
print(f"Bytes: {data!r}")
print(f"Type: {type(data)}")

# bytes to str: use decode()
decoded = data.decode('utf-8')
print(f"\nDecoded: {decoded!r}")
print(f"Type: {type(decoded)}")

# Verify they're equal
assert text == decoded
print("\n✓ Encoding and decoding are symmetric")

In [None]:
# Helper functions to ensure correct types
def to_str(bytes_or_str):
    """Convert bytes or str to str."""
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of str


def to_bytes(bytes_or_str):
    """Convert bytes or str to bytes."""
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value  # Instance of bytes


# Test the helper functions
print("to_str tests:")
print(repr(to_str(b'foo')))
print(repr(to_str('bar')))

print("\nto_bytes tests:")
print(repr(to_bytes(b'foo')))
print(repr(to_bytes('bar')))

### Different Encodings Example

In [None]:
# Demonstrating different encodings
text = 'こんにちは'  # "Hello" in Japanese

# UTF-8 encoding (variable-length, 1-4 bytes)
utf8_bytes = text.encode('utf-8')
print(f"UTF-8: {utf8_bytes}")
print(f"UTF-8 length: {len(utf8_bytes)} bytes")

# UTF-16 encoding (fixed 2 or 4 bytes)
utf16_bytes = text.encode('utf-16')
print(f"\nUTF-16: {utf16_bytes}")
print(f"UTF-16 length: {len(utf16_bytes)} bytes")

# ASCII would fail for non-ASCII characters
try:
    ascii_bytes = text.encode('ascii')
except UnicodeEncodeError as e:
    print(f"\nASCII encoding failed: {e}")

# But ASCII works for simple English text
english = "Hello"
ascii_bytes = english.encode('ascii')
print(f"\nASCII works for English: {ascii_bytes}")

### Common Gotchas: Incompatibility Between bytes and str

In [None]:
# GOTCHA 1: Can't concatenate bytes and str
print("✓ bytes + bytes:", b'one' + b'two')
print("✓ str + str:", 'one' + 'two')

try:
    result = b'one' + 'two'
except TypeError as e:
    print(f"\n✗ bytes + str fails: {e}")

try:
    result = 'one' + b'two'
except TypeError as e:
    print(f"✗ str + bytes fails: {e}")

In [None]:
# GOTCHA 2: Can't compare bytes and str
print("✓ bytes comparison:", b'red' > b'blue')
print("✓ str comparison:", 'red' > 'blue')

try:
    result = 'red' > b'blue'
except TypeError as e:
    print(f"\n✗ str > bytes fails: {e}")

# Equality always returns False (no exception)
print(f"\n✗ bytes == str: {b'foo' == 'foo'}  (always False!)")

In [None]:
# GOTCHA 3: Format strings behave differently
print("✓ bytes formatting:", b'red %s' % b'blue')
print("✓ str formatting:", 'red %s' % 'blue')

try:
    result = b'red %s' % 'blue'
except TypeError as e:
    print(f"\n✗ bytes format with str: {e}")

# str format with bytes calls __repr__
result = 'red %s' % b'blue'
print(f"\n⚠ str format with bytes: {result!r}  (includes b'' prefix!)")

### File Operations with bytes and str

In [None]:
import tempfile
import os

# Create a temporary directory for our examples
temp_dir = tempfile.mkdtemp()
binary_file = os.path.join(temp_dir, 'data.bin')
text_file = os.path.join(temp_dir, 'data.txt')

# Writing binary data (mode 'wb')
with open(binary_file, 'wb') as f:
    f.write(b'\xf1\xf2\xf3\xf4\xf5')

print("✓ Binary file written successfully")

# Reading binary data (mode 'rb')
with open(binary_file, 'rb') as f:
    data = f.read()

print(f"✓ Binary data read: {data}")
assert data == b'\xf1\xf2\xf3\xf4\xf5'

In [None]:
# Writing text with specific encoding
with open(text_file, 'w', encoding='utf-8') as f:
    f.write('Hello 世界')

print("✓ Text file written with UTF-8 encoding")

# Reading text with specific encoding
with open(text_file, 'r', encoding='utf-8') as f:
    text = f.read()

print(f"✓ Text data read: {text}")

# Cleanup
import shutil
shutil.rmtree(temp_dir)
print("✓ Temporary files cleaned up")

### The Unicode Sandwich Pattern

In [None]:
# Unicode Sandwich: Decode at boundaries, use str internally, encode at boundaries

def process_data(input_bytes):
    """
    Demonstrates the Unicode Sandwich pattern.
    
    1. Decode bytes to str at input boundary
    2. Process as str internally
    3. Encode str to bytes at output boundary
    """
    # Decode at input boundary
    text = input_bytes.decode('utf-8')
    print(f"1. Decoded input: {text!r}")
    
    # Process as str (all string operations)
    text = text.upper()
    text = text.replace('HELLO', 'HI')
    print(f"2. Processed: {text!r}")
    
    # Encode at output boundary
    output_bytes = text.encode('utf-8')
    print(f"3. Encoded output: {output_bytes!r}")
    
    return output_bytes


# Example usage
input_data = b'hello world'
output_data = process_data(input_data)
print(f"\nFinal result: {output_data}")

### Key Takeaways

- `bytes` contains sequences of 8-bit values, `str` contains Unicode code points
- Use helper functions to ensure correct types
- `bytes` and `str` can't be used together with operators like `>`, `==`, `+`, and `%`
- Always use binary mode (`'rb'` or `'wb'`) for binary data
- Always specify encoding when reading/writing text files
- Follow the Unicode Sandwich pattern: decode early, process as str, encode late

---

## Item 4: Prefer Interpolated F-Strings Over C-style Format Strings and str.format

### Evolution of String Formatting in Python

| Method | Introduced | Status | Recommendation |
|--------|-----------|--------|----------------|
| C-style (%) | Python 1.0 | Legacy | Avoid |
| str.format() | Python 3.0 | Verbose | Avoid |
| F-strings | Python 3.6 | Modern | Use this |

### Problem 1: Type Conversion Errors with C-Style Formatting

In [None]:
# C-style formatting example
key = 'my_var'
value = 1.234

# This works
formatted = '%-10s = %.2f' % (key, value)
print(f"Correct order: {formatted}")

# But swapping the order causes an error
try:
    wrong_order = '%-10s = %.2f' % (value, key)
except TypeError as e:
    print(f"\nError with swapped values: {e}")

# The format string also needs to match
try:
    wrong_format = '%.2f = %-10s' % (key, value)
except TypeError as e:
    print(f"Error with swapped format: {e}")

### Problem 2: Readability with Complex Modifications

In [None]:
# Pantry inventory example
pantry = [
    ('avocados', 1.25),
    ('bananas', 2.5),
    ('cherries', 15),
]

# Simple version (without modifications)
print("Simple formatting:")
for i, (item, count) in enumerate(pantry):
    print('#%d: %-10s = %.2f' % (i, item, count))

In [None]:
# Complex version (with inline modifications)
print("\nWith modifications (harder to read):")
for i, (item, count) in enumerate(pantry):
    print('#%d: %-10s = %d' % (
        i + 1,              # Add 1 to index
        item.title(),       # Capitalize item name
        round(count)))      # Round the count

# The tuple becomes very long and splits across multiple lines
print("\n✗ This style is hard to read!")

### Problem 3: Repetition with Multiple References

In [None]:
# Using the same value multiple times
template = '%s loves food. See %s cook.'
name = 'Max'

# Must repeat the value
formatted = template % (name, name)
print(formatted)

# Easy to forget or make mistakes
name = 'brad'
formatted = template % (name.title(), name.title())  # Error-prone!
print(formatted)

### Dictionary Formatting (Partial Solution)

In [None]:
# Dictionary formatting reduces some problems
key = 'my_var'
value = 1.234

# Can swap order in the dictionary without errors
old_way = '%-10s = %.2f' % (key, value)
new_way = '%(key)-10s = %(value).2f' % {'key': key, 'value': value}
reordered = '%(key)-10s = %(value).2f' % {'value': value, 'key': key}

assert old_way == new_way == reordered
print(f"All equal: {old_way}")

In [None]:
# Solves repetition problem
template = '%(name)s loves food. See %(name)s cook.'
formatted = template % {'name': 'Max'}
print(formatted)

# But introduces verbosity
soup = 'lentil'
formatted = "Today's soup is %(soup)s." % {'soup': soup}
print(formatted)
print("\n✗ Very verbose with redundant keys!")

### str.format() Method

In [None]:
# The format() built-in function
a = 1234.5678
formatted = format(a, ',.2f')
print(f"Formatted number: {formatted}")

b = 'my string'
formatted = format(b, '^20s')
print(f"Centered: '*{formatted}*'")

In [None]:
# The str.format() method
key = 'my_var'
value = 1.234

# Basic usage with positional arguments
formatted = '{} = {}'.format(key, value)
print(formatted)

# With format specifiers
formatted = '{:<10} = {:.2f}'.format(key, value)
print(formatted)

# Can reference positions multiple times
formatted = '{0} loves food. See {0} cook.'.format('Max')
print(formatted)

### F-Strings: The Modern Solution

In [None]:
# F-strings are concise and readable
key = 'my_var'
value = 1.234

formatted = f'{key} = {value}'
print(formatted)

# With format specifiers
formatted = f'{key!r:<10} = {value:.2f}'
print(formatted)

# Comparison of all methods
f_string  = f'{key:<10} = {value:.2f}'
c_tuple   = '%-10s = %.2f' % (key, value)
str_args  = '{:<10} = {:.2f}'.format(key, value)
str_kw    = '{key:<10} = {value:.2f}'.format(key=key, value=value)
c_dict    = '%(key)-10s = %(value).2f' % {'key': key, 'value': value}

assert c_tuple == c_dict == f_string == str_args == str_kw
print(f"\n✓ F-string is the shortest: {f_string}")

### F-Strings with Expressions

In [None]:
# F-strings allow full Python expressions
pantry = [
    ('avocados', 1.25),
    ('bananas', 2.5),
    ('cherries', 15),
]

print("F-string with inline expressions:")
for i, (item, count) in enumerate(pantry):
    # All modifications inline - clear and concise!
    print(f'#{i+1}: {item.title():<10s} = {round(count)}')

In [None]:
# Complex expressions in f-strings
import math

x = 5
y = 10

# Mathematical operations
print(f'Sum: {x + y}')
print(f'Product: {x * y}')
print(f'Square root of {x}: {math.sqrt(x):.2f}')

# Conditional expressions
print(f'x is {"even" if x % 2 == 0 else "odd"}')

# Method calls and list operations
words = ['hello', 'world']
print(f'Joined: {", ".join(words)}')
print(f'Uppercase: {[w.upper() for w in words]}')

### Common Format Specifiers

| Specifier | Description | Example | Output |
|-----------|-------------|---------|--------|
| :.2f | 2 decimal float | f'{3.14159:.2f}' | 3.14 |
| :<10 | Left align, width 10 | f'{"hi":<10}' | 'hi        ' |
| :>10 | Right align | f'{"hi":>10}' | '        hi' |
| :^10 | Center align | f'{"hi":^10}' | '    hi    ' |
| :, | Thousands separator | f'{1234567:,}' | '1,234,567' |
| !r | Repr format | f'{"hi"!r}' | "'hi'" |
| !s | Str format | f'{obj!s}' | str(obj) |
| !a | ASCII format | f'{obj!a}' | ascii(obj) |

In [None]:
# Format specifier examples
value = 1234.5678
text = "hello"

print("Number formatting:")
print(f"2 decimals: {value:.2f}")
print(f"Thousands: {value:,.2f}")
print(f"Percentage: {0.1234:.1%}")
print(f"Scientific: {value:.2e}")

print("\nString formatting:")
print(f"Left align:  '{text:<10}'")
print(f"Right align: '{text:>10}'")
print(f"Center:      '{text:^10}'")
print(f"Repr:        {text!r}")

### Dynamic Format Specifiers

In [None]:
# Format specifiers can be variables
places = 3
number = 1.23456

print(f'My number is {number:.{places}f}')

# Dynamic width
width = 15
text = "Python"
print(f'Centered: |{text:^{width}}|')

# Both dynamic
precision = 4
field_width = 12
value = 123.456789
print(f'Custom: |{value:{field_width}.{precision}f}|')

### Key Takeaways

- C-style format strings suffer from gotchas and verbosity
- str.format() is better but still repetitive
- F-strings are succinct, powerful, and Pythonic
- F-strings allow arbitrary Python expressions
- Always prefer f-strings for string formatting in modern Python