# <font color="#418FDE" size="6.5" uppercase>**Pythonic Solutions**</font>

>Last update: 20260102.
    
By the end of this Lecture, you will be able to:
- Refactor algorithm implementations to use idiomatic Python constructs without sacrificing clarity or performance. 
- Add effective tests and simple benchmarks to validate correctness and performance of algorithmic code. 
- Evaluate and improve the readability and structure of Python algorithm implementations. 


## **1. Pythonic Algorithm Patterns**

### **1.1. Safe Comprehension Patterns**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_01_01.jpg?v=1767350711" width="250">



>* Use comprehensions for clear, simple data pipelines
>* Each clause should mirror an obvious standalone step

>* Avoid overly complex comprehensions with tangled logic
>* Keep heavy branching in helpers, preserving readability

>* Use generators to save memory on datasets
>* Prefer simple, profiled comprehensions over excessive laziness



In [None]:
#@title Python Code - Safe Comprehension Patterns

# Demonstrate safe comprehension patterns with clear filtering and transformation steps.
# Compare comprehension with explicit loop for readability and correctness.
# Show when to switch from list comprehension to generator expression safely.

# pip install example_package_if_needed_here_but_standard_library_is_sufficient.

# Define raw sensor readings in Fahrenheit degrees with some invalid values.
raw_readings_fahrenheit = [72.0, 75.5, -999.0, 120.0, 2000.0, 68.0]

# Define a helper function that validates realistic Fahrenheit temperature ranges.
def is_valid_fahrenheit(reading):
    return -40.0 <= reading <= 212.0

# Define a helper function that converts Fahrenheit readings to Celsius degrees safely.
def to_celsius(reading):
    return (reading - 32.0) * 5.0 / 9.0

# Use a clear list comprehension for filtering and transforming valid readings.
clean_celsius_list = [to_celsius(r) for r in raw_readings_fahrenheit if is_valid_fahrenheit(r)]

# Use an equivalent explicit loop to highlight the same safe transformation steps.
clean_celsius_loop = []
for reading in raw_readings_fahrenheit:
    if is_valid_fahrenheit(reading):
        clean_celsius_loop.append(to_celsius(reading))

# Use a generator expression for one-pass processing without storing every converted value.
clean_celsius_generator = (to_celsius(r) for r in raw_readings_fahrenheit if is_valid_fahrenheit(r))

# Print both list results to confirm comprehension and loop produce identical outputs.
print("List from comprehension and loop:", clean_celsius_list, clean_celsius_loop)

# Print a summed value from the generator to show incremental safe consumption.
print("Average Celsius from generator:", sum(clean_celsius_generator) / len(clean_celsius_list))



### **1.2. Itertools and Collections Essentials**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_01_02.jpg?v=1767350729" width="250">



>* Use itertools and collections instead of custom loops
>* Match problems to counting, grouping, combining tools

>* Use iterator tools for memory-efficient data pipelines
>* Lazy transformations keep code clear and performant

>* Use specialized containers for counting and grouping
>* They reduce boilerplate, bugs, and clarify intent



In [None]:
#@title Python Code - Itertools and Collections Essentials

# Demonstrate itertools and collections for simple log style data processing.
# Show counting, grouping, and merging using standard library tools.
# Keep everything beginner friendly and clearly commented.
# pip install some_required_library_if_needed_here.

# Import itertools and collections modules from Python standard library.
from itertools import groupby, chain

# Import Counter and namedtuple for counting and structured records.
from collections import Counter, namedtuple

# Define a simple Event record with user, day, and miles driven fields.
Event = namedtuple("Event", ["user", "day", "miles"])

# Create two small sorted event streams representing separate log files.
stream_one = [
    Event("alice", "2024-01-01", 12.0),
    Event("bob", "2024-01-01", 7.5),
]

# Create another sorted stream with additional driving events.
stream_two = [
    Event("alice", "2024-01-02", 5.0),
    Event("carol", "2024-01-01", 3.0),
]

# Merge both streams lazily using itertools.chain without copying lists.
merged_stream = chain(stream_one, stream_two)

# Build a Counter of how many events each user has generated.
user_event_counts = Counter(event.user for event in merged_stream)

# Print the per user event counts in a readable format.
print("Events per user:")
print(user_event_counts)

# Recreate merged_stream because previous iteration exhausted the iterator.
merged_stream = chain(stream_one, stream_two)

# Sort merged events by day to prepare for grouping by date.
sorted_by_day = sorted(merged_stream, key=lambda event: event.day)

# Group events by day using itertools.groupby for daily summaries.
print("\nMiles driven per day:")
for day, day_events in groupby(sorted_by_day, key=lambda event: event.day):

    # Sum miles for each grouped day using a generator expression.
    total_miles = sum(event.miles for event in day_events)

    # Print the day and total miles driven that day.
    print(day, "total_miles", total_miles, "miles")

# Finally print a short confirmation that processing has completed successfully.
print("\nLog processing complete using itertools and collections.")



### **1.3. Pragmatic Performance Choices**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_01_03.jpg?v=1767350759" width="250">



>* Balance concise Python style with real workloads
>* Consider data scale, loops, and system constraints

>* Optimize tight, high-volume code paths carefully
>* Elsewhere, favor clarity, maintainability, and readability

>* Prefer built-ins and libraries for heavy work
>* Restructure or limit computation when constraints appear



In [None]:
#@title Python Code - Pragmatic Performance Choices

# Demonstrate pragmatic performance choices with simple list processing example.
# Compare naive list building versus generator based streaming approach.
# Show timing differences for critical path operations clearly.
# pip install numpy.

# Import required standard library modules for timing measurements.
import time

# Define a function using list comprehension building full list eagerly.
def sum_squares_list(limit):
    # Build full list then sum squares of distances in feet.
    distances_feet = [i * 3 for i in range(limit)]
    return sum(d * d for d in distances_feet)

# Define a function using generator expression streaming values lazily.
def sum_squares_generator(limit):
    # Stream distances in feet without storing full list in memory.
    return sum((i * 3) * (i * 3) for i in range(limit))

# Define a helper function measuring execution time for provided function.
def measure_time(func, limit):
    # Record start time using high resolution performance counter.
    start = time.perf_counter()
    result = func(limit)
    # Record end time and compute elapsed seconds for function call.
    elapsed = time.perf_counter() - start
    return result, elapsed

# Choose small and large sizes representing different workload scales.
small_limit = 10_000
large_limit = 2_000_000

# Measure list based approach for small workload size.
small_list_result, small_list_time = measure_time(sum_squares_list, small_limit)

# Measure generator based approach for small workload size.
small_gen_result, small_gen_time = measure_time(sum_squares_generator, small_limit)

# Measure list based approach for large workload size.
large_list_result, large_list_time = measure_time(sum_squares_list, large_limit)

# Measure generator based approach for large workload size.
large_gen_result, large_gen_time = measure_time(sum_squares_generator, large_limit)

# Print summary showing correctness agreement between both approaches.
print("Results equal small:", small_list_result == small_gen_result)

# Print summary showing correctness agreement for large workload size.
print("Results equal large:", large_list_result == large_gen_result)

# Print timing comparison for small workload where clarity usually dominates.
print("Small list seconds:", round(small_list_time, 6))

# Print timing comparison for small workload generator based streaming.
print("Small generator seconds:", round(small_gen_time, 6))

# Print timing comparison for large workload list based approach.
print("Large list seconds:", round(large_list_time, 6))

# Print timing comparison for large workload generator based approach.
print("Large generator seconds:", round(large_gen_time, 6))

# Print simple conclusion highlighting which approach wins for large workload.
print("Faster large workload:", "list" if large_list_time < large_gen_time else "generator")



## **2. Testing and Timing**

### **2.1. pytest for algorithms**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_02_01.jpg?v=1767350781" width="250">



>* Use pytest for small, focused algorithm tests
>* Cover core behavior, edge cases, and invalid inputs

>* Parametrized tests explore many input cases efficiently
>* They keep algorithm tests compact, readable, extendable

>* Use pytest to test robustness and failures
>* Build a living specification covering normal and edge cases



In [None]:
#@title Python Code - pytest for algorithms

# Demonstrate simple pytest style tests for algorithmic functions.
# Show parametrized style checks without external pytest dependency.
# Run tests manually and display clear pass or fail messages.

# pip install pytest would be used normally for real projects.

# Define a simple algorithmic function for Fahrenheit to Celsius conversion.
def fahrenheit_to_celsius(temperature_fahrenheit_value):
    return (temperature_fahrenheit_value - 32.0) * 5.0 / 9.0

# Define another algorithmic function for maximum subarray sum calculation.
def max_subarray_sum(number_list_input_values):
    best_sum_current_value = number_list_input_values[0]
    best_sum_global_value = number_list_input_values[0]
    for current_number_value in number_list_input_values[1:]:
        best_sum_current_value = max(current_number_value, best_sum_current_value + current_number_value)
        best_sum_global_value = max(best_sum_global_value, best_sum_current_value)
    return best_sum_global_value

# Define helper function that mimics simple pytest style assertion behavior.
def check_equal_values(received_value_result, expected_value_result, message_label_text):
    if received_value_result == expected_value_result:
        print("PASS:", message_label_text, "received", received_value_result)
    else:
        print("FAIL:", message_label_text, "received", received_value_result, "expected", expected_value_result)

# Define parametrized style tests for Fahrenheit conversion algorithmic function.
def test_fahrenheit_to_celsius_parametrized_cases():
    test_cases_temperature_values = [(32.0, 0.0), (212.0, 100.0), (68.0, 20.0)]
    for input_fahrenheit_value, expected_celsius_value in test_cases_temperature_values:
        label_text_message = f"fahrenheit_to_celsius({input_fahrenheit_value}) correctness check"
        result_celsius_value = round(fahrenheit_to_celsius(input_fahrenheit_value), 2)
        check_equal_values(result_celsius_value, expected_celsius_value, label_text_message)

# Define parametrized style tests for maximum subarray sum algorithmic function.
def test_max_subarray_sum_parametrized_cases():
    test_cases_subarrays_values = [([1, -2, 3, 4, -1], 7), ([-5, -2, -3], -2), ([4, -1, 2, 1], 6)]
    for input_array_values, expected_sum_value in test_cases_subarrays_values:
        label_text_message = f"max_subarray_sum({input_array_values}) correctness check"
        result_sum_value = max_subarray_sum(input_array_values)
        check_equal_values(result_sum_value, expected_sum_value, label_text_message)

# Run both parametrized style test groups and display overall completion message.
if __name__ == "__main__":
    test_fahrenheit_to_celsius_parametrized_cases()
    test_max_subarray_sum_parametrized_cases()
    print("All algorithmic style tests finished running successfully.")



### **2.2. Timing with timeit**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_02_02.jpg?v=1767350798" width="250">



>* Use timeit to measure code speed reliably
>* Run snippets repeatedly for stable, comparable timings

>* Design benchmarks that mirror real-world workloads
>* Repeat timings, separate setup to reduce noise

>* Use timeit results to balance speed and readability
>* Build small benchmark suites to monitor algorithm performance



In [None]:
#@title Python Code - Timing with timeit

# Demonstrate timing small algorithms using the timeit standard library module.
# Compare a loop based sum with a built in sum for performance clarity.
# Show how repetitions reduce noise and give more stable timing measurements.

# pip install commands are unnecessary because we only use Python standard library.

# Import timeit module for precise repeated timing measurements.
import timeit

# Define a simple function using an explicit loop for summing values.
def slow_sum(numbers):
    total = 0
    for value in numbers:
        total += value
    return total

# Define a faster function using Python built in sum for summing values.
def fast_sum(numbers):
    return sum(numbers)

# Prepare a realistic list of distances in miles for timing tests.
numbers = list(range(1, 50001))

# Create a Timer object for the slow_sum function with prepared numbers.
slow_timer = timeit.Timer(stmt="slow_sum(numbers)", setup="from __main__ import slow_sum, numbers")

# Create a Timer object for the fast_sum function with prepared numbers.
fast_timer = timeit.Timer(stmt="fast_sum(numbers)", setup="from __main__ import fast_sum, numbers")

# Run slow_sum timing several times and keep the best observed execution time.
slow_best = min(slow_timer.repeat(repeat=5, number=10))

# Run fast_sum timing several times and keep the best observed execution time.
fast_best = min(fast_timer.repeat(repeat=5, number=10))

# Print measured times showing seconds for ten runs of each implementation.
print("slow_sum best time for ten runs:", round(slow_best, 6), "seconds")

# Print measured times for fast_sum to compare performance against slow_sum implementation.
print("fast_sum best time for ten runs:", round(fast_best, 6), "seconds")

# Print a simple speedup factor showing how many times faster fast_sum performed.
speedup = slow_best / fast_best if fast_best > 0 else float('inf')
print("fast_sum speedup factor compared with slow_sum:", round(speedup, 2), "x")



### **2.3. Catching Performance Regressions**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_02_03.jpg?v=1767350815" width="250">



>* Algorithms can slowly become slower as code changes
>* Turn regular timing into repeatable automated performance tests

>* Create simple benchmarks with realistic input scenarios
>* Compare new timings to baselines to spot slowdowns

>* Make performance checks part of regular testing
>* Use benchmarks and automation to catch slowdowns early



In [None]:
#@title Python Code - Catching Performance Regressions

# Demonstrate simple timing baselines for catching algorithm performance regressions.
# Compare original and refactored versions using repeatable timing measurements.
# Show warning when new version becomes significantly slower than baseline.

# pip install pytest-benchmark matplotlib seaborn.

# Import required standard library modules for timing and statistics.
import timeit, statistics, math

# Define original algorithm using straightforward loop based summation approach.
def original_sum_squares(limit_value):
    total_value = 0
    for number_value in range(limit_value):
        total_value += number_value * number_value
    return total_value

# Define refactored algorithm using list comprehension and built in sum.
def refactored_sum_squares(limit_value):
    return sum([number_value * number_value for number_value in range(limit_value)])

# Define helper function measuring average runtime for provided callable and arguments.
def measure_average_runtime(func_object, limit_value, repeat_count=5):
    timer_object = timeit.Timer(lambda: func_object(limit_value))
    runs_seconds = timer_object.repeat(repeat=repeat_count, number=1)
    return statistics.mean(runs_seconds)

# Establish baseline timing using original implementation on representative input size.
baseline_limit = 200000
baseline_seconds = measure_average_runtime(original_sum_squares, baseline_limit)

# Measure timing for refactored implementation using identical input configuration.
refactored_seconds = measure_average_runtime(refactored_sum_squares, baseline_limit)

# Define acceptable slowdown threshold percentage for detecting performance regressions.
slowdown_threshold_percent = 25.0
slowdown_percent = ((refactored_seconds - baseline_seconds) / baseline_seconds) * 100.0

# Print concise summary showing baseline and current timings with slowdown percentage.
print("Baseline seconds:", round(baseline_seconds, 6), "Refactored seconds:", round(refactored_seconds, 6))

# Check slowdown against threshold and print appropriate regression warning or success message.
if slowdown_percent > slowdown_threshold_percent:
    print("Warning regression detected slowdown:", round(slowdown_percent, 2), "percent slower than baseline.")
else:
    print("Performance acceptable change:", round(slowdown_percent, 2), "percent relative to baseline.")



## **3. Readable Algorithm Structure**

### **3.1. Clear Names in Code**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_03_01.jpg?v=1767350837" width="250">



>* Descriptive names make algorithms easier to understand
>* Names should express intent and domain-specific roles

>* Name functions by outcomes, not internal steps
>* Use consistent names so concepts stay aligned

>* Specific, domain-rich names prevent subtle algorithm bugs
>* Deliberate naming creates clearer, future-proof code narratives



In [None]:
#@title Python Code - Clear Names in Code

# Demonstrate clear variable names improving algorithm readability and understanding.
# Compare confusing names with descriptive names in a simple distance filtering task.
# Show identical results while highlighting readability benefits through printed comparisons.

# !pip install nothing_needed_here_this_runs_with_standard_python_only.

# Define a confusingly named function with unclear variable names.
def bad_filter(xs, t):
    total = []
    for x in xs:
        if x <= t:
            total.append(x)
    return total


# Define a clear function with descriptive names and intent revealing structure.
def filter_short_trips(trip_distances_miles, maximum_allowed_miles):
    short_trips_miles = []
    for single_trip_miles in trip_distances_miles:
        if single_trip_miles <= maximum_allowed_miles:
            short_trips_miles.append(single_trip_miles)
    return short_trips_miles


# Prepare example trip distances in miles for both functions to process.
trip_distances_miles = [1.2, 3.5, 7.0, 10.5, 2.8]


# Define a maximum distance threshold representing short trips in miles.
maximum_short_trip_miles = 5.0


# Compute results using the confusingly named function for comparison.
bad_result = bad_filter(trip_distances_miles, maximum_short_trip_miles)


# Compute results using the clearly named function for better readability.
clear_result = filter_short_trips(trip_distances_miles, maximum_short_trip_miles)


# Print both results showing identical behavior despite naming differences.
print("Bad names result miles:", bad_result)


# Print clear result emphasizing improved understanding from descriptive names.
print("Clear names result miles:", clear_result)



### **3.2. Modular Algorithm Design**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_03_02.jpg?v=1767350858" width="250">



>* Break complex algorithms into small focused steps
>* Use reusable functions to improve clarity and maintenance

>* Top-level functions tell the algorithm’s overall story
>* Lower-level functions isolate specific steps and concerns

>* Modular pieces are easier to test and reuse
>* Stable structure supports safe changes and collaboration



In [None]:
#@title Python Code - Modular Algorithm Design

# Demonstrate modular algorithm design using simple temperature data processing.
# Show how separate functions clarify each algorithm processing step.
# Print concise results that highlight the modular pipeline structure.

# pip install commands are unnecessary because this script uses only standard libraries.

# Define a function that loads raw Fahrenheit temperatures from a simple list.
def load_fahrenheit_readings():
    return [72.0, 68.5, 75.2, 80.1, 69.8, 77.0, 73.4]

# Define a function that cleans readings by removing impossible extreme temperature values.
def clean_readings(readings):
    cleaned = [value for value in readings if 50.0 <= value <= 120.0]
    return cleaned

# Define a function that converts Fahrenheit readings into Celsius temperature values.
def convert_to_celsius(readings):
    converted = [(value - 32.0) * 5.0 / 9.0 for value in readings]
    return converted

# Define a function that summarizes readings with minimum, maximum, and average values.
def summarize_readings(readings):
    minimum = min(readings)
    maximum = max(readings)
    average = sum(readings) / len(readings)
    return minimum, maximum, average

# Define a function that runs the full modular temperature processing pipeline.
def run_temperature_pipeline():
    raw_readings = load_fahrenheit_readings()
    cleaned_readings = clean_readings(raw_readings)
    celsius_readings = convert_to_celsius(cleaned_readings)
    minimum, maximum, average = summarize_readings(celsius_readings)
    print("Raw Fahrenheit readings:", raw_readings)
    print("Cleaned Fahrenheit readings:", cleaned_readings)
    print("Converted Celsius readings:", [round(value, 1) for value in celsius_readings])
    print("Minimum Celsius temperature:", round(minimum, 1))
    print("Maximum Celsius temperature:", round(maximum, 1))
    print("Average Celsius temperature:", round(average, 1))

# Execute the modular pipeline so the algorithm structure becomes clearly visible.
run_temperature_pipeline()



### **3.3. Commenting and Docstrings**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Master Python Algorithms/Module_10/Lecture_B/image_03_03.jpg?v=1767350878" width="250">



>* Comments explain intent, assumptions, and design choices
>* They add hidden domain knowledge for future maintainers

>* Docstrings describe an algorithm’s purpose and interface
>* They explain behavior details so others use safely

>* Keep comments concise, current, and intent-focused
>* Evolve comments and docstrings as algorithms change



In [None]:
#@title Python Code - Commenting and Docstrings

# Demonstrate helpful comments and docstrings for simple algorithmic Python functions.
# Show how intent comments differ from obvious code behavior explanations.
# Use docstrings to describe inputs, outputs, and important algorithm assumptions clearly.

# pip install commands are unnecessary because this script uses only built in features.

# Define a function that converts miles to feet with clear documentation.
def miles_to_feet(miles_value: float) -> int:
    """Convert miles distance into feet units using fixed conversion factor constant."""

    # Explain assumption that fractional feet precision is unnecessary for this algorithm.
    FEET_PER_MILE: int = 5280

    # Return integer feet value using rounding to nearest whole foot for simplicity.
    return int(round(miles_value * FEET_PER_MILE))

# Define a function that sums route segment distances while documenting algorithm behavior.
def total_route_feet(segments_miles: list[float]) -> int:
    """Compute total route length in feet assuming all segment distances are nonnegative."""

    # Guard against negative distances because algorithm assumes forward travel only.
    if any(distance < 0 for distance in segments_miles):
        raise ValueError("Segments must be nonnegative miles values only.")

    # Convert each segment to feet and sum for overall route distance calculation.
    total_feet: int = sum(miles_to_feet(distance) for distance in segments_miles)

    # Return final total feet value representing complete route distance approximation.
    return total_feet

# Define a small helper that prints documentation and example algorithm usage results.
def demonstrate_documentation() -> None:
    """Show docstrings and outputs for distance functions to illustrate documentation usage."""

    # Prepare example route segments representing short city delivery path in miles.
    example_segments: list[float] = [1.2, 0.5, 2.0]

    # Compute total distance in feet using documented algorithmic helper function.
    total_distance_feet: int = total_route_feet(example_segments)

    # Print function docstring to show structured algorithm interface documentation.
    print("miles_to_feet docstring:")
    print(miles_to_feet.__doc__)

    # Print computed result to connect documentation with actual algorithm behavior.
    print("Total route distance in feet:", total_distance_feet)

# Execute demonstration function when script runs inside interactive environment session.
demonstrate_documentation()



# <font color="#418FDE" size="6.5" uppercase>**Pythonic Solutions**</font>


In this lecture, you learned to:
- Refactor algorithm implementations to use idiomatic Python constructs without sacrificing clarity or performance. 
- Add effective tests and simple benchmarks to validate correctness and performance of algorithmic code. 
- Evaluate and improve the readability and structure of Python algorithm implementations. 

<font color='yellow'>Congratulations on completing this course!</font>