## Function and Iterators

A function is defined to deploy an application, with default values for its version and namespace. How can you call this function to deploy the app in the 'staging' namespace while using the default version?

In [2]:
def deploy_app(app_name, version='1.0.0', namespace='default'):
    print(f"Deploying {app_name} v{version} to {namespace} namespace.")

deploy_app('auth-service', namespace="staging")    

Deploying auth-service v1.0.0 to staging namespace.


A script is mapping server names to their respective datacenter locations. What will be the output of this code?

In [3]:
servers = ['web-1', 'db-1', 'app-1', 'cache-1']
locations = ['us-east', 'us-west', 'eu-central']
 
for server, loc in zip(servers, locations):
    print(f"'{server}' is in '{loc}'")

'web-1' is in 'us-east'
'db-1' is in 'us-west'
'app-1' is in 'eu-central'


A script needs to display a numbered list of pending software updates. Which code snippet correctly uses enumerate to produce a numbered list starting with "1." for the first item, and incrementing the number by one for each item of the following list?



updates = ['kernel-patch', 'openssl-fix', 'python-update']

In [5]:
updates = ['kernel-patch', 'openssl-fix', 'python-update']
for i, update in enumerate(updates, start=1):
    print(f"{i}: {update}")

1: kernel-patch
2: openssl-fix
3: python-update


The function below is intended to report on a list of tasks without changing the original list. However, after the function call, the pending_deploys list is empty. What is the cause of this bug, and how should it be fixed?

In [None]:
def report_tasks(task_list):
    print("Tasks to process:")
    while task_list:
        task = task_list.pop(0)
        print(f"- {task}")
    print("Report complete.")
 
pending_deploys = ['deploy-web', 'deploy-db', 'deploy-cache']
report_tasks(pending_deploys)
# This next line unexpectedly prints: "Tasks remaining: []"
print(f"Tasks remaining: {pending_deploys}")

The .pop(0) method is modifying the list passed to it. The fix is to create a copy of the list for processing, by changing the first line of the function to task_list = task_list.copy().

## OOP in Python

When defining a Python class to represent a server, what is the primary role of the __init__ method?

Answer: It is a method that is automatically called when an instance of the class is created, used to initialize the instance's unique attributes(for example, self.hostname = "web01")

A WebServer class inherits from a generic Server class. Both classes have a shutdown() method. How can the WebServer's shutdown() method first perform its own specific actions and then call the generic shutdown logic from the Server class?

In [None]:
class WebServer(Server):
    def shutdown(self):
        print("Shutting down web server.")
        super().shutdown()

What is the fundamental difference between an instance attribute (e.g., self.hostname) and a class attribute?

In [None]:
class Server:
    # Class attribute
    location = "US-EAST-1"
 
    def __init__(self, hostname):
        # Instance attribute
        self.hostname = hostname

A class attribute is shared by all instances of the class, while an instance attribute is unique to each specific object.
All objects created from the Server class will share the single location attribute. However, each object will have its own distinct hostname attribute, set when the object is created.

Two instances of a User class are created. What will be the output of the final print statement?



In [6]:
class User:
    def __init__(self, username):
        self.username = username
        self.is_active = False
 
    def activate(self):
        self.is_active = True
 
user1 = User("admin")
user2 = User("guest")
user1.activate()
 
print(f"{user1.username}: {user1.is_active}, {user2.username}: {user2.is_active}")

admin: True, guest: False


An engineer wrote a class to monitor services. When they create an instance and check its status, they get an AttributeError. What is the cause of this bug?

In [None]:

class ServiceMonitor:
    def __init__(self, service_name):
        name = service_name
        is_alive = False
 
    def check_status(self):
        print(f"Checking status of {self.name}...")
        self.is_alive = True
        return self.is_alive
 
monitor = ServiceMonitor("database")
monitor.check_status()

The __init__ method created local variables name and is_alive instead of instance attributes. The fix is to use self.name= service_name and self.is_alive=False.

Consider the following class definition. What is the output of the code?

In [7]:
class VM:
    # Class attribute
    hypervisor = "KVM"
 
    def __init__(self, name):
        self.name = name
 
vm1 = VM("web-01")
vm2 = VM("db-01")
 
# The hypervisor for all VMs is upgraded
VM.hypervisor = "Xen"
 
print(f"{vm1.name} on {vm1.hypervisor}, {vm2.name} on {vm2.hypervisor}")

web-01 on Xen, db-01 on Xen


A child class DatabaseServer is created from a parent Server class. What is the correct way to initialize both the parent's attributes (hostname) and the child's specific attribute (db_engine)?

In [None]:
class DatabaseServer(Server):
    def __init__(self, hostname, db_engine):
        super().__init__(hostname)
        self.db_engine = db_engine

## Working with Flexible arguements

A function is designed to gather system metrics, where some metrics are required and others are optional. What is the value of the extra_metrics dictionary inside the function?

In [None]:
def gather_metrics(hostname, *base_metrics, **extra_metrics):
    # What is the value of extra_metrics?
    pass
 
gather_metrics("web01", "cpu_usage", "mem_usage", disk_io=45.5, network_traffic="200MB/s")

{'disk_io': 45.5, 'network_traffic' : '200MB/s'} 
**extra_mtrics collects all keyword arguements  that are not explicitly defined as parameters. In this call, disk_io and network_traffic are passed as keyword arguements and are therefore collected into the extra_metrics dictionary.

Q: A function is defined to accept a variable number of file paths to process. What is the value and data type of the paths variable inside the function?

In [None]:
def process_files(*paths):
    # What is `paths` here?
 
process_files("/etc/hosts", "/etc/nginx/nginx.conf")

A: It is a tuple contatining ("/etc/hosts", "/etc/nginx/nginx.conf"). The *syntax in a function definition gathers all provided arguements into a single tuple paths to maintain the order.

Q: A function needs to be called with configuration settings stored in a dictionary. What is the correct syntax to unpack the db_config dictionary into keyword arguments for the connect_to_db function?

In [None]:

def connect_to_db(host, port, user, password):
    print(f"Connecting to {host}:{port} as {user}...")
 
db_config = {'host': 'db.prod', 'port': 5432, 'user': 'admin', 'password': '123'}

A: connect_to_db(**db_config)

Q: An engineer wants to create a flexible function that takes a required command and optional keyword arguments for flags. The code below, however, raises a TypeError. What is the cause of the error?

In [None]:

def run_command(command, **options):
    option_str = " ".join([f"--{k}={v}" for k, v in options.items()])
    print(f"Executing: {command} {option_str}")
 
run_command("deploy", "app-server", verbose=True)

The string "app-server" is passed as a positional arguement

Q: In what order must the different types of parameters appear in a Python function definition?

A: Standard positional arguements, *args, keyowrd-only arguements, **kwargs

Q: A function is called by unpacking a list of values. What will be the output?

In [None]:

def check_health(host, port, timeout):
    print(f"Checking {host} on port {port} with a {timeout}s timeout.")
 
params = ["api.service.local", 443, 10, "extra_value"]
check_health(*params)

A: A TypeError is raised because the params list contains more items than the function has parameters

Q: A function is designed to merge multiple configuration dictionaries. The first dictionary provides base settings, and subsequent dictionaries provide overrides. What will be the final value of final_config?

In [8]:

def merge_configs(base_config, *override_configs):
    merged = base_config.copy()
    for config in override_configs:
        merged.update(config)
    return merged
 
config1 = {'user': 'admin', 'retries': 3}
config2 = {'retries': 5, 'timeout': 30}
config3 = {'timeout': 60, 'loglevel': 'debug'}
 
final_config = merge_configs(config1, config2, config3)

In [10]:
print(final_config)

{'user': 'admin', 'retries': 5, 'timeout': 60, 'loglevel': 'debug'}


## Generator Quiz

Question 1:
Consider the following Python code. What will be printed to the console when it is executed?

In [None]:
def generate_reports(count):
    print("Initializing report generator...")
    for i in range(count):
        yield f"Report #{i+1}"
    print("All reports generated.")
 
reports_generator = generate_reports(3)
print("Generator created.")

Generator created
 The body of a generator function does not execute when the function is called. Instead, a generator object is created and returned immediately. The code inside the generate_reports function (including the print statements) will only run when the generator is iterated over (e.g., with next() or a for loop).



Question 2:
What is the output of the following code snippet?

In [1]:
def sequence_generator():
    value = 10
    print(f"Yielding {value}")
    yield value
 
    value += 5
    print(f"Yielding {value}")
    yield value
 
    value *= 2
    print(f"Yielding {value}")
    yield value
 
gen = sequence_generator()
next(gen)
val = next(gen)
print(f"Final value: {val}")

Yielding 10
Yielding 15
Final value: 15


The first next(gen) call starts the generator; it prints "Yielding 10" and yields 10, then pauses. The second next(gen) call resumes execution; value is incremented to 15, "Yielding 15" is printed, and 15 is yielded and assigned to val. The program then prints the final line. The code for the third yield is never reached.

Question 3:
A developer wrote the following script to create pairs of network zones. They expected to see all possible pairs (e.g., "Web:DB", "Web:API", "DB:Web", etc.), but the output was not what they expected. What is the bug, and how should it be fixed in the most efficient way?

In [None]:
def get_zones():
    yield "Web"
    yield "DB"
    yield "API"
 
zones_gen = get_zones()
 
for zone1 in zones_gen:
    for zone2 in zones_gen:
        print(f"{zone1}:{zone2}")

The bug is that the zones_gen generator is exhausted by the inner loop on its first run. The fix is to get a fresh generator for the inner loop by calling the factory function again: for zone2 in get_zones():.

This is the correct answer. The inner loop for zone2 in zones_gen: completely consumes the generator. When the outer loop starts its second iteration (e.g., with zone1 as "DB"), the zones_gen is already empty (exhausted), so the inner loop does not run again. Creating a new generator for each loop (for zone1 in get_zones(): and for zone2 in get_zones():) ensures each loop works with a fresh, independent iterator.

Question 4:
What will be the output of this Python script, which attempts to iterate over a generator twice?

In [2]:
def device_ids():
    yield "sensor-A"
    yield "sensor-B"
 
ids = device_ids()
 
print("First pass:")
for device_id in ids:
    print(f"- {device_id}")
 
print("\nSecond pass:")
for device_id in ids:
    print(f"- {device_id}")

First pass:
- sensor-A
- sensor-B

Second pass:


This is the correct answer. The first for loop consumes all the values from the ids generator, exhausting it. When the second for loop attempts to iterate over the same exhausted generator, the generator immediately signals it has no more items (by raising StopIteration internally, which the for loop handles), so the loop's body never executes.

Question 5:
A generator function is defined to yield a sequence of status messages. What is the exact output when the following code is run?

In [3]:
def deployment_status():
    status = "PENDING"
    yield status
 
    status = "IN_PROGRESS"
    yield status
 
    status = "COMPLETED"
    print(f"Final state: {status}")
 
d_status = deployment_status()
print(f"1: {next(d_status)}")
print(f"2: {next(d_status)}")
try:
    next(d_status)
except StopIteration:
    print("3: Deployment finished.")

1: PENDING
2: IN_PROGRESS
Final state: COMPLETED
3: Deployment finished.


This is the correct answer. The first two next() calls consume the first two yielded values. The third next() call resumes the generator. It executes the print(f"Final state: {status}") line, then the function finishes, which causes a StopIteration to be raised. The except block catches this and prints its message.

## Building Lazy pipelines with Generators

Question 1:
A DevOps engineer needs to process a 50 GB log file to count the number of lines containing the word "FATAL". The engineer's machine has only 8 GB of RAM. Which of the following statements best explains the primary advantage of using a lazy generator pipeline for this task?

Answer: 
The pipeline avoids loading the entire 50 GB file into memory at once. It processes the file line-by-line, keeping memory usage minimal and constant regardless of the file size.

Correct! This is the core benefit of lazy pipelines. By reading and processing the file one line at a time, the memory footprint remains extremely low. The entire 50 GB file is never held in RAM, making it possible to process on a machine with much less memory.

Question 2:
You are given a pipeline of generators to process a list of raw transaction data. What will be the final output of the following script?

In [1]:
def parse_transactions(data):
    print("Parsing...")
    for item in data:
        parts = item.split(':')
        yield (parts[0], int(parts[1]))
 
def filter_high_value(transactions, min_value=100):
    print("Filtering...")
    for _, amount in transactions:
        if amount >= min_value:
            yield amount
 
def apply_fees(amounts, fee_percent=10):
    print("Applying fees...")
    for amount in amounts:
        yield amount - (amount * fee_percent / 100)
 
raw_data = ["TX01:50", "TX02:200", "TX03:150", "TX04:90"]
 
pipeline = apply_fees(filter_high_value(parse_transactions(raw_data)))
 
result = list(pipeline)
print(result)

Applying fees...
Filtering...
Parsing...
[180.0, 135.0]


Correct. The list() constructor pulls items one by one through the entire pipeline. The print statements inside the generators are executed only when the pipeline is first consumed at each stage. parse_transactions yields ('TX02', 200), which filter_high_value yields as 200, and apply_fees yields as 180.0. This repeats for 150, which becomes 135.0. The other items are filtered out.

Question 3:
The following script is intended to first iterate through all "login" events to perform an action (simulated by the print statement) and then, in a separate step, count the total number of these events. However, the script contains a bug and reports an incorrect count of 0. What is the fundamental cause of this bug, and what is the most suitable solution to fix it while preserving the two-step logic?

In [None]:
def get_login_events(all_events):
    for event in all_events:
        if event.get("type") == "login":
            print(f"Found login for user: {event['user_id']}")
            yield event
 
events = [
    {"user_id": 101, "type": "login"},
    {"user_id": 102, "type": "page_view"},
    {"user_id": 103, "type": "login"},
    {"user_id": 104, "type": "logout"},
    {"user_id": 105, "type": "login"},
]
 
login_stream = get_login_events(events)
 
# Step 1: Perform an action on each login event
for _ in login_stream:
    pass
 
# Step 2: Count the total number of login events
count = len(list(login_stream))
print(f"Total logins: {count}") # Buggy output: Total logins: 0

Cause: The login_stream generator is exhausted after the first for loop. A generator can only be iterated over once.
Solution: Convert the generator to a list at the beginning and perform both operations on that reusable list.



Correct. This choice accurately identifies generator exhaustion as the root cause. The proposed solution is the most practical fix for scenarios requiring multiple passes over the same dataset. It eagerly evaluates the generator into a list, which can then be used repeatedly without issue.

Question 4:
Consider this two-stage pipeline designed to process a sequence of numbers. What is the exact sequence of printed output when this script is executed?

In [2]:
def number_source(n):
    print("SOURCE: Starting")
    for i in range(n):
        print(f"SOURCE: Yielding {i}")
        yield i
    print("SOURCE: Finished")
 
def doubler(items):
    print("DOUBLER: Starting")
    for item in items:
        print(f"DOUBLER: Processing {item}")
        yield item * 2
    print("DOUBLER: Finished")
 
pipeline = doubler(number_source(2))
print("--- Getting first item ---")
print(f"Result: {next(pipeline)}")
print("--- Getting second item ---")
print(f"Result: {next(pipeline)}")

--- Getting first item ---
DOUBLER: Starting
SOURCE: Starting
SOURCE: Yielding 0
DOUBLER: Processing 0
Result: 0
--- Getting second item ---
SOURCE: Yielding 1
DOUBLER: Processing 1
Result: 2


Correct. This output accurately traces the lazy, "pull-based" execution. The call to next(pipeline) pulls a value from doubler, which in turn pulls a value from number_source. The first item 0 is yielded from source, processed by doubler, and returned as 0. The second call to next resumes the process, pulling 1 from source, which is processed by doubler and returned as 2.

Question 5:
The following script is intended to create a lazy pipeline that identifies IP addresses that transferred more than 1000 bytes and then formats them for a report. However, the filter_heavy_hitters function breaks the lazy evaluation model. How should the filter_heavy_hitters function be rewritten to ensure the entire pipeline is lazy and memory-efficient?

In [None]:
def parse_logs(log_lines):
    for line in log_lines:
        parts = line.split()
        if len(parts) == 2:
            yield (parts[0], int(parts[1]))
 
def filter_heavy_hitters(records):
    # This function is not lazy
    results = []
    for ip, byte_count in records:
        if byte_count > 1000:
            results.append(ip)
    return results # Eagerly returns a list
 
def format_for_report(ip_addresses):
    for ip in ip_addresses:
        yield f"ALERT: High traffic from {ip}"
 
# How the pipeline is used:
logs = ["1.1.1.1 500", "2.2.2.2 2500", "3.3.3.3 4000"]
records_gen = parse_logs(logs)
heavy_ips = filter_heavy_hitters(records_gen) # Entire records_gen is consumed here
report_lines = format_for_report(heavy_ips)

def filter_heavy_hitters(records):
    for ip, byte_count in records:
        if byte_count > 1000:
            yield ip


Correct. By replacing the list-building logic with a for loop that yields values one by one, the function is transformed into a generator. This makes it lazy, processing one record at a time and passing it down the pipeline without building an intermediate list in memory.

Question 6:
An engineer is writing a helper function get_cloud_regions() that fetches a list of available cloud provider regions (for example, 'us-east-1', 'eu-west-2'). The list is small (fewer than 50 items), does not change during the program's execution, and needs to be referenced by multiple other functions throughout the application's lifecycle. Which approach is most suitable for this function and why?

Answer: 
Use a regular function that returns a list because the data is small and needs to be accessed repeatedly. Storing the result in a list is more practical than re-creating an exhausted generator.


Correct. For small, static datasets that need to be used more than once, the eager approach of returning a list (or tuple) is superior. The memory cost is trivial, and the resulting list can be reused, iterated over, and accessed by index freely, which is exactly what the scenario requires.

## Decorator Fundamentals and Best Practices

Question 1:
In DevOps automation, you often need to add logging to many different functions that deploy services, update configurations, or run backups. Which statement best describes the primary advantage of using a decorator for this purpose?

Answer: Decorators add functionality (a "cross-cutting concern" like logging or timing) to multiple functions without modifying each function's source code, thus avoiding code duplication.


Correct. This is the core purpose of a decorator. It allows you to define a common behavior (like logging) in one place and apply it cleanly to any number of functions using the @ syntax, adhering to the "Don't Repeat Yourself" (DRY) principle.

Question 2:
What is the exact output of the following script when it is executed?

In [None]:
import functools
 
def state_change_decorator(func):
    @functools.wraps(func)
    def wrapper():
        print("State: Preparing to execute...")
        func()
        print("State: Execution complete.")
    return wrapper
 
@state_change_decorator
def apply_migrations():
    print("Action: Applying database migrations.")
 
apply_migrations()

State: Preparing to execute...
Action: Applying database migrations.
State: Execution complete.


Correct. The @state_change_decorator syntax means apply_migrations now refers to the wrapper function. When called, wrapper first prints its "before" message, then calls the original apply_migrations function, and finally prints its "after" message.

Question 3:
An engineer wrote a decorator to add a prefix to a function's string result. However, when the script runs, the final line prints Result from main script: None. What is the cause of the bug, and what is the correct fix?

In [None]:
def add_prefix_decorator(func):
    def wrapper(*args, **kwargs):
        original_result = func(*args, **kwargs)
        prefixed_result = f"[PREFIX] {original_result}"
        print(f"Inside wrapper, prefixed result is: {prefixed_result}")
    return wrapper
 
@add_prefix_decorator
def get_hostname(server_id):
    return f"server-{server_id}.prod.local"
 
hostname = get_hostname(101)
print(f"Result from main script: {hostname}")

Answer:

Cause: The wrapper function calculates the prefixed_result but does not return it. Since the wrapper has no explicit return, it implicitly returns None.
Fix: Add return prefixed_result to the end of the wrapper function.



Correct. The wrapper function effectively replaces the original function. If the wrapper doesn't return a value, the caller will receive None. The fix is to ensure the wrapper returns the intended new value.


Question 4:
This script uses a dictionary as a dispatch table to execute an action based on a command string. This pattern relies on functions being "first-class citizens." What is the output of the script?

In [None]:
def provision_vm(hostname):
    return f"PROVISION: Virtual machine {hostname} created."
 
def deprovision_vm(hostname):
    return f"DEPROVISION: Virtual machine {hostname} destroyed."
 
def reboot_vm(hostname):
    return f"REBOOT: Virtual machine {hostname} restarting."
 
action_map = {
    "create": provision_vm,
    "destroy": deprovision_vm,
    "restart": reboot_vm,
}
 
command = "create"
target = "db-main-01"
 
if command in action_map:
    # Look up the function and call it
    action_func = action_map[command]
    result = action_func(target)
    print(result)
else:
    print(f"Unknown command: {command}")

PROVISION: Virtual machine db-main-01 created.
Correct

Correct. The script looks up the key "create" in action_map, which returns the provision_vm function object. It then calls this function with the argument "db-main-01", and the string returned by that function is printed.

Question 5:
This decorator is designed to log details about a function call, including arguments and the return value. What will be printed to the console when this script is executed?

In [1]:
from functools import wraps
 
def audit_log(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        user = kwargs.get("user", "unknown")
        print(f"AUDIT: User '{user}' is calling '{func.__name__}' with args {args}.")
        value = func(*args, **kwargs)
        print(f"AUDIT: Call finished. Returned: {value}")
        return value
    return wrapper
 
@audit_log
def set_firewall_rule(rule_id, action="BLOCK", *, user="admin"):
    return {"status": "success", "rule_id": rule_id, "action": action}
 
# Function is called here
output = set_firewall_rule(101, user="dev-ops")

AUDIT: User 'dev-ops' is calling 'set_firewall_rule' with args (101,).
AUDIT: Call finished. Returned: {'status': 'success', 'rule_id': 101, 'action': 'BLOCK'}


Correct. The positional argument 101 is captured in args. The keyword argument user="dev-ops" is captured in kwargs, and kwargs.get("user") correctly retrieves it. The decorator prints its logs before and after calling the function, which uses its default value for action.

## Advanced Decorator Practices

Question 1:
What is the primary purpose and benefit of using @functools.wraps when creating a Python decorator?


It copies metadata (like __name__, __doc__, and the signature) from the original function to the wrapper function, which is crucial for introspection, documentation, and debugging.

Correct

Correct. Without @wraps, calling help() or accessing __name__ on a decorated function would show information about the inner wrapper function, not the original function. @wraps fixes this by making the wrapper "look" like the original function from the outside.

Question 2:
An engineer writes a decorator to log function calls, but forgets to use @functools.wraps. What will be the output when the following script inspects the decorated function's metadata?

In [2]:
def logging_decorator(original_function):
    def wrapper_function(*args, **kwargs):
        """This is the wrapper's docstring."""
        print(f"Calling function: {original_function.__name__}")
        return original_function(*args, **kwargs)
    return wrapper_function
 
@logging_decorator
def get_user_permissions(user_id: int) -> list:
    """Returns a list of permissions for a given user."""
    return ["admin", "editor"]
 
print(f"Function name: {get_user_permissions.__name__}")
print(f"Docstring: {get_user_permissions.__doc__}")

Function name: wrapper_function
Docstring: This is the wrapper's docstring.


Correct. Because @functools.wraps was not used, the decorated get_user_permissions variable now points directly to wrapper_function. Therefore, inspecting its __name__ and __doc__ attributes reveals the metadata of the wrapper, not the original function.

Question 3:
When designing a decorator, you realize you need to pass a configuration value to it at definition time, like so: @retry(attempts=5). Why is an extra layer of nesting (a "decorator factory") required to achieve this?


Because a standard decorator function is only allowed to accept the function it decorates. To accept other arguments, you must call a factory function that returns the actual decorator.

Correct

Correct. A plain decorator's signature is decorator(func). The @ syntax provides only that one argument (the function being decorated). The expression @retry(attempts=5) first calls retry(attempts=5), which must return the real decorator. That returned decorator then receives the function to be decorated. This three-layer structure (factory -> decorator -> wrapper) is necessary to handle configuration.

Question 4:
This script uses a configurable decorator require_role to protect a function. Given the decorator and the function call, what is the output?

In [3]:
from functools import wraps
 
CURRENT_USER = {"username": "prod-agent", "role": "viewer"}
 
def require_role(required_role):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            if CURRENT_USER.get("role") == required_role:
                print("Permission granted.")
                return func(*args, **kwargs)
            else:
                print(f"Permission denied. Requires '{required_role}'.")
                return None
        return wrapper
    return decorator
 
@require_role(required_role="admin")
def deploy_service(service_name):
    print(f"Deploying service: {service_name}")
    return "SUCCESS"
 
result = deploy_service("user-database")
print(f"Final result: {result}")

Permission denied. Requires 'admin'.
Final result: None


Correct. The @require_role(required_role="admin") call configures the decorator to check for the "admin" role. The wrapper compares this to the CURRENT_USER's role ("viewer") and finds a mismatch. It then prints the denial message and returns None, which is then printed as the final result.

Question 5:
An engineer attempts to write a decorator factory with_context that adds a context dictionary to the decorated function's keyword arguments. However, the implementation is structurally incorrect. Which of the following implementations correctly fixes the structure of the with_context decorator factory?

In [None]:
def with_context(func, context_data):
    def wrapper(*args, **kwargs):
        kwargs['context'] = context_data
        return func(*args, **kwargs)
    return wrapper
 
 
@with_context(context_data={"user_id": 123})
def process_request(request_id, *, context={}):
    print(f"Processing {request_id} for user {context['user_id']}")

def with_context(context_data):
    def decorator(func):
        def wrapper(*args, **kwargs):
            kwargs['context'] = context_data
            return func(*args, **kwargs)
        return wrapper
    return decorator
Correct

Correct. This code correctly implements the three-level structure. with_context(context_data) is the factory that accepts configuration. It returns decorator(func), which is the actual decorator that accepts the function. decorator in turn returns wrapper, which contains the runtime logic.

Question 6:
When stacking multiple decorators on a single function, the order in which they are listed matters. In which scenario is the order most critical and likely to cause bugs or unexpected behavior?

When one decorator transforms the return value (e.g., formats it as JSON) and another decorator needs to operate on that transformed value (e.g., logs the JSON string).

Correct

Correct. This is the most critical case. If a decorator expects a certain data type (e.g., a dictionary) but the decorator below it in the stack has already converted it to another type (e.g., a JSON string), a TypeError or AttributeError will occur. The execution order (top-down) means the outer decorator acts on the result of the inner decorator.

Question 7:
Two decorators, @log_enter and @log_exit, are stacked on a function. Based on the rules of decorator application and execution, what is the exact output printed to the console?

In [4]:
from functools import wraps
 
def log_enter(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print("ENTERING")
        return func(*args, **kwargs)
 
    return wrapper
 
def log_exit(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        result = func(*args, **kwargs)
        print("EXITING")
        return result
 
    return wrapper
 
@log_enter
@log_exit
def perform_task():
    print("  ...TASK RUNNING...")
 
perform_task()

ENTERING
  ...TASK RUNNING...
EXITING


Correct. Execution happens from the outside-in. The @log_enter decorator is outermost, so its wrapper runs first, printing "ENTERING". It then calls the next function in the chain (the one wrapped by @log_exit). The @log_exit wrapper calls the original perform_task, which prints "...TASK RUNNING...". Finally, the execution unwinds: the @log_exit wrapper prints "EXITING", and the call chain completes.

## Python Exception Handling: The Fundamentals

Question 1:
A DevOps engineer writes a Python function to check the status of a service endpoint. What is the exact output returned by the function call check_service_status(80)?

In [2]:
def check_service_status(port):
    print("Initiating check...")
    log = []
    try:
        if port == 80:
            log.append("OK")
        elif port == 503:
            raise ConnectionRefusedError("Service unavailable")
        else:
            log.append("Unknown")
    except ConnectionRefusedError:
        log.append("FAIL")
        print("Connection failed.")
    else:
        log.append("SUCCESS")
        print("Check completed without errors.")
    finally:
        log.append("LOGGED")
        print("Finalizing check.")
    return log

print(check_service_status(80))

Initiating check...
Check completed without errors.
Finalizing check.
['OK', 'SUCCESS', 'LOGGED']


Correct. When port is 80, the if port == 80: condition is met, and "OK" is appended. No exception is raised, so the except block is skipped. Because the try block completed without raising an exception, the else block is executed, appending "SUCCESS" and printing a message. The finally block always executes, appending "LOGGED" and printing its message. The final list returned is ['OK', 'SUCCESS', 'LOGGED'].

Question 2:
A script is being designed to process a list of hostnames. For each hostname, it will connect, download a configuration file, and parse it. The network is known to be unreliable, and some configuration files might be malformed or missing expected keys.

Which of the following best describes the most Pythonic and robust design philosophy for this script?

The EAFP (Easier to Ask for Forgiveness than Permission) approach, where the script directly attempts to connect, download, and parse the file inside a try block, using specific except blocks to handle ConnectionError, FileNotFoundError, or KeyError as they occur.

Correct

Correct. This is the preferred Pythonic approach for this scenario. It results in cleaner, more readable code by focusing the try block on the primary task (the "happy path"). It robustly handles specific, expected errors in separate except blocks, clearly separating the main logic from the error-handling logic.

Question 3:
You are reviewing a script that parses structured log data, which is represented as a list of dictionaries. What will be printed to the console when this script is executed?

In [3]:
def summarize_logs(log_entries):
    summary = []
    for entry in log_entries:
        try:
            # The 'user' key is optional
            user = entry.get('user', 'system')
            # The 'event_id' key is mandatory
            event_id = entry['event_id']
            summary.append(f"{event_id}:{user}")
        except KeyError:
            summary.append("ERROR:Missing-Data")
        except (TypeError, AttributeError):
            summary.append("ERROR:Invalid-Entry")
 
    return summary
 
logs = [
    {'event_id': 101, 'user': 'alice'},
    {'event_id': 102},
    None,
    {'user': 'bob'}
]
 
print(summarize_logs(logs))

['101:alice', '102:system', 'ERROR:Invalid-Entry', 'ERROR:Missing-Data']


Correct. The third entry None causes a TypeError when the code attempts entry.get(...), as NoneType has no .get method. The except TypeError block is triggered. Result: 'ERROR:Invalid-Entry'. The fourth entry {'user': 'bob'} causes a KeyError when the code attempts entry['event_id'], as this key is missing. The except KeyError block is triggered. Result: 'ERROR:Missing-Data'.

Question 4:
A junior developer wrote the following function to retrieve a specific metric from a nested dictionary representing server monitoring data. The function call returns "Could not retrieve metric 'memory' for server 'srv-db-01'." as expected. However, what is the most significant flaw in this function's error handling design?

In [4]:

def get_metric(data, server_id, metric_name):
    try:
        value = data[server_id][metric_name]
        return f"Metric '{metric_name}' on server '{server_id}' is {value}"
    except:
        return f"Could not retrieve metric '{metric_name}' for server '{server_id}'."
 
# Example usage:
server_data = {
    "srv-web-01": {"cpu": 0.75, "memory": 0.5},
    "srv-db-01": {"cpu": 0.4}
}
print(get_metric(server_data, "srv-db-01", "memory"))

Could not retrieve metric 'memory' for server 'srv-db-01'.



Using a bare except: clause is a bad practice because it catches all exceptions, including system-level ones and programming errors, making the code difficult to debug.

Correct

Correct. This is the most critical flaw. A bare except: (equivalent to except BaseException:) catches everything, including SystemExit, KeyboardInterrupt, and even SyntaxError. More importantly for a developer, it will silently hide unrelated bugs, such as a TypeError if data was accidentally passed as None. The handler should have been specific, like except KeyError:, to only catch the expected error of a missing key.

Question 5:
What is the primary conceptual difference between a ValueError and a TypeError in Python?


ValueError occurs when an operation receives an argument of the right type but an inappropriate value, whereas TypeError occurs when the argument's type itself is invalid for the operation.

Correct

This is the core distinction. For int("abc"), the int() function is given a str (the correct type), but the value "abc" is inappropriate. This is a ValueError. For len(123), the len() function is given an int, which is the wrong type entirely; it expects a sequence or collection. This is a TypeError.

## Raising and Defining custom exceptions in Python

Question 1:
You are writing a function get_user_profile(user_id) that fetches user data from a remote database. Under which circumstances is it most appropriate to raise an exception rather than return None?


When the database connection times out or is refused, the function should raise ConnectionError.

Correct

Correct. A connection failure is an unexpected, exceptional event that prevents the function from fulfilling its purpose. The function itself cannot resolve this issue. Raising an exception is the correct way to signal this failure to the caller, so it can be logged, retried, or handled appropriately at a higher level.

Question 2:
A function is written to validate a deployment configuration. Analyze its behavior with the given inputs. What is the output of this script?

In [5]:
def validate_config(config_data):
    if not isinstance(config_data, dict):
        raise TypeError("Configuration must be a dictionary.")
 
    mem_limit = config_data.get("memory_limit_mb")
    if mem_limit is None:
        raise ValueError("Mandatory key 'memory_limit_mb' is missing.")
 
    if not (isinstance(mem_limit, int) and 256 <= mem_limit <= 4096):
        raise ValueError(f"Memory limit {mem_limit} is outside the allowed range (256-4096).")
 
    print(f"Config valid: {mem_limit}MB")
 
 
configs = [
    {"memory_limit_mb": 1024},
    {"memory_limit_mb": 128},
    {"cpu_cores": 4},
    "memory_limit_mb: 512"
]
 
for config in configs:
    try:
        validate_config(config)
    except (ValueError, TypeError) as e:
        print(f"Error: {e}")

Config valid: 1024MB
Error: Memory limit 128 is outside the allowed range (256-4096).
Error: Mandatory key 'memory_limit_mb' is missing.
Error: Configuration must be a dictionary.


{"memory_limit_mb": 1024}: Passes all checks. Prints "Config valid...". {"memory_limit_mb": 128}: Fails the range check 256 <= 128. A ValueError is raised and caught. {"cpu_cores": 4}: mem_limit is None. A ValueError is raised due to the missing key and caught. "memory_limit_mb: 512": This is a string, not a dictionary. The first isinstance check fails, raising a TypeError which is caught.

Question 3:
Which of the following is the most significant advantage of defining a custom exception hierarchy, such as a base ServiceError with subclasses AuthenticationError and QuotaExceededError?

It allows the calling code to use a single except ServiceError: block to handle all service-related failures, or to use specific except AuthenticationError: blocks for targeted recovery logic.

Correct

This is the primary benefit. A custom hierarchy provides flexibility. Callers can handle errors at different levels of granularity: catch a specific subclass to retry or refresh a token, or catch the base class to perform generic logging and cleanup for any related failure.

Question 4:
A script uses a custom exception hierarchy to manage errors during a file provisioning process. What is the output of this script?

In [6]:
class ProvisionerError(Exception):
    """Base class for provisioning failures."""
    pass
 
class DiskSpaceError(ProvisionerError):
    """Raised when there is not enough disk space."""
    def __init__(self, required, available):
        super().__init__(f"Not enough disk. Required: {required}GB, Available: {available}GB")
 
class PermissionsError(ProvisionerError):
    """Raised due to file system permission issues."""
    pass
 
def provision_file(size_gb, path):
    if size_gb > 100:
        raise DiskSpaceError(required=size_gb, available=100)
    if "/root/" in path:
        raise PermissionsError(f"Cannot write to protected path: {path}")
    print("Provisioning successful.")
 
try:
    provision_file(size_gb=50, path="/root/data.bin")
except DiskSpaceError as e:
    print(f"Caught Disk Error: {e}")
except ProvisionerError as e:
    print(f"Caught Provisioner Error: {e}")
except Exception:
    print("Caught a generic exception.")

Caught Provisioner Error: Cannot write to protected path: /root/data.bin


Correct. The function raises a PermissionsError. The first except block (except DiskSpaceError) does not match. The second except block (except ProvisionerError) does match, because PermissionsError is a subclass of ProvisionerError. Execution enters this block and prints the message.

Question 5:
You are developing a parser for a custom configuration file format. You've created a special exception to handle syntax errors. What will be printed to the console after the following code executes?

In [7]:
class ConfigSyntaxError(ValueError):
    def __init__(self, message, line_num, text):
        full_msg = f"Syntax error on line {line_num}: {message}"
        super().__init__(full_msg)
        self.line = line_num
        self.text = text
 
def parse_config(lines):
    for i, line in enumerate(lines, 1):
        if "=" not in line:
            raise ConfigSyntaxError("Missing '=' assignment", i, line)
    return "Parsed OK"
 
config_text = ["host=server.local", "port", "timeout=30"]
 
try:
    parse_config(config_text)
except ConfigSyntaxError as e:
    print(e)
    print(f"-> Problematic text: '{e.text}'")

Syntax error on line 2: Missing '=' assignment
-> Problematic text: 'port'


Correct. The parse_config function finds that the second line ("port") does not contain an =. It raises a ConfigSyntaxError, passing the message, line number (2), and the text. The except block catches it. The first print(e) prints the message generated by super().__init__(). The second print statement accesses the custom .text attribute on the exception object.

Question 6:
Analyze the following function signature and its internal logic.



def process_data_files(file_paths):
    # (function implementation)


Which of the following implementations best demonstrates the principle of raising an exception for a true error state versus handling a valid-but-empty edge case gracefully?

if not isinstance(file_paths, list):
    raise TypeError("Input must be a list of file paths.")
if not file_paths:
    print("No files to process.")
    return
# ... process files ...
Correct

This implementation correctly identifies two different situations. An input that is not a list is a contract violation (a true error), which is correctly handled by raising a TypeError. An empty list is a valid edge case that is handled gracefully by printing a message and exiting.

## Resource Management with Context Managers

Question 1:
A DevOps script needs to open a log file, write several status updates to it, and ensure the file is always closed, even if a network error interrupts the script mid-operation.

Which of the following two code blocks represents the best practice for this task in Python, and why?

In [None]:
Block A:

f = open("update.log", "w")
try:
    f.write("Starting...\n")
    # ... network operations that might fail ...
    f.write("Finished.\n")
finally:
    f.close()


Block B:

with open("update.log", "w") as f:
    f.write("Starting...\n")
    # ... network operations that might fail ...
    f.write("Finished.\n")


Block B is better because the with statement is more concise, less error-prone, and guarantees the resource (f) is cleaned up (closed) upon exiting the block for any reason.

Correct

The with statement is the Pythonic best practice for resource management. It encapsulates the try...finally logic, making the code cleaner, more readable, and safer by guaranteeing that the resource's teardown logic (in this case, f.close()) is automatically executed.

Question 2:
A developer writes a custom context manager to temporarily switch the current working directory for a script. However, they notice that if an error occurs during the file operations, the script does not revert to the original directory. What is the primary bug in the temp_directory context manager?

In [None]:
from contextlib import contextmanager
import os
 
@contextmanager
def temp_directory(path):
    original_dir = os.getcwd()
    os.makedirs(path, exist_ok=True)
    os.chdir(path)
    yield
    os.chdir(original_dir)
    os.rmdir(path)
    print(f"Reverted to {original_dir}")
 
try:
    with temp_directory("./local_temp_folder"):
        print(f"Now in: {os.getcwd()}")
        result = 1 / 0  # Simulate an error
except ZeroDivisionError:
    print("An error occurred.")
 
print(f"Final directory: {os.getcwd()}")


The teardown logic (os.chdir(original_dir)) is not placed inside a finally block, so it is skipped when an exception is raised in the with block.

Correct

Correct. This is the critical bug. When the ZeroDivisionError occurs inside the with block, the generator's execution is terminated. Code that comes after the yield will only be executed if it is in a finally clause. The fix is to wrap the yield in a try...finally block to guarantee the cleanup code runs.

Question 3:
Analyze the following custom context manager class designed to handle database transactions. It is designed to suppress IntegrityError (e.g., duplicate key) while letting other errors propagate. What is the exact output of this script?

In [8]:
class DatabaseTransaction:
    def __enter__(self):
        print("BEGIN TRANSACTION")
        return self
 
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is None:
            print("COMMIT")
            return False  # Propagate no exception
        elif exc_type is IntegrityError:
            print("ROLLBACK: Ignoring duplicate key.")
            return True  # Suppress this specific exception
        else:
            print("ROLLBACK: An unexpected error occurred.")
            return False  # Propagate other exceptions
 
 
class IntegrityError(Exception):
    pass
 
 
class NetworkError(Exception):
    pass
 
try:
    with DatabaseTransaction():
        raise NetworkError("Connection lost")
except NetworkError as e:
    print(f"Handled at top level: {e}")

BEGIN TRANSACTION
ROLLBACK: An unexpected error occurred.
Handled at top level: Connection lost


Correct. __enter__ is called, printing "BEGIN TRANSACTION". A NetworkError is raised inside the with block. __exit__ is called. exc_type is NetworkError. It doesn't match IntegrityError, so it falls to the else clause, printing "ROLLBACK: An unexpected error occurred.". __exit__ returns False, so the NetworkError is re-raised and propagated outside the with statement. The outer except NetworkError block catches it and prints the final message.

Question 4:
When designing a custom context manager, what is the key difference in purpose between implementing it as a class with __enter__/__exit__ methods versus using the @contextlib.contextmanager decorator on a generator?

The class-based approach is generally better for managing complex state or when the setup/teardown logic is substantial, while the decorator is more suitable for simpler, more linear setup/teardown tasks.

Correct

Correct. This accurately describes the trade-off. A class provides a natural way to manage complex state through its attributes (self). For simple cases like temporarily changing a directory or setting an environment variable, the decorator is more concise and requires less boilerplate.

Question 5:
Considering the following code, what is the purpose of the value returned by the __enter__ method in a class-based context manager?

In [None]:
class ApiSessionManager:
    def __init__(self, api_key):
        self.key = api_key
        self.session_id = None
 
    def __enter__(self):
        print("Starting API session...")
        self.session_id = "xyz-123"
        return self # Returns the instance itself
 
    def __exit__(self, exc_type, exc_value, traceback):
        print("Closing API session...")
        self.session_id = None
 
with ApiSessionManager("my-secret-key") as session:
    print(f"Using session ID: {session.session_id}")

It is the value that gets assigned to the variable specified in the as clause of the with statement.

Correct

Correct. The with ... as var: syntax captures the return value of the __enter__ method in the variable var. In this example, session becomes a reference to the ApiSessionManager instance, allowing access to its attributes like session_id.

## Python Logging: Core Concepts and Mechanics

Question 1:
A DevOps team is developing a long-running Python service to manage cloud infrastructure. For debugging and auditing, they need to record events such as service startup, configuration reloads, and connection failures. Why is using Python's logging module a better practice than using print() statements for this purpose?


logging allows developers to categorize messages by severity (e.g., DEBUG, INFO, WARNING, ERROR), which can be filtered and routed to different destinations, whereas print() treats all messages with the same importance.

Correct

Correct. This is the core advantage. Log levels allow for granular control over log verbosity. In production, a team might only want to see INFO and above, but in a staging environment, they can enable DEBUG logs for deep diagnostics, all without changing the application code. This is impossible to achieve cleanly with print().

Question 2:
Which log level is most appropriate for recording a user's password during a failed login attempt for debugging purposes?

Passwords and other sensitive credentials should never be logged in plain text, regardless of the log level.

Correct

Correct. This is the fundamental security best practice. Logging sensitive information like passwords, API keys, or personal data creates a major security risk. If you need to confirm a value was received, you can log a confirmation like password_provided: True or a salted hash, but never the raw value.

Question 3:
You are configuring a logger for a script that monitors system metrics. You want to see only high-severity alerts on the console. What will be printed to the console when this script is executed?

In [None]:
import logging
import sys
 
monitor_logger = logging.getLogger("system.monitor")
monitor_logger.setLevel(logging.INFO)
 
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(logging.ERROR)
console_handler.setFormatter(logging.Formatter(logging.BASIC_FORMAT))
 
monitor_logger.addHandler(console_handler)
 
monitor_logger.info("CPU usage is at 25%.")
monitor_logger.warning("Memory usage is at 85%.")
monitor_logger.error("Disk space is critically low.")

ERROR:system.monitor:Disk space is critically low.
Correct

Correct. This demonstrates two-stage filtering. The logger's level is INFO, so it accepts INFO, WARNING, and ERROR messages. These messages are passed to the handler. The handler's level is ERROR. It will discard any message with a severity lower than ERROR. Therefore, the INFO and WARNING messages are discarded by the handler, and only the ERROR message is processed and printed.

Question 4:
A junior developer writes a script to log important events. After running it, they see that the audit.log file is created, but it is empty. The expected message "User 'admin' logged in." does not appear in the file. What is the fundamental reason the log message is not written to the file?

In [None]:
import logging
 
def setup_auditing():
    audit_logger = logging.getLogger("audit_trail")
    audit_logger.setLevel(logging.INFO)
 
    audit_handler = logging.FileHandler("audit.log")
    audit_handler.setLevel(logging.INFO)
 
setup_auditing()
 
logger = logging.getLogger("audit_trail")
logger.info("User 'admin' logged in.")

A logger without any handlers attached to it will not pass log records to any destination. The audit_handler was never attached using audit_logger.addHandler().

Correct

Correct. This is the core issue. A logger's job is to pass log records to its list of handlers. In the code, audit_handler is created but never registered with the audit_logger. Because the logger has no handlers, it effectively discards the log message.

Question 5:
In the architecture of Python's logging module, what are the primary, distinct roles of the Logger, Handler, and Formatter components?

Logger: Provides the main entry point for code to emit messages. Handler: Directs log records to a specific destination. Formatter: Defines the string layout of the final log record.

Correct

Correct. This correctly describes the separation of concerns. Logger: The interface the application code uses (logging.getLogger(...)). Handler: Responsible for the output destination (e.g., StreamHandler for console, FileHandler for files). Formatter: Responsible for the appearance of the log message (e.g., adding a timestamp, level name).

## Practical Logging: File Handlers and Structured JSON Output

Question 1:
A service running on a server generates approximately 1 GB of log data per day. A junior engineer suggests logging everything to a single service.log file. What is the primary reason for using a RotatingFileHandler or TimedRotatingFileHandler instead of a single, ever-growing file?


Rotating logs into smaller, manageable chunks (e.g., daily or by size) prevents single files from consuming excessive disk space and makes it much easier to archive, search, or delete old logs.

Correct

Correct. This is the core benefit. A single 30 GB log file is difficult to open, search, or transfer. A collection of 30 daily 1 GB files is far more manageable. It simplifies log lifecycle management (e.g., "delete logs older than 30 days") and prevents a single file from filling a disk partition.

Question 2:
A script is configured to log status messages with size-based rotation. The maxBytes is set very low to demonstrate the rotation behavior. After the script completes, what files will exist and what will be the content of app.log.2?

In [None]:
import logging
import logging.handlers
import os
 
# --- Cleanup for predictable test runs ---
for f in os.listdir('.'):
    if f.startswith('app.log'):
        os.remove(f)
 
# --- Logger Setup ---
logger = logging.getLogger('rotation_test')
logger.setLevel(logging.INFO)
 
formatter = logging.Formatter('%(message)s')
 
handler = logging.handlers.RotatingFileHandler(
    'app.log', maxBytes=50, backupCount=2)
handler.setFormatter(formatter)
logger.addHandler(handler)
 
# --- Logging Calls ---
for i in range(7):
    logger.info(f"Log entry number {i+1}...")

Files app.log, app.log.1, app.log.2. The content of app.log.2 will be Log entry number 3... followed by Log entry number 4....

Correct

Correct. Entries 1 and 2 (40 bytes) go into app.log. Entry 3 (20 bytes) triggers rotation. app.log is renamed to app.log.1. A new app.log is created and gets entry 3. Entry 4 (20 bytes) goes into the new app.log. Entry 5 (20 bytes) triggers rotation. app.log.1 is renamed to app.log.2. The current app.log (containing 3 & 4) is renamed to app.log.1. A new app.log gets entry 5. Entry 6 (20 bytes) goes into the new app.log. Entry 7 (20 bytes) triggers rotation. app.log.2 (containing 1 & 2) is deleted because backupCount is 2. app.log.1 (containing 3 & 4) is renamed to app.log.2.

Question 3:
A developer is trying to set up a logger that rotates its log file every hour. However, the script crashes at startup with an AttributeError. What is the cause of the AttributeError and how should it be fixed?

In [None]:
import logging
import time
 
logger = logging.getLogger('hourly_reporter')
logger.setLevel(logging.INFO)
 
handler = logging.TimedRotatingFileHandler('reporter.log', when='h', interval=1)
 
logger.addHandler(handler)
logger.info("Reporter starting up.")


The TimedRotatingFileHandler class is not in the top-level logging module. It must be imported from the logging.handlers submodule.

Correct

Correct. This is the precise bug. Specialized handlers like TimedRotatingFileHandler and RotatingFileHandler reside in a submodule. The code should be import logging.handlers and handler = logging.handlers.TimedRotatingFileHandler(...). The AttributeError occurs because Python cannot find TimedRotatingFileHandler as an attribute of the main logging module.

Question 4:
Your team is building a microservices-based application where logs from dozens of services are sent to a central log aggregation platform like Splunk or Elasticsearch. Why is structured logging (e.g., in JSON format) strongly preferred over plain-text logging in this environment?


Structured logs are human-readable key-value pairs that can be reliably parsed by machines, allowing for powerful filtering, querying, and aggregation. Plain-text logs require brittle and slow regular expressions to parse.

Correct

This is the primary advantage. With structured logs, fields like level, user_id, or request_id are distinct data points. This allows the aggregation platform to index them for fast, reliable queries. Parsing plain text is inefficient and breaks every time a developer changes the log message format.

Question 5:
You are setting up a JSON logger to include a custom request ID and to rename some default fields for clarity. What is the exact JSON output?

In [1]:

import logging
import sys
from pythonjsonlogger.json import JsonFormatter
 
logger = logging.getLogger('api_gateway')
logger.setLevel(logging.INFO)
logger.handlers.clear()
 
handler = logging.StreamHandler(sys.stdout)
formatter = JsonFormatter(
    '%(levelname) %(name) %(message)s',
    rename_fields={'levelname': 'severity', 'name': 'logger_name'}
)
handler.setFormatter(formatter)
logger.addHandler(handler)
 
logger.info(
    'Incoming request processed',
    extra={'request_id': 'abc-xyz-789', 'method': 'GET'}
)

{"severity": "INFO", "logger_name": "api_gateway", "message": "Incoming request processed", "request_id": "abc-xyz-789", "method": "GET"}


{"severity": "INFO", "logger_name": "api_gateway", "message": "Incoming request processed", "request_id": "abc-xyz-789", "method": "GET"}

Correct

Correct. The JsonFormatter creates a JSON object where: levelname is renamed to severity and has the value "INFO". name is renamed to logger_name and has the value "api_gateway". message has the value "Incoming request processed". All key-value pairs from the extra dictionary are added as top-level fields.

Question 6:
A developer is trying to add a task_id and worker_id to their JSON logs for better traceability. However, these fields are not appearing in the final log output. Why are the task_id and worker_id fields missing from the JSON output?

In [None]:
import logging
import sys
from pythonjsonlogger.json import JsonFormatter
 
logger = logging.getLogger("worker_pool")
logger.setLevel(logging.INFO)
logger.handlers.clear()
 
handler = logging.StreamHandler(sys.stdout)
formatter = JsonFormatter()
handler.setFormatter(formatter)
logger.addHandler(handler)
 
context_data = {"task_id": "t-456", "worker_id": "w-03"}
 
logger.info("Task completed successfully", context_data)


The context dictionary must be passed as the extra keyword argument, not as a positional argument. The correct call is logger.info('...', extra=context_data).

Correct

The logging methods (info, warning, etc.) interpret the second positional argument as data for string formatting within the message itself (e.g., logger.info("User %s logged in", username)). To pass a dictionary of context to be added to the log record, it must be passed using the extra keyword argument.

## Declarative and Dynamic Logging Configuration

Question 1:
What is the primary architectural advantage of using a declarative configuration method (like logging.config.dictConfig or fileConfig) over an imperative one (programmatically creating and attaching handlers/formatters)?


It separates the logging configuration (the "what") from the application's business logic (the "how"), allowing the logging setup to be modified without changing the application code.

Correct

Correct. This is the core principle. By externalizing the configuration into a file (like a .ini or .json), an operations team can adjust log levels, change output destinations, or modify formats for a running application just by changing the config file and reloading, without needing a developer to edit and redeploy the Python code.

Question 2:
An application uses the following logging.ini file for its configuration. Given this configuration, what happens when logging.getLogger("app.tasks").info("Task started") is called?

In [None]:
[loggers]
keys=root,tasks
 
[handlers]
keys=console,file
 
[formatters]
keys=simple
 
[logger_root]
level=WARNING
handlers=console
 
[logger_tasks]
level=INFO
handlers=file
qualname=app.tasks
propagate=0
 
[handler_console]
class=StreamHandler
level=WARNING
formatter=simple
args=(sys.stdout,)
 
[handler_file]
class=FileHandler
level=DEBUG
formatter=simple
args=('task.log', 'w')
 
[formatter_simple]
format=%(name)s:%(levelname)s:%(message)s

The message is written only to the task.log file.

Correct

Correct. The app.tasks logger is configured with a level of INFO, so it accepts the message. It sends the message to its configured handler, file, which writes to task.log. The propagate=0 setting stops the message from continuing up to the root logger.

Question 3:
When using logging.config.dictConfig, what is the purpose of the top-level disable_existing_loggers key?


If True (the default), it disables all loggers that existed before the dictConfig call, unless they or their ancestors are explicitly named in the new configuration. This helps ensure a clean, predictable logging setup.

Correct

Correct. This is the precise function. It's a safety measure to prevent old, imperatively configured loggers (e.g., from a third-party library that was imported earlier) from interfering with the new, declarative setup. Setting it to False is necessary if you intend to merge the new configuration with existing ones.

Question 4:
An application needs to dynamically set the log level based on an environment variable. If DEBUG_MODE is active, all loggers should be set to DEBUG; otherwise, they should use their specified levels. What is the output of this script?

In [2]:
import logging
import logging.config
 
def get_log_config(is_debug):
    config = {
        "version": 1,
        "disable_existing_loggers": False,
        "formatters": {"simple": {"format": "%(message)s"}},
        "handlers": {
            "console": {
                "class": "logging.StreamHandler",
                "level": "INFO",
                "formatter": "simple"
            }
        },
        "loggers": {
            "service_a": {"level": "WARNING", "handlers": ["console"]},
            "service_b": {"level": "ERROR", "handlers": ["console"]}
        }
    }
    if is_debug:
        for logger_name in config["loggers"]:
            config["loggers"][logger_name]["level"] = "DEBUG"
        config["handlers"]["console"]["level"] = "DEBUG"
 
    return config
 
logging.config.dictConfig(get_log_config(is_debug=True))
logging.getLogger("service_a").debug("Service A detailed trace.")
logging.getLogger("service_b").info("Service B started.")

Service A detailed trace.
Service B started.


Correct. When is_debug=True, the configuration is modified at runtime. The levels for service_a and service_b loggers are both changed to DEBUG, and the console handler's level is also changed to DEBUG. Consequently, both the DEBUG message from service_a and the INFO message from service_b are accepted by their respective loggers and by the handler, and are printed.

Question 5:
Which configuration format, INI-style (fileConfig) or Dictionary-style (dictConfig), offers more flexibility for complex and dynamic logging setups, and why?

Dictionary-style, because it can natively represent complex data types (lists, nested dictionaries), can be constructed programmatically, and easily serialized/deserialized from expressive formats like JSON or YAML.

Correct

Correct. The dictionary schema is superior for complex setups. It allows for non-string values (e.g., for custom filters or handlers that take complex arguments), can be built and modified in Python code before being applied, and maps directly to modern data interchange formats like JSON and YAML.

## File System Interaction and I/O

Question 1:
In modern Python (3.6+), why is using the pathlib module generally preferred over the older os.path module for filesystem path manipulations?


pathlib provides an object-oriented API where paths are objects with methods, making code more readable and expressive compared to the string-based functional approach of os.path.

Correct

Correct. This is the core reason. With pathlib, a path is an object, not just a string. You can call methods directly on it (e.g., p.exists(), p.is_dir()) and use operators like / for joining paths. This leads to cleaner, more intuitive, and less error-prone code than calling separate functions like os.path.exists(p) or os.path.join(p, "dir").

Question 2:
A DevOps script needs to construct a path to a deployment artifact and print its components. Analyze the following script. What is the exact output of this script?

In [1]:
from pathlib import Path
 
base_dir = Path("/opt/deployments/artifacts")
artifact_file = "app-v2.1.tar.gz"
 
full_path = base_dir / artifact_file
 
print(f"Name: {full_path.name}")
print(f"Parent: {full_path.parent.name}")
print(f"Stem: {full_path.stem}")
print(f"Suffix: {full_path.suffix}")

Name: app-v2.1.tar.gz
Parent: artifacts
Stem: app-v2.1.tar
Suffix: .gz


Correct. .name is the full final path component: app-v2.1.tar.gz. .parent is the path to the directory containing the file (/opt/deployments/artifacts), and .parent.name is its final component: artifacts. .suffix is only the final dot-separated part: .gz. .stem is the name without the final suffix: app-v2.1.tar.

Question 3:
A developer writes a function to create a unique lock file for a process. The function should only succeed if the lock file does not already exist, to prevent multiple instances of a process from running. The function, however, never prints "Lock file already exists."; instead, it overwrites any existing file with the new pid. Why does this code not behave as intended?

In [None]:
from pathlib import Path
import os
 
def create_lock_file(pid):
    lock_path = Path(f"process_{pid}.lock")
    try:
        with lock_path.open(mode='w', encoding='utf-8') as f:
            f.write(str(os.getpid()))
        print("Lock file created successfully.")
        return True
    except FileExistsError:
        print("Lock file already exists.")
        return False
 
# Simulate that the file already exists
Path("process_123.lock").touch()
 
create_lock_file(123)


Using mode='w' overwrites an existing file or creates a new one; it never raises a FileExistsError. To achieve the desired behavior, mode='x' (exclusive creation) should be used.

Correct

Correct. This is the bug. The 'w' mode's behavior is to truncate and overwrite any existing file. It will never raise a FileExistsError. The 'x' mode is designed specifically for this "create only if it does not exist" use case and will raise FileExistsError if the path already exists.

Question 4:
You need to write a Python script to parse a 10 GB log file and count the occurrences of the word "ERROR". Which method of reading the file is most appropriate for this task to ensure the script does not consume an excessive amount of memory?


for line in file:

Correct

Correct. Iterating directly over the file object (for line in file:) is the most memory-efficient method. Python reads the file line-by-line, loading only one line into memory at a time. This allows the script to process files of any size with a minimal and constant memory footprint.

Question 5:
A developer is writing a list of server hostnames to a configuration file. Analyze the following code. What will be the final content of the servers.conf file?

In [2]:
from pathlib import Path
 
servers = ["web-01\n", "db-01\n", "cache-01\n"]
config_file = Path("servers.conf")
 
with config_file.open(mode='w', encoding='utf-8') as f:
    f.writelines(servers)
 
with config_file.open(mode='a', encoding='utf-8') as f:
    f.write("monitor-01")

web-01
db-01
cache-01
monitor-01
Correct

Correct. The first with block opens the file in write (w) mode, truncating it. f.writelines(servers) writes each string from the list to the file. Since each string in the servers list already contains a newline character (\n), each server is written on its own line. The second with block opens the file in append (a) mode and f.write("monitor-01") adds "monitor-01" immediately after the last character of the existing content, which was the newline after "cache-01".

## Text Processing with Regular Expressions

Question 1:
A DevOps engineer needs to parse log files where lines can be in slightly different formats, such as ERROR: [auth-service] - Login failed or WARN: [db-service] - Connection slow. Why is using the re module generally more robust for this task than using string methods like str.split() and str.find()?


The re module can define flexible patterns that accommodate variations in whitespace, optional components, and different keywords, whereas string methods rely on fixed, literal substrings.

Correct

Correct. This is the key advantage of regex. A single pattern like r"(ERROR|WARN):\s+\[(.*?)\]" can handle multiple log levels, variable whitespace (\s+), and extract the service name ((.*?)) reliably. Achieving this with str.split() and str.find() would require complex, brittle, and hard-to-maintain if/else logic.

Question 2:
A script is written to extract all IP addresses and status codes from a web server access log. What is the output of this script?



In [1]:
import re
 
log_data = """
192.168.1.10 - - [10/Mar/2023:13:55:36 +0000] "GET /api/v1/users HTTP/1.1" 200 512
10.0.0.5 - - [10/Mar/2023:13:56:12 +0000] "POST /api/v1/data HTTP/1.1" 201 1024
172.16.0.2 - - [10/Mar/2023:13:57:01 +0000] "GET /static/style.css HTTP/1.1" 404 2326
"""
 
ip_pattern = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
status_code_pattern = r'"\s(\d{3})\s'
 
ips = re.findall(ip_pattern, log_data)
codes = re.findall(status_code_pattern, log_data)
 
print(f"IPs: {ips}")
print(f"Codes: {codes}")

IPs: ['192.168.1.10', '10.0.0.5', '172.16.0.2']
Codes: ['200', '201', '404']


Correct. The ip_pattern has no capturing groups, so re.findall returns a list of all full strings that match the pattern. The status_code_pattern has one capturing group (\d{3}). re.findall therefore returns a list containing only the strings captured by that group, which are the 3-digit status codes.

Question 3:
A developer wants to extract the content inside an XML-like tag. Their code is returning a much larger string than expected, instead of extracting only the first sender <sender>user1</sender>. What is the cause of the bug?

In [None]:
import re
 
text = "<message><sender>user1</sender><content>hello</content></message><message><sender>user2</sender><content>hi</content></message>"
 
pattern = r"<sender>.*</sender>"
 
match = re.search(pattern, text)
 
if match:
    print(match.group(0))

The * quantifier is greedy by default. It makes .* match as many characters as possible, causing it to match from the first <sender> tag to the very last </sender> tag in the string.

Correct

This is the classic greedy vs. non-greedy problem. The * quantifier will match everything until it finds the final possible . The fix is to use a non-greedy (or lazy) quantifier, *?, so the pattern becomes r".*?", which stops at the first

Question 4:
A script is designed to parse key-value pairs from service configuration lines. It uses named capturing groups to make the extracted data easier to work with. What is the output of this script?

In [2]:
import re
 
config_line = "service=nginx; port=443; protocol=https;"
pattern = r"service=(?P<name>\w+);\s*port=(?P<port>\d+);\s*protocol=(?P<proto>\w+);"
 
match = re.search(pattern, config_line)
 
if match:
    data = match.groupdict()
    print(f"{data['name']} is running on port {data['port']} using {data['proto']}")

nginx is running on port 443 using https


Correct. The regex pattern correctly uses named capturing groups (?P...) to capture nginx, 443, and https. The re.search() finds a match, and match.groupdict() returns a dictionary: {'name': 'nginx', 'port': '443', 'proto': 'https'}. The f-string then correctly formats this data.

Question 5:
What is the main advantage of using re.finditer() over re.findall() when you need to process a very large number of matches in a large text file?

re.finditer() returns an iterator that yields match objects one by one, making it highly memory-efficient. re.findall() constructs a full list of all matches in memory at once.

Correct

Correct. This is the key difference. For a file with millions of matches, re.findall() could consume a huge amount of memory by building a giant list. re.finditer() is a lazy approach; it finds one match, yields it for processing, and only then moves on to find the next one, keeping memory usage low and constant.

## Working with Data Formats (JSON, YAML, CSV)

Question 1:
A DevOps team is deciding on a configuration file format for a new application. The configuration will be maintained by engineers of varying technical backgrounds and needs to support comments for documentation. Between JSON and YAML, which is the better choice and why?

YAML, because its syntax is designed for human readability, using indentation for structure and allowing comments (#), which makes it easier to document and manage complex configuration files.

Correct

Correct. YAML's primary design goal is human-friendliness. The ability to add comments, the minimal syntax (no mandatory quotes or braces), and the clear block structure make it an excellent choice for configuration files that humans need to read, understand, and edit.

Question 2:
A script needs to parse a JSON string received from a monitoring API and extract specific details. What is the output of this script?

In [1]:
import json
 
api_response = """
{
    "service_id": "svc-web-01",
    "status": "HEALTHY",
    "metrics": {
        "cpu_percent": 15.5,
        "memory_mb": 1024
    },
    "tags": ["prod", "frontend"],
    "enabled": true
}
"""
 
try:
    data = json.loads(api_response)
    service = data.get("service_id")
    cpu = data["metrics"]["cpu_percent"]
    is_prod = "prod" in data.get("tags", [])
    print(f"{service} CPU: {cpu}%, Is_Prod: {is_prod}")
except (json.JSONDecodeError, KeyError) as e:
    print(f"Error parsing data: {e}")

svc-web-01 CPU: 15.5%, Is_Prod: True


Question 3:
A developer is writing a Python dictionary to a CSV file. The script runs without error, but the output CSV file is missing the header row. What is the cause of the missing header row in status_report.csv?

In [None]:
import csv
from pathlib import Path
 
report_data = [
    {'hostname': 'db-master-01', 'status': 'OK', 'region': 'us-east-1'},
    {'hostname': 'web-worker-05', 'status': 'FAIL', 'region': 'eu-west-1'}
]
field_order = ['hostname', 'region', 'status']
output_path = Path("status_report.csv")
 
# Buggy Code
with output_path.open('w', newline='', encoding='utf-8') as f:
    writer = csv.DictWriter(f, fieldnames=field_order)
    # The header row is not being written
    writer.writerows(report_data)

csv.DictWriter does not write headers automatically. The developer must explicitly call the writer.writeheader() method before writing the data rows.

Correct

Correct. This is the precise bug. Unlike csv.writer, which treats all rows as data, csv.DictWriter makes a distinction between the header and data rows. To write the header (based on the fieldnames provided during instantiation), you must make an explicit call to writer.writeheader().

Question 4:
Your script needs to read a simple CSV file where some fields might be empty. What is the most robust way to read this data into a format that allows you to access columns by name (row['hostname'])?

In [None]:
# servers.csv
hostname,ip_address,status
web-01,1.1.1.1,online
db-01,,offline

Use the csv.DictReader object, which automatically uses the first row as headers and returns each subsequent row as a dictionary.

Correct

Correct. csv.DictReader is designed for exactly this purpose. It reads the header row to determine the keys and then yields each data row as a dictionary (e.g., {'hostname': 'db-01', 'ip_address': '', 'status': 'offline'}). This allows for robust, name-based access to columns.

Question 5:
A script serializes a Python dictionary to a JSON string for an API payload. The indent and sort_keys parameters are used to ensure the output is consistent and readable. What is the output printed to the console?

In [2]:
import json
 
payload = {
    "user": "ci-bot",
    "action": "deploy",
    "environment": "staging",
    "commit_hash": "a1b2c3d4"
}
 
json_string = json.dumps(payload, indent=2, sort_keys=True)
print(json_string)

{
  "action": "deploy",
  "commit_hash": "a1b2c3d4",
  "environment": "staging",
  "user": "ci-bot"
}


Correct. The indent=2 argument formats the JSON with newlines and a two-space indent. The sort_keys=True argument ensures that the keys in the output object are sorted alphabetically.

## Environment Variables and Configuration

Question 1:
A DevOps team manages a Python application that is deployed to three different environments: development, staging, and production. Each environment connects to a different database. What is the most robust and conventional method to manage the database connection string for this application?

Store the connection string in an environment variable (for example, DATABASE_URL) that is set differently in each environment's runtime.

Correct

Correct. This is the standard and best practice. It decouples the configuration from the application code, allowing operators to manage connection details per environment without touching the source. It also prevents secrets from being committed to version control.

Question 2:
A script is written to configure a connection to a monitoring service. It needs to read an endpoint URL and an optional timeout value from environment variables. What is the output of this script?

In [1]:
import os
 
# Assume the following environment variable is set before the script runs:
# export SERVICE_ENDPOINT="https://monitor.example.com/api"
 
def get_service_config():
    endpoint = os.getenv("SERVICE_ENDPOINT")
    timeout_str = os.getenv("TIMEOUT_SECONDS", "10") # Default is a string
 
    if not endpoint:
        return "Error: Endpoint not configured."
 
    return f"Connecting to {endpoint} with a timeout of {timeout_str}s."
 
print(get_service_config())

Error: Endpoint not configured.


Connecting to https://monitor.example.com/api with a timeout of 10s.

Correct

Correct. os.getenv("SERVICE_ENDPOINT") successfully retrieves the URL. The TIMEOUT_SECONDS variable is not set, so os.getenv returns the provided default value, which is the string "10". The function then constructs and returns the formatted string.

Question 3:
A developer writes a script to configure the number of worker processes based on an environment variable. The script fails with a TypeError. What is the fundamental cause of the TypeError?

In [None]:
import os
 
# Assume this is set: export WORKER_COUNT="4"
 
def start_workers():
    num_workers = os.getenv("WORKER_COUNT")
    if num_workers:
        print(f"Starting {num_workers} workers...")
        for i in range(num_workers):
            print(f"  - Worker {i+1} started.")
    else:
        print("WORKER_COUNT not set, starting 1 worker.")
 
start_workers()

The os.getenv() function always returns a string. The range() function requires an integer argument, but it received the string "4".

Correct

Correct. This is the core issue. All values retrieved from environment variables are strings. The code attempts range("4"), which is a TypeError. The fix is to explicitly convert the value to an integer: for i in range(int(num_workers)):.

Question 4:
When should a developer choose to access an environment variable using os.environ['MY_VAR'] versus os.getenv('MY_VAR')?


os.environ[] should be used for mandatory configuration that is essential for the application to run. os.getenv() should be used for optional configuration where a None or default value is acceptable.

Correct

Correct. This is the fundamental design choice. Using os.environ[] for a required variable like a database host will cause the application to fail fast with a KeyError if it's not configured, which is good. Using os.getenv() for an optional setting like a LOG_LEVEL allows the program to proceed with a sensible default.

Question 5:
A developer is using a .env file for local development to avoid setting environment variables in their shell. They want to test how the override parameter of load_dotenv works. What is the output of this script?

In [None]:
import os
from dotenv import load_dotenv
 
# Assume a file named ".env" exists with the content:
# DATABASE_URL=postgres://user:pass@dotenv_host/db
 
# Step 1: Set a variable in the script's environment first
os.environ["DATABASE_URL"] = "postgres://user:pass@shell_host/db"
 
# Step 2: Load .env without override
load_dotenv(override=False)
print(f"Without override: {os.getenv('DATABASE_URL')}")
 
# Step 3: Load .env WITH override
load_dotenv(override=True)
print(f"With override: {os.getenv('DATABASE_URL')}")

Without override: postgres://user:pass@shell_host/db
With override: postgres://user:pass@dotenv_host/db
Correct

Correct. In Step 2, load_dotenv(override=False) is called. Since DATABASE_URL already exists in the environment (set in Step 1), it is not overwritten. os.getenv returns the existing "shell_host" value. In Step 3, load_dotenv(override=True) is called. This forces the function to load values from the .env file even if they already exist in the environment. os.getenv now returns the "dotenv_host" value.

## Filesystem and Directory Operations

Question 1:
A backup script needs to create a directory structure like /mnt/backups/2023-10-27/. If the parent directories /mnt/backups/ and /mnt/backups/2023-10-27/ might not exist, which method is the most robust and Pythonic way to create this structure?

Path("/mnt/backups/2023-10-27/").mkdir(parents=True, exist_ok=True)

Correct

Correct. This is the ideal method. parents=True ensures that all necessary parent directories (/mnt/backups) are created. exist_ok=True ensures that no error is raised if the directory already exists. This makes the operation idempotent and highly robust.

Question 2:
A script needs to clean up a temporary directory by deleting all .tmp files but leaving other files untouched. The directory contains a mix of files and one subdirectory. What is printed to the console after the following script executes?

In [2]:
import os
from pathlib import Path
import shutil
 
# --- Setup for demonstration purposes ---
base_dir = Path("test_cleanup")
if base_dir.exists():
    shutil.rmtree(base_dir)
base_dir.mkdir()
(base_dir / "data1.tmp").touch()
(base_dir / "data2.tmp").touch()
(base_dir / "config.ini").touch()
(base_dir / "archive").mkdir()
 
# --- Core logic ---
for item in base_dir.iterdir():
    if item.is_file() and item.suffix == ".tmp":
        item.unlink()
 
# --- Verification ---
remaining_items = sorted([p.name for p in base_dir.iterdir()])
print(remaining_items)
 
# --- Cleanup after demonstration ---
shutil.rmtree(base_dir)

['archive', 'config.ini']


Correct. The for loop iterates through the contents of test_cleanup. data1.tmp: is_file() is true and suffix is .tmp, so item.unlink() is called. data2.tmp: is_file() is true and suffix is .tmp, so item.unlink() is called. config.ini: is_file() is true but suffix is .ini, so it is skipped. archive: is_file() is false, so it is skipped.The remaining items are the directory archive and the file config.ini.

Question 3:
A developer writes a script to remove a directory named old_build. The script works when the directory is empty but fails with an OSError when the directory contains files. Why does the p.rmdir() call fail?

In [None]:
from pathlib import Path
import shutil
 
# --- Setup for demonstration ---
p = Path("old_build")
p.mkdir(exist_ok=True)
(p / "app.bin").touch()
 
# --- Core logic ---
try:
    # This line fails if "old_build" is not empty
    p.rmdir()
    print("Directory 'old_build' removed.")
except OSError as e:
    print(f"Error: {e}")
 
# --- Cleanup after demonstration ---
if p.exists():
    shutil.rmtree(p)

The rmdir() method can only remove empty directories. To recursively delete a directory and all its contents, shutil.rmtree() must be used.

Correct

Correct. This is the fundamental difference between the two functions. Path.rmdir() (and os.rmdir()) is a safe-by-default operation that will only succeed on an empty directory. For a forceful, recursive deletion, the shutil.rmtree() function is required.

Question 4:
A script needs to download a large file, process it, and then ensure the downloaded file is deleted, even if an error occurs during processing. The file does not need to remain saved after processing. What is the most reliable and Pythonic way to manage the lifecycle of this downloaded file?


Use tempfile.NamedTemporaryFile(). Write the downloaded content to this file, process it, and allow the context manager to automatically delete the file upon exiting the with block.

Correct

Correct. This is the ideal solution. NamedTemporaryFile creates a file with a unique, non-colliding name in a secure temporary location. The with statement guarantees that the file is automatically cleaned up (since delete=True is the default) when the block is exited, whether normally or due to an exception.

Question 5:
A data processing workflow creates multiple intermediate files within a temporary directory. The script uses a context manager to handle the cleanup. What will the output of this script be?

In [3]:
import tempfile
from pathlib import Path
 
# --- Action ---
final_path_str = ""
with tempfile.TemporaryDirectory() as temp_dir:
    print(f"Created directory: {temp_dir}")
    temp_path_obj = Path(temp_dir)
 
    (temp_path_obj / "step1.dat").touch()
    (temp_path_obj / "step2.dat").touch()
 
    final_path_str = str(temp_path_obj)
 
# --- Verification ---
print(f"Directory {final_path_str} exists after with-block: {Path(final_path_str).exists()}")

Created directory: C:\Users\SHUBHE~1\AppData\Local\Temp\tmpmtlhlgl6
Directory C:\Users\SHUBHE~1\AppData\Local\Temp\tmpmtlhlgl6 exists after with-block: False


Correct. The with tempfile.TemporaryDirectory() ... block creates a temporary directory. All operations inside the block succeed. Upon exiting the with block, the context manager's cleanup logic is triggered, which recursively deletes the temporary directory and all files inside it. Therefore, the final check Path(...).exists() will evaluate to False.

## Running and Managing Subprocesses

Question 1:
A DevOps script needs to delete a Docker container based on an ID provided by a user. Why is subprocess.run(['docker', 'rm', user_provided_id]) significantly safer than os.system(f"docker rm {user_provided_id}")?


subprocess.run() (with default shell=False) passes arguments as a list of tokens, preventing the shell from interpreting metacharacters in the input. This mitigates shell injection vulnerabilities.

Correct

Correct. This is the crucial security benefit. If a user provided an ID like my-container; rm -rf /, os.system() would execute both commands. subprocess.run() passes "my-container; rm -rf /" as a single, literal argument to docker rm, which would safely fail without executing the malicious second command.

Question 2:
A script is used to check the current version of an installed command-line tool, like pip. What is the most likely output of this script in a standard Python environment?

In [49]:
import subprocess
import sys
 
try:
    command = [sys.executable, "-m", "pip", "--version"]
    result = subprocess.run(
        command,
        capture_output=True,
        text=True,
        check=True
    )
    version_info = result.stdout.strip()
    print(version_info)
except (subprocess.CalledProcessError, FileNotFoundError) as e:
    print(f"Error: {e}")

pip 25.1 from C:\Users\Shubhesh Swain\anaconda3\Lib\site-packages\pip (python 3.13)



A string containing the pip version information, such as pip 24.3.1 from /path/to/lib/python3.12/site-packages/pip (python 3.12).

Correct

Correct. The command correctly invokes the pip module to get its version. subprocess.run captures the output, text=True decodes it to a string, and strip() removes leading/trailing whitespace. The final print displays this captured string.

Question 3:
A developer writes a script to archive a directory using the tar command. The script seems to run, but the archive is not created when an invalid source directory is given. The script, however, does not report an error. Why does the script proceed and print "Archive process finished..." even though the tar command failed with a non-zero exit code?

In [None]:
import subprocess
 
source_dir = "non_existent_data/"
archive_file = "backup.tar.gz"
command = ["tar", "-czf", archive_file, source_dir]
 
result = subprocess.run(command)
 
print("Archive process finished. Continuing with next steps.")


By default, subprocess.run does not raise an exception for a non-zero exit code. The failure is silent unless the result.returncode is manually checked or check=True is used.

Correct

Correct. This is the fundamental issue. The default behavior of subprocess.run is to run the command and report the outcome in the CompletedProcess object, but it does not treat a non-zero exit code as a Python exception. To make it fail loudly, the developer must add check=True to the call.

Question 4:
A script attempts to delete a Kubernetes secret. If the secret does not exist, the kubectl command will fail with a non-zero exit code and an error message on stderr. The script is designed to handle this specific case gracefully. Assuming the kubectl command fails because the secret does not exist, what is the output?

In [None]:
import subprocess
 
secret_name = "this-is-a-non-existent-secret"
command = ["kubectl", "delete", "secret", secret_name]
 
try:
    subprocess.run(
        command,
        check=True,
        capture_output=True,
        text=True
    )
    print(f"Secret '{secret_name}' deleted successfully.")
except subprocess.CalledProcessError as e:
    if "NotFound" in e.stderr:
        print(f"Info: Secret '{secret_name}' did not exist. No action taken.")
    else:
        print(f"Error deleting secret: {e.stderr}")


Info: Secret 'this-is-a-non-existent-secret' did not exist. No action taken.

Correct

Correct. subprocess.run with check=True raises a CalledProcessError. The except block catches the error. The error message from kubectl for a missing resource includes the word "NotFound". The if "NotFound" in e.stderr: condition evaluates to True, and the corresponding informational message is printed.

Question 5:
What is the key difference between the subprocess.CalledProcessError and FileNotFoundError exceptions when using subprocess.run()?


FileNotFoundError is raised by subprocess.run() if the executable itself cannot be found in the system's PATH. CalledProcessError is raised (if check=True) when the command is found and runs, but exits with a non-zero status code.

Correct

This correctly distinguishes the two failure modes. FileNotFoundError means the command could not even be started. CalledProcessError means the command started, ran, and then reported an error upon exiting.

Question 6:
A developer needs to pass a command with multiple arguments, including one with a space, to subprocess.run().

Command to run: aws s3 cp my-local-file.txt "s3://my-bucket/target path/file.txt"

Which list correctly and safely represents this command for subprocess.run()?


['aws', 's3', 'cp', 'my-local-file.txt', 's3://my-bucket/target path/file.txt']

Correct

Correct. This is the correct representation. Each distinct argument to the command, as it would be separated by spaces in a shell (while respecting quotes), becomes a separate string element in the list. The argument containing a space is passed as a single list element, ensuring it is treated as one argument by the aws command.

## Making and Inspecting HTTP Requests

Question 1:
A developer needs to interact with a modern REST API that returns JSON data. Why is the requests library generally preferred over Python's built-in urllib.request module for this task?


requests provides a simpler, more human-friendly API, with features like a built-in JSON decoder (.json()) and automatic handling of query strings, which significantly reduces the amount of boilerplate code.

Correct

This is the primary reason. The requests library was designed to be intuitive. Tasks that are complex in urllib (like sending a JSON payload, handling authentication, or managing sessions) are often single-line operations in requests, making the code cleaner and easier to maintain.

Question 2:
A script needs to search a package repository API for packages matching specific criteria. What is the most likely URL that will be printed?

In [None]:
import requests
 
api_url = "https://api.pypackage.org/search"
query_data = {
    "name": "devops-automation",
    "min_version": "2.1",
    "license": "MIT"
}
 
response = requests.get(api_url, params=query_data, timeout=5)
 
print(response.url)


https://api.pypackage.org/search?name=devops-automation&min_version=2.1&license=MIT

Correct

Correct.The requests library correctly serializes the params dictionary into a URL-encoded query string, joining key-value pairs with = and separating pairs with &.

Question 3:
A developer is trying to create a new user by sending a JSON payload to an API endpoint. The server responds with an error indicating a malformed request, even though the Python dictionary seems correct. What is the most likely cause of the server error?

In [None]:
import requests
import json
 
api_url = "https://api.example.com/v1/users"
new_user_data = {
    "username": "cicd_bot",
    "permissions": ["read", "deploy"]
}
 
response = requests.post(api_url, data=new_user_data)
 
print(response.status_code)

The data parameter form-encodes the dictionary (username=cicd_bot&...). To send a JSON payload, the json parameter should be used instead (json=new_user_data).

Correct

Correct. This is the precise bug. Using data= causes requests to send the payload as application/x-www-form-urlencoded. Modern APIs typically expect application/json. The json parameter is the correct way to handle this; it automatically serializes the dictionary to a JSON string and sets the Content-Type header to application/json.

Question 4:
A developer is writing a script to fetch an image from a URL. The script runs, but the saved logo.png file is corrupted and cannot be opened. What is the cause of the corrupted image file?

In [None]:
import requests
 
image_url = "https://httpbin.org/image/png"
response = requests.get(image_url)
 
if response.status_code == 200:
    with open("logo.png", "w", encoding="utf-8") as f:
        f.write(response.text)

Image data is binary. The script uses response.text, which tries to decode the binary data as text (often UTF-8), corrupting it. It should use response.content and open the file in binary write mode ('wb').

Correct

Correct. This is the fundamental error. response.text is for text-based content and applies a character encoding. response.content provides the raw, unmodified bytes of the response body. For non-text data like images, PDFs, or zip files, one must use response.content and write it to a file opened in binary mode ('wb').

Question 5:
What is the key difference between the response.text and response.content attributes of a requests Response object?


response.text returns the response body as a decoded string, while response.content returns the raw response body as bytes.

Correct

Correct. response.text takes the raw bytes and decodes them into a Python string, usually based on the Content-Type header or a default encoding (like UTF-8). response.content provides the raw, unprocessed bytes, which is essential for handling non-textual data like images or compressed files.

## Robust API Interaction: Authentication and Error Handling

Question 1:
When automating interactions with an API, what is the primary purpose of calling response.raise_for_status() immediately after receiving a response?

It checks the response's status code and raises an HTTPError exception if the code indicates a client error (4xx) or server error (5xx), allowing for centralized error handling.

Correct

Correct. This is the core function of raise_for_status(). It turns failed HTTP responses into Python exceptions, allowing you to use a try...except block to handle all kinds of failures cleanly instead of writing if/else statements to check the status code manually. This promotes the "Easier to Ask for Forgiveness than Permission" (EAFP) coding style.

Question 2:
A script attempts to fetch data from a protected API endpoint that requires a Bearer Token for authentication. If the API_TOKEN is expired and the server returns a 401 Unauthorized status code, what will the script print?

In [3]:
import os
import requests
 
# Assume this is set in the environment:
# export API_TOKEN="secret-token-123"
 
api_token = os.getenv("API_TOKEN")
api_url = "https://api.cloudservice.com/v1/data"
 
headers = {
    "Accept": "application/json",
    "Authorization": f"Bearer {api_token}"
}
 
try:
    response = requests.get(api_url, headers=headers, timeout=5)
    response.raise_for_status()
    # Assume the API returns: {"data": ["item1", "item2"]}
    items = response.json()["data"]
    print(f"Successfully retrieved {len(items)} items.")
except requests.exceptions.HTTPError as e:
    if e.response.status_code == 401:
        print("Authentication failed: Invalid token.")
    else:
        print(f"An HTTP error occurred: {e}")

ConnectionError: HTTPSConnectionPool(host='api.cloudservice.com', port=443): Max retries exceeded with url: /v1/data (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x00000195A6891090>: Failed to resolve 'api.cloudservice.com' ([Errno 11001] getaddrinfo failed)"))

Authentication failed: Invalid token.

Correct

Correct. The server returns a 401 status. response.raise_for_status() raises an HTTPError. The except block catches it. Inside the except block, e.response.status_code is 401, so the if condition is met, and the specific "Authentication failed" message is printed.

Question 3:
A developer is trying to authenticate with an API using Basic Authentication. The script sends the request, but the server responds with a 401 Unauthorized error. What is the bug in this authentication attempt?

In [None]:
import requests
 
username = "devops_user"
password = "a_very_secure_password"
url = "https://api.service.com/v1/status"
 
headers = {
    "username": username,
    "password": password
}
 
response = requests.get(url, headers=headers)
print(response.status_code)


Basic Authentication requires sending credentials as a tuple to the auth parameter, not as custom headers. The correct call is requests.get(url, auth=(username, password)).

Correct

Correct. This is the fundamental bug. requests has a specific parameter, auth, for handling Basic Authentication. It automatically formats the username and password into the required Authorization: Basic header. Sending them as custom headers username and password is incorrect, and the server will not recognize them.

Question 4:
A script needs to connect to a server that is known to be slow to respond. A timeout is configured to prevent the script from hanging. What will be the output of this script?

In [4]:
import requests
 
# This endpoint simulates a 5-second delay before responding.
slow_url = "https://httpbin.org/delay/5"
 
try:
    print("Making request...")
    response = requests.get(slow_url, timeout=(1.0, 3.0))
    print("Request successful.")
except requests.exceptions.ConnectTimeout:
    print("Connection timed out.")
except requests.exceptions.ReadTimeout:
    print("Read timed out.")
except requests.exceptions.HTTPError:
    print("An HTTP error occurred.")

Making request...
Request successful.


Making request...
Read timed out.
Correct

Correct. The connection is established successfully (likely in < 1.0s). The client then waits for the server's response. The server intentionally waits for 5 seconds. Since this is longer than the configured read timeout of 3.0 seconds, the requests library will raise a ReadTimeout exception, which is caught by the corresponding except block.

Question 5:
A developer needs to check an API endpoint and print an error message if the response is a server error (5xx), but the script continues the execution of the try block and prints Status: 503 on the screen. Why doesn't the script enter the exception clause?

In [None]:
import requests
 
api_url = "https://httpbin.org/status/503"
 
try:
    response = requests.get(api_url)
    print(f"Status: {response.status_code}")
except requests.exceptions.HTTPError as e:
    print(f"Server error detected: {e.response.status_code}")


The requests.get() call itself does not raise an HTTPError on a bad status code. The developer must explicitly call response.raise_for_status() to trigger the exception.

Correct

Correct. This is the bug. By default, requests.get() will happily return a response object for any valid HTTP response, including 4xx and 5xx errors. To turn a bad status code into an exception, you must call response.raise_for_status(). Without this call, the try block completes successfully, and the except block is never entered.

Question 6:
In what scenario is it appropriate to use a tuple for the timeout parameter, like timeout=(3.05, 27), instead of a single float?

When you need to separately control the connection timeout (time to establish a connection) and the read timeout (time to wait for data after connecting).

Correct

Correct. The tuple (connect, read) allows for fine-grained control. This is useful if you expect a server to be quick to connect to, but potentially slow to process and return a large response. For example, timeout=(3.05, 60) allows a quick connection check but gives the server a full minute to generate and send its response.

## Advanced Resilience: API Retry Strategies

Question 1:
A script makes API calls to a third-party service that occasionally fails with a 503 Service Unavailable error during peak hours. Why is it a good practice to implement a retry mechanism for this specific type of error?


Server-side errors like 503 are often transient. A retry loop with a short delay gives the server a chance to recover, making the script more resilient to temporary glitches.

Correct

Correct. This is the core principle of retrying 5xx errors. The problem is likely temporary and on the server's side. Instead of failing immediately, the client can wait a moment and try again, often succeeding on the second or third attempt once the transient issue (like a brief network partition or a service restart) is resolved.

Question 2:
Which of the following operations is generally not idempotent and should be handled with extreme care when implementing a retry mechanism?


A POST request to /api/transactions with the body {"amount": 100, "description": "Payment"} to append a new transaction.

Correct

Correct. This is a classic non-idempotent operation. If this request is retried due to a temporary network error after the server has already processed it, a duplicate transaction will be created. Such operations require special handling, like using a unique idempotency key in the request header, to prevent duplicate actions.

Question 3:
A developer implements an exponential backoff strategy, but they notice the delay between retries is not increasing as expected. What is the flaw in this exponential backoff implementation?

In [None]:
import time
 
max_retries = 4
delay = 1 # Initial delay in seconds
 
for attempt in range(max_retries):
    print(f"Attempt {attempt + 1}. Waiting for {delay}s.")
    time.sleep(delay)
    
print("Process complete.")


The delay variable is not updated inside the loop. It remains 1 for every iteration, resulting in a fixed delay, not an exponential one.

Correct

Correct. This is the bug. For an exponential backoff, the delay variable must be increased after each failed attempt (e.g., delay = delay * 2). Since this line is missing, the script sleeps for the same 1-second interval every time.

Question 4:
When implementing a retry strategy with exponential backoff, what is the primary purpose of adding jitter?

Jitter adds a small, random amount of time to the delay. This prevents multiple clients that failed at the same time from all retrying in perfect synchronization, which could overload the server again.

Correct

Correct. This is the exact purpose of jitter. Without it, if 100 clients fail simultaneously and all have the same exponential backoff logic, they will all retry after 1 second, then 2 seconds, then 4 seconds, etc., creating synchronized waves of traffic. Jitter spreads these retries out over a small time window, smoothing the load on the recovering server.

Question 5:
A developer writes a retry loop for an API call. When the script runs against a URL that consistently returns a 400 Bad Request error, they notice it pointlessly retries three times before stopping. What is the primary logical flaw in this retry strategy?

In [None]:
import requests
import time
 
# This URL will consistently return a 400 error
error_url = "https://httpbin.org/status/400"
max_retries = 3
 
for attempt in range(max_retries):
    print(f"Attempt {attempt + 1}...")
    try:
        response = requests.get(error_url, timeout=5)
        response.raise_for_status()
        print("Success!")
        break
    except requests.exceptions.HTTPError as e:
        print(f"Request failed: {e.response.status_code}. Retrying...")
        time.sleep(1)


The script retries on a 400 Bad Request error. Client errors (4xx) indicate a problem with the request itself, and retrying the same invalid request is pointless.

Correct

Correct. This is the logical bug. The retry logic should only be triggered for transient server-side errors (5xx) or network exceptions. A 400 Bad Request is a client error, meaning the request is flawed. The script should fail immediately and report the error instead of wasting time retrying. A proper implementation would check e.response.status_code and only retry if it's in the 500-599 range.

## Python Type Hints: Foundations and Basic Collections

Question 1:
What is the primary benefit of adding type hints (for example, name: str, -> int) to a Python script that will be analyzed by a static type checker like mypy?


They allow for the detection of type-related errors before the script is run, catching potential bugs early in the development cycle.

Correct

Correct. This is the core purpose of static type checking. A tool like mypy can analyze the code and report errors, such as passing a list to a function that expects a str, without ever executing the code. This prevents certain classes of bugs from reaching production.

Question 2:
A developer writes a function to process a list of numerical job IDs. When they run a static type checker like mypy, it reports an error for the function call. What is the cause of the static type checking error?

In [None]:
def process_job_ids(ids: list[int]) -> None:
    print(f"Processing {len(ids)} job IDs.")
 
job_names: list[str] = ["job-72", "job-91"]
process_job_ids(job_names)

The function is called with job_names (a list[str]), but it is defined to accept ids of type list[int].

Correct

Correct. This is a direct type mismatch. The function signature explicitly states it requires a list of integers. The static checker sees that a list of strings is being passed instead and flags this as an error, preventing a potential runtime bug if the function were to perform mathematical operations on the IDs.

Question 3:
A function is designed to take an optional configuration dictionary. Analyze the following code. What will be printed to the console?

In [5]:
from typing import Optional
 
def apply_config(settings: Optional[dict[str, str]]) -> str:
    if settings:
        user = settings.get("USER", "default")
        return f"User set to {user}"
    
    return "No settings provided."
 
# Scenario 1
config_data = {"USER": "admin", "HOST": "prod.server"}
print(apply_config(config_data))
 
# Scenario 2
print(apply_config(None))

User set to admin
No settings provided.


Correct. In Scenario 1, settings is a dictionary, so the if settings: block is executed. settings.get("USER", "default") finds the key and returns "admin". In Scenario 2, settings is None, so the if settings: condition is false, and the function proceeds to return "No settings provided.".

Question 4:
A developer writes a function that should accept either a single port number (int) or a list of port numbers (list[int]). The static type checker flags an error on the return statement. What is the cause of the type error on the return statement?

In [None]:
def format_ports(ports: int | list[int]) -> str:
    if isinstance(ports, int):
        return f"Port: {ports}"
    else:
        return f"Ports: {','.join(ports)}"


The str.join() method expects an iterable of strings, but mypy knows that ports is a list[int], so the elements are not strings.

Correct

Correct. This is the precise error. str.join() requires all elements in the iterable to be strings. The static checker sees that the ports list contains integers and flags this incompatibility. The fix is to convert each integer to a string, for example, using a generator expression: ','.join(str(p) for p in ports).

Question 5:
You have a function that processes a user ID, which can be either an integer or a string representation of an integer. What is printed to the console?

In [6]:
def get_user_id(raw_id: str | int) -> int:
    if isinstance(raw_id, str):
        return int(raw_id)
    return raw_id
 
result1 = get_user_id(123)
result2 = get_user_id("456")
 
print(f"{type(result1)} {type(result2)}")

<class 'int'> <class 'int'>


Correct. For result1, get_user_id(123) is called. isinstance(123, str) is false, so the function returns the integer 123 directly. type(result1) is . For result2, get_user_id("456") is called. isinstance("456", str) is true, so the function returns int("456"), which is the integer 456. type(result2) is .

Question 6:
A function is defined as def get_item(key: str) -> Optional[str]:. Why is this Optional[str] annotation more precise and useful than -> str | None?


There is no difference. Optional[T] is simply an alias for Union[T, None]. The choice is purely stylistic.

Correct

Correct. Optional[T] is exactly equivalent to Union[T, None] (or T | None in Python 3.10+). It was created as a common shorthand to more clearly express the intent that a value can be "present" or "absent" (None). While the choice is stylistic, Optional is often preferred for its clarity in this specific use case.