You are an expert Python developer specializing in the Databricks environment. Your task is to create a complete Python script to be executed within a Databricks notebook. The script must perform the following operations:
1.	Data Retrieval from SpaceX API:
o	Interact with the SpaceX v3 REST API (https://api.spacexdata.com/v3).
o	Retrieve data from one specific endpoint likely containing numerical data where missing values might occur: 
	All launches: https://api.spacexdata.com/v3/launches
	(Self-correction: While launches is common, /cores might be a better example for potential missing numericals like reuse_count, rtls_landings etc. Let's use /cores for a potentially more illustrative example, but keep /launches as an alternative)
	Alternative/Primary: All Cores: https://api.spacexdata.com/v3/cores
o	Handle potential errors during the API calls (e.g., timeouts, non-200 status codes).
2.	Missing Value Imputation (Mean):
o	Perform mean imputation on the retrieved data (list of dictionaries).
o	Imputation Logic: 
	Identify Numerical Fields: First, automatically identify the keys/fields within the dictionaries that predominantly contain numerical values (int or float). You might need to inspect the first few records or a sample to determine these fields reliably, or iterate through all records checking types.
	Calculate Mean per Field: For each identified numerical field, calculate the mean using only the existing, non-missing (not None) numerical values across all records in the dataset.
	Impute Missing Values: Iterate through the dataset again. For each numerical field, replace any missing values (represented as None) with the pre-calculated mean for that specific field.
	Handle Edge Cases: If a numerical field contains only missing values (or no valid numbers to calculate a mean), log a warning and leave the missing values as None (or impute with 0, please specify preference - let's default to leaving them None and logging).
o	The final result should be the original list of dictionaries, but with missing numerical values replaced by the calculated mean for their respective fields.
3.	Control Parameters and Debugging:
o	Include a variable at the beginning of the script to define the API endpoint URL, making it easily modifiable: 
	API_ENDPOINT_URL = "https://api.spacexdata.com/v3/cores" #(or /launches)
o	Use Python's standard logging module to provide informative output during execution. Configure logging to display messages at the INFO level.
o	Log key messages such as: starting data retrieval, number of records retrieved, starting imputation process, identified numerical fields potentially needing imputation (e.g., ['reuse_count', 'rtls_attempts', ...]), calculated mean for field X, number of missing values imputed for field X, any warnings for fields with no calculable mean, imputation complete, starting upload to httpbin, upload outcome.
4.	Execution Time Measurement:
o	Code Execution Time: Measure the time taken to perform the main operations (data retrieval + imputation). Print this time after the imputation operation is complete.
o	Pipeline Execution Time: Measure the total execution time of the entire script (from the beginning until after the upload to httpbin). Print this total time at the end of the script. Use Python's time module.
5.	Upload Result:
o	Take the resulting imputed list of dictionaries from the imputation operation.
o	Serialize it into JSON format.
o	Make an HTTP POST request to the https://httpbin.org/post endpoint, sending the resulting imputed JSON data in the request body.
o	Verify the response from httpbin.org (e.g., check the status code) and log the outcome of the upload operation.


In [0]:
import requests
import logging
import time
import json

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# Control parameter for API endpoint
API_ENDPOINT_URL = "https://api.spacexdata.com/v3/cores"  # Change to "/launches" if needed

def fetch_data(api_url):
    """Fetch data from the specified SpaceX API endpoint."""
    try:
        logging.info(f"Starting data retrieval from {api_url}")
        response = requests.get(api_url, timeout=10)
        response.raise_for_status()  # Raise an exception for HTTP errors
        data = response.json()
        logging.info(f"Data retrieval successful. Retrieved {len(data)} records.")
        return data
    except requests.exceptions.RequestException as e:
        logging.error(f"Error during data retrieval: {e}")
        return None

def identify_numerical_fields(data):
    """Identify fields in the dataset that predominantly contain numerical values."""
    numerical_fields = set()
    for record in data:
        for key, value in record.items():
            if isinstance(value, (int, float)):
                numerical_fields.add(key)
    logging.info(f"Identified numerical fields: {list(numerical_fields)}")
    return list(numerical_fields)

def calculate_mean(data, field):
    """Calculate the mean of non-missing values for a specific field."""
    valid_values = [record[field] for record in data if isinstance(record.get(field), (int, float))]
    if valid_values:
        mean_value = sum(valid_values) / len(valid_values)
        logging.info(f"Calculated mean for field '{field}': {mean_value}")
        return mean_value
    else:
        logging.warning(f"No valid values found to calculate mean for field '{field}'. Leaving missing values as None.")
        return None

def impute_missing_values(data, numerical_fields):
    """Perform mean imputation on numerical fields."""
    means = {field: calculate_mean(data, field) for field in numerical_fields}
    imputed_count = {field: 0 for field in numerical_fields}

    for record in data:
        for field in numerical_fields:
            if record.get(field) is None and means[field] is not None:
                record[field] = means[field]
                imputed_count[field] += 1

    for field, count in imputed_count.items():
        logging.info(f"Imputed {count} missing values for field '{field}'")

    logging.info("Imputation process completed.")
    return data

def upload_to_httpbin(data):
    """Serialize data to JSON and upload it to httpbin.org."""
    try:
        json_data = json.dumps(data)
        logging.info("Starting upload to https://httpbin.org/post")
        response = requests.post("https://httpbin.org/post", data=json_data, headers={"Content-Type": "application/json"})
        response.raise_for_status()
        logging.info(f"Upload successful. Response status code: {response.status_code}")
        return response
    except requests.exceptions.RequestException as e:
        logging.error(f"Error during upload to httpbin.org: {e}")
        return None

def main():
    start_time_pipeline = time.time()

    # Step 1: Fetch data from SpaceX API
    data = fetch_data(API_ENDPOINT_URL)
    if data is None:
        logging.error("Exiting due to data retrieval failure.")
        return

    # Step 2: Identify numerical fields and perform imputation
    start_time_main_operations = time.time()
    numerical_fields = identify_numerical_fields(data)
    imputed_data = impute_missing_values(data, numerical_fields)
    end_time_main_operations = time.time()
    logging.info(f"Main operations (data retrieval + imputation) took {end_time_main_operations - start_time_main_operations:.2f} seconds.")

    # Step 3: Upload the imputed data to httpbin.org
    upload_response = upload_to_httpbin(imputed_data)
    if upload_response is None:
        logging.error("Upload to httpbin.org failed.")

    end_time_pipeline = time.time()
    logging.info(f"Total pipeline execution time: {end_time_pipeline - start_time_pipeline:.2f} seconds.")

if __name__ == "__main__":
    main()