<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       BreuschPaganGodfrey Function in Vantage
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction</b></p>
<p style = 'font-size:18px;font-family:Arial'><b>BreuschPaganGodfrey</b></p>
<p style = 'font-size:16px;font-family:Arial'>The BreuschPaganGodfrey() function is used to detect the presence of heteroscedasticity in regression analysis. Heteroscedasticity is the situation where the variability of the error term, such as the difference between the observed values and the predicted values, is not constant across all levels of the independent variables.</p>

<p style = 'font-size:16px;font-family:Arial'>If heteroscedasticity is present in the data, an ordinary least squares (OLS) estimator can still provide unbiased estimates of the regression coefficients but it can lead to inefficient estimates. This causes the estimated standard errors of the coefficients to be biased and can underestimate the true standard errors. As a result, the hypothesis tests may be unreliable and lead to incorrect conclusions about the significance of the coefficients.</p>

<p style = 'font-size:16px;font-family:Arial'> While heteroscedasticity can lead to biased and inefficient estimates, it may not be a major concern for OLS. For example, the following variations in the error variance may not be a concern:</p>
<li style = 'font-size:16px;font-family:Arial'>Small variations so the OLS estimator is unbiased and efficient.</li>
<li style = 'font-size:16px;font-family:Arial'>Sample sizes are large so that the OLS estimator is robust to heteroscedasticity.</li>
<li style = 'font-size:16px;font-family:Arial'>Estimates are not used for inference but for prediction.</li>
</p>
<p style = 'font-size:16px;font-family:Arial'>BreuschPaganGodfrey involves regressing the squared residuals from the original regression on the independent variables and their squares, and then using the resulting regression to test whether the coefficients on the squared residuals are statistically different from zero.
BreuschPaganGodfrey is used in econometrics and other fields for regression analysis. It is used in conjunction with other diagnostic tests to assess the assumptions and validity of a regression model.</p>

<p style = 'font-size:16px;font-family:Arial'>The following procedure is an example of how to use BreuschPaganGodfrey() function:</p>
<li style = 'font-size:16px;font-family:Arial'>Use MultivarRegr() to create regression model and generate residuals.</li>
<li style = 'font-size:16px;font-family:Arial'>Use BreuschPaganGodfrey() on the output DataFrame from MultivarRegr().</li>
</p>


<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>1. Initiate a connection to Vantage</b>

<p style = 'font-size:16px;font-family:Arial'>In the section, we import the required libraries and set environment variables and environment paths (if required).

In [None]:
from teradataml import (
    create_context,
    execute_sql,
    load_example_data,
    DataFrame, 
    in_schema,
    TDSeries,
    MultivarRegr,
    BreuschPaganGodfrey,
    db_drop_table,
    db_drop_view,
    remove_context
    )

# Modify the following to match the specific client environment settings
display.max_rows = 5

<hr style="height:1px;border:none;">
<p style = 'font-size:18px;font-family:Arial'><b>1.1 Connect to Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial'>You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [None]:
%run -i ../../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=PP_BreuschPaganGodfrey.ipynb;' UPDATE FOR SESSION; ''')

<p style = 'font-size:16px;font-family:Arial'>Begin running steps with Shift + Enter keys. </p>

<hr style='height:1px;border:none;background-color:#00233C;'>

<p style = 'font-size:18px;font-family:Arial;color:#00233c'><b>1.2 Getting Data for This Demo</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, we will get the time series data which is available in the teradataml library and use the same to show the usage of the function.</p>

In [None]:
load_example_data("uaf", ["house_values"])

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Data Exploration</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Create a "Virtual DataFrame" that points to the data set in Vantage. Check the shape of the dataframe as check the datatype of all the columns of the dataframe.</p>

In [None]:
data = DataFrame.from_table("house_values")
data

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>3. BreuschPaganGodfrey</b>
<p style = 'font-size:16px;font-family:Arial'>The BreuschPaganGodfrey() function checks for heteroscedasticity using one or more variables among the residual terms after running a regression.</p>

<p></p>
<p style = 'font-size:16px;font-family:Arial'>Detailed help can be found by passing function name to built-in help function. </p>

In [None]:
help(BreuschPaganGodfrey)

<p style = 'font-size:16px;font-family:Arial'>We need to first convert the data from dataframe into a TDSeries which will be passed to the DickeyFuller function as input.</p>

In [None]:
data_series_df = TDSeries(data=data,
                              id="cityid",
                              row_index="TD_TIMECODE",
                              payload_field=["house_val","salary","mortgage"],
                              payload_content="MULTIVAR_REAL")

<p style = 'font-size:16px;font-family:Arial'>We will create Multivariate regression model.</p>

In [None]:
mvr_out = MultivarRegr(data=data_series_df,
                           variables_count=3,
                           weights=False,
                           formula="Y = B0 + B1*X1 + B2*X2",
                           algorithm='QR',
                           coeff_stats=True,
                           model_stats=True,
                           residuals=True)

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We will perform Breusch-Pagan-Godfrey (BPG) test using input as teradataml TDSeries object generated from model residuals.We will extract residuals from the model as TDSeries.</p>

In [None]:
data_series_bg = TDSeries(data=mvr_out.fitresiduals,
                              id="cityid",
                              row_index="ROW_I",
                              row_index_style= "SEQUENCE",
                              payload_field=["RESIDUAL","ACTUAL_VALUE","CALC_VALUE"],
                              payload_content="MULTIVAR_REAL")

In [None]:
uaf_out = BreuschPaganGodfrey(data=data_series_bg,
                                  variables_count=2,
                                  significance_level=0.01)

# Print the result DataFrame.
uaf_out.result

<p></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The 'NULL_HYPOTHESIS' value determines if there is heteroscedasticity.</p> 
          <li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>ACCEPT</b> means that variance is homoscedastic.</li> 
          <li style = 'font-size:16px;font-family:Arial;color:#00233C'><b>REJECT</b> means that variance is heteroscedastic.</li>
</p>

<hr style="height:2px;border:none;">
<b style = 'font-size:20px;font-family:Arial'>4. Cleanup</b>

<p style = 'font-size:18px;font-family:Arial'><b>Work Tables</b></p>
<p style = 'font-size:16px;font-family:Arial'>The following code will clean up intermediate tables.</p>

In [None]:
db_drop_table("house_values")

In [None]:
remove_context()

<hr style="height:1px;border:none;">

<p style = 'font-size:16px;font-family:Arial'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradataml Python reference: <a href = 'https://docs.teradata.com/search/all?query=Python+Package+User+Guide&content-lang=en-US'>here</a></li>
    <li>BreuschPaganGodfrey function reference: <a href = 'https://docs.teradata.com/search/all?query=BreuschPaganGodfrey&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; border-bottom:3px solid #91A0Ab">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>