<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       OneClassSVM and OneClassSVMPredict functions in Vantage
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Introduction</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>OneClassSVM is a linear support vector machine (SVM) that performs classification analysis on data sets to identify outliers or novelty in the data. This function supports the Classification (loss: hinge) model. During the training, all the data is assumed to belong to a single class (value 1), therefore ResponseColumn is not needed by the model. For OneClassSVMPredict, output values are 0 or 1. A value of 0 corresponds to an outlier, and 1 to a normal observation or instance.<br> In this notebook we will see how we can use the OneClassSVM and OneClassSVMPredict functions available in Vantage.</p>

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>1. Initiate a connection to Vantage</b>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In the section, we import the required libraries and set environment variables and environment paths (if required).

In [None]:
from teradataml import *

# Modify the following to match the specific client environment settings
display.max_rows = 5

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233c'><b>1.1 Connect to Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [None]:
%run -i ../../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=PP_OneClassSVM_and_OneClassSVMPredict_Python.ipynb;' UPDATE FOR SESSION; ''')

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Begin running steps with Shift + Enter keys. </p>

<hr style='height:1px;border:none;background-color:#00233C;'>

<p style = 'font-size:18px;font-family:Arial;color:#00233c'><b>1.2 Getting Data for This Demo</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, we will get the data which is available in the teradataml library and use the same to show the usage of the function.</p>

In [None]:
load_example_data("teradataml", ["cal_housing_ex_raw"])

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Next is an optional step – if you want to see the status of databases/tables created and space used.</p>

In [None]:
%run -i ../../UseCases/run_procedure.py "call space_report();"        # Takes 10 seconds

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Data Exploration</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Create a "Virtual DataFrame" that points to the data set in Vantage. Check the shape of the dataframe as check the datatype of all the columns of the dataframe.</p>

In [None]:
tdf = DataFrame.from_table("cal_housing_ex_raw")
print("Shape of the data: ", tdf.shape)
tdf

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Scaling the data using Scalefit and ScaleTransform functions</p>

In [None]:
# Scale "target_columns" with respect to 'STD' value of the column.
fit_obj = ScaleFit(data=tdf,
                    target_columns=['MedInc', 'HouseAge', 'AveRooms',
                                    'AveBedrms', 'Population', 'AveOccup',
                                    'Latitude', 'Longitude'],
                    scale_method="STD")
 
 # Transform the data.
transform_obj = ScaleTransform(data=tdf,
                                object=fit_obj.output,
                                accumulate=["id", "MedHouseVal"])

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Creating OneClassSVM model to find anomalies.<br>Detailed help can be found by passing function name to built-in help function.</p>

In [None]:
help(OneClassSVM)

In [None]:
# Train the input data by OneClassSVM which helps model
# to find anomalies in transformed data.
one_class_svm = OneClassSVM(data=transform_obj.result,
                            input_columns=['MedInc', 'HouseAge', 'AveRooms',
                                           'AveBedrms', 'Population', 'AveOccup',
                                           'Latitude', 'Longitude'],
                            local_sgd_iterations=537,
                            batch_size=1,
                            learning_rate='constant',
                            initial_eta=0.01,
                            lambda1=0.1,
                            alpha=0.0,
                            momentum=0.0,
                            iter_max=1
                            )

In [None]:
# Print the result DataFrame.
one_class_svm.result

In [None]:
one_class_svm.output_data

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Predict the values if they are anomalies or not using the model created above by OneClassSVMPredict function.<br>
Detailed help can be found by passing function name to built-in help function.</p>

In [None]:
help(OneClassSVMPredict)

In [None]:
OneClassSVMPredict_out = OneClassSVMPredict(newdata=transform_obj.result,
                                            object=one_class_svm.result,
                                            id_column="id"
                                            )

In [None]:
OneClassSVMPredict_out.result

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Check if data had anomalies by looking at the result.</p>

In [None]:
a = OneClassSVMPredict_out.result
a[a.prediction == 0]

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>From above result we can see the ids which had anomalies.</p>

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>3. Cleanup</b>

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:18px;font-family:Arial;color:#00233C'> <b>Databases and Tables </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following code will clean up tables and databases created above.</p>

In [None]:
db_drop_table("cal_housing_ex_raw")

In [None]:
remove_context()

<hr style="height:1px;border:none;background-color:#00233C;">
<p style = 'font-size:16px;font-family:Arial;color:#00233C'><b>Links:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradataml Python reference: <a href = 'https://docs.teradata.com/search/all?query=Python+Package+User+Guide&content-lang=en-US'>here</a></li>
    <li>OneClassSVM function reference: <a href = 'https://docs.teradata.com/search/all?query=OneClassSVM&content-lang=en-US'>here</a></li>
    <li>OneClassSVMPredict function reference: <a href = 'https://docs.teradata.com/search/all?query=OneClassSVMPredict&content-lang=en-US'>here</a></li>
</ul>

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>