In [None]:
Oracle AI Data Platform v1.0

Copyright © 2025, Oracle and/or its affiliates.

Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/

# Simple TensorFlow on AI Data Platform Example
 **Training and Evaluating model on TensorFlow in AI Data Platform Cluster**
 
 This notebook demonstrates training a model using the SKLearn (https://scikit-learn.org/stable/) and TensorFlow (https://www.tensorflow.org) frameworks. It covers:
 
 1. **Create source dataframe**
 2. **Create TensorFlow model**
 3. **Train TensorFlow model**

 **Prerequisites**

Before you begin, ensure you have:
 - The necessary IAM policies for accessing AI Data Platform. Learn more about permissions.
 - A configured AI Data Platform environment with a compute cluster created - install the requirements file into cluster libraries, this includes;
   - tensorflow
   - scikit-learn
   - pandas

**Next Steps**

Now that you’ve explored this sample TensorFlow and scikit-learn notebook, consider the following next steps:

1.	Enhance Model Performance
 - Fine-tune hyperparameters using GridSearchCV or RandomizedSearchCV in scikit-learn.
 - Experiment with different optimizers, activation functions, or network architectures in TensorFlow.
2.	Expand the Dataset
 - Use data augmentation techniques (if working with images) or synthetic data generation to improve model robustness.
 - Explore feature engineering to extract more meaningful insights from your data.
4.	Integrate with Other Tools
 - Use MLflow for tracking experiments and model performance.
5.	Explore Advanced Topics
 - Implement custom TensorFlow layers and loss functions.
 - Explore transfer learning with pre-trained models like ResNet or BERT, depending on the problem domain.
 - Integrate scikit-learn models with deep learning pipelines (e.g., using scikit-learn’s Pipelines API with TensorFlow embeddings).

In [1]:
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.metrics import classification_report

import pandas as pd

# Example: Load datasets
df1 = pd.DataFrame({
    'customer_id': [101, 102, 103],
    'age': [34, 29, 45],
    'gender': ['Male', 'Female', 'Male'],
    'income': [55000, 60000, 75000],
    'subscription_type': ['Premium', 'Free', 'Premium']
})

df2 = pd.DataFrame({
    'customer_id': [101, 102, 103],
    'total_logins_last_30_days': [15, 3, 20],
    'total_support_tickets': [2, 100, 100],
    'average_session_duration': [12.5, 8.3, 15.0],
    'churned': [0, 1, 1]
})

# Join datasets
df = pd.merge(df1, df2, on="customer_id")

# One-hot encode categorical features
encoder = OneHotEncoder(drop='first', sparse_output=False)
categorical_features = encoder.fit_transform(df[['gender', 'subscription_type']])
categorical_df = pd.DataFrame(categorical_features, columns=encoder.get_feature_names_out())

# Normalize numerical features
numerical_features = df[['age', 'income', 'total_logins_last_30_days', 
                         'total_support_tickets', 'average_session_duration']]
scaler = StandardScaler()
numerical_df = pd.DataFrame(scaler.fit_transform(numerical_features), columns=numerical_features.columns)
# Combine features
X = pd.concat([numerical_df, categorical_df], axis=1)
y = df['churned']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create TensorFlow model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')  # Binary classification
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))

In [1]:
# Evaluate on same test data
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
print(f"Test Accuracy: {test_acc}")


In [1]:
# Run prediction
y_pred = (model.predict(X_test) > 0.5).astype("int32")

In [1]:
# Generate classification report
print(classification_report(y_test, y_pred, target_names=['No Churn', 'Churn'], zero_division=1))