In [1]:
# for loading the dataset
import seaborn as sns

import logging

# Get the logger for oci.circuit_breaker
oci_circuit_breaker_logger = logging.getLogger("oci.circuit_breaker")

# Set the log level to WARNING or ERROR to suppress INFO logs
oci_circuit_breaker_logger.setLevel(logging.WARNING)

In [2]:
# load the magic extension
%load_ext oci_genai_magics

OCIGenaiMagics extension loaded...
List of magic commands available:
* ask
* ask_code
* ask_data
* clear_history
* show_variables
* show_model_config
* genai_stats
* clear_stats


In [3]:
%show_model_config

Model configuration defined in config.py:
* Model:  meta.llama-3.1-405b-instruct
* Endpoint:  https://inference.generativeai.us-chicago-1.oci.oraclecloud.com
* Temperature:  0.1
* Max_tokens:  1024


In [4]:
%genai_stats

Performance metrics:
* Total requests:  0
* Total input tokens:  0
* Total output tokens:  0


In [5]:
%ask What is a survival model ?

A survival model, also known as a survival analysis or time-to-event model, is a type of statistical model used to analyze the time it takes for a specific event to occur. The event of interest can be anything from a customer churning, a machine failing, a patient experiencing a disease recurrence, or even death.

The primary goal of a survival model is to estimate the probability of survival (or non-occurrence of the event) beyond a certain time point, given a set of predictor variables.

Key characteristics of survival models:

1. **Time-to-event data**: The outcome variable is the time it takes for the event to occur.
2. **Censoring**: Some observations may not experience the event during the study period, so their outcome is censored (i.e., we don't know when or if the event will occur).
3. **Non-negative outcome**: Time-to-event data is always non-negative, as time cannot be negative.

Common applications of survival models:

1. **Predicting customer churn**: Estimating the probability of a customer switching to a competitor based on their usage patterns and demographic data.
2. **Reliability engineering**: Modeling the time-to-failure of machines or components to optimize maintenance schedules.
3. **Medical research**: Analyzing the time-to-disease recurrence or death in patients with a specific condition.
4. **Insurance risk assessment**: Estimating the likelihood of a policyholder filing a claim based on their risk profile.

Some popular survival models include:

1. **Kaplan-Meier estimator**: A non-parametric model for estimating the survival function.
2. **Cox proportional hazards model**: A semi-parametric model for estimating the hazard ratio (i.e., the relative risk of the event occurring).
3. **Parametric models**: Such as the Weibull, exponential, and log-normal distributions, which assume a specific distribution for the time-to-event data.

Do you have any specific questions about survival models or their applications?

#### Load some data (we choose a well know dataset)

In [None]:
# load a dataset
titanic = sns.load_dataset("titanic")

In [None]:
%show_variables

In [None]:
%%ask_data 
is titanic dataframe in your Context ?

In [None]:
%%ask_data 
Analyze the titanic dataset I have loaded and provide a detailed report

In [None]:
%%ask_code 
provide the code to compute the correlation matrix for numerical features in dataset titanic.
Plot the correlation matrix.                                             

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Assuming the titanic dataset is already loaded into a pandas DataFrame called 'titanic'

# Select numerical features
numerical_features = titanic.select_dtypes(include=["int64", "float64"]).columns

# Compute correlation matrix
correlation_matrix = titanic[numerical_features].corr()

# Print correlation matrix
print(correlation_matrix)

# Plot correlation matrix
plt.figure(figsize=(10, 8))
sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", square=True)
plt.title("Correlation Matrix for Numerical Features")
plt.show()

In [None]:
%%ask_data 
How many records and columns are in the titanic DataFrame ?

In [None]:
%%ask_data 
"list all columns in titanic dataset loaded with null values"

In [None]:
titanic.isnull().sum()

In [None]:
%ask What is catboost ? Give a detailed description.

In [None]:
%ask "can you suggest other gradient boosting models ?"

In [None]:
%%ask_code 
"how to find rows with NaN values in titanic dataframe ?"