# Testing functions

In [2]:
import os 
import sys
sys.path.append('..')

from IPython.display import display, Markdown

from utils.llm import chat_completion, achat_completion, format_response

from dotenv import load_dotenv
load_dotenv()

from openai import OpenAI, AsyncOpenAI
from anthropic import Anthropic

client = OpenAI()
aclient = AsyncOpenAI()
pplx_client = OpenAI(api_key=os.getenv('PPLX_API_KEY'), base_url="https://api.perplexity.ai")

anthropic_client = Anthropic()

## Test Synchronous Chat Completion

In [3]:
model_name = "gpt-4o"
o1_model_name = 'o1-preview'
model_name_pplx = "llama-3.1-sonar-huge-128k-online"
anthropic_model_name = "claude-3-5-sonnet-20241022"
temperature = 0.0

system_message = "You are a helpful assistant to an experienced data scientist."
user_message = "Using a table, explain the difference between LightGBM, XGBoost, and CatBoost."

OpenAI

In [4]:
response = chat_completion(
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message}
    ],
    model_name=model_name,
    temperature=temperature,
    client=client
)

In [5]:
response, usage = format_response(response)

display(Markdown(f'Usage: {usage}'))
display(Markdown(f'Response: {response}'))

Usage: {'completion_tokens': 491, 'prompt_tokens': 42, 'total_tokens': 533, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}

Response: Certainly! Below is a table that outlines the key differences between LightGBM, XGBoost, and CatBoost, which are all popular gradient boosting frameworks:

| Feature/Aspect       | LightGBM                                      | XGBoost                                      | CatBoost                                      |
|----------------------|-----------------------------------------------|----------------------------------------------|-----------------------------------------------|
| **Developer**        | Microsoft                                     | Distributed Machine Learning Community       | Yandex                                        |
| **Language Support** | C++, Python, R, Java, C#, and more            | C++, Python, R, Java, Scala, and more        | Python, R, Java, C++, and more                |
| **Tree Structure**   | Leaf-wise growth with depth limitation        | Level-wise growth                            | Symmetric trees                               |
| **Handling Categorical Features** | Requires preprocessing (e.g., one-hot encoding) | Requires preprocessing (e.g., one-hot encoding) | Native support for categorical features       |
| **Speed**            | Generally faster due to leaf-wise growth      | Slower compared to LightGBM and CatBoost     | Fast, especially with categorical features    |
| **Memory Usage**     | Lower memory usage due to efficient algorithms| Higher memory usage                          | Efficient memory usage                        |
| **Parallelization**  | Efficient parallelization                     | Good parallelization                         | Efficient parallelization                     |
| **Overfitting Control** | Can overfit if not tuned properly          | Good control with regularization parameters  | Good control with built-in regularization     |
| **Missing Values**   | Handles missing values natively               | Handles missing values natively              | Handles missing values natively               |
| **GPU Support**      | Yes                                           | Yes                                          | Yes                                           |
| **Ease of Use**      | Requires careful tuning for optimal performance| Well-documented and widely used              | User-friendly with automatic handling of categorical features |
| **Use Cases**        | Large datasets, high-dimensional data         | General-purpose, tabular data                | Datasets with categorical features, time series|

Each of these frameworks has its strengths and is suited to different types of problems. LightGBM is known for its speed and efficiency, especially with large datasets. XGBoost is a robust and widely-used framework with a strong community and extensive documentation. CatBoost excels in handling categorical features and is user-friendly, making it a great choice for datasets with many categorical variables.

OpenAI - reasoning models

In [6]:
response_o1 = chat_completion(
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message}
    ],
    model_name=o1_model_name,
    temperature=temperature,
    client=client
)
response_o1, usage_o1 = format_response(response_o1)

display(Markdown(f'Usage: {usage_o1}'))
display(Markdown(f'Response: {response_o1}'))

Usage: {'completion_tokens': 1909, 'prompt_tokens': 49, 'total_tokens': 1958, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 960, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}

Response: Here is a table comparing LightGBM, XGBoost, and CatBoost:

| Feature                      | LightGBM                                                    | XGBoost                                                    | CatBoost                                                    |
|------------------------------|-------------------------------------------------------------|------------------------------------------------------------|-------------------------------------------------------------|
| **Developer/Organization**   | Microsoft                                                   | Distributed (Deep) Machine Learning Community              | Yandex                                                      |
| **Primary Language(s)**      | C++, with APIs for Python, R, etc.                          | C++, with APIs for Python, R, Java, Scala, etc.            | C++, with APIs for Python, R, etc.                          |
| **Key Features**             | - Exclusive Feature Bundling<br>- Gradient-based One-Side Sampling (GOSS)| - Regularization techniques<br>- Parallel processing<br>- Weighted quantile sketch| - Native handling of categorical features<br>- Ordered boosting to reduce overfitting<br>- Support for text data|
| **Speed and Performance**    | - Optimized for speed and memory efficiency<br>- Generally faster on large datasets| - High performance with optimized computing power<br>- Efficient with smaller to medium datasets| - Competitive speed, especially with categorical features<br>- Efficient with default parameter settings|
| **Handling of Categorical Features** | - Supports categorical features by converting them to numerical values internally<br>- May require tuning for optimal results| - Does not natively handle categorical features<br>- Requires preprocessing like one-hot encoding| - Excellent native handling of categorical features<br>- No need for preprocessing|
| **GPU Support**              | Yes                                                        | Yes                                                       | Yes                                                        |
| **Tree Growth Algorithm**    | - Leaf-wise (best-first) growth<br>- Can result in deeper trees and potential overfitting if not regularized| - Level-wise (depth-wise) growth<br>- Balances tree depth and can be more stable| - Symmetric tree growth<br>- Builds balanced trees, which can prevent overfitting|
| **Missing Value Handling**   | - Automatically handles missing values<br>- Learns optimal splits for missing values| - Automatically handles missing values<br>- Assigns default direction in the tree| - Automatically handles missing values<br>- Considers missing values as a separate category|
| **Regularization Techniques**| - Supports L1 and L2 regularization<br>- Provides parameters to control tree complexity| - Supports L1 and L2 regularization<br>- Parameters for tree complexity control<br>- Supports monotonic constraints| - Uses overfitting detector<br>- Supports L2 leaf regularization<br>- Incorporates various parameters to prevent overfitting|
| **Support for Sparse Data**  | - Efficient handling of sparse data<br>- Sparse optimization for features| - Efficient handling of sparse data<br>- Specialized data structures for sparsity| - Supports sparse data<br>- Handles missing and zero values effectively|
| **Ease of Use and API**      | - User-friendly APIs<br>- Similar interface to XGBoost<br>- Flexible with many parameters| - User-friendly APIs<br>- Widely adopted with extensive examples<br>- Supports multiple languages| - User-friendly APIs<br>- Simplifies handling of categorical data<br>- Less need for parameter tuning|
| **Documentation and Community Support** | - Good documentation<br>- Growing community<br>- Active development| - Extensive documentation<br>- Large community<br>- Widely used in competitions| - Comprehensive documentation<br>- Growing community support<br>- Active development|
| **Use Cases**                | - Ideal for large datasets<br>- High-dimensional data<br>- When speed is critical| - General-purpose model<br>- Suitable for a wide range of tasks<br>- Often used as a benchmark| - Datasets with categorical features<br>- When ease of use is important<br>- Tasks with text or categorical data|

**Summary:**

- **LightGBM** is optimized for speed and performance, especially with large datasets. It uses a leaf-wise tree growth strategy that can result in deeper trees and potential overfitting if not properly regularized. It provides support for categorical features and missing values but may require parameter tuning for optimal results.

- **XGBoost** is known for its flexibility and performance, providing extensive options for regularization and tree control. It does not natively handle categorical features, requiring preprocessing steps like one-hot encoding. XGBoost is widely used and has a large community, making it a reliable choice for many machine learning tasks.

- **CatBoost** excels in handling categorical features natively without the need for preprocessing. It uses ordered boosting and symmetric trees to combat overfitting and improve prediction accuracy. CatBoost is user-friendly with less parameter tuning required, making it suitable for datasets with a large number of categorical variables.

Perplexity

In [7]:
response_pplx = chat_completion(
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message}
    ],
    model_name=model_name_pplx,
    temperature=temperature,
    client=pplx_client
)

In [8]:
response_content_pplx, usage_pplx = format_response(response_pplx)  

display(Markdown(f'Usage: {usage_pplx}'))
display(Markdown(f'Response: {response_content_pplx}'))
display(Markdown(f'Citations: {response_pplx.citations}'))

Usage: {'completion_tokens': 493, 'prompt_tokens': 31, 'total_tokens': 524, 'completion_tokens_details': None, 'prompt_tokens_details': None}

Response: **Comparison of LightGBM, XGBoost, and CatBoost**

| **Feature** | **LightGBM** | **XGBoost** | **CatBoost** |
| --- | --- | --- | --- |
| **Tree Growth Strategy** | Leaf-wise (depth-first) | Level-wise (breadth-first) | Symmetric (balanced trees) |
| **Handling Categorical Features** | Native support with GOSS and EFB | Native support with one-hot encoding | Specifically designed for categorical features |
| **Handling Missing Values** | Native support | Native support | Native support |
| **Parallelization** | Multi-threaded with OpenMP | Multi-threaded with OpenMP | Single-threaded by default, but can be parallelized |
| **GPU Support** | Yes | Yes | Yes |
| **Prediction Latency** | Generally slower than CatBoost | Generally slower than CatBoost | Fastest in many cases, especially with high concurrent load |
| **Training Time** | Can be faster than CatBoost with GOSS and EFB | Can be slower than LightGBM | Can be slower than LightGBM and XGBoost |
| **Accuracy** | High accuracy, especially with GOSS and EFB | High accuracy | High accuracy, especially with categorical features |
| **Scalability** | Scalable with large datasets | Scalable with large datasets | Scalable with large datasets |

**Key Points:**
- **LightGBM** excels with its leaf-wise tree growth strategy and native support for categorical features and missing values. It is particularly fast with GOSS and EFB.
- **XGBoost** uses a level-wise tree growth strategy and supports one-hot encoding for categorical features. It is known for its regularization techniques and high accuracy.
- **CatBoost** is specifically designed for handling categorical features and uses symmetric tree growth. It often has the fastest prediction latency, especially under high concurrent loads.

**References:**
- [1] https://tech.deliveryhero.com/is-catboost-faster-than-lightgbm-and-xgboost/
- [2] https://arxiv.org/pdf/2305.17094.pdf
- [4] https://www.geeksforgeeks.org/gradientboosting-vs-adaboost-vs-xgboost-vs-catboost-vs-lightgbm/
- [5] https://datamapu.com/posts/classical_ml/gradient_boosting_variants/

Citations: ['https://tech.deliveryhero.com/is-catboost-faster-than-lightgbm-and-xgboost/', 'https://arxiv.org/pdf/2305.17094.pdf', 'https://www.youtube.com/watch?v=R5FB1ZUejXM', 'https://www.geeksforgeeks.org/gradientboosting-vs-adaboost-vs-xgboost-vs-catboost-vs-lightgbm/', 'https://datamapu.com/posts/classical_ml/gradient_boosting_variants/']

Anthropic

In [24]:
response_anthropic = anthropic_client.messages.create(
    model=anthropic_model_name,
    max_tokens=8192,
    temperature=temperature,
    system=system_message,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": user_message
                }
            ]
        }
    ]
)

In [25]:
display(Markdown(response_anthropic.content[0].text))

Here's a comparative table explaining the key differences between LightGBM, XGBoost, and CatBoost:

| Feature | LightGBM | XGBoost | CatBoost |
|---------|----------|----------|-----------|
| **Development** | By Microsoft | By DMLC (Distributed Machine Learning Community) | By Yandex |
| **Tree Growth** | Leaf-wise (best-first) growth | Level-wise growth | Symmetric tree growth |
| **Speed** | Fastest among the three | Moderate speed | Moderate to fast |
| **Memory Usage** | Most memory efficient | Higher memory usage | Moderate memory usage |
| **Handling Categorical Features** | Native support with optimal binning | No native support (requires pre-encoding) | Best native support with several encoding methods |
| **Missing Values** | Native handling | Native handling | Native handling |
| **Key Advantage** | Speed and efficiency | Accuracy and wide adoption | Handling categorical features and reducing overfitting |
| **Best Use Case** | Large datasets with limited resources | General-purpose, smaller to medium datasets | Datasets with many categorical features |
| **Hyperparameter Tuning** | More parameters, requires careful tuning | Many parameters, well-documented | Fewer parameters, easier to tune |
| **Risk of Overfitting** | Higher (especially with small datasets) | Moderate | Lower (built-in mechanisms to prevent overfitting) |
| **GPU Support** | Yes | Yes | Yes |
| **Parallel Processing** | Yes | Yes | Yes |
| **Learning Task Support** | Classification, Regression, Ranking | Classification, Regression, Ranking | Classification, Regression |
| **Community & Documentation** | Growing community, good documentation | Largest community, excellent documentation | Smaller community, good documentation |
| **Production Readiness** | Good | Excellent | Good |

Note: The performance of each algorithm can vary significantly depending on the specific use case, data characteristics, and implementation details.

In [26]:
response_anthropic.content[0].citations

In [27]:
response_anthropic.usage

Usage(cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=43, output_tokens=461)

## Test Asynchronous Chat Completion

In [11]:
response_async = await achat_completion(
    messages=[
        {"role": "system", "content": system_message},
        {"role": "user", "content": user_message}
    ],
    model_name=model_name,
    temperature=temperature,
    client=aclient
)

In [12]:
response_async, usage_async = format_response(response_async)

display(Markdown(f'Usage: {usage_async}'))
display(Markdown(f'Response: {response_async}'))

Usage: {'completion_tokens': 600, 'prompt_tokens': 42, 'total_tokens': 642, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}

Response: Certainly! Below is a table that outlines the key differences between LightGBM, XGBoost, and CatBoost, which are all popular gradient boosting frameworks:

| Feature/Aspect       | LightGBM                                      | XGBoost                                      | CatBoost                                      |
|----------------------|-----------------------------------------------|----------------------------------------------|-----------------------------------------------|
| **Developer**        | Microsoft                                     | Distributed Machine Learning Community       | Yandex                                        |
| **Language**         | C++ (with interfaces for Python, R, etc.)     | C++ (with interfaces for Python, R, etc.)    | C++ (with interfaces for Python, R, etc.)     |
| **Tree Structure**   | Leaf-wise growth with depth limitation        | Level-wise growth                            | Symmetric tree (balanced)                     |
| **Handling Categorical Features** | Requires preprocessing (e.g., one-hot encoding) | Requires preprocessing (e.g., one-hot encoding) | Native support for categorical features       |
| **Speed**            | Generally faster due to leaf-wise growth      | Slower compared to LightGBM and CatBoost     | Fast, especially with categorical features    |
| **Memory Usage**     | Lower memory usage due to efficient algorithms| Higher memory usage                          | Efficient memory usage                        |
| **Parallelization**  | Efficient parallelization                     | Good parallelization                         | Efficient parallelization                     |
| **Overfitting Control** | Can overfit if not properly tuned          | Good control with regularization parameters  | Good control with built-in regularization     |
| **Missing Values**   | Handles missing values natively               | Handles missing values natively              | Handles missing values natively               |
| **GPU Support**      | Yes                                           | Yes                                          | Yes                                           |
| **Ease of Use**      | Requires careful tuning for best performance  | Well-documented and widely used              | User-friendly with automatic handling of categorical features |
| **Use Cases**        | Large datasets, high-dimensional data         | General-purpose, widely applicable           | Datasets with categorical features, ease of use |

### Additional Notes:
- **LightGBM**: Known for its speed and efficiency, especially with large datasets. The leaf-wise growth strategy can lead to faster convergence but may require careful tuning to avoid overfitting.
- **XGBoost**: One of the most popular and widely used gradient boosting libraries. It is known for its robust performance and flexibility, with a wide range of parameters for fine-tuning.
- **CatBoost**: Particularly strong in handling categorical features without the need for extensive preprocessing. It is designed to be user-friendly and often requires less parameter tuning compared to the other two.

Each of these frameworks has its strengths and is suitable for different types of problems and datasets. The choice between them often depends on the specific requirements of the task, such as the size of the dataset, the presence of categorical features, and the need for speed versus interpretability.