# PandasAI Data Platform Guide

This notebook demonstrates how to use the PandasAI Data Platform features to save, push, load, and pull dataframes. The platform enables seamless collaboration and version control for your data analysis projects.

## Setup

First, let's set up PandasAI with your API key. You can get your free API key from [pandabi.ai](https://app.pandabi.ai)

In [None]:
import pandasai as pai

pai.api_key.set("your-key-here")

## Load Example Datasets

We'll use two example datasets from our data folder:

In [None]:
# Load heart disease and loans datasets
heart_df = pai.read_csv('./data/heart.csv')
loans_df = pai.read_csv('./data/loans_payments.csv')

# Display first few rows of each dataset
print("Heart Disease Dataset:")
heart_df.head()

print("\nLoans Dataset:")
loans_df.head()

## 1. Create Dataframes

The `create()` method allows you to save your dataframes with metadata and column descriptions. This enriches your data with semantic meaning.

In [None]:
# Create heart disease dataset with semantic information
heart = pai.create(
    path="my-team/heart",
    name="Heart Disease Data",
    df = heart_df,
    description="Dataset containing heart disease patient information",
    columns=[
        {"name": "Age", "type": "integer", "description": "Age of the patient in years"},
        {"name": "Sex", "type": "string", "description": "Gender of the patient (M/F)"},
        {"name": "ChestPainType", "type": "string", "description": "Type of chest pain experienced"},
        {"name": "RestingBP", "type": "integer", "description": "Resting blood pressure in mm Hg"},
        {"name": "Cholesterol", "type": "integer", "description": "Serum cholesterol in mg/dl"},
        {"name": "FastingBS", "type": "integer", "description": "Fasting blood sugar > 120 mg/dl (1: true; 0: false)"},
        {"name": "MaxHR", "type": "integer", "description": "Maximum heart rate achieved"},
        {"name": "Oldpeak", "type": "float", "description": "ST depression induced by exercise relative to rest"},
        {"name": "HeartDisease", "type": "integer", "description": "Output class (1: heart disease; 0: normal)"}
    ]
)

# Save loans dataset
loans = pai.create(
    path="my-team/loans",
    name="Loan Payments Data",
    df = loans_df,
    description="Dataset containing loan payment information",
    columns=[
        {"name": "loan_id", "type": "integer", "description": "Unique identifier for each loan"},
        {"name": "amount", "type": "float", "description": "Loan amount in dollars"},
        {"name": "term", "type": "integer", "description": "Loan term in months"},
        {"name": "interest_rate", "type": "float", "description": "Annual interest rate as a percentage"},
        {"name": "payment", "type": "float", "description": "Monthly payment amount"}
    ]
)

## 2. Push to Platform

Push your dataframes to make them available to your team. You can optionally specify versions. You do have to add your own unique slug. You find it in settings/organization in app.pandabi.ai

In [None]:
# Push datasets to platform
heart.push('my-team-slug/heart')
loans.push('my-team-slug/loans')

## 3. Load from Platform

Load existing dataframes that you or your team have pushed to the platform. You can load them once and use acrss different sessions.

In [None]:
# Load datasets from platform
loaded_heart = pai.load('my-team-slug/heart')
loaded_loans = pai.load('my-team-slug/loans')

## 4. Pull Latest Updates

If you already have the dataset locally, you can ensure you get the latest version by using the `pull()` method.

In [None]:
# Pull latest versions
latest_heart = pai.pull('my-team-slug/heart')
latest_loans = pai.pull('my-team-slug/loans')

## Chat with the pulled datasets

Then simply chat with your team-mates' freshly pulled datasets. 

In [None]:
# Ask questions about both datasets
pai.chat('Relastionship between cholesterol and chest type pain', latest_heart)
pai.chat('What is the average interest rate for loans?', latest_loans)