# Tutorial 5: Supported APIs

<div class="alert alert-block alert-info"> <b>Before we get started: </b> 
    <ul style="list-style-type: none;margin: 0;padding: 0;">
        <li>✍️ To run this notebook, you need to have Ponder installed and set up on your machine. If you have not done so already, please refer to our <a href="https://docs.ponder.io/getting_started/quickstart.html">Quickstart guide</a> to get started.</li>
        <li>📖 Otherwise, if you're just interested in browsing through the tutorial, keep reading below!</li>
    </ul>
</div>

In [1]:
import os; os.chdir("..")
import credential
import ponder; ponder.init()
import modin.pandas as pd
import snowflake.connector
snowflake_con = snowflake.connector.connect(user=credential.params["user"],password=credential.params["password"],account=credential.params["account"],role=credential.params["role"],database=credential.params["database"],schema=credential.params["schema"],warehouse=credential.params["warehouse"])
ponder.configure(default_connection=snowflake_con)



Ponder aims to be a drop-in replacement for pandas, we are working to support as much of the pandas API as possible, but it is possible that certain pandas APIs are not currently supported in Ponder. You can find a full list of pandas APIs we support [here](https://docs.ponder.io/overviewAPI/dataframes.html).

### What happens when an API is not supported in Ponder?

In the case where you are using a function that is not yet supported, you will get an `NotImplementedError`. 

In [2]:
df = pd.read_csv("https://raw.githubusercontent.com/ponder-org/ponder-datasets/main/tpch/orders.csv", header=0)
num_df = df.select_dtypes("number")

For example, here we are using the `.corr()` function to compute the correlation matrix. This is not currently implemented in Ponder, so running this will throw a `NotImplementedError`.

In [3]:
num_df.corr()

NotImplementedError: We don't yet support mismatched index objects

If you are running into this error, please send us the specific dataset and APIs that you’re using to support@ponder.io, so that we can suggest a possible solution. 

Note that in such cases, there may be ways to rewrite the query in a different way that leverages the [APIs we support](https://docs.ponder.io/overviewAPI/dataframes.html). 

### Workaround: Defaulting to Pandas

Alternatively, you can access the underlying pandas dataframe via the `_to_pandas` helper method. Note that when you call `_to_pandas`, your entire table from your warehouse is now pull into memory since pandas operates in memory. Beware that this can incur a high I/O cost if you have a very large table, so please do so with care. In this case, we have a very small table, so it is ok to default to pandas as a workaround.

In [4]:
num_df.shape

(145, 4)

Our original dataframe is a Modin DataFrame:

In [5]:
type(num_df)

modin.pandas.dataframe.DataFrame

`_to_pandas` returns a pandas DataFrame:

In [6]:
type(num_df._to_pandas())

pandas.core.frame.DataFrame

We can then default to using pandas's implementation for `.corr()` to get our result. 

In [7]:
num_df._to_pandas().corr()

Unnamed: 0,O_ORDERKEY,O_CUSTKEY,O_TOTALPRICE,O_SHIPPRIORITY
O_ORDERKEY,1.0,-0.070673,0.16906,
O_CUSTKEY,-0.070673,1.0,0.131396,
O_TOTALPRICE,0.16906,0.131396,1.0,
O_SHIPPRIORITY,,,,
