# Tutorial 8: Running NumPy In Your Data Warehouse

Pandas and NumPy are the two most popular Python data science libraries, used by more than 60% of Python developers. NumPy supports linear algebra operations in Python, and as a result, is the fundamental building block of machine learning.

Ponder lets you run NumPy commands directly in your warehouse. This means you can work with the NumPy API to build data and ML pipelines, and let BigQuery take care of scaling and security for you.

Here, we'll show a few examples of Ponder in action with NumPy.

In [1]:
import ponder; ponder.init()
import modin.pandas as pd
from google.cloud import bigquery
from google.cloud.bigquery import dbapi
from google.oauth2 import service_account
import json
bigquery_con = dbapi.Connection(bigquery.Client(credentials=service_account.Credentials.from_service_account_info(json.loads(open("../credential.json").read()),scopes=["https://www.googleapis.com/auth/bigquery"])))
ponder.configure(bigquery_dataset='TEST', default_connection=bigquery_con)

2023-05-05 13:22:55 - Creating session u_QRcDrWTgfQzYOZyE3FP6ilvw_QS7hNu-DYuHZXJL


In [None]:
df = pd.read_sql("TEST.PONDER_CUSTOMER", bigquery_con)

<div class="alert alert-block alert-info"> <b>Note: </b> <span>NumPy support is currently part of Modin's experimental API, please drop us a note at <a href"mailto:support@ponder.io">support@ponder.io</a> if you run into any issues. Feedback welcome!</span></div>

In [None]:
import modin.config as cfg
cfg.ExperimentalNumPyAPI.put(True)
import modin.numpy as np

In [None]:
arr = df.select_dtypes("number").to_numpy()

We can convert the numerical values of the dataframe into Modin's NumPy array.

In [None]:
type(arr)

In [None]:
arr

We can perform reduce operations such as `np.sum` and `np.mean` across the entire matrix: 

In [None]:
np.sum(arr)

In [None]:
np.mean(arr)

or we can perform the reduce operation along a specific axis: 

In [None]:
# mean of every row returning object of same dimensions
np.mean(arr, axis=-1, keepdims=True)

We can also do element-wise matrix operations such as addition of two matrices:

In [None]:
# add an array with an array with reversed columns
arr + arr[:,::-1]

Putting everything together, we can do both together: 

In [None]:
# subtract each element from the average of its row
arr - np.mean(arr, axis=-1, keepdims=True)

Some additional NumPy operations Ponder currently supports include:

- Element-wise matrix operations such as addition, subtraction, multiplication, division, power
- Axis-collapsing or reducing operations such as min, max, sum, product, mean
- Multi-array operations such as maximum or minimum
- And many others, such as where, ravel, and transpose