# Introduction to Fugue (10 mins)

[![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](http://slack.fugue.ai)

[Fugue](https://github.com/fugue-project/fugue/) is an open-source project that aims to simplify distributed computing. The simplest interface is the `transform()` function.

<img src="https://fugue-tutorials.readthedocs.io/_images/fugue_backends.png" align="left" width="700"/>

## Fugue transform()

The simplest way Fugue can be used to scale Pandas based code to Spark or Dask is the transform() function. In the example below, we’ll train a model using scikit-learn and pandas, and then perform the inference parallelized on top of the Dask execution engine.

In [None]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

X = pd.DataFrame({"x_1": [1, 1, 2, 2], "x_2":[1, 2, 2, 3]})
y = np.dot(X, np.array([1, 2])) + 3
reg = LinearRegression().fit(X, y)

In [None]:
def predict(df: pd.DataFrame, model: LinearRegression) -> pd.DataFrame:
    return df.assign(predicted=model.predict(df))

input_df = pd.DataFrame({"x_1": [3, 4, 6, 6], "x_2":[3, 3, 6, 6]})
predict(input_df.copy(), reg)

In [None]:
from fugue import transform

result = transform(
    input_df,
    predict,
    schema="*,predicted:double",
    params=dict(model=reg),
)
print(type(result))
result.head()

**Spark**

In [None]:
result = transform(
    input_df,
    predict,
    schema="*,predicted:double",
    params=dict(model=reg),
    engine="spark"
)
print(type(result))
result.show()

## Equivalent Spark Code

In [None]:
## Parallelized Inference Using Fugue Transform

## Principles of Fugue