```{hint}
✨✨✨ **Run this notebook on Google Colab** ✨✨✨

You can [run this notebook yourself with Google Colab](https://colab.research.google.com/github/Eventual-Inc/Daft/blob/main/docs/source/10-min.ipynb)!
```

# 10 minutes Quickstart

This is a short introduction to all the main functionality in Daft, geared towards new users.

## What is Daft?
Daft is a distributed query engine built for running ETL, analytics, and ML/AI workloads at scale. Daft is implemented in Rust (fast!) and exposes a familiar Python dataframe API (friendly!). 

In this Quickstart you will learn the basics of Daft’s DataFrame API and the features that set it apart from frameworks like pandas, pySpark, Dask and Ray. You will build a database of dog owners and their fluffy companions and see how you can use Daft to download images from URLs, run an ML classifier and call custom UDFs, all within an interactive DataFrame interface. Woof! 🐶

## When Should I use Daft?

Daft is the right tool for you if you are working with any of the following:
- **Large datasets** that don't fit into memory or would benefit from parallelization
- **Multimodal data types** such as images, JSON, vector embeddings, and tensors
- **Formats that support data skipping** through automatic partition pruning and stats-based file pruning for filter predicates
- **ML workloads** that would benefit from interactive computation within DataFrame (via UDFs)

Read more about how Daft compares to other DataFrames in our [FAQ](/faq/dataframe_comparison.rst).

Let's jump in! 🪂

## Install and Import Daft

You can install Daft using `pip`:

In [None]:
%pip install getdaft

And then import Daft and some of its classes which we'll need later on:

In [1]:
import daft
from daft import DataType, udf

## Create your first Daft DataFrame

See also: [API Reference: DataFrame Construction](df-input-output)

To begin, let's create a DataFrame from a dictionary of columns:

In [2]:
import datetime

df = daft.from_pydict(
    {
        "integers": [1, 2, 3, 4],
        "floats": [1.5, 2.5, 3.5, 4.5],
        "bools": [True, True, False, False],
        "strings": ["a", "b", "c", "d"],
        "bytes": [b"a", b"b", b"c", b"d"],
        "dates": [
            datetime.date(1994, 1, 1),
            datetime.date(1994, 1, 2),
            datetime.date(1994, 1, 3),
            datetime.date(1994, 1, 4),
        ],
        "lists": [[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]],
        "nulls": [None, None, None, None],
    }
)

df

integers Int64,floats Float64,bools Boolean,strings Utf8,bytes Binary,dates Date,lists List[Int64],nulls Null
1,1.5,True,a,"b""a""",1994-01-01,"[1, 1, 1]",
2,2.5,True,b,"b""b""",1994-01-02,"[2, 2, 2]",
3,3.5,False,c,"b""c""",1994-01-03,"[3, 3, 3]",
4,4.5,False,d,"b""d""",1994-01-04,"[4, 4, 4]",


Nice. If you've worked with DataFrame libraries like pandas, Dask or Spark this should look familiar.

### Multimodal Data Types

Daft is built for multimodal data type support. Daft DataFrames can contain more data types than other DataFrame APIs like pandas, Spark or Dask. Daft columns can contain URLs, images, tensors and Python classes. You'll get to work with some of these data types in a moment.

For a complete list of supported data types see: [API Reference: DataTypes](../api_docs/datatype)

### Data Sources

You can also load DataFrames from other sources, such as:

1. CSV files: {func}`daft.read_csv("s3://bucket/*.csv") <daft.read_csv>`
2. Parquet files: {func}`daft.read_parquet("/path/*.parquet") <daft.read_parquet>`
3. JSON line-delimited files: {func}`daft.read_json("/path/*.parquet") <daft.read_json>`
4. Files on disk: {func}`daft.from_glob_path("/path/*.jpeg") <daft.from_glob_path>`

Daft automatically supports local paths as well as paths to object storage such as AWS S3:

```
df = daft.read_json("s3://path/to/bucket/file.jsonl")
```

See [User Guide: Integrations](/user_guide/integrations) to learn more about working with other formats like Delta Lake and Iceberg.

Let's read in a Parquet file from a public S3 bucket. Note that this Parquet file is partitioned on the `country` column. This will be important later on.

In [3]:
# Set IO Configurations to use anonymous data access mode
daft.set_planning_config(default_io_config=daft.io.IOConfig(s3=daft.io.S3Config(anonymous=True)))

df = daft.read_parquet("s3://daft-public-data/tutorials/10-min/sample-data-dog-owners-partitioned.pq/**")
df

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean


## Executing and Displaying Data

Daft DataFrames are lazy by default. This means that the contents will not be computed ("materialized") unless you explicitly tell Daft to do so. This is best practice for working with larger-than-memory datasets and parallel/distributed architectures.

The file we have just loaded only has 5 rows. You can materialize the whole DataFrame in memory easily using the {meth}`df.collect() <daft.DataFrame.collect>` method:

In [4]:
df.collect()

ScanWithTask [Stage:1]:   0%|          | 0/1 [00:00<?, ?it/s]

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean
Shandra,Shamas,57,1967-01-02,United Kingdom,True
Zaya,Zaphora,40,1984-04-07,United Kingdom,True
Wolfgang,Winter,23,2001-02-12,Germany,
Ernesto,Evergreen,34,1990-04-03,Canada,True
James,Jale,62,1962-03-24,Canada,True


You can also take a look at just the first few rows with the {meth}`df.show() <daft.DataFrame.show>` method:

In [5]:
df.show(3)

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean
Shandra,Shamas,57,1967-01-02,United Kingdom,True
Zaya,Zaphora,40,1984-04-07,United Kingdom,True
Wolfgang,Winter,23,2001-02-12,Germany,


Use `.show` for quick visualisation in an interactive notebook.

## Basic DataFrame Operations

Let's take a look at some of the most common DataFrame operations.

## Selecting Columns

You can **select** specific columns from your DataFrame with the {meth}`df.select() <daft.DataFrame.select>`  method:

In [6]:
df.select("first_name", "has_dog").show()

first_name Utf8,has_dog Boolean
Shandra,True
Zaya,True
Wolfgang,
Ernesto,True
James,True


### Excluding Data

You can **limit** the number of rows in a dataframe by calling {meth}`df.limit() <daft.DataFrame.limit>`. Use `limit` and not `show` when you want to return a limited number of rows for further transformation.

In [7]:
df.limit(1).show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean
Shandra,Shamas,57,1967-01-02,United Kingdom,True


To **drop** columns from the dataframe, call {meth}`df.exclude() <daft.DataFrame.exclude>`:

In [8]:
df.exclude("DoB").show()

first_name Utf8,last_name Utf8,age Int64,country Utf8,has_dog Boolean
Shandra,Shamas,57,United Kingdom,True
Zaya,Zaphora,40,United Kingdom,True
Wolfgang,Winter,23,Germany,
Ernesto,Evergreen,34,Canada,True
James,Jale,62,Canada,True


### Transforming Columns with Expressions

See: [Expressions](user_guide/expressions.rst)

Expressions are an API for defining computation that needs to happen over your columns.

For example, use the {meth}`daft.col() <daft.col>` expression together with the `with_column` method to create a new column `full_name`, joining the contents of the `last_name` column to the `first_name` column:

In [9]:
df = df.with_column("full_name", daft.col("first_name") + " " + daft.col("last_name"))
df.select("full_name", "age", "country", "has_dog").show()

full_name Utf8,age Int64,country Utf8,has_dog Boolean
Shandra Shamas,57,United Kingdom,True
Zaya Zaphora,40,United Kingdom,True
Wolfgang Winter,23,Germany,
Ernesto Evergreen,34,Canada,True
James Jale,62,Canada,True


Alternatively, you can also run your column transforms using Expressions directly inside your `select` call:

In [17]:
df.select((daft.col("first_name") + " " + daft.col("last_name")), "age", "country").show()

first_name Utf8,age Int64,country Utf8
Shandra Shamas,57,United Kingdom
Zaya Zaphora,40,United Kingdom
Wolfgang Winter,23,Germany
Ernesto Evergreen,34,Canada
James Jale,62,Canada


Some Expression methods are only allowed on certain types and are accessible through "method accessors" (see: [Expression Accessor Properties](expression-accessor-properties)).

For example, the `.dt.year()` expression is only valid when run on a `datetime` column.

Below we use an Expression to extract the year from a `datetime` column:

In [10]:
df_year = df.with_column("DoB_year", df["DoB"].dt.year())
df_year.show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean,full_name Utf8,DoB_year Int32
Shandra,Shamas,57,1967-01-02,United Kingdom,True,Shandra Shamas,1967
Zaya,Zaphora,40,1984-04-07,United Kingdom,True,Zaya Zaphora,1984
Wolfgang,Winter,23,2001-02-12,Germany,,Wolfgang Winter,2001
Ernesto,Evergreen,34,1990-04-03,Canada,True,Ernesto Evergreen,1990
James,Jale,62,1962-03-24,Canada,True,James Jale,1962


## Other DataFrame Operations

### Sorting Data

You can **sort** a dataframe with {meth}`df.sort() <daft.DataFrame.sort>`, which we do so here in ascending order:

In [11]:
df.sort(df["age"], desc=False).show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean,full_name Utf8
Wolfgang,Winter,23,2001-02-12,Germany,,Wolfgang Winter
Ernesto,Evergreen,34,1990-04-03,Canada,True,Ernesto Evergreen
Zaya,Zaphora,40,1984-04-07,United Kingdom,True,Zaya Zaphora
Shandra,Shamas,57,1967-01-02,United Kingdom,True,Shandra Shamas
James,Jale,62,1962-03-24,Canada,True,James Jale


### Grouping and Aggregating Data

You can **group** and **aggregate** your data using the {meth}`df.groupby() <daft.DataFrame.groupby>` method:

Groupby aggregation operations over a dataset happens in 2 phases:

1. Splitting the data into groups based on some criteria using {meth}`df.groupby() <daft.DataFrame.groupby>`
2. Specifying how to aggregate the data for each group using {meth}`GroupedDataFrame.agg() <daft.DataFrame.dataframe.GroupedDataFrame.agg>`

For example:

In [12]:
# select only columns for grouping
grouping_df = df.select(df["country"], df["first_name"].alias("counts"))

# groupby country column and count the number of countries
grouping_df.groupby(df["country"]).count().show()

country Utf8,counts UInt64
Canada,2
Germany,1
United Kingdom,2


Note that we can use {meth}`.alias() <daft.Expression.alias>` to quickly rename columns.

### Missing Data

All columns in Daft are "nullable" by default. Unlike other frameworks such as Pandas, Daft differentiates between "null" (missing) and "nan" (stands for not a number - a special value indicating an invalid float).

In [13]:
missing_data_df = daft.from_pydict(
    {
        "floats": [1.5, None, float("nan")],
    }
)
missing_data_df = missing_data_df.with_column("floats_is_null", missing_data_df["floats"].is_null()).with_column(
    "floats_is_nan", missing_data_df["floats"].float.is_nan()
)

missing_data_df.show()

floats Float64,floats_is_null Boolean,floats_is_nan Boolean
1.5,False,False
,True,
,False,True


Let's correct the one missing value in our dataset:

In [14]:
df = df.with_column("has_dog", df["has_dog"].is_null().if_else(True, df["has_dog"]))
df.show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean,full_name Utf8
Shandra,Shamas,57,1967-01-02,United Kingdom,True,Shandra Shamas
Zaya,Zaphora,40,1984-04-07,United Kingdom,True,Zaya Zaphora
Wolfgang,Winter,23,2001-02-12,Germany,True,Wolfgang Winter
Ernesto,Evergreen,34,1990-04-03,Canada,True,Ernesto Evergreen
James,Jale,62,1962-03-24,Canada,True,James Jale


### Filtering Data

You can **filter** rows in your DataFrame with a predicate using the {meth}`df.where() <daft.DataFrame.where>` method:

In [16]:
df.where(df["age"] > 35).show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean,full_name Utf8
James,Jale,62,1962-03-24,Canada,True,James Jale
Shandra,Shamas,57,1967-01-02,United Kingdom,True,Shandra Shamas
Zaya,Zaphora,40,1984-04-07,United Kingdom,True,Zaya Zaphora


Filtering can give you powerful optimization when you are working with partitioned files or tables. Daft will use the predicate to read only the necessary partitions, skipping any data that is not relevant.

For example, our Parquet file is partitioned on the `country` column. This means that queries with a `country` predicate will benefit from query optimization:

In [17]:
df.where(df["country"] == "Canada").show()

first_name Utf8,last_name Utf8,age Int64,DoB Date,country Utf8,has_dog Boolean,full_name Utf8
Ernesto,Evergreen,34,1990-04-03,Canada,True,Ernesto Evergreen
James,Jale,62,1962-03-24,Canada,True,James Jale


Daft only needs to read in 1 file for this query, instead of 3.

## Query Planning

Daft is lazy: computations on your DataFrame are not executed immediately. Instead, Daft creates a `LogicalPlan` which defines the operations that need to happen to materialize the requested result. Think of this LogicalPlan as a recipe. 

You can examine this logical plan using {meth}`df.explain() <daft.DataFrame.explain>`:

In [18]:
df2 = daft.read_parquet("s3://daft-public-data/tutorials/10-min/sample-data-dog-owners-partitioned.pq/**")
df2.where(df["country"] == "Canada").explain(show_all=True)

== Unoptimized Logical Plan ==

* Filter: col(country) == lit("Canada")
|
* GlobScanOperator
|   Glob paths = [s3://daft-public-data/tutorials/10-min/sample-data-dog-owners-
|     partitioned.pq/**]
|   Coerce int96 timestamp unit = Nanoseconds
|   IO config = S3 config = { Max connections = 8, Retry initial backoff ms = 1000,
|     Connect timeout ms = 30000, Read timeout ms = 30000, Max retries = 25, Retry
|     mode = adaptive, Anonymous = false, Use SSL = true, Verify SSL = true, Check
|     hostname SSL = true, Requester pays = false, Force Virtual Addressing = false },
|     Azure config = { Anonymous = false, Use SSL = true }, GCS config = { Anonymous =
|     false }, HTTP config = { user_agent = daft/0.0.1 }
|   Use multithreading = true
|   File schema = first_name#Utf8, last_name#Utf8, age#Int64, DoB#Date,
|     country#Utf8, has_dog#Boolean
|   Partitioning keys = []
|   Output schema = first_name#Utf8, last_name#Utf8, age#Int64, DoB#Date,
|     country#Utf8, has_dog#Boolean

Because we are filtering our DataFrame on the partition column `country`, Daft can optimize the Logical Plan and save us time and computing resources by only reading a single partition from disk. 

## More Advanced Operations

You've made it half-way! Time to bring in some fluffy beings 🐶

Let's bring all of the elements together to see how you can use Daft to:
- perform more advanced operations like **joins**
- work with **multimodal data** like Python classes, URLs, and Images,
- apply **custom User-Defined Functions** to your columns,
- and **run ML workloads** within your DataFrame,

### Merging DataFrames

DataFrames can be joined with {meth}`df.join() <daft.DataFrame.join>`.

Let's use a join to reunite our `owners` with their sweet fluffy companions. We'll create a `dogs` DataFrame from a Python dictionary and then join this to our existing dataframe with the owners data.

In [19]:
df_dogs = daft.from_pydict(
    {
        "urls": [
            "https://live.staticflickr.com/65535/53671838774_03ba68d203_o.jpg",
            "https://live.staticflickr.com/65535/53671700073_2c9441422e_o.jpg",
            "https://live.staticflickr.com/65535/53670606332_1ea5f2ce68_o.jpg",
            "https://live.staticflickr.com/65535/53671838039_b97411a441_o.jpg",
            "https://live.staticflickr.com/65535/53671698613_0230f8af3c_o.jpg",
        ],
        "full_name": [
            "Ernesto Evergreen",
            "James Jale",
            "Wolfgang Winter",
            "Shandra Shamas",
            "Zaya Zaphora",
        ],
        "dog_name": ["Ernie", "Jackie", "Wolfie", "Shaggie", "Zadie"],
    }
)

Let's join and drop some columns to keep the output easy to read:

In [20]:
df_family = df.join(df_dogs, on="full_name").exclude("first_name", "last_name", "DoB", "country", "age")
df_family.show()

has_dog Boolean,full_name Utf8,urls Utf8,dog_name Utf8
True,Ernesto Evergreen,https://live.staticflickr.com/65535/53671838774_03ba68d203_o.jpg,Ernie
True,James Jale,https://live.staticflickr.com/65535/53671700073_2c9441422e_o.jpg,Jackie
True,Wolfgang Winter,https://live.staticflickr.com/65535/53670606332_1ea5f2ce68_o.jpg,Wolfie
True,Shandra Shamas,https://live.staticflickr.com/65535/53671838039_b97411a441_o.jpg,Shaggie
True,Zaya Zaphora,https://live.staticflickr.com/65535/53671698613_0230f8af3c_o.jpg,Zadie


Let's just quickly re-order the columns for easier reading:

In [21]:
df_family = df_family.select("full_name", "has_dog", "dog_name", "urls")
df_family.show()

full_name Utf8,has_dog Boolean,dog_name Utf8,urls Utf8
Ernesto Evergreen,True,Ernie,https://live.staticflickr.com/65535/53671838774_03ba68d203_o.jpg
James Jale,True,Jackie,https://live.staticflickr.com/65535/53671700073_2c9441422e_o.jpg
Wolfgang Winter,True,Wolfie,https://live.staticflickr.com/65535/53670606332_1ea5f2ce68_o.jpg
Shandra Shamas,True,Shaggie,https://live.staticflickr.com/65535/53671838039_b97411a441_o.jpg
Zaya Zaphora,True,Zadie,https://live.staticflickr.com/65535/53671698613_0230f8af3c_o.jpg


### Working with Multimodal Data

Daft is built to work comfortably with multimodal data types, including URLs and Images.

You can use the {meth}`url.download() <daft.Expression.url.download>` expression to download the bytes from a URL. Let's store them in a new column using the `with_column` method:

In [22]:
df_family = df_family.with_column("image_bytes", df_dogs["urls"].url.download(on_error="null"))
df_family.show()

full_name Utf8,has_dog Boolean,dog_name Utf8,urls Utf8,image_bytes Binary
Ernesto Evergreen,True,Ernie,https://live.staticflickr.com/65535/53671838774_03ba68d203_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""..."
James Jale,True,Jackie,https://live.staticflickr.com/65535/53671700073_2c9441422e_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""..."
Wolfgang Winter,True,Wolfie,https://live.staticflickr.com/65535/53670606332_1ea5f2ce68_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""..."
Shandra Shamas,True,Shaggie,https://live.staticflickr.com/65535/53671838039_b97411a441_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""..."
Zaya Zaphora,True,Zadie,https://live.staticflickr.com/65535/53671698613_0230f8af3c_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""..."


Great! But where's the fluffiness? 🙁

Let's turn the bytes into human-readable images using `image.decode`:

In [23]:
df_family = df_family.with_column("image", daft.col("image_bytes").image.decode())
df_family.show()

full_name Utf8,has_dog Boolean,dog_name Utf8,urls Utf8,image_bytes Binary,image Image[MIXED]
Ernesto Evergreen,True,Ernie,https://live.staticflickr.com/65535/53671838774_03ba68d203_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""...",
James Jale,True,Jackie,https://live.staticflickr.com/65535/53671700073_2c9441422e_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""...",
Wolfgang Winter,True,Wolfie,https://live.staticflickr.com/65535/53670606332_1ea5f2ce68_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""...",
Shandra Shamas,True,Shaggie,https://live.staticflickr.com/65535/53671838039_b97411a441_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""...",
Zaya Zaphora,True,Zadie,https://live.staticflickr.com/65535/53671698613_0230f8af3c_o.jpg,"b""\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01""...",


Woof! 🐶

### User-Defined Functions

See: [UDF User Guide](user_guide/udf)

You can use User-Defined Functions (UDFs) to run computations over multiple rows or columns.

As the final part of this Quickstart, you'll build a Machine Learning model to classify our new fluffy friends by breed. 

Daft enables you to do all this right within our DataFrame, using UDFs. 

### ML Workloads

We'll define a function that uses a pre-trained PyTorch model: [ResNet50](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html) to classify the dog pictures. We'll pass the contents of the image `urls` column and send the classification predictions to a new column `classify_breed`.

Working with PyTorch adds some complexity but you can just run the cells below to perform the classification.

First, make sure to install and import some extra dependencies:

In [None]:
%pip install validators matplotlib Pillow torch torchvision

In [24]:
# import additional libraries, these are necessary for PyTorch

import torch

Then, go ahead and define your `ClassifyImages` UDF. 

Models are expensive to initialize and load, so we want to do this as few times as possible, and share a model across multiple invocations.

In [25]:
@udf(return_dtype=DataType.fixed_size_list(dtype=DataType.string(), size=2))
class ClassifyImages:
    def __init__(self):
        # Perform expensive initializations - create and load the pre-trained model
        self.model = torch.hub.load("NVIDIA/DeepLearningExamples:torchhub", "nvidia_resnet50", pretrained=True)
        self.utils = torch.hub.load("NVIDIA/DeepLearningExamples:torchhub", "nvidia_convnets_processing_utils")
        self.model.eval().to(torch.device("cpu"))

    def __call__(self, images_urls):
        uris = images_urls.to_pylist()
        batch = torch.cat([self.utils.prepare_input_from_uri(uri) for uri in uris]).to(torch.device("cpu"))

        with torch.no_grad():
            output = torch.nn.functional.softmax(self.model(batch), dim=1)

        results = self.utils.pick_n_best(predictions=output, n=1)
        return [result[0] for result in results]

Nice, now you're all set to call this function on the `urls` column and store the outputs in a new column we'll call `classify breeds`:

In [26]:
classified_images_df = df_family.with_column("classify_breed", ClassifyImages(daft.col("urls")))

classified_images_df.select("dog_name", "image", "classify_breed").show()

dog_name Utf8,image Image[MIXED],classify_breed FixedSizeList[Utf8; 2]
Ernie,,"[boxer, 52.3%]"
Jackie,,"[American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier, 42.4%]"
Wolfie,,"[collie, 49.6%]"
Shaggie,,"[standard schnauzer, 29.6%]"
Zadie,,"[Rottweiler, 78.6%]"


Nice work!

It looks like our pre-trained model is more familiar with some specific breeds. You could do further work to fine-tune this model to improve performance.

## Writing Data

See: [Writing Data](df-writing-data)

Writing data will execute your DataFrame and write the results out to the specified backend. For example, to write data out to Parquet with {meth}`df.write_parquet() <daft.DataFrame.write_parquet>`:


In [39]:
written_df = df.write_parquet("my-dataframe.parquet")

                                                                   

Note that writing your dataframe is a **blocking** operation that executes your DataFrame. It will return a new `DataFrame` that contains the filepaths to the written data:

In [40]:
written_df

path Utf8
my-dataframe.parquet/36bdcc36-9fec-4be8-b22e-a792cc5c6c4c-0.parquet


## What's Next?

Now that you have a basic sense of Daft's functionality and features, take a look at some of the other resources to help you get the most out of Daft:

- [The Daft User Guide](/user_guide/index.rst) for more information on specific topics
- Hands-on Tutorials in Google Colab on:
   - Image Classification
   - NLP Similarity Search / Vector Embedding
   - Querying Images
   - Image Generation with GPUs


### Contributing
Excited about Daft and want to contribute? Join us [on Github](https://github.com/Eventual-Inc/Daft) 🚀