# Wrangling with Data in Pixeltable 101

In this tutorial, we'll guide you through using Pixeltable's intuitive table interface to handle common data wrangling tasks. We'll cover creating tables, populating them with data, querying them with various filters, and even performing basic transformations.

Pixeltable simplifies among other things data preparation and management for AI Product Development. By treating your data as tables with flexible columns, it allows for easy exploration, manipulation, and integration with other stages of your ML workflow.

### Setting up Your Directory

First, let's install Pixeltable and create a directory to hold our tables:

In [18]:
%pip install -q pixeltable  # Remove this line if already installed

import pixeltable as pxt

# Create a directory for our demonstration
pxt.create_dir("demo_direct", ignore_errors="true")  # Ignore errors if the directory already exists

###  Creating and Populating Tables

Pixeltable makes it easy to define tables with typed columns. Let's create a sample table to track information about things with wings and legs:

In [19]:
pxt.drop_table('Int_Table', ignore_errors= 'True')  # Drop the table if it already exists to avoid conflict
t = pxt.create_table('Int_Table', {'num_legs': pxt.IntType(nullable='True'),
                                  'num_wings': pxt.IntType(),
                                  'name': pxt.StringType(nullable='True')})

# Insert rows (each row is a dictionary)
t.insert([{'num_wings': 2, 'name': 'jake'},
          {'num_legs': 3, 'num_wings': 2},
          {'num_legs': 4, 'num_wings': 8, 'name': 'kev'}])


Created table `Int_Table`.
Inserting rows into `Int_Table`: 3 rows [00:00, 853.95 rows/s]
Inserted 3 rows with 0 errors.


UpdateStatus(num_rows=3, num_computed_values=0, num_excs=0, updated_cols=[], cols_with_excs=[])

### Viewing Table Data

Pixeltable offers convenient functions to inspect your data:

In [29]:
# Show a specific row (e.g., row 1)
# t.show(1)

# Get all the rows as a list of dictionaries
t.collect()

num_legs,num_wings,name
,2,jake
3.0,2,
4.0,8,kev


### Filtering Data with Queries

Let's use Pixeltable's interface to filter rows based on specific criteria:

In [30]:
# Show rows where num_wings is greater than or equal to 7
t.where(t.num_wings >= 7).show()

num_legs,num_wings,name
4,8,kev


### Transforming Data with Queries

Pixeltable allows you to perform calculations directly on columns, creating new values on the fly:

In [43]:
# Filter on multiple values using 'isin'
t.where(t.num_wings.isin([2, 8])).collect()

num_legs,num_wings,name
,2,jake
3.0,2,
4.0,8,kev


In [50]:
# Extract data based on calculated columns
t.select(t.num_legs + 4 * 12, t.num_wings * t.num_legs).where(t.name == 'kev').show()

col_0,col_1
52,32


## Learn More

This is just a taste of what Pixeltable can do. Dive into our [Documentation](https://pixeltable.readme.io/docs/get-started) and in-depth tutorials to discover how to:

- Seamlessly integrate with your labeling tools
- Experiment with different ML models and track their performance
- Harness the full power of lineage tracking and versioning

See more examples for:

- [Object Detection](https://dash.readme.com/project/pixeltable/v1.0/docs/object-detection-in-videos)
- [Rag Operations](https://dash.readme.com/project/pixeltable/v1.0/docs/rag-operations)
- [Working with OpenAI](https://pixeltable.readme.io/docs/working-with-openai)

