# Introduction to Airweave

Airweave is an open-source platform enables AI agents to search your data sources, like business apps and databases. 
Airweave does this by turning them into queryable knowledge bases for AI agents. 

This quickstart guide will walk you through the essential steps to get started with Airweave using the Python SDK. We also have an SDK for Node.js. Check it out by heading over to https://docs.airweave.ai/quickstart

### What you'll learn in this notebook
In this tutorial, you'll learn how to:
- Set up the Airweave Python client
- Create a collection to organize your data sources
- Connect data sources to your collection
- Search across all connected sources with natural language queries

### Prerequisites
Before starting, make sure you have:
- Python 3.11 or higher installed
- An Airweave API key (get one from https://app.airweave.ai or your local instance)
- Access to at least one data source (we'll use Stripe account with test data as an example)

## Step 1: Installation and Setup

In [16]:
# First, pip install the Airweave SDK (run this once):
%pip install airweave-sdk


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


Once installed, import the SDK and initialize the client with your API key. Find it on app.airweave.ai 
or in the dashboard of your locally running instance.
If you're running Airweave locally, change the base_url to "http://localhost:8001"

In [None]:

from airweave import AirweaveSDK

# Initialize the Airweave client
client = AirweaveSDK(
    api_key="YOUR_API_KEY",  # Get your actual API key from the Airweave dashboard
    base_url="https://api.airweave.ai",  # Use "http://localhost:8001" for local deployment
)

## Step 2: Create a collection

A collection in Airweave is a searchable container that groups user-specified data sources together.
Think of it as a unified search index that can query multiple data sources
simultaneously. For example, you might create a "Customer Data" collection
that includes data from Stripe, HubSpot, and your internal Postgres database, or a "Productivity Tools" collection 
containing your Notion, Linear, and Microsoft Teams accounts.

You are free to add as many or as little sources to a collection as you like. The important thing to remember is that, in Airweave, a collection is what's being searched by an agent.

Let's create your first collection:

In [None]:
# Create a new collection
collection = client.collections.create(
    name="My First Collection"  # Give your collection a descriptive name
)

print(f"✅ Created collection: {collection.readable_id}")
print(f"   Name: {collection.name}")
print(f"   ID: {collection.id}")

✅ Created collection: my-first-collection-ezhgns
   Name: My First Collection
   ID: YOUR_COLLECTION_ID


The collection has been created with:
- a unique readable_id that you'll use to reference it in future operations,
- a name that you can change at any time,
- and a randomly generated version-4 UUID.


## Step 3: Add a source connection

Now that we have a collection, let's connect a data source to it.
Source connections handle authentication and automatically sync data from
your apps and databases into Airweave.

Airweave supports numerous integrations including:
- Business apps like Stripe, GitHub, HubSpot, Notion, Gmail, Linear, etc.
- Document stores like Dropbox, Google Drive, and OneDrive
- Databases like PostgreSQL

In this example, we'll connect to Stripe, but the process is similar for
other data sources.

In [None]:
# Create a source connection to Stripe
# You'll need to replace 'your_stripe_api_key' with an actual Stripe API key
source_connection = client.source_connections.create(
    name="My Stripe Connection",  # A name for this connection
    short_name="stripe",  # The connector type (e.g., 'stripe', 'hubspot', 'postgresql')
    readable_collection_id=collection.readable_id,  # Link to our collection
    authentication={
        "credentials": {
            "api_key": "sk_YOUR_STRIPE_API_KEY"  # Replace with real API key
        }
    }
)

print(f"✅ Created source connection: {source_connection.name}")
print(f"   Status: {source_connection.status}")
print(f"   Type: {source_connection.short_name}")

✅ Created source connection: My Stripe Connection
   Status: active
   Type: stripe


Once source connection is established, Airweave will automatically:
1. Validate the credentials
2. Begin syncing data from Stripe
3. Extract and index relevant entities (customers, payments, invoices, etc.)
4. Make the data searchable through natural language queries

The initial sync may take a few minutes depending on the amount of data.
You can check the sync status through the dashboard or API.

## Step 4: Search your collection

With your data sources connected and synced, you or your agent can now search across all the data in the collection, using just natural
language queries. 

Airweave uses advanced semantic search to understand the intent of your queries and return relevant results.

Let's try some example searches:

### Example: Search for specific customer information in Stripe

In [15]:
print("\n🔍 Searching for customer payment information...")
results = client.collections.search(
    readable_id=collection.readable_id, query="Payment attempts for concert tickets"
)

# Display the search results
for i, result in enumerate(results.results[:6], 1):  # Show first 6 results
    score = result.get('score', 0)
    payload = result.get('payload', {})
    
    # Extract common fields that exist across different sources
    entity_id = payload.get('entity_id', 'N/A')
    source = payload.get('airweave_system_metadata', {}).get('source_name', 'Unknown')
    entity_type = payload.get('airweave_system_metadata', {}).get('entity_type', 'Unknown')
    
    print(f"\n📍 Result {i} (Score: {score:.2%})")
    print(f"   Source: {source} | Type: {entity_type}")
    print(f"   ID: {entity_id}")
    
    # Display other fields dynamically (excluding metadata)
    for key, value in payload.items():
        if key not in ['entity_id', 'airweave_system_metadata', 'breadcrumbs', 'metadata']:
            if value is not None and value != '':
                print(f"   {key.replace('_', ' ').title()}: {value}")
    
    print("-" * 60)


🔍 Searching for customer payment information...

📍 Result 1 (Score: 88.83%)
   Source: stripe | Type: StripePaymentIntentEntity
   ID: pi_3Q13xgGm1FpXlyE50uRm1zO2
   Amount: 1008
   Currency: eur
   Status: canceled
   Description: Dua Lipa Concert ticket 2025
   Created At: 2024-09-20T10:21:16+00:00
   Customer Id: cus_QpEel2ZOt74xmy
------------------------------------------------------------

📍 Result 2 (Score: 51.82%)
   Source: stripe | Type: StripePaymentIntentEntity
   ID: pi_3Q17MZGm1FpXlyE51ZMPa9cr
   Amount: 1008
   Currency: eur
   Status: canceled
   Description: Dua Lipa Concert ticket 2025
   Created At: 2024-09-20T13:59:11+00:00
   Customer Id: cus_QpEel2ZOt74xmy
------------------------------------------------------------

📍 Result 3 (Score: 47.38%)
   Source: stripe | Type: StripePaymentIntentEntity
   ID: pi_3Q1373Gm1FpXlyE5195Woh3n
   Amount: 1008
   Currency: eur
   Status: canceled
   Description: Dua Lipa Concert ticket 2025
   Created At: 2024-09-20T09:26:53+00:

And there you have it, the search results include:
- Relevant entities from your connected sources
- Confidence scores indicating relevance
- Metadata about the source and entity type
- The actual content that matches your query

## Congratulations! 🎉

You've successfully:
- ✅ Set up the Airweave Python client
- ✅ Created a collection
- ✅ Connected data sources
- ✅ Performed searches across your data

### What's Next?

#### Explore More Integrations
- Check out the full list of available connectors and [search functionality](https://docs.airweave.ai/search) in the [documentation](https://docs.airweave.ai/welcome) and connect more of your data sources.
- Consider becoming a contibutor and [adding a custom connector](https://docs.airweave.ai/add-new-source)


#### Join the Community
- GitHub: https://github.com/airweave-ai/airweave
- Discord: https://discord.gg/484HY9Ehxt
- Documentation: https://docs.airweave.ai