# Welcome to Auctus Search! 🎉

If you’re familiar with the basics of Auctus Search and want to take your skills further, you’re in the right place! In this notebook, we’ll explore **advanced search techniques** using the `page` and `size` parameters in the `search_datasets` method. We’ll cover two key use cases:

- **Retrieve many datasets at once** without pagination.
- **Use pagination** for faster retrieval and better control.

Let’s dive in and discover how to fine-tune your dataset searches and visualise the results! 🚕

---

## 🎯 **Goal**

In this notebook, you’ll learn:
- How to use the `page` and `size` parameters in `search_datasets()`.
- Two approaches: retrieving a large number of datasets at once or using pagination for efficiency.
- When to choose each approach based on your needs.

---

**Ready to level up?** Let’s get started!

## Step 1: Import the Library

We start by importing the `AuctusSearch` class, which provides the functionality to interact with the Auctus API.

In [1]:
from auctus_search import AuctusSearch

## Step 2: Initialise AuctusSearch

Next, we create an instance of `AuctusSearch` to perform searches and manage datasets.

In [2]:
search = AuctusSearch()

## Step 3: Understanding `page` and `size` in `search_datasets`

When searching for datasets, the `search_datasets` method allows you to control how many results are returned and how they are paginated using two key parameters:

- **`size`**: The number of datasets to retrieve per page.
- **`page`**: The page number to retrieve (starting from 1).

By adjusting these parameters, you can choose between two main approaches:

- **Approach 1: Retrieve a large number of datasets at once**:
  - Set a high `size` (e.g., 50 or 100) and `page=1` to get many datasets in a single request.
  - **Pros**: Fewer API calls, simpler to implement.
  - **Cons**: Slower response times for very large `size`, and you might get more data than needed so good luck on the scroll :)

- **Approach 2: Use pagination for faster retrieval**:
  - Set a smaller `size` (e.g., 10) and use multiple `page` values to retrieve datasets in chunks.
  - **Pros**: Faster individual requests, better for browsing or when you don’t need all results at once.
  - **Cons**: Requires multiple API calls if you need many datasets.

Let’s explore both approaches with examples.

### Approach 1: Retrieve Many Datasets at Once

In this approach, we set a large `size` to retrieve many datasets in one go. This is useful when you want to see a broad range of results without worrying about pagination.

**Example**: Retrieve 50 or less (if less than 50 exists) datasets related to "taxis" in a single request.

In [3]:
collection_large = search.search_datasets(search_query="taxis", size=50, page=1)
collection_large.display()

Output()

This will display 50 dataset cards at once. While convenient, keep in mind that requesting a very large `size` might take longer to load. Though, Auctus API is very fast so you should be fine, yet in theory pagination exists for a reason.

When you run the above cell, you’ll see a grid of dataset cards, each representing a dataset related to "taxis". Each card includes:

- **Name**: The name of the dataset.
- **Source**: A link to the dataset's source.
- **Upload Date**: The date when the dataset was uploaded.
- **Description**: A brief overview of the dataset.
- **Type**: The primary type (e.g., Spatial, Tabular) and additional types.
- **Size**: The number of rows and columns in the dataset.
- **Relevancy**: A gauge showing how relevant the dataset is to your search query.

You can interact with these cards by clicking "Select This Dataset" to choose one for further analysis. This is covered in more detail in other examples.

### Approach 2: Use Pagination for Faster Retrieval

In this approach, we set a smaller `size` and use the `page` parameter to retrieve datasets in smaller, faster chunks. This is ideal when you want to browse through results page by page or when working with limited resources.

**Example**: Retrieve 10 datasets per page and display the first two pages.

In [4]:
# Retrieve and display the first page (datasets 1-10)
collection_page1 = search.search_datasets(search_query="taxis", size=10, page=1)
collection_page1.display()

# Retrieve and display the second page (datasets 11-20)
collection_page2 = search.search_datasets(search_query="taxis", size=10, page=2)
collection_page2.display()

Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': "Label(value='', layout=Layout(margin='…

Output(outputs=({'output_type': 'display_data', 'data': {'text/plain': "Label(value='', layout=Layout(margin='…

Each call to `search_datasets` with a different `page` will fetch a new set of 10 datasets. This allows for quicker responses and easier navigation through large result sets.

The visualisation is the same as in Approach 1, but the results are split into smaller pages.

## Step 4: Choosing the Right Approach

- **Use Approach 1 (large `size`)** when:
  - You need to see many datasets at once.
  - You prefer fewer API calls.
  - You’re not concerned about slightly longer load times.

- **Use Approach 2 (pagination with smaller `size`)** when:
  - You want faster, more responsive searches.
  - You’re browsing or exploring datasets gradually.
  - You’re working with limited bandwidth or processing power.

Both approaches are powerful—choose the one that fits your needs!

## Alternative: Displaying Results Immediately

You can also display results immediately by setting `display_initial_results=True` in `search_datasets`, regardless of the `page` and `size` settings.

**Example**:

```python
collection = search.search_datasets(search_query="taxis", size=20, page=1, display_initial_results=True)
```

This combines searching and displaying into one step, which can be handy for quick explorations.