# Advanced Filtering and Sorting

This notebook demonstrates advanced query techniques including:
- Multiple filter conditions
- Combining filters with AND logic
- Sorting results
- Pagination with offset and limit

## Prerequisites
Make sure you have completed the `01_basic_search.ipynb` notebook first to understand the basics.

## Setup

In [None]:
# Add parent directory to path for local development
import sys
import os
sys.path.insert(0, os.path.abspath('..'))

In [None]:
from nanohubremote import Session
from nanohubresults import Results

# Initialize session with authentication
# Replace with your actual token from https://nanohub.org/developer
auth_data = {
    "grant_type": "personal_token",
    "token": "YOUR_TOKEN_HERE"
}
session = Session(auth_data, url="https://nanohub.org/api")
results = Results(session)

print("✓ Connected to nanoHUB API")

## Multiple Filter Conditions

You can chain multiple `.filter()` calls to create complex queries. All filters are combined with AND logic.

In this example, we'll search for 2dfets results where:
1. Fermi energy is between 0.2V and 0.4V (two filters)
2. Gate length is greater than 15nm

### Available Filter Operations
- `=` : Equal to
- `!=` : Not equal to
- `>` : Greater than
- `<` : Less than
- `>=` : Greater than or equal to
- `<=` : Less than or equal to
- `like` : Pattern matching
- `in` : Value in list

In [None]:
print("Building query with multiple filters...")

query = results.query("2dfets", simtool=False) \
    .filter("input.Ef", ">", 0.2) \
    .filter("input.Ef", "<", 0.4) \
    .filter("input.Lg", ">", 15) \
    .select("input.Ef", "input.Lg", "input.temperature", "output.f11") \
    .limit(20)

print("Query built successfully with 3 filter conditions")

## Sorting Results

You can sort results by any field using the `.sort()` method. By default, sorting is in ascending order. Use `asc=False` for descending order.

In [None]:
# Add sorting to the query
query = query.sort("input.Ef", asc=False)  # Sort by Fermi energy, highest first

print("Added sorting: Fermi energy (descending)")

## Executing the Query

In [None]:
print("Executing advanced query...\n")
response = query.execute()

print(f"✓ Query completed")
print(f"  Results found: {len(response.get('results', []))}")
print(f"  Search time: {response.get('searchTime', 0):.4f} seconds")

## Analyzing Results

Let's examine the results to verify our filters and sorting are working correctly.

In [None]:
if response.get('results'):
    print("\nFirst 5 results (sorted by Fermi energy, descending):\n")
    print(f"{'#':<4} {'SQUID':<50} {'Ef (V)':<10} {'Lg (nm)':<10} {'Temp (K)':<10}")
    print("-" * 100)
    
    for i, result in enumerate(response['results'][:5], 1):
        squid = result.get('squid', 'N/A')[:47] + '...' if len(result.get('squid', '')) > 50 else result.get('squid', 'N/A')
        ef = result.get('input.Ef', 'N/A')
        lg = result.get('input.Lg', 'N/A')
        temp = result.get('input.temperature', 'N/A')
        
        print(f"{i:<4} {squid:<50} {ef:<10} {lg:<10} {temp:<10}")
    
    # Verify filters
    print("\n" + "=" * 100)
    print("Filter Verification:")
    ef_values = [r.get('input.Ef') for r in response['results'] if r.get('input.Ef') is not None]
    lg_values = [r.get('input.Lg') for r in response['results'] if r.get('input.Lg') is not None]
    
    print(f"  Fermi Energy range: {min(ef_values):.3f}V to {max(ef_values):.3f}V (expected: 0.2-0.4V)")
    print(f"  Gate Length range: {min(lg_values):.1f}nm to {max(lg_values):.1f}nm (expected: >15nm)")
    print(f"  Sorting: {'✓ Correct' if ef_values == sorted(ef_values, reverse=True) else '✗ Incorrect'}")
else:
    print("No results found.")

## Pagination with Offset

For large result sets, you can use offset to skip results. This is useful for implementing pagination in applications.

In [None]:
# Get the second "page" of results
print("Fetching page 2 (offset=20, limit=10)...")

query_page2 = results.query("2dfets", simtool=False) \
    .filter("input.Ef", ">", 0.2) \
    .filter("input.Ef", "<", 0.4) \
    .select("input.Ef", "input.Lg") \
    .sort("input.Ef", asc=False) \
    .limit(10) \
    .offset(20)  # Skip first 20 results

response_page2 = query_page2.execute()
print(f"\n✓ Retrieved {len(response_page2.get('results', []))} results from page 2")

## Complex Filter Example

Here's a more complex example combining multiple filter types:

In [None]:
# Example: Find results with specific temperature values
complex_query = results.query("2dfets", simtool=False) \
    .filter("input.temperature", "=", 300) \
    .filter("input.Ef", ">=", 0.25) \
    .filter("input.Ef", "<=", 0.35) \
    .select("input.Ef", "input.temperature", "input.Material") \
    .sort("input.Ef", asc=True) \
    .limit(10)

complex_response = complex_query.execute()

print(f"Complex query results: {len(complex_response.get('results', []))} found")
if complex_response.get('results'):
    print("\nSample results:")
    for i, r in enumerate(complex_response['results'][:3], 1):
        print(f"  {i}. Ef={r.get('input.Ef')}V, T={r.get('input.temperature')}K, Material={r.get('input.Material')}")

## Summary

In this notebook, you learned:
1. ✓ How to apply multiple filter conditions
2. ✓ How to sort results by any field
3. ✓ How to use offset for pagination
4. ✓ How to verify filter conditions in results

## Best Practices

- **Be specific with filters**: More filters = faster queries and smaller result sets
- **Use sorting wisely**: Sort only when needed, as it can increase query time
- **Select only needed fields**: This reduces data transfer and processing time
- **Use pagination**: For large datasets, fetch data in chunks using limit and offset

## Next Steps

- `03_downloading_files.ipynb` - Learn how to save and work with result data
- `04_pagination.ipynb` - Automatic pagination for large datasets