# Exact-Match Queries

Use `query_device()` for exact field matching on brand names, product codes, and regulatory numbers.

## Setup

In [1]:
from pymaude import MaudeDatabase
import pandas as pd

# Use shared database
# db = MaudeDatabase('notebooks.db', verbose=True)
# db.add_years('2020-2023', tables=['device'], download=True)

db = MaudeDatabase('../analysis/venous_thrombectomy/maude_2008_2025.db', verbose=True)


print("Setup complete!")

Setup complete!


## Exact vs Boolean Search

**Boolean search** (`search_by_device_names`) finds substrings:

In [2]:
db.create_search_index()

# Finds anything containing "pacemaker"
partial = db.search_by_device_names('pacemaker')
print(f"Boolean search 'pacemaker': {len(partial)} events")
print(f"\nGeneric names found (sample):")
print(partial['GENERIC_NAME'].value_counts().head())

Search index already exists and is up to date
Boolean search 'pacemaker': 640629 events

Generic names found (sample):
GENERIC_NAME
PERMANENT PACEMAKER ELECTRODE                       230791
ELECTRODE, PACEMAKER, PERMANENT                      90669
DEFIBRILLATOR/PACEMAKER                              45683
IMPLANTABLE PACEMAKER PULSE GENERATOR                45018
IMPLANTABLE PULSE GENERATOR, PACEMAKER (NON-CRT)     35829
Name: count, dtype: int64


**Exact query** (`query_device`) matches entire field:

In [3]:
# Only events where GENERIC_NAME exactly equals "Pacemaker"
exact = db.query_device(generic_name='Pacemaker')
print(f"Exact query (generic_name='Pacemaker'): {len(exact)} events")
print(f"\nUnique generic names:")
print(exact['GENERIC_NAME'].unique())

Exact query (generic_name='Pacemaker'): 11557 events

Unique generic names:
['PACEMAKER']


## Query by Brand Name

Case-insensitive exact match:

In [4]:
# Query specific insulin pump brand
results = db.query_device(brand_name='670G INSULIN PUMP MMT-1780KL')

print(f"Events: {len(results)}")
print(f"\nManufacturer:")
print(results['MANUFACTURER_D_NAME'].value_counts())
print(f"\nGeneric name:")
print(results['GENERIC_NAME'].value_counts())

Events: 77658

Manufacturer:
MANUFACTURER_D_NAME
MEDTRONIC PUERTO RICO OPERATIONS CO.    77651
MEDTRONIC MINIMED                           7
Name: count, dtype: int64

Generic name:
GENERIC_NAME
ARTIFICIAL PANCREAS DEVICE SYSTEM, SINGLE HORMONAL CONTROL          55112
AUTOMATED INSULIN DOSING DEVICE SYSTEM, SINGLE HORMONAL CONTROL     22545
PUMP, INFUSION, INSULIN, TO BE USED WITH INVASIVE GLUCOSE SENSOR        1
Name: count, dtype: int64


## Query by Generic Name

Find all brands of a device type:

In [5]:
# All pacemakers
pacemakers = db.query_device(generic_name='Pacemaker')
print(f"Total pacemaker events: {len(pacemakers)}")
print(f"\nTop brands:")
print(pacemakers['BRAND_NAME'].value_counts().head(10))

Total pacemaker events: 11557

Top brands:
BRAND_NAME
EDORA 8 DR-T           1089
ACCOLADE MRI DR        1087
ACCOLADE MRI EL DR     1014
ACCOLADE? MRI EL DR     610
INGENIO                 536
EVIA DR-T               418
ACCOLADE DR             410
ELUNA 8 DR-T PROMRI     407
ACCOLADE? MRI DR        362
ESSENTIO MRI DR         281
Name: count, dtype: int64


## Query by Manufacturer

In [6]:
medtronic = db.query_device(manufacturer_name='Medtronic')
print(f"Medtronic events: {len(medtronic)}")
print(f"\nTop device types:")
print(medtronic['GENERIC_NAME'].value_counts().head(5))

Medtronic events: 12383

Top device types:
GENERIC_NAME
PERMANENT PACEMAKER ELECTRODE                                                       3506
DRUG ELUTING PERMANENT RIGHT VENTRICULAR (RV) OR RIGHT ATRIAL (RA) PACEMAKER ELE    1901
IMPLANTABLE CARDIOVERTER DEFIBRILLATOR (NON-CRT)                                    1891
PERMANENT DEFIBRILLATOR ELECTRODES                                                  1227
DRUG ELUTING PERMANENT LEFT VENTRICULAR (LV) PACEMAKER ELECTRODE                     586
Name: count, dtype: int64


## Query by Product Code

Product codes are FDA's device classification:

In [7]:
# DQY = Short-term intravascular catheters
catheters = db.query_device(product_code='DQY')
print(f"Product code DQY events: {len(catheters)}")
print(f"\nTop generic names:")
print(catheters['GENERIC_NAME'].value_counts().head(5))

Product code DQY events: 38567

Top generic names:
GENERIC_NAME
CATHETER, PERCUTANEOUS             18014
CATHETER PERCUTANEOUS               3413
DQY                                 2794
PTA BALLOON DILATATION CATHETER     2154
DQY CATHETER, PERCUTANEOUS          1735
Name: count, dtype: int64


## Combining Parameters (AND Logic)

Multiple parameters are combined with AND:

In [8]:
# Specific brand from specific manufacturer
specific = db.query_device(
    generic_name='Pacemaker',
    manufacturer_name='Boston Scientific Corporation',
    start_date='2022-01-01'
)

print(f"Boston Scientific pacemakers (2022+): {len(specific)} events")
if len(specific) > 0:
    print(f"\nBrands:")
    print(specific['BRAND_NAME'].value_counts().head())

Boston Scientific pacemakers (2022+): 6164 events

Brands:
BRAND_NAME
ACCOLADE MRI DR        979
ACCOLADE MRI EL DR     922
ACCOLADE? MRI EL DR    606
INGENIO                497
ACCOLADE? MRI DR       359
Name: count, dtype: int64


But what happens if you search for "Boston Scientific" rather than "Boston Scientific Corporation"? 

See below how important it is that you use boolean search (`search_by_device_names()`) when you are exploring the database, and only use exact search (`query_device()`) when you already know what to expect:

In [9]:
exact_search_results = db.query_device(
    generic_name='Pacemaker',
    manufacturer_name='Boston Scientific',
    start_date='2022-01-01'
)

boolean_search_results = db.search_by_device_names(
    [['Pacemaker', 'Boston Scientific']],
    start_date='2022-01-01'
)

print(f"Boston Scientific pacemakers by exact search (2022+): {len(exact_search_results)} events")
print(f"Boston Scientific pacemakers by boolean search (2022+): {len(boolean_search_results)} events")


Boston Scientific pacemakers by exact search (2022+): 0 events
Boston Scientific pacemakers by boolean search (2022+): 39149 events


## When to Use Each Method

**Use `query_device()` (exact) when:**
- You know the exact brand/generic/manufacturer name
- You need product code or PMA/PMN queries
- You want precise, unambiguous results

**Use `search_by_device_names()` (boolean) when:**
- You're exploring and don't know exact names
- You want all variations (e.g., all devices containing "pump")
- You need complex boolean logic
- You want to compare multiple device categories

**Note**: MAUDE entries are inconsistent - brand_name, generic_name, and manufacturer_d_name may contain any combination of brand, model, type, or manufacturer. Boolean search is often more reliable.

## Example: Explore First, Then Query

Use boolean search to discover exact field values, then use exact queries:

In [10]:
# Step 1: Explore
explore = db.search_by_device_names('insulin pump')
print(f"Found {len(explore)} events\n")

print("Top generic names:")
print(explore['GENERIC_NAME'].value_counts().head(3))

# Step 2: Use exact query with discovered value
top_generic = explore['GENERIC_NAME'].value_counts().index[0]
exact_results = db.query_device(generic_name=top_generic)

print(f"\nExact query for '{top_generic}': {len(exact_results)} events")

Found 1850096 events

Top generic names:
GENERIC_NAME
ALTERNATE CONTROLLER ENABLED INFUSION PUMP                          587505
ALTERNATE CONTROLLER ENABLED INSULIN INFUSION PUMP                  286526
PUMP, INFUSION, INSULIN, TO BE USED WITH INVASIVE GLUCOSE SENSOR    242058
Name: count, dtype: int64

Exact query for 'ALTERNATE CONTROLLER ENABLED INFUSION PUMP': 588800 events


## Cleanup

In [11]:
db.close()

## Summary

- `query_device()` uses **exact, case-insensitive matching**
- Query by: `brand_name`, `generic_name`, `manufacturer_name`, `product_code`, `pma_pmn`
- Multiple parameters = **AND logic**
- Use for **precision** when you know exact values
- **Explore with boolean search first**, then refine with exact queries

**Next**: [04_analysis_helpers.ipynb](04_analysis_helpers.ipynb) - Statistical analysis and visualization