# PyEuropePMC QueryBuilder: Comprehensive Interactive Demo

This notebook demonstrates the powerful QueryBuilder API for constructing complex Europe PMC search queries.

## Table of Contents
1. [Setup & Imports](#setup)
2. [Basic Queries](#basic)
3. [Field-Specific Searches](#fields)
4. [Boolean Logic (AND, OR, NOT)](#boolean)
5. [Date & Citation Filters](#filters)
6. [Content Availability Filters](#content)
7. [Complex Real-World Queries](#complex)
8. [Live Search Integration](#live)
9. [Advanced: Generic field() Method](#advanced)
10. [Tips & Best Practices](#tips)

## 1. Setup & Imports <a name="setup"></a>

First, let's import the necessary components:

In [1]:
from pyeuropepmc import QueryBuilder, SearchClient
import json

# Helper function to print queries nicely
def show_query(description, query_builder):
    query = query_builder.build()
    print(f"\n{'='*60}")
    print(f"üìù {description}")
    print(f"{'='*60}")
    print(f"Query: {query}")
    print(f"{'='*60}")
    return query

print("‚úÖ Imports successful!")

‚úÖ Imports successful!


## 2. Basic Queries <a name="basic"></a>

Let's start with simple keyword searches:

In [2]:
# Simple keyword search
qb = QueryBuilder(validate=False)
query1 = show_query(
    "Simple keyword search",
    qb.keyword("cancer")
)


üìù Simple keyword search
Query: cancer


In [3]:
# Keyword in specific field (title)
qb = QueryBuilder(validate=False)
query2 = show_query(
    "Keyword in title only",
    qb.keyword("CRISPR", field="title")
)


üìù Keyword in title only
Query: TITLE:CRISPR


In [4]:
# Multi-word phrases (automatically quoted)
qb = QueryBuilder(validate=False)
query3 = show_query(
    "Multi-word phrase in abstract",
    qb.keyword("gene editing", field="abstract")
)


üìù Multi-word phrase in abstract
Query: ABSTRACT:"gene editing"


## 3. Field-Specific Searches <a name="fields"></a>

QueryBuilder provides convenient methods for common field searches:

In [5]:
# Author search
qb = QueryBuilder(validate=False)
query = show_query(
    "Search by author",
    qb.field("author", "Smith J")
)


üìù Search by author
Query: AUTH:"Smith J"


In [6]:
# Journal search
qb = QueryBuilder(validate=False)
query = show_query(
    "Search by journal",
    qb.field("journal", "Nature")
)


üìù Search by journal
Query: JOURNAL:Nature


In [7]:
# MeSH term search (Medical Subject Headings)
qb = QueryBuilder(validate=False)
query = show_query(
    "Search by MeSH term",
    qb.field("mesh", "Neoplasms")
)


üìù Search by MeSH term
Query: MESH:Neoplasms


In [8]:
# Affiliation search
qb = QueryBuilder(validate=False)
query = show_query(
    "Search by institution",
    qb.field("affiliation", "university of cambridge")
)


üìù Search by institution
Query: AFF:"university of cambridge"


In [9]:
# Grant agency search
qb = QueryBuilder(validate=False)
query = show_query(
    "Search by funding agency",
    qb.field("grant_agency", "wellcome")
)


üìù Search by funding agency
Query: GRANT_AGENCY:wellcome


## 4. Boolean Logic (AND, OR, NOT) <a name="boolean"></a>

Combine search terms with logical operators:

In [10]:
# AND operator - both terms must be present
qb = QueryBuilder(validate=False)
query = show_query(
    "AND operator: cancer AND treatment",
    qb.keyword("cancer").and_().keyword("treatment")
)


üìù AND operator: cancer AND treatment
Query: cancer AND treatment


In [11]:
# OR operator - either term can be present
qb = QueryBuilder(validate=False)
query = show_query(
    "OR operator: cancer OR tumor",
    qb.keyword("cancer").or_().keyword("tumor")
)


üìù OR operator: cancer OR tumor
Query: cancer OR tumor


In [12]:
# NOT operator - exclude term
qb = QueryBuilder(validate=False)
query = show_query(
    "NOT operator: cancer NOT review",
    qb.keyword("cancer").and_().not_().keyword("review")
)


üìù NOT operator: cancer NOT review
Query: cancer AND NOT review


In [13]:
# Complex boolean logic
qb = QueryBuilder(validate=False)
query = show_query(
    "Complex: (cancer OR tumor) AND treatment NOT review",
    qb.keyword("cancer").or_().keyword("tumor").and_().keyword("treatment").and_().not_().keyword("review")
)


üìù Complex: (cancer OR tumor) AND treatment NOT review
Query: cancer OR tumor AND treatment AND NOT review


In [14]:
# Grouping for precedence control
disease_terms = QueryBuilder(validate=False).keyword("cancer").or_().keyword("tumor")

qb = QueryBuilder(validate=False)
query = show_query(
    "Grouped query: (cancer OR tumor) AND treatment",
    qb.group(disease_terms).and_().keyword("treatment")
)


üìù Grouped query: (cancer OR tumor) AND treatment
Query: (cancer OR tumor) AND treatment


## 5. Date & Citation Filters <a name="filters"></a>

Filter by publication date ranges and citation counts:

In [15]:
# Year range (2020-2023)
qb = QueryBuilder(validate=False)
query = show_query(
    "Publications from 2020 to 2023",
    qb.keyword("cancer").and_().date_range(start_year=2020, end_year=2023)
)


üìù Publications from 2020 to 2023
Query: cancer AND (PUB_YEAR:[2020 TO 2023])


In [16]:
# Open-ended date range (from 2020 onwards)
qb = QueryBuilder(validate=False)
query = show_query(
    "Publications from 2020 onwards",
    qb.keyword("CRISPR").and_().date_range(start_year=2020)
)


üìù Publications from 2020 onwards
Query: CRISPR AND (PUB_YEAR:[2020 TO 2025])


In [17]:
# Specific date range (YYYY-MM-DD format)
qb = QueryBuilder(validate=False)
query = show_query(
    "Specific date range",
    qb.keyword("COVID-19").and_().date_range(start_date="2020-03-01", end_date="2020-12-31")
)


üìù Specific date range
Query: COVID-19 AND (PUB_YEAR:[2020-03-01 TO 2020-12-31])


In [18]:
# Citation count filter (minimum 10 citations)
qb = QueryBuilder(validate=False)
query = show_query(
    "High-impact papers (10+ citations)",
    qb.keyword("machine learning").and_().citation_count(min_count=10)
)


üìù High-impact papers (10+ citations)
Query: "machine learning" AND (CITED:[10 TO *])


In [19]:
# Citation count range
qb = QueryBuilder(validate=False)
query = show_query(
    "Papers with 10-100 citations",
    qb.keyword("genomics").and_().citation_count(min_count=10, max_count=100)
)


üìù Papers with 10-100 citations
Query: genomics AND (CITED:[10 TO 100])


## 6. Content Availability Filters <a name="content"></a>

Filter by content availability (open access, PDFs, full text, etc.):

In [20]:
# Open access only
qb = QueryBuilder(validate=False)
query = show_query(
    "Open access papers only",
    qb.keyword("cancer").and_().field("open_access", True)
)


üìù Open access papers only
Query: cancer AND OPEN_ACCESS:y


In [21]:
# Has PDF available
qb = QueryBuilder(validate=False)
query = show_query(
    "Papers with PDF available",
    qb.keyword("immunotherapy").and_().field("has_pdf", True)
)


üìù Papers with PDF available
Query: immunotherapy AND HAS_PDF:y


In [22]:
# Has full text available
qb = QueryBuilder(validate=False)
query = show_query(
    "Papers with full text",
    qb.keyword("proteomics").and_().field("has_text", True)
)


üìù Papers with full text
Query: proteomics AND HAS_TEXT:y


In [23]:
# Has abstract
qb = QueryBuilder(validate=False)
query = show_query(
    "Papers with abstract",
    qb.keyword("neuroscience").and_().field("has_abstract", True)
)


üìù Papers with abstract
Query: neuroscience AND HAS_ABSTRACT:y


In [24]:
# Multiple content filters combined
qb = QueryBuilder(validate=False)
query = show_query(
    "Open access papers with PDF and full text",
    qb.keyword("bioinformatics")
      .and_().field("open_access", True)
      .and_().field("has_pdf", True)
      .and_().field("has_text", True)
)


üìù Open access papers with PDF and full text
Query: bioinformatics AND OPEN_ACCESS:y AND HAS_PDF:y AND HAS_TEXT:y


## 7. Complex Real-World Queries <a name="complex"></a>

Let's build some realistic, complex queries:

In [25]:
# Example 1: Recent high-impact cancer research
# Goal: Find recent open-access cancer papers with significant citations
qb = QueryBuilder(validate=False)
query = show_query(
    "Recent high-impact cancer research (2020+, OA, 10+ citations)",
    qb.keyword("cancer", field="title")
      .and_().date_range(start_year=2020)
      .and_().field("open_access", True)
      .and_().citation_count(min_count=10)
)


üìù Recent high-impact cancer research (2020+, OA, 10+ citations)
Query: TITLE:cancer AND (PUB_YEAR:[2020 TO 2025]) AND OPEN_ACCESS:y AND (CITED:[10 TO *])


In [26]:
# Example 2: CRISPR papers by specific author in top journals
qb = QueryBuilder(validate=False)
query = show_query(
    "CRISPR papers by Smith J in Nature",
    qb.field("author", "Smith J")
      .and_().field("journal", "Nature")
      .and_().keyword("CRISPR", field="title")
)


üìù CRISPR papers by Smith J in Nature
Query: AUTH:"Smith J" AND JOURNAL:Nature AND TITLE:CRISPR


In [27]:
# Example 3: Clinical trials for specific disease with MeSH terms
qb = QueryBuilder(validate=False)
query = show_query(
    "Cancer drug therapy clinical trials (2018-2023, full text)",
    qb.field("mesh", "Neoplasms")
      .and_().field("mesh", "Drug Therapy")
      .and_().keyword("clinical trial")
      .and_().date_range(start_year=2018, end_year=2023)
      .and_().field("has_text", True)
)


üìù Cancer drug therapy clinical trials (2018-2023, full text)
Query: MESH:Neoplasms AND MESH:"Drug Therapy" AND "clinical trial" AND (PUB_YEAR:[2018 TO 2023]) AND HAS_TEXT:y


In [28]:
# Example 4: Multiple authors OR logic with constraints
authors_query = QueryBuilder(validate=False).field("author", "Smith J").or_().field("author", "Doe Jane")

qb = QueryBuilder(validate=False)
query = show_query(
    "Genetics papers by Smith J OR Doe Jane (since 2020, OA)",
    qb.group(authors_query)
      .and_().keyword("genetics")
      .and_().date_range(start_year=2020)
      .and_().field("open_access", True)
)


üìù Genetics papers by Smith J OR Doe Jane (since 2020, OA)
Query: (AUTH:"Smith J" OR AUTH:"Doe Jane") AND genetics AND (PUB_YEAR:[2020 TO 2025]) AND OPEN_ACCESS:y


In [29]:
# Example 5: Disease terms OR logic with treatment focus
disease_query = QueryBuilder(validate=False).field("disease", "cancer").or_().field("disease", "tumor")

qb = QueryBuilder(validate=False)
query = show_query(
    "(Cancer OR Tumor) AND Immunotherapy (2020+, highly cited)",
    qb.group(disease_query)
      .and_().keyword("immunotherapy")
      .and_().date_range(start_year=2020)
      .and_().citation_count(min_count=20)
)


üìù (Cancer OR Tumor) AND Immunotherapy (2020+, highly cited)
Query: (DISEASE:cancer OR DISEASE:tumor) AND immunotherapy AND (PUB_YEAR:[2020 TO 2025]) AND (CITED:[20 TO *])


In [30]:
# Example 6: Excluding reviews and focusing on research articles
qb = QueryBuilder(validate=False)
query = show_query(
    "AI in medicine (research articles, not reviews)",
    qb.keyword("artificial intelligence")
      .and_().keyword("medicine")
      .and_().not_().field("pub_type", "review")
      .and_().date_range(start_year=2021)
      .and_().field("has_text", True)
)


üìù AI in medicine (research articles, not reviews)
Query: "artificial intelligence" AND medicine AND NOT PUB_TYPE:review AND (PUB_YEAR:[2021 TO 2025]) AND HAS_TEXT:y


## 8. Live Search Integration <a name="live"></a>

Now let's actually execute some queries and see results!

**Note:** This requires an internet connection and may take a moment to execute.

In [31]:
from pyeuropepmc.query_builder import get_available_fields
get_available_fields()

['ABBR',
 'ABSTRACT',
 'ACCESSION_ID',
 'ACCESSION_TYPE',
 'ACK_FUND',
 'AFF',
 'ANNOTATION_PROVIDER',
 'ANNOTATION_TYPE',
 'APPENDIX',
 'ARXPR_PUBS',
 'AUTH',
 'AUTHORID',
 'AUTHORID_TYPE',
 'AUTHOR_ROLES',
 'AUTH_COLLECTIVE_LIST',
 'AUTH_CON',
 'AUTH_FIRST',
 'AUTH_LAST',
 'AUTH_MAN',
 'AUTH_MAN_ID',
 'BACK',
 'BACK_NOREF',
 'BODY',
 'BOOK_ID',
 'CASE',
 'CHEBITERM',
 'CHEBITERM_ID',
 'CHEBI_PUBS',
 'CHEM',
 'CHEMBL_PUBS',
 'CITED',
 'CITES',
 'COMP_INT',
 'CONCL',
 'CREATION_DATE',
 'DATA_AVAILABILITY',
 'DISCUSS',
 'DISEASE',
 'DISEASE_ID',
 'DOI',
 'ED',
 'EMBARGO_DATE',
 'EMBL_PUBS',
 'EMBL_ROR_ID',
 'EPMC_AUTH_MAN',
 'ESSN',
 'EXPERIMENTAL_METHOD',
 'EXPERIMENTAL_METHOD_ID',
 'EXT_ID',
 'E_PDATE',
 'FIG',
 'FIRST_IDATE',
 'FIRST_IDATE_D',
 'FIRST_PDATE',
 'FT_CDATE',
 'FT_CDATE_D',
 'FT_ID',
 'FUNDER_INITIATIVE',
 'GENE_PROTEIN',
 'GOTERM',
 'GOTERM_ID',
 'GRANT_AGENCY',
 'GRANT_AGENCY_ID',
 'GRANT_ID',
 'HAS_ABSTRACT',
 'HAS_ARXPR',
 'HAS_BOOK',
 'HAS_CHEBI',
 'HAS_CHEMBL',
 'H

In [32]:
# Build a query for CRISPR papers
# Note: Validation is disabled because Europe PMC syntax differs from PubMed
qb = QueryBuilder(validate=False)
query = (qb.keyword("CRISPR", field="title")
          .and_().date_range(start_year=2020)
          .and_().field("open_access", True)
          .build())

print(f"Query: {query}\n")
print("Searching Europe PMC...\n")

# Execute the search
with SearchClient() as client:
    results = client.search(query, pageSize=5)
    print(results)
    hit_count = results.get("hitCount", 0)

    print(f"‚úÖ Found {hit_count:,} results\n")
    print("=" * 80)

    # Display first 5 results
    if "resultList" in results and "result" in results["resultList"]:
        for i, paper in enumerate(results["resultList"]["result"], 1):
            print(f"\n{i}. {paper.get('title', 'N/A')}")
            print(f"   üìù Authors: {paper.get('authorString', 'N/A')}")
            print(f"   üìÖ Year: {paper.get('pubYear', 'N/A')}")
            print(f"   üìñ Journal: {paper.get('journalTitle', 'N/A')}")
            print(f"   üîó PMID: {paper.get('pmid', 'N/A')}")
            if 'citedByCount' in paper:
                print(f"   üìä Citations: {paper['citedByCount']}")

Query: TITLE:CRISPR AND (PUB_YEAR:[2020 TO 2025]) AND OPEN_ACCESS:y

Searching Europe PMC...

{'version': '6.9', 'hitCount': 7143, 'nextCursorMark': 'AoIIQD8HgCg1MzUwMzM2OA==', 'nextPageUrl': 'https://www.ebi.ac.uk/europepmc/webservices/rest/search?query=TITLE:CRISPR AND (PUB_YEAR:[2020 TO 2025]) AND OPEN_ACCESS:y&cursorMark=AoIIQD8HgCg1MzUwMzM2OA==&resultType=lite&pageSize=5&format=json', 'request': {'queryString': 'TITLE:CRISPR AND (PUB_YEAR:[2020 TO 2025]) AND OPEN_ACCESS:y', 'resultType': 'lite', 'cursorMark': '%2A', 'pageSize': 5, 'sort': '', 'synonym': False}, 'resultList': {'result': [{'id': '41093884', 'source': 'MED', 'pmid': '41093884', 'pmcid': 'PMC12528686', 'fullTextIdList': {'fullTextId': ['PMC12528686']}, 'doi': '10.1038/s41467-025-64205-4', 'title': 'CRISPR anti-tag-mediated room-temperature RNA detection using CRISPR/Cas13a.', 'authorString': 'Moon J, Zhang J, Guan X, Yang R, Guo C, Schalper KT, Avery L, Banach D, LaSala R, Warrier R, Liu C.', 'journalTitle': 'Nat Comm

In [40]:
# Another example: Recent high-impact AI papers
qb = QueryBuilder(validate=False)
query = (qb.keyword("artificial intelligence")
          .and_().keyword("medicine", field="title")
          .and_().date_range(start_year=2023)
          .and_().citation_count(min_count=5)
          .build())

print(f"Query: {query}\n")
print("Searching Europe PMC...\n")

with SearchClient() as client:
    results = client.search(query, pageSize=3)
    hit_count = results.get("hitCount", 0)
    print(results)
    print(f"‚úÖ Found {hit_count:,} results\n")
    print("=" * 80)

    if "resultList" in results and "result" in results["resultList"]:
        for i, paper in enumerate(results["resultList"]["result"], 1):
            print(f"\n{i}. {paper.get('title', 'N/A')}")
            print(f"   üìù Authors: {paper.get('authorString', 'N/A')}")
            print(f"   üìÖ Year: {paper.get('pubYear', 'N/A')}")
            if 'citedByCount' in paper:
                print(f"   üìä Citations: {paper['citedByCount']}")

Query: "artificial intelligence" AND TITLE:medicine AND (PUB_YEAR:[2023 TO 2025]) AND (CITED:[5 TO *])

Searching Europe PMC...

{'version': '6.9', 'hitCount': 822, 'nextCursorMark': 'AoIIQIs0Wyg1MjYzNzk1Ng==', 'nextPageUrl': 'https://www.ebi.ac.uk/europepmc/webservices/rest/search?query="artificial intelligence" AND TITLE:medicine AND (PUB_YEAR:[2023 TO 2025]) AND (CITED:[5 TO *])&cursorMark=AoIIQIs0Wyg1MjYzNzk1Ng==&resultType=lite&pageSize=3&format=json', 'request': {'queryString': '"artificial intelligence" AND TITLE:medicine AND (PUB_YEAR:[2023 TO 2025]) AND (CITED:[5 TO *])', 'resultType': 'lite', 'cursorMark': '%2A', 'pageSize': 3, 'sort': '', 'synonym': False}, 'resultList': {'result': [{'id': '40140300', 'source': 'MED', 'pmid': '40140300', 'doi': '10.1016/j.disamonth.2025.101882', 'title': 'The integration of artificial intelligence into clinical medicine: Trends, challenges, and future directions.', 'authorString': 'Aravazhi PS, Gunasekaran P, Benjamin NZY, Thai A, Chandrasek

## 9. Advanced: Generic field() Method <a name="advanced"></a>

The `field()` method provides direct access to all 149+ Europe PMC search fields:

In [34]:
# Using generic field() method
qb = QueryBuilder(validate=False)
query = show_query(
    "Using generic field() method",
    qb.field("disease", "diabetes")
      .and_().field("organism", "homo sapiens")
      .and_().field("open_access", True)
)


üìù Using generic field() method
Query: DISEASE:diabetes AND ORGANISM:"homo sapiens" AND OPEN_ACCESS:y


In [35]:
# Accessing specialized fields
qb = QueryBuilder(validate=False)
query = show_query(
    "Specialized fields: Gene/Protein and ChEBI terms",
    qb.field("gene_protein", "TP53")
      .and_().field("chebiterm", "drug")
      .and_().field("has_pdb", True)  # Has Protein Data Bank cross-references
)


üìù Specialized fields: Gene/Protein and ChEBI terms
Query: GENE_PROTEIN:TP53 AND CHEBITERM:drug AND HAS_PDB:y


In [36]:
# Accession IDs and database cross-references
qb = QueryBuilder(validate=False)
query = show_query(
    "Papers with specific database accessions",
    qb.accession_type("pdb")
      .and_().field("has_uniprot", True)  # Has UniProt cross-references
      .and_().date_range(start_year=2020)
)


üìù Papers with specific database accessions
Query: ACCESSION_TYPE:pdb AND HAS_UNIPROT:y AND (PUB_YEAR:[2020 TO 2025])


In [37]:
# Available field types overview
print("\n" + "="*60)
print("Field Types in QueryBuilder")
print("="*60)
print("\nüìö Content Fields:")
print("  - title, abstract, keyword, mesh")
print("\nüë§ Author Fields:")
print("  - author, affiliation, investigator, authorid")
print("\nüì∞ Publication Fields:")
print("  - journal, issn, pub_type, language")
print("\nüî¢ Identifier Fields:")
print("  - pmid, pmcid, doi, ext_id")
print("\nüí∞ Funding Fields:")
print("  - grant_agency, grant_id, funder_initiative")
print("\nüß¨ Biomedical Fields:")
print("  - disease, gene_protein, organism, chemical")
print("  - goterm, chebiterm, experimental_method")
print("\nüìÖ Date Fields:")
print("  - pub_year, e_pdate, first_pdate, update_date")
print("\n‚úÖ Availability Fields:")
print("  - open_access, has_pdf, has_full_text, has_abstract")
print("  - in_pmc, in_epmc, has_references, has_supplementary")
print("\nüîó Cross-Reference Fields:")
print("  - has_uniprot, has_pdb, has_embl, has_intact")
print("  - has_chebi, has_chembl, has_omim")
print("\nüìñ Section Fields:")
print("  - intro, methods, results, discuss, concl")
print("  - fig, table, suppl, ref")
print("="*60)


Field Types in QueryBuilder

üìö Content Fields:
  - title, abstract, keyword, mesh

üë§ Author Fields:
  - author, affiliation, investigator, authorid

üì∞ Publication Fields:
  - journal, issn, pub_type, language

üî¢ Identifier Fields:
  - pmid, pmcid, doi, ext_id

üí∞ Funding Fields:
  - grant_agency, grant_id, funder_initiative

üß¨ Biomedical Fields:
  - disease, gene_protein, organism, chemical
  - goterm, chebiterm, experimental_method

üìÖ Date Fields:
  - pub_year, e_pdate, first_pdate, update_date

‚úÖ Availability Fields:
  - open_access, has_pdf, has_full_text, has_abstract
  - in_pmc, in_epmc, has_references, has_supplementary

üîó Cross-Reference Fields:
  - has_uniprot, has_pdb, has_embl, has_intact
  - has_chebi, has_chembl, has_omim

üìñ Section Fields:
  - intro, methods, results, discuss, concl
  - fig, table, suppl, ref


## 10. Tips & Best Practices <a name="tips"></a>

In [38]:
print("\n" + "="*70)
print("QueryBuilder Tips & Best Practices")
print("="*70)

print("\n‚úÖ DO:")
print("  1. Use field-specific methods (author, journal, etc.) for clarity")
print("  2. Chain methods for readability")
print("  3. Use grouping for complex OR logic")
print("  4. Add date_range() to focus on recent research")
print("  5. Use open_access(True) for freely available papers")
print("  6. Add citation_count() for high-impact papers")
print("  7. Use has_full_text(True) when you need complete articles")

print("\n‚ùå DON'T:")
print("  1. Don't start queries with AND, OR, or NOT operators")
print("  2. Don't end queries with operators (call build() after adding terms)")
print("  3. Don't use consecutive operators (e.g., .and_().and_())")
print("  4. Don't forget to call build() to get the query string")

print("\nüí° PRO TIPS:")
print("  1. Start broad, then add filters incrementally")
print("  2. Test queries with small pageSize first")
print("  3. Use MeSH terms for precise medical concept matching")
print("  4. Combine multiple content filters for highest quality results")
print("  5. Use the generic field() method for specialized searches")
print("  6. Group OR terms together before combining with AND")

print("\nüìö Documentation:")
print("  - QueryBuilder API: docs/api/README.md")
print("  - Field reference: See FIELD_METADATA in query_builder.py")
print("  - Examples: examples/08-query-builder/")
print("="*70)


QueryBuilder Tips & Best Practices

‚úÖ DO:
  1. Use field-specific methods (author, journal, etc.) for clarity
  2. Chain methods for readability
  3. Use grouping for complex OR logic
  4. Add date_range() to focus on recent research
  5. Use open_access(True) for freely available papers
  6. Add citation_count() for high-impact papers
  7. Use has_full_text(True) when you need complete articles

‚ùå DON'T:
  1. Don't start queries with AND, OR, or NOT operators
  2. Don't end queries with operators (call build() after adding terms)
  3. Don't use consecutive operators (e.g., .and_().and_())
  4. Don't forget to call build() to get the query string

üí° PRO TIPS:
  1. Start broad, then add filters incrementally
  2. Test queries with small pageSize first
  3. Use MeSH terms for precise medical concept matching
  4. Combine multiple content filters for highest quality results
  5. Use the generic field() method for specialized searches
  6. Group OR terms together before combining w

## Example Workflow: Building a Query Step-by-Step

Let's build a complex query incrementally to see how it evolves:

In [39]:
# Step 1: Start with a basic keyword
qb = QueryBuilder(validate=False)
qb.keyword("cancer")
print(f"Step 1: {qb.build()}")

# Step 2: Add a field constraint (title only)
qb = QueryBuilder(validate=False)
qb.keyword("cancer", field="title")
print(f"Step 2: {qb.build()}")

# Step 3: Add date range (recent papers)
qb = QueryBuilder(validate=False)
qb.keyword("cancer", field="title").and_().date_range(start_year=2020)
print(f"Step 3: {qb.build()}")

# Step 4: Add treatment focus
qb = QueryBuilder(validate=False)
qb.keyword("cancer", field="title").and_().date_range(start_year=2020).and_().keyword("immunotherapy")
print(f"Step 4: {qb.build()}")

# Step 5: Add quality filters (open access + citations)
qb = QueryBuilder(validate=False)
query = (qb.keyword("cancer", field="title")
          .and_().date_range(start_year=2020)
          .and_().keyword("immunotherapy")
          .and_().field("open_access", True)
          .and_().citation_count(min_count=10)
          .build())
print(f"Step 5 (Final): {query}")

print("\n‚ú® Query built successfully!")

Step 1: cancer
Step 2: TITLE:cancer
Step 3: TITLE:cancer AND (PUB_YEAR:[2020 TO 2025])
Step 4: TITLE:cancer AND (PUB_YEAR:[2020 TO 2025]) AND immunotherapy
Step 5 (Final): TITLE:cancer AND (PUB_YEAR:[2020 TO 2025]) AND immunotherapy AND OPEN_ACCESS:y AND (CITED:[10 TO *])

‚ú® Query built successfully!


## Summary

You've learned how to:
- ‚úÖ Build simple and complex queries with the fluent API
- ‚úÖ Use field-specific methods (author, journal, mesh_term, etc.)
- ‚úÖ Combine terms with boolean operators (AND, OR, NOT)
- ‚úÖ Filter by dates, citations, and content availability
- ‚úÖ Group queries for complex logic
- ‚úÖ Execute live searches and display results
- ‚úÖ Use the generic field() method for advanced searches

### Next Steps
- Explore the 149+ available fields in FIELD_METADATA
- Try combining different filters for your research needs
- Check out the FullTextClient for retrieving full article content
- Use the FTPDownloader for bulk data retrieval

**Happy searching! üî¨üìö**