# SPARQL Query Examples for GKC

This notebook demonstrates various ways to use the SPARQL query utilities from the Global Knowledge Commons (GKC) package. You'll learn how to:

- Execute simple and complex SPARQL queries against Wikidata and other endpoints
- Parse and transform results into different formats (dict lists, DataFrames, CSV)
- Use convenience functions for quick queries
- Handle errors and edge cases
- Work with custom SPARQL endpoints

All examples query real data from Wikidata, so results will vary. Note that some queries may take a few seconds to complete.

## Setup: Import Libraries

Let's start by importing the necessary GKC SPARQL utilities. The `SPARQLQuery` class is the main interface for executing queries.

In [None]:
from gkc.sparql import SPARQLQuery, execute_sparql, execute_sparql_to_dataframe, SPARQLError

# Optional: import pandas for enhanced output display
try:
    import pandas as pd
    HAS_PANDAS = True
except ImportError:
    HAS_PANDAS = False
    print("Note: pandas not installed. Some examples will be limited.")

## Example 1: Simple Query Execution

The most basic way to use SPARQL queries: create a `SPARQLQuery` object and call its `query()` method with a SPARQL SELECT query.

This example queries for instances of "federally recognized Native American tribe in the United States" (Q7840353) from Wikidata:

In [None]:
executor = SPARQLQuery()

query = """
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P31 wd:Q7840353 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
LIMIT 5
"""

results = executor.query(query)
print("Raw JSON results:")
display(results)

## Example 2: Execute Query from Wikidata URL

You can share SPARQL queries via Wikidata Query Service URLs (look for the "Share" button in the UI). The `SPARQLQuery` class can parse these URLs and extract the query:


In [None]:
# This is a real Wikidata Query Service URL - you can copy these from the Query Service UI
# This one queries for domestic cats (Q146)
url = "https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%20wd%3AQ7840353%20.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22.%20%7D%0A%7D%0ALIMIT%203"

executor = SPARQLQuery()
results = executor.query(url)
print("Results from Wikidata URL:")
display(results)

## Example 3: Convert Results to List of Dictionaries

Often it's more convenient to work with results as a list of dictionaries. The `to_dict_list()` method does this conversion automatically:


In [None]:
executor = SPARQLQuery()

query = """
SELECT ?item ?itemLabel ?itemDescription WHERE {
  ?item wdt:P31 wd:Q7840353 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
LIMIT 3
"""

results = executor.to_dict_list(query)

print("Results as list of dicts:")
for i, row in enumerate(results, 1):
    print(f"\n{i}. {row}")

## Example 4: Convert Results to pandas DataFrame

For data analysis, converting results to a pandas DataFrame is often ideal. DataFrame objects are great for filtering, grouping, and visualization:


In [None]:
if HAS_PANDAS:
    executor = SPARQLQuery()

    query = """
    SELECT ?item ?itemLabel ?itemDescription WHERE {
      ?item wdt:P31 wd:Q7840353 .
      SERVICE wikibase:label {
        bd:serviceParam wikibase:language "en" .
      }
    }
    LIMIT 10
    """

    df = executor.to_dataframe(query)

    print("Results as DataFrame:")
    print(df)
    print(f"\nDataFrame shape: {df.shape}")
    print(f"Columns: {list(df.columns)}")
else:
    print("pandas not installed. Install with: pip install pandas")

## Example 5: Export Results to CSV

You can export results directly to CSV format for use in other tools or to save for later analysis:


In [None]:
executor = SPARQLQuery()

query = """
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P31 wd:Q7840353 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
LIMIT 5
"""

# Get CSV data as a string
csv_data = executor.to_csv(query)
print("CSV data:")
print(csv_data)

## Example 6: Convenience Functions

For one-off queries, you can use convenience functions that handle instantiation automatically:


In [None]:
query = """
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P31 wd:Q7840353 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
LIMIT 3
"""

# Use convenience function for one-off queries
results = execute_sparql(query)
print("Using execute_sparql():")
print(results)

# Also works with DataFrames if pandas is available
if HAS_PANDAS:
    df = execute_sparql_to_dataframe(query)
    print("\nUsing execute_sparql_to_dataframe():")
    print(df)
else:
    print("pandas not installed. Install with: pip install pandas")

## Example 7: Using Custom SPARQL Endpoints

While Wikidata is the default, you can query any SPARQL endpoint by specifying a custom endpoint URL:


In [None]:
# Example with DBpedia - a different SPARQL endpoint
custom_endpoint = "https://dbpedia.org/sparql"

executor = SPARQLQuery(endpoint=custom_endpoint)

query = """
SELECT ?resource ?label WHERE {
  ?resource rdf:type dbo:Animal .
  ?resource rdfs:label ?label .
  FILTER(LANG(?label) = 'en')
}
LIMIT 5
"""

try:
    results = executor.query(query)
    print("Results from DBpedia endpoint:")
    print(results)
except Exception as e:
    print(f"Note: DBpedia endpoint example requires working internet connection")
    print(f"Error: {e}")

## Example 8: Error Handling

Robust code handles errors gracefully. The GKC SPARQL module raises `SPARQLError` for known issues:


In [None]:
executor = SPARQLQuery()

# Example 1: Invalid SPARQL syntax
print("Attempting invalid SPARQL query:")
try:
    results = executor.query("INVALID SPARQL SYNTAX")
except SPARQLError as e:
    print(f"✓ Caught SPARQLError: {e}\n")

# Example 2: Invalid Wikidata URL
print("Attempting invalid Wikidata URL:")
try:
    executor.query("https://example.com/#SELECT%20*")
except SPARQLError as e:
    print(f"✓ Caught SPARQLError: {e}")

## Example 9: Complex Query with Filters and Sorting

SPARQL is powerful for complex queries. Here's an example finding tribes with member counts less than 20, sorted by population:


In [None]:
executor = SPARQLQuery()

query = """
SELECT ?item ?itemLabel ?member_count WHERE {
  ?item wdt:P31 wd:Q7840353 .
  ?item wdt:P2124 ?member_count .
  FILTER(?member_count < 20)
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
ORDER BY DESC(?member_count)
LIMIT 10
"""

try:
    results = executor.to_dict_list(query)
    print(f"Found {len(results)} tribes with member count < 20\n")
    print("Top 5 by member count:")
    for i, row in enumerate(results[:5], 1):
        member_count = row.get('member_count', 'N/A')
        label = row.get('itemLabel', 'N/A')
        print(f"  {i}. {label}: {member_count}")
except Exception as e:
    print(f"Error: {e}")

## Summary

You've now seen how to:

- **Execute queries** using the `SPARQLQuery` class or convenience functions
- **Parse Wikidata URLs** using `SPARQLQuery.parse_wikidata_query_url()`
- **Transform results** into different formats (raw JSON, dict lists, DataFrames, CSV)
- **Use custom endpoints** to query any SPARQL server
- **Handle errors** gracefully with `SPARQLError` exceptions
- **Build complex queries** with filters, sorting, and joins

### Next Steps

- Explore the [Wikidata Query Service](https://query.wikidata.org/) to write your own queries
- Check the [SPARQL documentation](../docs/sparql_quick_reference.md) for query syntax and examples
- See other example notebooks for authentication, item creation, and sitelinks

### Tips

- Start simple and build complexity gradually
- Use the Wikidata UI to test queries before using them in code
- Remember to use `LIMIT` on large queries to keep response times reasonable
- Share queries via Wikidata URLs for reproducibility