# GeoQA Quick Start Guide

This notebook demonstrates the core features of **GeoQA** — a Python package for
geospatial data quality assessment and interactive profiling.

## What you'll learn:
1. Profile a dataset with a single line of code
2. View quality scores and check results
3. Create interactive maps
4. Generate HTML reports

In [None]:
# Install geoqa (uncomment if needed)
# !pip install geoqa

import geoqa
print(f"GeoQA version: {geoqa.__version__}")

## 1. One-Liner Dataset Profiling

Profile any vector dataset with a single function call. GeoQA supports
Shapefile, GeoJSON, GeoPackage, and all formats supported by GeoPandas.

In [None]:
# Profile a shapefile
profile = geoqa.profile(r"../../data/giza_buildings.shp")

# View the summary
profile.summary()

## 2. Quality Score

GeoQA computes an overall quality score (0-100) based on:
- Geometry validity (40%)
- Attribute completeness (30%)
- CRS presence (15%)
- No empty geometries (15%)

In [None]:
print(f"Quality Score: {profile.quality_score:.1f}/100")
print(f"Features: {profile.feature_count}")
print(f"Geometry Type: {profile.geometry_type}")
print(f"CRS: {profile.crs}")

## 3. Quality Checks

View detailed quality check results as a structured DataFrame.

In [None]:
checks = profile.quality_checks()
checks

## 4. Interactive Map

Visualize the dataset on an interactive folium map with quality issue highlighting.

In [None]:
# Create interactive map with issue highlighting
m = profile.show_map(highlight_issues=True)
m

## 5. Attribute Statistics

Explore attribute statistics for all columns.

In [None]:
# All attribute statistics
stats = profile.attribute_stats()
stats

## 6. Geometry Statistics

View per-feature geometry measurements.

In [None]:
geom_stats = profile.geometry_stats()
geom_stats.head(10)

## 7. Generate HTML Report

Create a comprehensive, self-contained HTML quality report.

In [None]:
# Generate report
report_path = profile.to_html("giza_buildings_report.html")
print(f"Report saved to: {report_path}")

## 8. Profile Multiple Datasets

Compare quality across different datasets.

In [None]:
import os

data_dir = r"../../data"
shapefiles = [f for f in os.listdir(data_dir) if f.endswith('.shp')]

results = []
for shp in shapefiles:
    try:
        p = geoqa.profile(os.path.join(data_dir, shp))
        results.append({
            "Dataset": p.name,
            "Features": p.feature_count,
            "Type": p.geometry_type,
            "Quality Score": f"{p.quality_score:.1f}",
        })
    except Exception as e:
        results.append({"Dataset": shp, "Error": str(e)})

import pandas as pd
pd.DataFrame(results)

---

**GeoQA** — Geospatial Data Quality Assessment & Interactive Profiling

Learn more: [GitHub](https://github.com/geoqa/geoqa) | [Documentation](https://geoqa.readthedocs.io)