# GeoCleanr Demo

This notebook exercises validation, fixing, reporting, and visualization on sample coordinate data.


## Setup

If GeoCleanr is installed in this Jupyter environment, you can skip this step.
Otherwise, the cell below adds the local `src/` directory to `sys.path` so imports resolve.


In [9]:
from pathlib import Path
import sys

project_root = Path("..").resolve()
src_path = project_root / "src"
if str(src_path) not in sys.path:
    sys.path.insert(0, str(src_path))


## Imports


In [10]:
from geocleanr import GeometryValidator, CoordinateFixer, ReportBuilder, AsciiHeatmap


## Sample data

The records below contain common issues: missing values, out-of-range values, swapped axes, and string formats.


In [14]:
rows = [
    {"id": 1, "lat": 40.7128, "lon": -74.0060},
    {"id": 2, "lat": None, "lon": -73.9352},
    {"id": 3, "lat": 91.0, "lon": 10.0},
    {"id": 4, "lat": 12.34, "lon": -190.0},
    {"id": 5, "lat": -33.8688, "lon": 151.2093},
    {"id": 6, "lat": "34.05N", "lon": "118.25W"},
    {"id": 7, "lat": 120.0, "lon": 45.0},
    {"id": 8, "lat": "48,8566", "lon": "2,3522"},
    {"id": 9, "lat": None, "lon": None},
]


## Validate


In [15]:
validator = GeometryValidator()
issues = validator.validate(rows)
print(f"Total issues: {len(issues)}")
for issue in issues:
    print(issue)


Total issues: 9
ValidationIssue(index=1, field='coordinates', message='Missing latitude or longitude')
ValidationIssue(index=2, field='lat', message='Latitude outside bounds')
ValidationIssue(index=3, field='lon', message='Longitude outside bounds')
ValidationIssue(index=5, field='lat', message='Latitude outside bounds')
ValidationIssue(index=5, field='lon', message='Longitude outside bounds')
ValidationIssue(index=6, field='lat', message='Latitude outside bounds')
ValidationIssue(index=7, field='lat', message='Latitude outside bounds')
ValidationIssue(index=7, field='lon', message='Longitude outside bounds')
ValidationIssue(index=8, field='coordinates', message='Missing latitude or longitude')


## Build a report


In [16]:
builder = ReportBuilder(sample_size=5)
summary = builder.build_summary(issues)
report_md = builder.to_markdown(summary)
print(report_md)


# GeoCleanr Validation Report

**Total issues detected:** 9

## Issues by Field
- **coordinates**: 2 issue(s)
- **lat**: 4 issue(s)
- **lon**: 3 issue(s)

## Issues by Message
- `Latitude outside bounds` — 4 occurrence(s)
- `Longitude outside bounds` — 3 occurrence(s)
- `Missing latitude or longitude` — 2 occurrence(s)

## Sample Issues (first 5)
- Row #1 [coordinates]: Missing latitude or longitude
- Row #2 [lat]: Latitude outside bounds
- Row #3 [lon]: Longitude outside bounds
- Row #5 [lat]: Latitude outside bounds
- Row #5 [lon]: Longitude outside bounds


## Fix records


In [17]:
fixer = CoordinateFixer()
fixed_results = fixer.fix_all(rows)
fixed_rows = [result.record for result in fixed_results]

for original, result in zip(rows, fixed_results):
    if result.fixes:
        print(f"id={original.get('id')} fixes={result.fixes}")


id=2 fixes=['lat_filled']
id=3 fixes=['swapped_axes']
id=4 fixes=['lon_wrapped']
id=6 fixes=['coerced_from_string', 'coerced_from_string']
id=7 fixes=['swapped_axes']
id=8 fixes=['coerced_from_string', 'coerced_from_string']
id=9 fixes=['filled_missing']


## Validate again after fixes


In [18]:
post_issues = validator.validate(fixed_rows)
print(f"Issues after fixes: {len(post_issues)}")
if post_issues:
    for issue in post_issues:
        print(issue)


Issues after fixes: 0


## ASCII heatmap


In [19]:
heatmap = AsciiHeatmap(rows=6, cols=12)
print(heatmap.render(fixed_rows))


............
..**..*...*.
.........*.*
...*..*.....
...........*
............

Legend: . = none, - = low, + = medium, * = high density
Lat range: [-90.0, 90.0], Lon range: [-180.0, 180.0]


## Next steps

- Replace the sample records with your data.
- If your columns are not named `lat` and `lon`, pass `lat_field` and `lon_field`.
- To load a CSV file into `rows`, you can use the standard library:

```python
import csv
with open("your.csv", newline="") as f:
    rows = list(csv.DictReader(f))
```
