Skip to content

Repository to convert geojson datasets from images to yolo labels

Notifications You must be signed in to change notification settings

xoryouyou/geojson-to-yolo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Radagast

Question: How many trees are there? Sidequest... How many cars are there Goal: Automate counting of trees How: Build machine-learning infrastructure from satellite images with trees crown images extracted based on opendata tree lang registers.

Idea

  • Get a set of information about trees like what kind, how old and where it stands

  • Fetch & Prepare satelite data

    • sentinel account & info
    • sentinelsat python
    • download tiles containing berlin T32UVU,T32UUU
    • merge R,G,B channels
    • crop berlin border
  • filter tree data

    • remove trees with missing info -> null fields
    • sort to genus -> pivot table
    • remove genuses with too few samples < 1000
  • loop over tree data

    • extract 64x64 pixels around lat/lon from the tree
    • write to file with genus_lat_lon_circumference.jpg

Get the Data

Satellite Images

From: daten.berlin.de

Files at https://fbinter.stadt-berlin.de/fb/berlin/service_intern.jsp?id=a_luftbild2019_rgb@senstadt&type=FEED

  • CRS: EPSG:25833
  • Format: zipped ECW
  • License: Für die Nutzung der Daten ist die Datenlizenz Deutschland - Namensnennung - Version 2.0 anzuwenden. Die Lizenz ist über https://www.govdata.de/dl-de/by-2-0 abrufbar. Der Quellenvermerk gemäß (2) der Lizenz lautet "Geoportal Berlin / [Titel des Datensatzes]".

Just throw these urls below into aria2c or wget

https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Mitte.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Nord.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Nordost.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Nordwest.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Ost.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Sued.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Suedost.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/Suedwest.zip
https://fbinter.stadt-berlin.de/fb/atom/DOP/dop20rgb_2019/West.zip

Get the Cars geojson

Ask Hans Hack @hnshck or http://hanshack.com/

Convert

  • Unpack all .zip files into raw/ecw/

  • Run create_conversion_jobs.sh >> jobs.txt to create a list of conversion jobs

  • Run cat jobs.txt | parallell -j 4 which does the above created jobs in parallel

  • (optional) run convert_to_png.sh for get .png from the .ecw

Prepare Dataset

Setup Virtual env

  1. virtualenv env
  2. pip install -r requirements.txt

Add UUIDv4 to cars for indexing

Add a UUIDv4 to each car in the dataset for later addresing run python src/add_uuid_to_dataset.py

Extract BBox from tile

To get a EPSG:4326 boundingbox from each tile run python src/extract_bbox_for_tile.py

Result:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              13.2358212259,
              52.4090536323
            ],
            [
              13.2365380842,
              52.3910814688
            ],
            [
              13.2659143789,
              52.3915161631
            ],
            [
              13.2652094476,
              52.409488607
            ],
            [
              13.2358212259,
              52.4090536323
            ]
          ]
        ]
      },
      "properties": {
        "image": "dop20rgb_380_5806_2_be_2019.tif",
        "tile": "380-5806"
      }
    ...

Get Mapping of Cars to Tiles

Run python src/get_cars_in_tile.py which creates a json which holds a ImageFile<->UUIDv4 mapping.

{
  "dop20rgb_380_5806_2_be_2019.tif": [
    "ac4bd529-128f-4fef-b4c1-74fd0054d4c7",
    "8399861b-0ad1-481c-a0c1-eb8d9731482c",
    ...
  ],
  "dop20rgb_390_5818_2_be_2019.tif": [
    "4dcc247d-6a94-43fd-9b8e-481ad767daa1",
    "586a7893-1451-4e72-b28f-df8980400a0d",
    ...
  ]
}

Create Labels

Now we got the tile->Car mapping and a boundingbox for each tile we can run python src/create_labels.py.

This creates a folder with .txt files containing Yolo format lines <class> <center_x> <center_y> <width> <height>

0 0.64485 0.83785 0.0029 0.0027
0 0.64675 0.84055 0.0029 0.0027
0 0.64705 0.846 0.0029 0.0028

Happy training!

Optional Tree Stuff

Opendata provided by OpenData from Stadtentwicklung Berlin License: Open Data Nutzungsbedingungen NutzIII der Stadtentwicklung Berlin

https://opendata-esri-de.opendata.arcgis.com/datasets/05c3f9d7dea6422b86e30967811bddd7_0 https://opendata.arcgis.com/datasets/05c3f9d7dea6422b86e30967811bddd7_0.geojson

Count trees: 565363

Features:

  • "FID": 1,
  • "StandortNr": "027",
  • "Kennzeich": "05601-Str",
  • "NameNr": "GANDENITZER WEG",
  • "Art_Dtsch": "JAPANISCHE ZIERKIRSCHE 'KANZAN'",
  • "Art_Bot": "PRUNUS SERRULATA 'KANZAN'",
  • "GATTUNG_DEUTSCH": null,
  • "Gattung": "PRUNUS",
  • "Pflanzjahr": "1970",
  • "Standalter": 47.0,
  • "KroneDurch": null,
  • "Stammumfg": 63,
  • "BaumHoehe": 10.0,
  • "BEZIRK": "Reinickendorf",
  • "Eigentuemer": "Land Berlin",
  • "ALK_Nr4st": 27.0,
  • "StrName": "Gandenitzer Weg",
  • "HausNr": null,
  • "Zusatz": null,
  • "Kategorie": "Straßenbaum",
  • "ORIG_FID": 1,
  • "geometry": { "type": "Point", "coordinates": [ 13.364091607000034, 52.59961781100003 ] }

About

Repository to convert geojson datasets from images to yolo labels

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published