# Reading Vector Data with Fiona

https://fiona.readthedocs.io/en/stable/manual.html

https://github.com/Toblerity/Fiona


Fiona streams simple feature data to and from GIS formats like GeoPackage and Shapefile.

Fiona can read and write real-world data using multi-layered GIS formats, zipped and in-memory virtual file systems, from files on your hard drive or in cloud storage. This project includes Python modules and a command line interface (CLI).

Fiona depends on GDAL but is different from GDAL's own bindings. Fiona is designed to be highly productive and to make it easy to write code which is easy to read.

## Vectors ? Types of Geospatial Data

The Open Geospatial Consortium (OGC) Simple Feature Access Standard defines a common set of geometries for representing two-dimensional geospatial features in vector data. These geometries are widely used in GIS software and data formats to describe the spatial characteristics of features like points, lines, and polygons. The standard specifies the following geometry types:
    
![Geometry Types](artwork/geomtype.png)    
    
### Point

A single location in space, defined by a pair of coordinates (x, y). Points can represent features like trees, streetlights, or the centroids of more complex shapes.

### LineString

A sequence of connected points, forming a line. LineStrings are used to represent linear features like roads, rivers, and paths. LineStrings can be open or closed; a closed LineString forms a linear ring.

### Polygon

A closed, two-dimensional shape defined by one or more linear rings. The first ring represents the outer boundary of the polygon, while additional rings, if present, represent holes or interior boundaries. Polygons are used to represent areas like buildings, lakes, or administrative boundaries.

### MultiPoint

A collection of one or more Point geometries, which can be used to represent a group of discrete features that share the same attributes.

### MultiLineString

A collection of one or more LineString geometries, which can represent a group of linear features that share the same attributes. Examples include a network of roads or a group of disconnected rivers.

### MultiPolygon

A collection of one or more Polygon geometries, which can represent a group of area features that share the same attributes. Examples include a group of islands or a set of non-contiguous administrative regions.

### GeometryCollection

A collection of any combination of the geometry types listed above. This allows for more complex representations of features that have multiple, distinct geometries with different types.


## Popular Vector File Formats

Vector formats are used to represent geospatial data as points, lines, and polygons, which are composed of coordinates defining their shapes and positions. Vector data is distinct from raster data, which represents geospatial information as a grid of pixels or cells, each having a specific value. Vector formats are particularly suited for representing discrete features like roads, buildings, rivers, or administrative boundaries.

Here is an **incomplete** list of some popular vector data formats:

### Shapefile (.shp)

Shapefile (.shp): Developed by Esri, this is a widely used and supported format for storing vector data. It consists of multiple files, including geometry, attribute data, and projection information.

Do not use Shapefiles. http://switchfromshapefile.org/

### GeoJSON (.geojson)

A lightweight, text-based format that uses the JSON (JavaScript Object Notation) standard to represent geographic features and their properties. It is especially popular for web-based mapping and geospatial applications.

### KML (.kml) and KMZ (.kmz)

Keyhole Markup Language (KML) is an XML-based format originally developed for Google Earth. It is used to display geographic data in Earth browsers and other compatible applications. KMZ is a compressed version of KML.

### MicroStation DGN (.dgn)

A CAD-based vector format developed by Bentley Systems for their MicroStation software.

### Esri File Geodatabase (.gdb) 

A proprietary data format developed by Esri for storing and managing geospatial data, including vector features, raster data, and attribute tables. File geodatabases offer efficient storage, spatial indexing, and compression options, making them suitable for large datasets and complex applications.

### OGC Geopackage (.gpkg)

The GeoPackage format is an open, standards-based, platform-independent, and portable data format for geospatial information. Developed by the Open Geospatial Consortium (OGC), it is designed to facilitate the storage, sharing, and management of geospatial data, including vector features, tile matrix sets of imagery and raster maps at various scales, and even simple nontopological attributes.

GeoPackage files use the SQLite database format, with the file extension ".gpkg". SQLite is a widely used, lightweight, and self-contained database engine, which makes GeoPackage files easily portable and accessible across different devices, platforms, and programming languages.

( More info: https://www.geopackage.org/ )


### SpatiaLite (.sqlite)

An extension to the SQLite database format that adds support for geospatial data types and operations. SpatiaLite is similar to the GeoPackage format but follows a different specification.





## Download Sample Vector Data

Let's download some data we need later. We download data in the **GeoPackage** format.

**Natural Earth** is a public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. Featuring tightly integrated vector and raster data, with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software. More info: https://www.naturalearthdata.com/

We download vector data now

In [2]:
from download import download, unzip

In [3]:
url = "http://naciscdn.org/naturalearth/packages/natural_earth_vector.gpkg.zip"
#url = "https://www.geopython.xyz/geodata/naturalearth/natural_earth_vector.gpkg.zip" # backup-url

download(url, "geodata/ne.gpkg.zip")
unzip("geodata/ne.gpkg.zip", "geodata")

Downloading geodata/ne.gpkg.zip from http://naciscdn.org/naturalearth/packages/natural_earth_vector.gpkg.zip
100% done 	[****************************************************************************************************]


## Reading Vector File

In [4]:
import fiona

List supported vector file formats:

In [5]:
fiona.supported_drivers

{'DXF': 'rw',
 'CSV': 'raw',
 'OpenFileGDB': 'raw',
 'ESRIJSON': 'r',
 'ESRI Shapefile': 'raw',
 'FlatGeobuf': 'raw',
 'GeoJSON': 'raw',
 'GeoJSONSeq': 'raw',
 'GPKG': 'raw',
 'GML': 'rw',
 'OGR_GMT': 'rw',
 'GPX': 'rw',
 'Idrisi': 'r',
 'MapInfo File': 'raw',
 'DGN': 'raw',
 'PCIDSK': 'raw',
 'OGR_PDS': 'r',
 'S57': 'r',
 'SQLite': 'raw',
 'TopoJSON': 'r'}

In [6]:
if 'GPKG' in fiona.supported_drivers:
    print("Yay!! GeoPackage is supported...")

Yay!! GeoPackage is supported...


### Layers in GeoPackage 

In a GeoPackage, layers, also known as tables, are used to organize and store different types of geospatial data within the same SQLite database file. Each layer is a container for a specific type of geospatial data and its associated attributes. 


Let's list all available layers of our Geopackage file:

In [7]:
filename = "geodata/packages/natural_earth_vector.gpkg"  # this is the extracted GeoPackage file

In [8]:
layers = fiona.listlayers(filename)
layers

['ne_10m_admin_0_antarctic_claim_limit_lines',
 'ne_10m_admin_0_antarctic_claims',
 'ne_10m_admin_0_boundary_lines_disputed_areas',
 'ne_10m_admin_0_boundary_lines_land',
 'ne_10m_admin_0_boundary_lines_map_units',
 'ne_10m_admin_0_boundary_lines_maritime_indicator',
 'ne_10m_admin_0_boundary_lines_maritime_indicator_chn',
 'ne_10m_admin_0_countries',
 'ne_10m_admin_0_countries_arg',
 'ne_10m_admin_0_countries_bdg',
 'ne_10m_admin_0_countries_bra',
 'ne_10m_admin_0_countries_chn',
 'ne_10m_admin_0_countries_deu',
 'ne_10m_admin_0_countries_egy',
 'ne_10m_admin_0_countries_esp',
 'ne_10m_admin_0_countries_fra',
 'ne_10m_admin_0_countries_gbr',
 'ne_10m_admin_0_countries_grc',
 'ne_10m_admin_0_countries_idn',
 'ne_10m_admin_0_countries_ind',
 'ne_10m_admin_0_countries_iso',
 'ne_10m_admin_0_countries_isr',
 'ne_10m_admin_0_countries_ita',
 'ne_10m_admin_0_countries_jpn',
 'ne_10m_admin_0_countries_kor',
 'ne_10m_admin_0_countries_lakes',
 'ne_10m_admin_0_countries_mar',
 'ne_10m_admin_0_

### Reading a Layer

Let's use the `ne_10m_airports` layer. It cointains some airports around the workd. 

10m stands for the resolution 1:10_000_000 resolution. Some data is also available in 50m and 110m. (1:50_000_000 and 1:110_000_000)

Note that many other vector file formats don't have the concept of layers (Shapefile, GeoJSON, ...). In that case you can't list layers and you would just omit the layer parameter when opening the file.

Open the dataset:

In [9]:
c = fiona.open(filename, 'r', layer='ne_10m_airports')

#### Get the schema

In [10]:
c.schema

{'properties': {'scalerank': 'int',
  'featurecla': 'str:80',
  'type': 'str:50',
  'name': 'str:200',
  'abbrev': 'str:4',
  'location': 'str:50',
  'gps_code': 'str:254',
  'iata_code': 'str:254',
  'wikipedia': 'str:254',
  'natlscale': 'float',
  'comments': 'str:254',
  'wikidataid': 'str:254',
  'name_ar': 'str:254',
  'name_bn': 'str:254',
  'name_de': 'str:254',
  'name_en': 'str:254',
  'name_es': 'str:254',
  'name_fr': 'str:254',
  'name_el': 'str:254',
  'name_hi': 'str:254',
  'name_hu': 'str:254',
  'name_id': 'str:254',
  'name_it': 'str:254',
  'name_ja': 'str:254',
  'name_ko': 'str:254',
  'name_nl': 'str:254',
  'name_pl': 'str:254',
  'name_pt': 'str:254',
  'name_ru': 'str:254',
  'name_sv': 'str:254',
  'name_tr': 'str:254',
  'name_vi': 'str:254',
  'name_zh': 'str:254',
  'wdid_score': 'int',
  'ne_id': 'int',
  'name_fa': 'str:99',
  'name_he': 'str:99',
  'name_uk': 'str:112',
  'name_ur': 'str:93',
  'name_zht': 'str:80'},
 'geometry': 'Point'}

#### Attributes / Properties

In [11]:
c.schema["properties"]

{'scalerank': 'int',
 'featurecla': 'str:80',
 'type': 'str:50',
 'name': 'str:200',
 'abbrev': 'str:4',
 'location': 'str:50',
 'gps_code': 'str:254',
 'iata_code': 'str:254',
 'wikipedia': 'str:254',
 'natlscale': 'float',
 'comments': 'str:254',
 'wikidataid': 'str:254',
 'name_ar': 'str:254',
 'name_bn': 'str:254',
 'name_de': 'str:254',
 'name_en': 'str:254',
 'name_es': 'str:254',
 'name_fr': 'str:254',
 'name_el': 'str:254',
 'name_hi': 'str:254',
 'name_hu': 'str:254',
 'name_id': 'str:254',
 'name_it': 'str:254',
 'name_ja': 'str:254',
 'name_ko': 'str:254',
 'name_nl': 'str:254',
 'name_pl': 'str:254',
 'name_pt': 'str:254',
 'name_ru': 'str:254',
 'name_sv': 'str:254',
 'name_tr': 'str:254',
 'name_vi': 'str:254',
 'name_zh': 'str:254',
 'wdid_score': 'int',
 'ne_id': 'int',
 'name_fa': 'str:99',
 'name_he': 'str:99',
 'name_uk': 'str:112',
 'name_ur': 'str:93',
 'name_zht': 'str:80'}

#### Get the Geospatial Data Type

In [12]:
c.schema["geometry"]

'Point'

#### Get the first entry:

In [13]:
airport = next(iter(c))

In [14]:
airport['properties']['name']

'Sahnewal'

In [15]:
airport['geometry']['type']

'Point'

In [16]:
airport['geometry']['coordinates']

(75.95707224036518, 30.850359856170176)

In [17]:
c.close()

#### Retrieving all data

There are basically two ways:

* convert to a list: Load everything into memory: (if dataset isn't too large...)

      alldata = list(c)
    
* iterate through all data: (one by one):

      for element in c:
           ...

In [20]:
cnt = 0
with fiona.open(filename, 'r', layer='ne_10m_airports') as c:
    for airport in c:
        if cnt < 15:
             print(dict(airport['properties'])['name'])
        cnt +=1

Sahnewal
Solapur
Birsa Munda
Ahwaz
Gwalior


In [24]:
with fiona.open(filename, 'r', layer='ne_10m_airports') as c:
    for airport in c:
        if airport['properties']['iata_code'] == "ZRH":
            #print(airport['properties']['name'])
            #print(airport['geometry']['coordinates'])
            #print(airport['properties']['wikipedia'])
            print(dict(airport['properties']))
            print()
            print(dict(airport['geometry']))

{'scalerank': 3, 'featurecla': 'Airport', 'type': 'major', 'name': "Zurich Int'l", 'abbrev': 'ZRH', 'location': 'terminal', 'gps_code': 'LSZH', 'iata_code': 'ZRH', 'wikipedia': 'http://en.wikipedia.org/wiki/Z%C3%BCrich_Airport', 'natlscale': 75.0, 'comments': None, 'wikidataid': 'Q15114', 'name_ar': 'مطار زيورخ الدولي', 'name_bn': 'জুরিখ বিমানবন্দর', 'name_de': 'Flughafen Zürich', 'name_en': 'Zurich Airport', 'name_es': 'Aeropuerto Internacional de Zúrich', 'name_fr': 'aéroport international de Zurich', 'name_el': None, 'name_hi': None, 'name_hu': 'Zürichi repülőtér', 'name_id': 'Bandar Udara Internasional Zürich', 'name_it': 'aeroporto di Zurigo', 'name_ja': 'チューリッヒ空港', 'name_ko': '취리히 공항', 'name_nl': 'Luchthaven Zürich', 'name_pl': 'Port lotniczy Zurych-Kloten', 'name_pt': 'Aeroporto de Zurique', 'name_ru': 'Цюрих', 'name_sv': 'Zürich flygplats', 'name_tr': 'Zürih Havalimanı', 'name_vi': 'Sân bay Zürich', 'name_zh': '蘇黎世機場', 'wdid_score': 4, 'ne_id': 1159127117, 'name_fa': 'فرودگاه ز

In [25]:
with fiona.open(filename, 'r', layer='ne_10m_airports') as c:
    print(c.crs)

EPSG:4326


In [26]:
import fiona

c = fiona.open(filename, 'r', layer='ne_10m_admin_0_countries')

c.schema

{'properties': {'featurecla': 'str:15',
  'scalerank': 'int',
  'LABELRANK': 'int',
  'SOVEREIGNT': 'str:32',
  'SOV_A3': 'str:3',
  'ADM0_DIF': 'int',
  'LEVEL': 'int',
  'TYPE': 'str:17',
  'TLC': 'str:1',
  'ADMIN': 'str:36',
  'ADM0_A3': 'str:3',
  'GEOU_DIF': 'int',
  'GEOUNIT': 'str:36',
  'GU_A3': 'str:3',
  'SU_DIF': 'int',
  'SUBUNIT': 'str:36',
  'SU_A3': 'str:3',
  'BRK_DIFF': 'int',
  'NAME': 'str:29',
  'NAME_LONG': 'str:36',
  'BRK_A3': 'str:3',
  'BRK_NAME': 'str:32',
  'BRK_GROUP': 'str:17',
  'ABBREV': 'str:16',
  'POSTAL': 'str:4',
  'FORMAL_EN': 'str:52',
  'FORMAL_FR': 'str:35',
  'NAME_CIAWF': 'str:45',
  'NOTE_ADM0': 'str:16',
  'NOTE_BRK': 'str:63',
  'NAME_SORT': 'str:36',
  'NAME_ALT': 'str:19',
  'MAPCOLOR7': 'int',
  'MAPCOLOR8': 'int',
  'MAPCOLOR9': 'int',
  'MAPCOLOR13': 'int',
  'POP_EST': 'float',
  'POP_RANK': 'int',
  'POP_YEAR': 'int',
  'GDP_MD': 'int',
  'GDP_YEAR': 'int',
  'ECONOMY': 'str:26',
  'INCOME_GRP': 'str:23',
  'FIPS_10': 'str:3',
  'ISO

In [27]:
country = next(iter(c))

print(country['properties']['NAME'])
print(country['properties']['NAME_ZH'])
print(country['properties']['CONTINENT'])
print(country['properties']['POP_EST'])
print(country['properties']['POP_YEAR'])

Indonesia
印度尼西亚
Asia
270625568.0
2019


In [28]:
with fiona.open(filename, 'r', layer='ne_10m_admin_0_countries') as c:
    for country in c:
        if country['properties']['NAME'] == "France":
            print(country['properties']['POP_EST'])
            print(country['properties']['POP_YEAR'])   
            print(country['geometry']['type'])
            print(country['geometry']['coordinates'])               

67059887.0
2019
MultiPolygon
[[[(-54.1115266969999, 2.1142704430000663), (-54.13490799999994, 2.1106733200000605), (-54.14813716599994, 2.1143940230000737), (-54.15738724899995, 2.121654562000046), (-54.18854813699994, 2.1613162240001174), (-54.194490925999844, 2.1630732220000937), (-54.210045532999885, 2.161574605000098), (-54.229785929999935, 2.157233785000116), (-54.246477416999966, 2.151601054000082), (-54.263789021999884, 2.148190410000055), (-54.31401851399991, 2.1548566690000683), (-54.32285518499992, 2.1573371380000452), (-54.33505082199994, 2.163900045000034), (-54.360940714999884, 2.188187968000122), (-54.37391149899989, 2.196507873000087), (-54.38052608299998, 2.197696432000015), (-54.38781245999988, 2.1963011680000335), (-54.40254024299989, 2.197127991000059), (-54.4114802659999, 2.1997118120000607), (-54.42708654899994, 2.2071532190000767), (-54.43395951399992, 2.2093236290000675), (-54.47664424699991, 2.21412953800008), (-54.49266394099996, 2.2227078250000147), (-54.51142

In [None]:
 print(dict(airport['properties']))