# Loading fiboa datasets from Source Cooperative

---

**WARNING:** The fiboa CLI Python code is not really a public API or library yet. The imports and methods may change at any time. This is a proof-of-concept and not really meant for production use cases yet!

---

Make sure that you have fiboa-cli installed.  
Otherwise install via `pip install fiboa-cli`.  
Once done, we can import the library:

In [1]:
from fiboa_cli.util import load_parquet_data

We can load fiboa data from [Source Cooperative](https://beta.source.coop/fiboa/). Here are the direct download links to some data from Austria and Germany:
- Austria: <https://data.source.coop/fiboa/austria/inspire_referenzen_2021.parquet> (EPSG:31287)
- Berlin and Brandenburg, Germany: <https://data.source.coop/fiboa/de-bb/dfbk.parquet> (EPSG:25833)
- Lower Saxony, Germany: <https://data.source.coop/fiboa/de-nds/FB_NDS.parquet> (EPSG:25832)
- North Rhine-Westphalia, Germany: <https://data.source.coop/fiboa/de-nrw/LFK-AKTI_EPSG25832.parquet> (EPSG:25832)
- Schleswig-Holstein, Germany: <https://data.source.coop/fiboa/de-sh/Feldbloecke_2024.parquet> (EPSG:4647)

Below is a simple example how to load one of the datasets and print an excerpt of data:

In [2]:
example = load_parquet_data("https://data.source.coop/fiboa/de-sh/Feldbloecke_2024.parquet")
print(example.head())

                                            geometry  \
0  MULTIPOLYGON (((32427203.690 6004239.850, 3242...   
1  MULTIPOLYGON (((32427174.010 6004065.310, 3242...   
2  MULTIPOLYGON (((32426555.520 6004971.950, 3242...   
3  MULTIPOLYGON (((32426680.970 6004743.340, 3242...   
4  MULTIPOLYGON (((32426473.120 6005003.740, 3242...   

     determination_datetime              flik                id    area  \
0 2024-01-01 00:00:00+00:00  DESHLIL080100007  DESHLIL080100007  0.4897   
1 2024-01-01 00:00:00+00:00  DESHLIL080100017  DESHLIL080100017  0.6807   
2 2024-01-01 00:00:00+00:00  DESHLIL080100003  DESHLIL080100003  4.0968   
3 2024-01-01 00:00:00+00:00  DESHLIL080100005  DESHLIL080100005  5.4543   
4 2024-01-01 00:00:00+00:00  DESHLIL080110000  DESHLIL080110000  0.2203   

             hbn  
0  Dauergrünland  
1  Dauergrünland  
2  Dauergrünland  
3  Dauergrünland  
4  Dauergrünland  


We want to load all fiboa datasets we currently have and merge them into a single GeoDataFrame.  
First of all, let's define a map with all the URLs we can download data from:

In [3]:
sources = {
#   'at': 'https://data.source.coop/fiboa/austria/inspire_referenzen_2021.parquet',
    'de_bb': 'https://data.source.coop/fiboa/de-bb/dfbk.parquet',
    'de_nds': 'https://data.source.coop/fiboa/de-nds/FB_NDS.parquet',
    'de_nrw': 'https://data.source.coop/fiboa/de-nrw/LFK-AKTI_EPSG25832.parquet',
    'de_sh': 'https://data.source.coop/fiboa/de-sh/Feldbloecke_2024.parquet',
}

In the following we download all data to the local file system once to make things in the following more simple.  
You can skip this step if you don't want to download the files completely and directly work on the files in the internet.

In [4]:
import urllib.request
import os
for key, url in sources.items():
    name = key + ".parquet"
    if not os.path.exists(name):
        urllib.request.urlretrieve(url, name)
    sources[key] = name

We load the data and reproject the geometries to EPSG:4326:

In [5]:
data = {}
for key, url in sources.items():
    data[key] = load_parquet_data(url).to_crs(epsg=4326)

Merge all data into a single GeoDataFrame:

In [6]:
import pandas as pd
merged = pd.concat(data.values(), ignore_index=True)
print("Number of rows: " + str(len(merged)))

Number of rows: 1235364


You can visualize all the polygons using lonboard. For this, you need to install lonboard first: `pip install lonboard`. Afterwards you can run:

In [7]:
from lonboard import viz
viz(merged)

Map(basemap_style=<CartoBasemap.DarkMatter: 'https://basemaps.cartocdn.com/gl/dark-matter-gl-style/style.json'…

Now we can query across all field boundaries in Germany (selected states).  
For example, how large is the largest field?

In [8]:
max_area = merged['area'].max()
print(f"Largest field size: {max_area} ha")

Largest field size: 499.96600341796875 ha


And which are the 30 largest fields by area in Germany?

In [9]:
indices = merged['area'].nlargest(30).index
largest_fields = merged.loc[indices].sort_values(by='area', ascending=False)
print(largest_fields[['id', 'flik', 'area']])

                       id              flik        area
4253     DEBBLI1464397727  DEBBLI1464397727  499.966003
50116    DEBBLI0272200078  DEBBLI0272200078  437.395203
737      DEBBLI1473412691  DEBBLI1473412691  427.259094
34060    DEBBLI0372301712  DEBBLI0372301712  397.646912
13327    DEBBLI0264008176  DEBBLI0264008176  383.888092
957      DEBBLI1573412975  DEBBLI1573412975  361.886414
37469    DEBBLI0363037869  DEBBLI0363037869  357.405701
88227    DEBBLI0373305506  DEBBLI0373305506  349.223297
29403    DEBBLI0373303862  DEBBLI0373303862     340.306
78177    DEBBLI0264006263  DEBBLI0264006263  339.653015
9375     DEBBLI2072551344  DEBBLI2072551344  325.146515
44372    DEBBLI0372301947  DEBBLI0372301947  324.643402
688      DEBBLI0373301041  DEBBLI0373301041  322.441193
77274    DEBBLI0372301145  DEBBLI0372301145   313.92569
53309    DEBBLI0364300793  DEBBLI0364300793  311.396393
56905    DEBBLI0372301161  DEBBLI0372301161  310.567108
84230    DEBBLI1867426372  DEBBLI1867426372  309

It looks like most of them are in Berlin/Brandenburg (indicated by the DEBB in the flik column).  
Maybe it's easier to visualize this on a map though:

In [10]:
viz(largest_fields)

Map(basemap_style=<CartoBasemap.DarkMatter: 'https://basemaps.cartocdn.com/gl/dark-matter-gl-style/style.json'…