<H1>Load and Clean GeoJSON Data</H1>
<p>Your GeoJSON files contain geographical features like parks, restaurants, and stores with properties such as names, categories, and coordinates (latitude/longitude). The main tasks here are loading, inspecting, and preparing the data for later use.

🔹 2.1 Load GeoJSON files

GeoJSON is a format for encoding geographic data structures, including points (like a restaurant), lines (like walking trails), and polygons (like a park boundary).</p>

<p>You can read GeoJSON files using:

    Pandas — useful if you just want the properties (but not geometry).

    GeoPandas — best for full geographic structure, including geometry handling.
</p>


<p> 🔹 2.2 Inspect the structure
    
Typical GeoJSON features include:
{
  "type": "Feature",
  "properties": {
    "name": "Central Park",
    "category": "park"
  },
  "geometry": {
    "type": "Point",
    "coordinates": [-73.9654, 40.7829]
  }
}
</p>

<p>You’ll want to:

    Extract relevant columns (name, category, etc.).

    Ensure coordinates are in a usable format.

🔹 2.3 Clean the data

    Remove missing or irrelevant entries.

    Normalize text data (e.g., lowercasing category names).

    Check for duplicates.

    Optionally filter by city boundaries or other criteria.

🔹 2.4 Structure for analysis

Prepare the data so that each row corresponds to one location, with clearly separated fields like:

    name

    type or category

    latitude / longitude

    Optional tags or descriptions
</p>

In [2]:
import geopandas as gpd

In [3]:
gdf = gpd.read_file("raw_data_geojson/gruenflaechen.geojson")

In [4]:
gdf.head()

Unnamed: 0,obj_nr,obj_art,geometry
0,45408,4,"MULTIPOLYGON (((7.58272 51.93164, 7.58271 51.9..."
1,56234,5,"POLYGON ((7.61478 52.00261, 7.61485 52.00252, ..."
2,56232,5,"POLYGON ((7.61558 51.99764, 7.61565 51.99762, ..."
3,56206,5,"MULTIPOLYGON (((7.60867 51.99611, 7.60867 51.9..."
4,18707,1,"POLYGON ((7.72766 51.91591, 7.72781 51.91587, ..."


In [5]:
gdftheater = gpd.read_file("raw_data_geojson/theater.geojson")

In [6]:
gdftheater

Unnamed: 0,NAME,STR_NAME,HSNR,HSNR_ZUS,PLZ,ORT,RECHTSWERT,HOCHWERT,HOMEPAGE,geometry
0,Theater in der Meerwiese,An der Meerwiese,25.0,,48157.0,Münster,407379.0,5760608.0,https://www.stadt-muenster.de/meerwiese/,POINT (7.65115 51.98841)
1,Amateurbühne Münster-Ost,Andreas-Hofer-Straße,13.0,,48145.0,Münster,407399.0,5756985.0,https://amateurbuehne.de/,POINT (7.65242 51.95585)
2,GOP,Bahnhofstraße,20.0,,48143.0,Münster,406140.0,5757178.0,https://www.variete.de/Muenster,POINT (7.63405 51.95737)
3,Theaterlabor im Kulturbahnhof,Bergiusstraße,15.0,,48165.0,Münster,407404.0,5751284.0,https://www.uni-muenster.de/Kustodie/kulturatl...,POINT (7.65403 51.9046)
4,TPZ Münster,Achtermannstraße,24.0,,48143.0,Münster,406021.0,5757093.0,https://www.tpz-muenster.de/,POINT (7.63234 51.95659)
5,Wolfgang Borchert Theater,Am Mittelhafen,10.0,,48155.0,Münster,406543.0,5756479.0,https://www.wolfgang-borchert-theater.de/,POINT (7.64011 51.95115)
6,"Kreativ-Haus, Theaterbühne",Diepenbrockstraße,28.0,,48145.0,Münster,406666.0,5757329.0,https://kreativ-haus.de/,POINT (7.64166 51.95882)
7,Studiobühnen Münster,Domplatz,23.0,a,48143.0,Münster,405453.0,5757760.0,https://www.uni-muenster.de/Studiobuehne/,POINT (7.6239 51.96248)
8,Theater im Pumpenhaus,Gartenstraße,123.0,,48147.0,Münster,406318.0,5758948.0,https://www.pumpenhaus.de/,POINT (7.63616 51.97331)
9,STORNO,Finkenstraße,7.0,,48147.0,Münster,405166.0,5758484.0,https://www.storno.org/start/,POINT (7.61952 51.96894)


In [7]:
gdftheater.NAME

0          Theater in der Meerwiese
1          Amateurbühne Münster-Ost
2                               GOP
3     Theaterlabor im Kulturbahnhof
4                       TPZ Münster
5         Wolfgang Borchert Theater
6        Kreativ-Haus, Theaterbühne
7              Studiobühnen Münster
8             Theater im Pumpenhaus
9                            STORNO
10                  Theater Münster
11             Niederdeutsche Bühne
12                Boulevard Münster
13          Charivari Puppentheater
14            Theater Szenenwechsel
15           Der kleine Bühnenboden
Name: NAME, dtype: object

In [8]:
gdftheater['NAME'].count()

np.int64(16)

In [9]:
gdftheater.columns

Index(['NAME', 'STR_NAME', 'HSNR', 'HSNR_ZUS', 'PLZ', 'ORT', 'RECHTSWERT',
       'HOCHWERT', 'HOMEPAGE', 'geometry'],
      dtype='object')

In [10]:
gdftheater.columns = [x.lower() for x in gdftheater.columns]
gdftheater.columns

Index(['name', 'str_name', 'hsnr', 'hsnr_zus', 'plz', 'ort', 'rechtswert',
       'hochwert', 'homepage', 'geometry'],
      dtype='object')

In [11]:
gdftheater.columns = [x.capitalize() for x in gdftheater.columns]
gdftheater.columns

Index(['Name', 'Str_name', 'Hsnr', 'Hsnr_zus', 'Plz', 'Ort', 'Rechtswert',
       'Hochwert', 'Homepage', 'Geometry'],
      dtype='object')

In [12]:
gdftheater.sort_values(by='Name', inplace=True)
gdftheater.head()

Unnamed: 0,Name,Str_name,Hsnr,Hsnr_zus,Plz,Ort,Rechtswert,Hochwert,Homepage,Geometry
1,Amateurbühne Münster-Ost,Andreas-Hofer-Straße,13.0,,48145.0,Münster,407399.0,5756985.0,https://amateurbuehne.de/,POINT (7.65242 51.95585)
12,Boulevard Münster,Königsstraße,12.0,,48143.0,Münster,405612.0,5757413.0,https://www.boulevard-muenster.de/,POINT (7.62631 51.95939)
13,Charivari Puppentheater,Körnerstraße,3.0,,48151.0,Münster,404931.0,5756907.0,http://charivari-theater.de/,POINT (7.61654 51.95473)
15,Der kleine Bühnenboden,Schillerstraße,48.0,,48155.0,Münster,406623.0,5757011.0,http://www.derkleinebuehnenboden.de/,POINT (7.64112 51.95595)
2,GOP,Bahnhofstraße,20.0,,48143.0,Münster,406140.0,5757178.0,https://www.variete.de/Muenster,POINT (7.63405 51.95737)


In [13]:
gdftheater['Hsnr_zus'].value_counts()

Hsnr_zus
     15
a     1
Name: count, dtype: int64

In [40]:
gdftheater['Plz'] = gdftheater['Plz'].astype(float).astype(int).astype(str)

In [42]:
gdftheater.dtypes

Name            object
Str_name        object
Hsnr           float64
Hsnr_zus        object
Plz             object
Ort             object
Rechtswert     float64
Hochwert       float64
Homepage        object
Geometry      geometry
dtype: object

In [44]:
print(gdftheater['Plz'].unique())

['48145' '48143' '48151' '48155' '48147' '48153' '48157' '48165']


In [46]:
#Grouping our Data. 
#Group by operation: Split data, apply function, combine results
#1.Groupby Method
gdftheater.groupby(['Plz'])

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x00000134D0B8F750>

In [54]:
plzGrp = gdftheater.groupby(['Plz'])

In [None]:
#plzGrp['Name'].loc['48145.0']

In [66]:
plzGrp.get_group('48145')

  plzGrp.get_group('48145')


Unnamed: 0,Name,Str_name,Hsnr,Hsnr_zus,Plz,Ort,Rechtswert,Hochwert,Homepage,Geometry
1,Amateurbühne Münster-Ost,Andreas-Hofer-Straße,13.0,,48145,Münster,407399.0,5756985.0,https://amateurbuehne.de/,POINT (7.65242 51.95585)
6,"Kreativ-Haus, Theaterbühne",Diepenbrockstraße,28.0,,48145,Münster,406666.0,5757329.0,https://kreativ-haus.de/,POINT (7.64166 51.95882)


In [None]:
#gdftheater = gdftheater.set_index("name")

In [None]:
#gdftheater["boundary"] = gdftheater.boundary
#gdftheater["boundary"]

In [None]:
gdftheater.explore()

In [72]:
gdftheater['Geometry']

1     POINT (7.65242 51.95585)
12    POINT (7.62631 51.95939)
13    POINT (7.61654 51.95473)
15    POINT (7.64112 51.95595)
2     POINT (7.63405 51.95737)
6     POINT (7.64166 51.95882)
11    POINT (7.62934 51.96477)
9     POINT (7.61952 51.96894)
7      POINT (7.6239 51.96248)
4     POINT (7.63234 51.95659)
10     POINT (7.62898 51.9649)
14    POINT (7.62568 51.94137)
8     POINT (7.63616 51.97331)
0     POINT (7.65115 51.98841)
3      POINT (7.65403 51.9046)
5     POINT (7.64011 51.95115)
Name: Geometry, dtype: geometry