## Geojson / Json Intermediate Review


The file [world_cities_fixed.json](../data/world_cities_fixed.json) has thousands of cities but they are in a json format that doesn't have the correct format for GeoJson. It doesn't have to be GeoJson, as we can use other libraries to generate points or lines using the provided coordinates, but I want to go over creating a couple of things: 

1. A dictionary that organizes each city by time zone
2. A set of GeoJson files where one file has all the cities in a single time zone in a GeoJson feature collection format.

I'm making the individual files so you can see a json file get processed into a GeoJson file, plus after they are converted, you can view them easily with [GeoJson.io](https://GeoJson.io) and of course Folium (Folium has other ways to display spatial features aside from GeoJson).

> Notice that the [world_cities_fixed.json](../data/world_cities_fixed.json) file is just a json file that has point data for cities. What it doesn't have is the country in which each city exists. However, we have the worlds country polygons! And with some spatial math, we can assign a city to a country or vice versa. This will be a later project or lesson. 

## Run Cell Below First

The cell below prints output so it looks organized. It uses a "generator" that "yields" step numbers along with a message and outputs them in a python rich Panel.

In [3]:
from rich.console import Console
from rich.panel import Panel
from rich import print
console = Console()

# Run this first so I can print "Step 1-------, Step 2-----" with print messages to make the output 
def stepper(width=60, start=1):
    step = start
    while True:
        message = yield
        block = f"Step {step} " + "-" * width
        if message:
            block += f"\n   ↳ {message}"
        step += 1
        yield block
        
def stp(s,msg,content):
    next(s)
    
    panel_styled = Panel(
        f"[b #555555]Output:[/b #555555] \n\n{content}",
        title=f"[b #FF00FF]{msg}[/b #FF00FF]",
        # subtitle="[dim]Status: Active[/dim]",
        border_style="white",
        title_align="left",
        width=100 # Optional: set a specific width
    )
    console.print(panel_styled)

### Json Review(ish)

In [4]:
"""
Understanding our world_cities_fixed file.
"""
s = stepper()             # Prime the generator

# Below is an example object for a single city

single_city = {
    "city-name": "Tokyo",
    "lat": "35.6895",
    "lon": "139.69171",
    "time-zone": "Asia/Tokyo",
}

stp(s,"Printing a key from object",single_city["city-name"])

# Notice the next example with three cities. It adds commas between the objects, and square brackets to show a list.
cities = [
    {
        "city-name": "Tokyo",
        "lat": "35.6895",
        "lon": "139.69171",
        "time-zone": "Asia/Tokyo",
    },
    {
        "city-name": "Paris",
        "lat": "48.85341",
        "lon": "2.3488",
        "time-zone": "Europe/Paris",
    },
    {
        "city-name": "London",
        "lat": "51.50853",
        "lon": "-0.12574",
        "time-zone": "Europe/London",
    },
]

# Print out the lat, lon for Paris:
output = cities[1]["lat"],cities[1]["lon"]
stp(s,"Printing a key from an object thats in a list",output)

# Format the output
formatted = f"Location for: {cities[1]["city-name"]} = lat:{cities[1]["lat"]},lon:{cities[1]["lon"]}"
stp(s,"Printing a key from an object thats in a list but formatted",formatted)


# Now lets process the three cities with a for loop.
# I cast the "city" to a string so I can print it in my fancy box below.
output = ""
for city in cities:
    output += str(city) +"\n"

stp(s,"Printing each object in a list",output)

# Notice that each object prints out to the screen as a whole object, lets just print the city-name and time zone

output = ""
for city in cities:
    output += f"{city["city-name"]} is in {city['time-zone']}\n"

stp(s,"Printing specific items from each object",output)



### Processing All of World Cities

In [None]:
# If I want to print all the values from the world_cities_fixed.json file, I can do the following:
from pathlib import Path
import sys
import json

cwd = Path.cwd()

target = cwd.parent / "data" / "world_cities_fixed.json"

exists = target.exists()
if not exists:
    print(f"Error: cannot find file: {target}")
    sys.exit()
    
with open(target) as f:
    data = json.load(f)
    
output = len(data)
stp(s,"Number of Cities",str(output))

output = data[:20]
stp(s,"First 20 Cities",str(output))
    


## Create Files

Remember we are doing the following:

1. A dictionary that organizes each city by time zone
2. A set of GeoJson files where one file has all the cities in a single time zone in a GeoJson feature collection format.

## Create City By Time Zone

This snippet will create a dictionary that stores the cities in a json object where the `time-zone` is the key.

In [2]:
import json

with open("../data/world_cities_fixed.json","r") as f:
    cities = json.load(f)
    
    
new_city_dict = {}
for city in cities:
    tz = city['time-zone']
    del city['time-zone']
    if not tz in new_city_dict:
        new_city_dict[tz] = []
    
    
    new_city_dict[tz].append(city)

with open("../data/world_cities_by_time-zone.json","w") as f:
    json.dump(new_city_dict,f,indent=2)

# Create GeoJson Point Files

- One file per time zone
- Folder will be in `../Data/WorldCitiesGeo/....`

In [None]:
## Here we open the file and print the timezone and length of the list it points to

import json

with open("../data/world_cities_by_time-zone.json","r") as f:
    cities = json.load(f)
    
for timezone,citylist in cities.items():
    print(timezone,len(citylist))

Europe/Andorra 10
Asia/Dubai 14
Asia/Kabul 311
America/Antigua 9
America/Anguilla 14
Europe/Tirane 355
Asia/Yerevan 311
Africa/Luanda 38
Antarctica/McMurdo 1
America/Argentina/Buenos_Aires 138
America/Argentina/Cordoba 505
America/Argentina/San_Juan 21
America/Argentina/Salta 135
America/Argentina/Jujuy 24
America/Argentina/Tucuman 17
America/Argentina/Rio_Gallegos 15
America/Argentina/La_Rioja 21
America/Argentina/Mendoza 18
America/Argentina/San_Luis 17
America/Argentina/Ushuaia 3
America/Argentina/Catamarca 56
Pacific/Pago_Pago 14
Europe/Vienna 1691
Australia/Perth 239
Australia/Adelaide 182
Australia/Darwin 16
Australia/Brisbane 239
Australia/Sydney 593
Australia/Melbourne 427
Australia/Hobart 67
Australia/Currie 1
Australia/Broken_Hill 1
America/Aruba 1
Europe/Mariehamn 16
Asia/Baku 173
Europe/Sarajevo 263
America/Barbados 10
Asia/Dhaka 109
Europe/Brussels 546
Africa/Ouagadougou 56
Europe/Sofia 284
Asia/Bahrain 9
Africa/Bujumbura 19
Africa/Porto-Novo 35
America/St_Barthelemy 1
Atl

In [None]:
"""
This cell creates all the geojson files in ../data/WorldCitiesGeo/ 
Notice it uses two data structures:
    geojson = {
        "type": "FeatureCollection", 
        "features": []
    }
  
and

    tempList = []
    
The geojson var has the overall structure, and the tempList held features that were points:

            {
                "type": "Feature",
                "properties": {"name": city["city-name"]},
                "geometry": {
                    "coordinates": [float(city["lon"]), float(city["lat"])],
                    "type": "Point",
                },
            }
After one timezone completed, it wrote out the file and re-used the vars.
"""

import json

for timezone, citylist in cities.items():
    geojson = {"type": "FeatureCollection", "features": []}
    tempList = []
    print(f"Processing {timezone} ....")
    for city in citylist:
        tempList.append(
            {
                "type": "Feature",
                "properties": {"name": city["city-name"]},
                "geometry": {
                    "coordinates": [float(city["lon"]), float(city["lat"])],
                    "type": "Point",
                },
            }
        )
    geojson['features'] = tempList
    timezone = timezone.replace("/","-")
    with open(f"../data/WorldCitiesGeo/{timezone}.geojson","w") as f:
        json.dump(geojson,f,indent=2)



Processing Europe/Andorra ....
Processing Asia/Dubai ....
Processing Asia/Kabul ....
Processing America/Antigua ....
Processing America/Anguilla ....
Processing Europe/Tirane ....
Processing Asia/Yerevan ....
Processing Africa/Luanda ....
Processing Antarctica/McMurdo ....
Processing America/Argentina/Buenos_Aires ....
Processing America/Argentina/Cordoba ....
Processing America/Argentina/San_Juan ....
Processing America/Argentina/Salta ....
Processing America/Argentina/Jujuy ....
Processing America/Argentina/Tucuman ....
Processing America/Argentina/Rio_Gallegos ....
Processing America/Argentina/La_Rioja ....
Processing America/Argentina/Mendoza ....
Processing America/Argentina/San_Luis ....
Processing America/Argentina/Ushuaia ....
Processing America/Argentina/Catamarca ....
Processing Pacific/Pago_Pago ....
Processing Europe/Vienna ....
Processing Australia/Perth ....
Processing Australia/Adelaide ....
Processing Australia/Darwin ....
Processing Australia/Brisbane ....
Processing A

## Create GeoJson LineString Files

- Using the following snippet of json, let me create a geojson file that connects all 4 cites with lines as an example.
- I'm viewing my outputs in geojson.io
- I'll do more Folium soon.



In [11]:
from random import shuffle
from rich import print
import json

colors = ["#ff0000","#00ff00","#0000ff","#ff00ff","#ffff00","#00ffff"]


def makeLineString(start,end):
  shuffle(colors)
  line = {
    "type": "Feature",
    "properties": {
      "from-city":start['city-name'],
      "to-city":end['city-name'],
      "stroke":colors.pop()
      },
    "geometry": {
      "coordinates": [
        [
          float(start['lon']),
          float(start['lat'])
        ],
        [
          float(end['lon']),
          float(end['lat'])
        ]
      ],
      "type": "LineString"
    }
  }
  return line

cities = [
    {
        "city-name": "Tokyo",
        "lat": "35.6895",
        "lon": "139.69171",
        "time-zone": "Asia/Tokyo",
    },
    {
        "city-name": "Paris",
        "lat": "48.85341",
        "lon": "2.3488",
        "time-zone": "Europe/Paris",
    },
    {
        "city-name": "London",
        "lat": "51.50853",
        "lon": "-0.12574",
        "time-zone": "Europe/London",
    },
    {
        "city-name": "London",
        "lat": "51.50853",
        "lon": "-0.12574",
        "time-zone": "Europe/London",
    },
    {
      "city-name": "New York City",
      "lat": "40.71427",
      "lon": "-74.00597",
      "time-zone":"America/New York"
    },
]

geojson = {"type": "FeatureCollection", "features": []}
cityList = []
for i in range(len(cities)):
  line = makeLineString(cities[i],cities[(i+1)%len(cities)])
  cityList.append(line)

geojson["features"] = cityList
print(geojson)
with open("../data/lineStrings.geojson","w") as f:
  json.dump(geojson,f,indent=2)
  
  
  




