# Open Legal Data - Interactive Map

Open Legal Data is an open data project that aims to make legal data more available to the public. It tackles the fact that most of the information produced by courts in Germany and in many countries isn't accessible or displayed in a structured format.

The projects offers an API through which users can retrieve data on many court dedicions in Germany.

Links to the data, the API and the Project, can be found here:

1) Github repository: https://github.com/openlegaldata

2) Project's website: http://openlegaldata.io/

3) API: https://de.openlegaldata.io/


# Notebook's Goal

In this notebook, we'll create an interactive map from germany, which will allow us to visualize the differences between the german Federal States and get separate data from each court.

# Upload Problems

Because GitHub only allows files up to 25 MB to be uploaded, I can't update all maps in one Notebook. I'll split the notebooks, while I work on a solution. 

### Importing Libraries

In [1]:
import pandas as pd
import geopandas as gpd
import googlemaps
import folium
import json
import altair as alt

In [2]:
cases = pd.read_csv(r"C:\Users\celio\Data Analysis\Projects\Open Legal Data\merged.csv",
                   index_col = "Unnamed: 0")
cases.head()

Unnamed: 0,id,slug,file_number,date,created_date,updated_date,type,ecli,court_id,name,slug.1,city,state,jurisdiction,level_of_appeal
0,328393,bgh-2020-05-07-ix-zb-5619,IX ZB 56/19,2020-05-07,2020-05-29T10:00:15Z,2020-05-29T10:07:14Z,Beschluss,ECLI:DE:BGH:2020:070520BIXZB56.19.0,4,Bundesgerichtshof,bgh,Karlsruhe,Baden-Württemberg,Ordentliche Gerichtsbarkeit,Bundesgericht
1,328192,bverwg-2020-04-22-2-b-5219,2 B 52/19,2020-04-22,2020-05-21T10:00:05Z,2020-05-21T10:06:18Z,Beschluss,ECLI:DE:BVerwG:2020:220420B2B52.19.0,5,Bundesverwaltungsgericht,bverwg,Leipzig,Sachsen,Verwaltungsgerichtsbarkeit,Bundesgericht
2,328242,bgh-2020-04-21-ii-zr-5618,II ZR 56/18,2020-04-21,2020-05-23T10:00:15Z,2020-05-23T10:07:16Z,Urteil,ECLI:DE:BGH:2020:210420UIIZR56.18.0,4,Bundesgerichtshof,bgh,Karlsruhe,Baden-Württemberg,Ordentliche Gerichtsbarkeit,Bundesgericht
3,327286,bverfg-2020-03-25-2-bvr-11320,2 BvR 113/20,2020-03-25,2020-04-17T10:00:22Z,2020-04-17T10:06:52Z,Nichtannahmebeschluss,ECLI:DE:BVerfG:2020:rk20200325.2bvr011320,3,Bundesverfassungsgericht,bverfg,Karlsruhe,Baden-Württemberg,Verfassungsgerichtsbarkeit,Bundesgericht
4,327121,bverfg-2020-03-18-1-bvr-33720,1 BvR 337/20,2020-03-18,2020-04-09T10:00:18Z,2020-04-09T10:08:59Z,Nichtannahmebeschluss,ECLI:DE:BVerfG:2020:rk20200318.1bvr033720,3,Bundesverfassungsgericht,bverfg,Karlsruhe,Baden-Württemberg,Verfassungsgerichtsbarkeit,Bundesgericht


## Complementing the Data With Coordinates

The first step to plot the map is to find data about courts' locations. Only then it will be possible to find their location on a geographic representation.

To accomplish this task, we'll use the [googlemaps API](https://developers.google.com/maps/documentation/geocoding/start).

To use the api, it's necessary to get a API key, as stated in the documentation of the API [https://developers.google.com/maps/documentation/geocoding/start].

In [3]:
# Opens the files with my authorization key
with open(r"C:\Users\celio\Data Analysis\Projects\Open Legal Data\maps_api_key.txt") as file: 
    key = file.read()
gmaps = googlemaps.Client(key=key)

### Getting Coordinates

We'll write a function, which gets the coordinates through the API.

In [4]:
def get_coordinates(a_list):
    
    coordinates = {"court_name":[],"latitude":[],"longitude":[]} # Instantiates a dictionary to get the data

    for name in a_list: # Loops through element in a list
            coordinates["court_name"].append(name) # Appends the name to the list
            c = gmaps.geocode(name) # retrieves information on the coordinates (based on the place)
            if c == []: # If the API does not find the adress, it returns an empty list
                lat = pd.NA # If result is an empty list, than append pandas null value to the dictionary key
                lon = pd.NA
                coordinates["latitude"].append(lat)
                coordinates["longitude"].append(lon)
            elif c!= []:
                lat = c[0]["geometry"]["location"].get("lat")
                coordinates["latitude"].append(lat)
                lon = c[0]["geometry"]["location"].get("lng")
                coordinates["longitude"].append(lon)
        
    return coordinates # Returns the dictionary

In [5]:
coor = get_coordinates(cases["name"].unique()) # Passes the court name colum from cases DF to the function

Checking results

In [6]:
coor_df = pd.DataFrame(coor)
coor_df[coor_df["latitude"].isnull()]

Unnamed: 0,court_name,latitude,longitude
139,Schleswig-Holsteinisches Landesverfassungsgericht,,


We can see that the function managed to retrieve almost every values for the coordinates of the courts. However, the "Schleswig-Holsteinisches Landesverfassungsgericht" could not be found. The reason for this is that the googlemap search can't find a directly result to the court's name. 

However, a [quick google search](https://www.schleswig-holstein.de/DE/Justiz/LVG/Kontakt/kontakt_node.html;jsessionid=A3A17709F621288091A2C08C43960D38.delivery2-replication) show us that the court is located at Brockdorff-Rantzau-Straße 13 in Schleswig, which allows us to manually search for the coordinates.

Below, we input the right coordinates for the court. 

In [7]:
coor_df.loc[coor_df["court_name"]== "Schleswig-Holsteinisches Landesverfassungsgericht","latitude"] = 54.50546
coor_df.loc[coor_df["court_name"]== "Schleswig-Holsteinisches Landesverfassungsgericht", "longitude"] = 9.5326413

We can verify the canges below:

In [8]:
coor_df.isnull().sum()

court_name    0
latitude      0
longitude     0
dtype: int64

We can also save the coordinate data in a separate file.

In [9]:
coor_df.to_csv(r"C:\Users\celio\Data Analysis\Projects\Open Legal Data\coordinates.csv")

### Merging The DataFrames

At this point, we have a main DataFrame containing the data from the cases and courts and another one containing the coordinates from each court. We can merge them to have it all in one place. 

In [10]:
coor_df.head()

Unnamed: 0,court_name,latitude,longitude
0,Bundesgerichtshof,49.0061,8.39666
1,Bundesverwaltungsgericht,51.3332,12.3701
2,Bundesverfassungsgericht,49.0131,8.402
3,Bundesarbeitsgericht,50.9775,11.0142
4,Bundesfinanzhof,48.1492,11.6057


In [11]:
cases = cases.merge(right=coor_df,how = "left",left_on="name",right_on="court_name")
cases.head()

Unnamed: 0,id,slug,file_number,date,created_date,updated_date,type,ecli,court_id,name,slug.1,city,state,jurisdiction,level_of_appeal,court_name,latitude,longitude
0,328393,bgh-2020-05-07-ix-zb-5619,IX ZB 56/19,2020-05-07,2020-05-29T10:00:15Z,2020-05-29T10:07:14Z,Beschluss,ECLI:DE:BGH:2020:070520BIXZB56.19.0,4,Bundesgerichtshof,bgh,Karlsruhe,Baden-Württemberg,Ordentliche Gerichtsbarkeit,Bundesgericht,Bundesgerichtshof,49.0061,8.39666
1,328192,bverwg-2020-04-22-2-b-5219,2 B 52/19,2020-04-22,2020-05-21T10:00:05Z,2020-05-21T10:06:18Z,Beschluss,ECLI:DE:BVerwG:2020:220420B2B52.19.0,5,Bundesverwaltungsgericht,bverwg,Leipzig,Sachsen,Verwaltungsgerichtsbarkeit,Bundesgericht,Bundesverwaltungsgericht,51.3332,12.3701
2,328242,bgh-2020-04-21-ii-zr-5618,II ZR 56/18,2020-04-21,2020-05-23T10:00:15Z,2020-05-23T10:07:16Z,Urteil,ECLI:DE:BGH:2020:210420UIIZR56.18.0,4,Bundesgerichtshof,bgh,Karlsruhe,Baden-Württemberg,Ordentliche Gerichtsbarkeit,Bundesgericht,Bundesgerichtshof,49.0061,8.39666
3,327286,bverfg-2020-03-25-2-bvr-11320,2 BvR 113/20,2020-03-25,2020-04-17T10:00:22Z,2020-04-17T10:06:52Z,Nichtannahmebeschluss,ECLI:DE:BVerfG:2020:rk20200325.2bvr011320,3,Bundesverfassungsgericht,bverfg,Karlsruhe,Baden-Württemberg,Verfassungsgerichtsbarkeit,Bundesgericht,Bundesverfassungsgericht,49.0131,8.402
4,327121,bverfg-2020-03-18-1-bvr-33720,1 BvR 337/20,2020-03-18,2020-04-09T10:00:18Z,2020-04-09T10:08:59Z,Nichtannahmebeschluss,ECLI:DE:BVerfG:2020:rk20200318.1bvr033720,3,Bundesverfassungsgericht,bverfg,Karlsruhe,Baden-Württemberg,Verfassungsgerichtsbarkeit,Bundesgericht,Bundesverfassungsgericht,49.0131,8.402


# Plotting the Map

Now we have all the data we need to plot the map. To create the map, we will use the folium library.

In [12]:
ger = folium.Map(location = [51.283447, 10.352765], zoom_start = 6)

Now that we have instatiated a map in folium, we can plot the data on it.

The plotting consists of passing markers to map. The markers will be organized by jurisdiction and every jurisdiction will have a different color. There will also be a layer control, so that it's possible to visualize only the selected jurisdiction.

There are many ways to do this, the tidiest way I came up with was by using dictionaries.

In [13]:
charts = dict.fromkeys(cases["court_name"].unique(),"") # Creates a data containing the unique court names

for key in charts: # Loops through the keys from charts dictionary, each one representing an unique court name
    
    freq_count = cases[cases["name"]==key]["type"].value_counts().reset_index() # creates freq_table
    freq_count.columns = ["Case Type","Number of Cases"] # renames columns (this is important for the Chart creating)
    c = alt.Chart(freq_count).mark_bar().encode(
        x= alt.X("Number of Cases",axis = alt.Axis(title = "Number of Cases",titleFontSize=12,
                                                   labelFontSize=12)),
        y= alt.Y("Case Type",sort = "-x"),
        color = "Case Type")
    t = c.mark_text(align="left",
                   baseline="middle",
                   dx=3).encode(text="Number of Cases")
    # Creates an Altair Chart based on the Frequency Count of the Case Types
    charts[key] = (c+t) # Sets the chart as a value to the dictionary's key

In [14]:
coordinates = dict.fromkeys(cases["name"].unique(),"") # Instation of a Dictionary to store coordinates

for c in coordinates: 
    # Selects coordinates and sets them as values for the corresponding dictionary key
    coordinates[c] = cases[cases["name"]==c][["latitude","longitude"]].head(1).values[0]

In [15]:
jurisdictions = dict.fromkeys(cases["jurisdiction"].unique(),"") # Dictionary to store DataFrame Slices

for j in jurisdictions:
    jurisdictions[j] = cases[cases["jurisdiction"]==j] # Slices the DataFrame according to unique jurisdiction

In [16]:
def create_markers(keyword,color):
    
    fg = folium.FeatureGroup(name = keyword).add_to(earth)
    d = layers.get(keyword)
    for key in d:
        marker = folium.Marker(
            location = coordinates[key],
            icon = folium.Icon(color = color),
            tooltip = key,
            popup = folium.Popup(max_width=500).add_child(
                    folium.VegaLite(data =layers[keyword][key],
                                   width =450,
                                   height = "80%"))
                          ).add_to(fg)

# Map With Court Locations

The map will be plotted below. In it, every color represents a different jurisdiction. By rendering, the image might get a bi cluttered, but it's possible to use the layer control on the top right corner to filter which jurisdiction is currently being seen.

On top of this, every marker display a tooltip when the mouse hooves over it. By clicking on the marker, a chart is displayed containing some statistics for the court.  

In [17]:
color = ["red","darkblue","green","lightblue","purple","orange"] # List with colors
                                                                # This will help visualization
count = 0 #This will manipulate the list of colors

for key in jurisdictions: # Loops through the keys in the jurisdictions dictionary
    
    feature_group = folium.FeatureGroup(name = key).add_to(ger) # Instatiantes a FeatureGroup for unique jurisdiction name
    unique_court_names = jurisdictions[key]["name"].unique() # Gets unique court names for given jurisdiction
    
    for name in unique_court_names: # Loops through the court names
#         Instatiates a marker
        marker = folium.Marker(
            location = coordinates[name], # fetches coordinates from the coordinates dictionary
            icon = folium.Icon(color = color[count]), # sets color according to current count value
            tooltip = name, # Sets tooltip to be the court's name
            popup = folium.Popup(max_width=600).add_child( # popup = folium Popup object
                folium.VegaLite(data = charts[name], #Popup object = VegaLite object, 
                                height = 250,        # Data is fetched from the charts dictionary
                                width = "80%"))
                              ).add_to(feature_group) # Adds marker to the map
        
    count+=1 # Advances count value by 1. This will change the color for the next jurisdiction
    
folium.LayerControl().add_to(ger) # Adds marker to the layer control

ger # Displays the map

# Temporary Conclusion

The map above offers information for every court. Since we're taking a more general approach, we chose to display the most common case types for each court, however, depending on the analysis, the charts could contain other sorts of information, such as the litigation value or maybe the most common circumstances of the case.

For a more extensive analysis, we'd have to use data on the content of the sentences, but this is out of this scope of this visualization notebook.

Also, please note that due to the upload problems mentioned at the beginning, I had to split this notebook in two. As soon as I find a way upload files bigger than 25 MB to GitHub, I'll publish the visualization of the German Map as a Choropleth.