# Briefe vom 'Alten Sepp' - Joseph von Laßberg als Schnittstelle der frühen Mediävistik
Einen ersten Zugang zum gelehrten Netzwerk Laßbergs bieten die Metadaten der erhaltenen Briefe. Diese Briefe wurden bereits in den 1990er Jahren in `Harris, Martin: Joseph Maria Christoph Freiherr von Lassberg 1770-1855. Briefinventar und Prosopographie. Mit einer Abhandlung zu Lassbergs Entwicklung zum Altertumsforscher. Die erste geschlossene, wissenschaftlich fundierte Würdigung von Lassbergs Wirken und Werk. Beihefte zum Euphorion Heft 25/C. Heidelberg 1991` gesammelt und entsprechend verzeichnet. Das dort erstelle Register OCR-erkannt, in eine CSV-Datei überführt und mit vorhandenen Normdaten der GND und Wikidata ergänzt. Abschließend wurden aus diesen Daten Personen und Ortsregister in TEI-XML generiert. Die so erstellten digitalen Register liefert nicht nur die Grundlage für eine Edition der Briefe, sondern ermöglichen auch einen ersten datanalytischen Einblick in das Netzwerk, der hier im Entstehen begriffen ist und vorläufigen Charakter hat.

## Vorbereitungen
Zur Analyse und Visualisierung der Daten greift das vorliegende Notebook auf die Pakete `pandas`, `etree`, `matplotlib`, `seaborn`, `IPython` und `ipywidgets` zurück, ggf. per `pip install` installiert werden müssen. Während die auf diesen Packeten aufbauende Datenanalyse in Github nachvollzogen werden kann, ohne das Notebook auszuführen, müssen die Kartenansichten auf einer lokalen [Jupyter Notebook Installation](https://jupyter-tutorial.readthedocs.io/de/stable/notebook/install.html) ausgeführt werden. Hierfür muss das Paket `ipyleafle` installiert sein.

In [1]:
# Import of used packages
import pandas as pd # for data analysis
from lxml import etree # for xml transformation
import matplotlib.pyplot as plt # for plotting
import seaborn as sns # for pretty plotting
from IPython.display import Markdown, display # for pretty print
from ipyleaflet import AwesomeIcon, Map, Marker, MarkerCluster, Popup # for mapping
from ipywidgets import HTML # for widgets and popups

# Function for markdown formatted outputs
def printmd(string):
    display(Markdown(string))

# Load main data from csv register
df = pd.read_csv('../../data/register/register.csv', delimiter=';')

# Load and parse place register
tree = etree.parse('../../data/register/lassberg-places.xml')
root = tree.getroot()

# Define a list to hold your data
data = []

# Extract information from each <place> element
for place in root.findall('.//{http://www.tei-c.org/ns/1.0}place'):
    place_id = place.get('{http://www.w3.org/XML/1998/namespace}id')
    place_name = place.find('.//{http://www.tei-c.org/ns/1.0}placeName').text if place.find('.//{http://www.tei-c.org/ns/1.0}placeName') is not None else None
    geo = place.find('.//{http://www.tei-c.org/ns/1.0}geo')
    coordinates = geo.text if geo is not None else None
    
    # Append this data to the list
    data.append({'place_id': place_id, 'place_name': place_name, 'coordinates': coordinates})

# Convert the list to a DataFrame
places_df = pd.DataFrame(data)

# Splitting the 'coordinates' column into 'latitude' and 'longitude'
places_df[['latitude', 'longitude']] = places_df['coordinates'].str.split(',', expand=True)

# Convert latitude and longitude to float
places_df['latitude'] = pd.to_numeric(places_df['latitude'], errors='coerce')
places_df['longitude'] = pd.to_numeric(places_df['longitude'], errors='coerce')

## Überblick Dataframe

In [2]:
# Overview of main dataframe
printmd(f"Information of letters-df: \n")
print(df.info())
printmd(f"Head of letters-df: \n")
print(df.head())
printmd(f"Information of place-register-df:  \n")
print(places_df.info())
printmd(f"Head of place-register-df: \n")
print(places_df.head())

Information of letters-df: 


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3265 entries, 0 to 3264
Data columns (total 24 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   ID                        3265 non-null   object 
 1   SENT_FROM_NAME            3265 non-null   object 
 2   SENT_FROM_ID              3265 non-null   object 
 3   RECIVED_BY_NAME           3265 non-null   object 
 4   RECIVED_BY_ID             3265 non-null   object 
 5   Absendeort                3265 non-null   object 
 6   Absendeort_id             3265 non-null   object 
 7   Ankunftsort               0 non-null      float64
 8   Ankunftsort_id            0 non-null      float64
 9   Nummer_Harris             3265 non-null   int64  
 10  Journalnummer             1158 non-null   object 
 11  Aufbewahrungsort          3150 non-null   object 
 12  Aufbewahrungsinstitution  3150 non-null   object 
 13  Signatur                  308 non-null    object 
 14  iiif_man

Head of letters-df: 


                     ID          SENT_FROM_NAME                 SENT_FROM_ID  \
0  lassberg-letter-0783      Joseph von Laßberg  lassberg-correspondent-0373   
1  lassberg-letter-0788      Joseph von Laßberg  lassberg-correspondent-0373   
2  lassberg-letter-0815      Joseph von Laßberg  lassberg-correspondent-0373   
3  lassberg-letter-0883      Joseph von Laßberg  lassberg-correspondent-0373   
4  lassberg-letter-0884  Johann Heinrich Tanner  lassberg-correspondent-0201   

          RECIVED_BY_NAME                RECIVED_BY_ID    Absendeort  \
0  Johann Heinrich Tanner  lassberg-correspondent-0201   Eppishausen   
1  Johann Heinrich Tanner  lassberg-correspondent-0201  Heiligenberg   
2  Johann Heinrich Tanner  lassberg-correspondent-0201   Eppishausen   
3  Johann Heinrich Tanner  lassberg-correspondent-0201   Eppishausen   
4      Joseph von Laßberg  lassberg-correspondent-0373         Aarau   

         Absendeort_id  Ankunftsort  Ankunftsort_id  Nummer_Harris  ...  \
0  lassberg

Information of place-register-df:  


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 207 entries, 0 to 206
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   place_id     207 non-null    object 
 1   place_name   207 non-null    object 
 2   coordinates  204 non-null    object 
 3   latitude     196 non-null    float64
 4   longitude    196 non-null    float64
dtypes: float64(2), object(3)
memory usage: 8.2+ KB
None


Head of place-register-df: 


              place_id   place_name             coordinates   latitude  \
0  lassberg-place-0001        Aarau               47.4,8.05  47.400000   
1  lassberg-place-0002  Ägeri (Zug)      47.136104,8.613887  47.136104   
2  lassberg-place-0003         Alme      51.455833,8.620278  51.455833   
3  lassberg-place-0004      Altenau                       -        NaN   
4  lassberg-place-0005      Altikon  47.56666667,8.78333333  47.566667   

   longitude  
0   8.050000  
1   8.613887  
2   8.620278  
3        NaN  
4   8.783333  


## Datenexploration
### Persons

In [3]:
# Total letters in dataset
total_letters = df.shape[0]
printmd(f"**Total number of letters:** {total_letters}")

# Letters from Lassberg
lassberg_letters = df[df['SENT_FROM_NAME'] == 'Joseph von Laßberg'].shape[0]
printmd(f"**Letters written by Joseph von Laßberg:** {lassberg_letters} ({int(lassberg_letters/total_letters*100)} %)")
printmd(f"**Letters written by others:** {3265 - lassberg_letters} ({int(100 - (lassberg_letters/total_letters*100))} %)")

# Unique correspondences
unique_correspondences = pd.concat([df['SENT_FROM_ID'], df['RECIVED_BY_ID']]).drop_duplicates().shape[0]
printmd(f"**Unique correspondences:** {unique_correspondences - 1}")

**Total number of letters:** 3265

**Letters written by Joseph von Laßberg:** 1565 (47 %)

**Letters written by others:** 1700 (52 %)

**Unique correspondences:** 372

In [4]:
# Top 20 correspondence differenciated in sending and recieving
# Count letters
from_counts = df['SENT_FROM_NAME'].value_counts()
to_counts = df['RECIVED_BY_NAME'].value_counts()

# Combining counts and sorting
total_counts = from_counts.add(to_counts, fill_value=0).sort_values(ascending=False)

# Get top 20 participants
top_20_participants = total_counts.head(20)

# Display 'from', 'to', and total counts for top 20 participants
printmd("**Top 20 participants in correspondence:**\n")
for participant in top_20_participants.index:
    from_count = from_counts.get(participant, 0)
    to_count = to_counts.get(participant, 0)
    total_count = top_20_participants[participant]
    printmd(f"**{participant}** *{from_count}* sent, *{to_count}* recieved, *total: {int(total_count)}*")
    
# Get Median
median = total_counts.median()
printmd(f"**Median number of letters per correspondence: {int(median)}**")

**Top 20 participants in correspondence:**


**Joseph von Laßberg** *1565* sent, *1700* recieved, *total: 3265*

**Elisabeth zu Fürstenberg** *25* sent, *165* recieved, *total: 190*

**Hermann von Liebenau** *6* sent, *137* recieved, *total: 143*

**Johann Adam Pupikofer** *96* sent, *47* recieved, *total: 143*

**Johann Caspar Zellweger** *87* sent, *55* recieved, *total: 142*

**Ludwig Uhland** *50* sent, *70* recieved, *total: 120*

**Christian Wilhelm Sixt von Armin** *85* sent, *2* recieved, *total: 87*

**Gustav Schwab** *44* sent, *42* recieved, *total: 86*

**Jacob Grimm** *33* sent, *47* recieved, *total: 80*

**Bernhard Zeerleder** *75* sent, *0* recieved, *total: 75*

**Ildefons von Arx** *25* sent, *43* recieved, *total: 68*

**Melchior Kirchhofer** *61* sent, *0* recieved, *total: 61*

**Egbert Friedrich von Mülinen** *31* sent, *29* recieved, *total: 60*

**Carl Johann Greith** *38* sent, *17* recieved, *total: 55*

**Philipp Lüthard** *14* sent, *39* recieved, *total: 53*

**Wilhelm Wackernagel** *27* sent, *20* recieved, *total: 47*

**Ludwig von Madroux** *46* sent, *1* recieved, *total: 47*

**Jenny von Droste zu Hülshoff** *46* sent, *0* recieved, *total: 46*

**Johann Leonhard Hug** *1* sent, *44* recieved, *total: 45*

**Karl Egon III. zu Fürstenberg** *7* sent, *38* recieved, *total: 45*

**Median number of letters per correspondence: 2**

In [5]:
# Top 20 correspondence ordered by sent letters

# Get top 20 participants
top_20_participants = from_counts.head(20)

# Display 'from', 'to', and total counts for top 20 participants
printmd("**Top 20 participants in correspondence ordered by number of sent letters:**\n")
for participant in top_20_participants.index:
    from_count = from_counts.get(participant, 0)
    to_count = to_counts.get(participant, 0)
    total_count = top_20_participants[participant]
    printmd(f"**{participant}** *{from_count}* sent, *{to_count}* recieved, *{total_count}*")

**Top 20 participants in correspondence ordered by number of sent letters:**


**Joseph von Laßberg** *1565* sent, *1700* recieved, *1565*

**Johann Adam Pupikofer** *96* sent, *47* recieved, *96*

**Johann Caspar Zellweger** *87* sent, *55* recieved, *87*

**Christian Wilhelm Sixt von Armin** *85* sent, *2* recieved, *85*

**Bernhard Zeerleder** *75* sent, *0* recieved, *75*

**Melchior Kirchhofer** *61* sent, *0* recieved, *61*

**Ludwig Uhland** *50* sent, *70* recieved, *50*

**Ludwig von Madroux** *46* sent, *1* recieved, *46*

**Jenny von Droste zu Hülshoff** *46* sent, *0* recieved, *46*

**Gustav Schwab** *44* sent, *42* recieved, *44*

**Carl Johann Greith** *38* sent, *17* recieved, *38*

**Jacob Grimm** *33* sent, *47* recieved, *33*

**Egbert Friedrich von Mülinen** *31* sent, *29* recieved, *31*

**Wilhelm Wackernagel** *27* sent, *20* recieved, *27*

**Franz Karl Grieshaber** *26* sent, *8* recieved, *26*

**Ildefons von Arx** *25* sent, *43* recieved, *25*

**Elisabeth zu Fürstenberg** *25* sent, *165* recieved, *25*

**Emil Braun** *24* sent, *5* recieved, *24*

**Johann Heinrich Nepomuk Schreiber** *22* sent, *0* recieved, *22*

**Karl Anton Friedrich Meinrad Fidelis von Hohenzollern-Sigmaringen** *22* sent, *2* recieved, *22*

In [6]:
# Top 20 correspondence receiving

# Get top 20 participants
top_20_participants = to_counts.head(20)

# Display 'from', 'to', and total counts for top 20 participants
printmd("**Top 20 participants in correspondence ordered by received letters:**\n")
for participant in top_20_participants.index:
    from_count = from_counts.get(participant, 0)
    to_count = to_counts.get(participant, 0)
    total_count = top_20_participants[participant]
    printmd(f"**{participant}** *{from_count}* sent, *{to_count}* recieved, *{total_count}*")

**Top 20 participants in correspondence ordered by received letters:**


**Joseph von Laßberg** *1565* sent, *1700* recieved, *1700*

**Elisabeth zu Fürstenberg** *25* sent, *165* recieved, *165*

**Hermann von Liebenau** *6* sent, *137* recieved, *137*

**Ludwig Uhland** *50* sent, *70* recieved, *70*

**Johann Caspar Zellweger** *87* sent, *55* recieved, *55*

**Jacob Grimm** *33* sent, *47* recieved, *47*

**Johann Adam Pupikofer** *96* sent, *47* recieved, *47*

**Johann Leonhard Hug** *1* sent, *44* recieved, *44*

**Ildefons von Arx** *25* sent, *43* recieved, *43*

**Gustav Schwab** *44* sent, *42* recieved, *42*

**Philipp Lüthard** *14* sent, *39* recieved, *39*

**Karl Egon III. zu Fürstenberg** *7* sent, *38* recieved, *38*

**Friedrich Carl von und zu Brenken** *1* sent, *35* recieved, *35*

**Unbekannt** *9* sent, *35* recieved, *35*

**Johann Daniel Wilhelm Hartmann** *3* sent, *31* recieved, *31*

**Egbert Friedrich von Mülinen** *31* sent, *29* recieved, *29*

**Joseph Thaddäus von Reischach** *0* sent, *28* recieved, *28*

**Johann Rudolf Wyss** *18* sent, *21* recieved, *21*

**Johann Heinrich Tanner** *21* sent, *21* recieved, *21*

**Franz Simon von Pfaffenhofen** *9* sent, *20* recieved, *20*

### Places

In [7]:
sent_from_counts = df['Absendeort'].value_counts()

# Unique places
unique_places = df[['Absendeort']].drop_duplicates().shape[0]
printmd(f"**Unique places:** {unique_places}")

# Display the result
print(sent_from_counts.head(50))

**Unique places:** 172

Absendeort
Eppishausen                        628
Meersburg                          487
Unbekannt                          322
Donaueschingen                     158
Stuttgart                          115
Heiligenberg                       114
Bischofszell                        85
Trogen (Appenzell Ausserrhoden)     84
Konstanz                            77
Bern                                69
Tübingen                            68
St. Gallen                          67
Zürich                              65
Stein                               61
Steinegg                            48
München                             38
Freiburg i. Brsg                    36
Aarau                               36
Koblenz                             35
Göttingen                           28
Sigmaringen                         27
Basel                               26
Rastatt                             26
Rosenberg                           25
Baden (Aargau)                      23
Berlin        

In [8]:
# Create interactive map (not displayed on Github)
# merge place register with dataset
df_for_map = pd.merge(df, places_df, left_on='Absendeort_id', right_on='place_id', how='left')

# Ensure latitude and longitude are numeric
#merged_df['latitude'] = pd.to_numeric(merged_df['latitude'], errors='coerce')
#merged_df['longitude'] = pd.to_numeric(merged_df['longitude'], errors='coerce')

# Remove rows with missing or invalid coordinates
valid_locations = df_for_map.dropna(subset=['latitude', 'longitude'])

# Create a Map instance
m = Map(center=(50, 10), zoom=4)  # Adjust the center and zoom level

# Create different icons for sent and received letters
icon_sent_from_by_lassberg = AwesomeIcon(
    name = 'fa-paper-plane',
    marker_color='red',
    icon_color='black',
    spin=False
)
icon_sent_from_to_lassberg = AwesomeIcon(
    name = 'fa-paper-plane',
    marker_color='blue',
    icon_color='black',
    spin=False
)

# Create markers and add them to a MarkerCluster
markers = []
for _, row in valid_locations.iterrows():
    message_popoup = HTML()
    message_popoup.value = f"Letter from {row['SENT_FROM_NAME']} to {row['RECIVED_BY_NAME']} dated {row['Datum']}, Harris: {row['Nummer_Harris']}, ID: {row['ID']}"
    
    if row['SENT_FROM_ID'] == 'lassberg-correspondent-0373':
        marker = Marker(icon=icon_sent_from_by_lassberg, location=(row['latitude'], row['longitude']))
    else:
        marker = Marker(icon=icon_sent_from_to_lassberg, location=(row['latitude'], row['longitude']))
    marker.popup = message_popoup
    markers.append(marker)

marker_cluster = MarkerCluster(markers=markers)
m.add_layer(marker_cluster)

# Display the map
printmd(f"Kartographische Darstellung der Absendeort (Blau = Brief an Laßberg, Rot = Brief von Laßberg): \n")
m

Kartographische Darstellung der Absendeort (Blau = Brief an Laßberg, Rot = Brief von Laßberg): 


Map(center=[50, 10], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_tex…