## Kalliope SRU-Abfrage und auslesen der Daten zu den Briefwechseln von Werner Heisenberg
Sources: Code für die SRU-Abfrage und das Parsen der Daten adaptiert nach DNB SRU Tutorial:

https://github.com/deutsche-nationalbibliothek/dnblab/blob/main/DNB_SRU_Tutorial.ipynb 

Dokumentation Kalliope SRU:
https://kalliope-verbund.info/de/support/sru.html

Zweck ist es, die Daten so auszulesen, dass im Anschluß für eine Netzwerkanalyse weiterverarbeitet werden können

Link zum Heisenberg Nachlass in Kalliope: https://kalliope-verbund.info/search.html?q=DE-611-BF-73161

In [58]:
# Import necessary libraries
# Benötigte Bibliotheken importieren

import requests # https://www.w3schools.com/python/module_requests.asp
from lxml import etree # https://www.w3schools.com/Python/ref_module_xml.asp
import pandas as pd # https://pandas.pydata.org/

In [26]:
# Function to send a SRU request to the Kalliope system with a custom query
# Funktion zum Senden einer SRU-Anfrage an das Kalliope-System mit einer benutzerdefinierten Abfrage
# SRU query
def kalliope_sru(query):
    base_url = "https://kalliope-verbund.info/sru"
    params = {
        'version': '1.2',
        'operation': 'searchRetrieve',
        'recordSchema': 'mods37',
        'maximumRecords': '1500',   #mehr records werden gefunden, wenn maximum records hochgesetzt wird.
        'query': query
    }
    
    r = requests.get(base_url, params=params)
    mods_content = r.content
    records_mods = etree.fromstring(mods_content)
    
    return records_mods


Search/Retrieve Web Service https://www.loc.gov/standards/sru/companionSpecs/srw.html



In [57]:
# Function to parse a MODS record (SRU response XML) and extract relevant fields
# Funktion zum Parsen eines MODS-Datensatzes (SRU XML-Antwort) und Extraktion relevanter Felder
def parse_mods(record):
    ns = {
        'srw': 'http://www.loc.gov/zing/srw/',  # SRW namespace - Link aktualisieren? siehe https://www.loc.gov/standards/sru/companionSpecs/srw.html
        'mods': 'http://www.loc.gov/mods/v3'    # MODS namespace
    }
    
    # Extract RecordID
    recordIdentifier = record.xpath(".//mods:mods/mods:recordInfo/mods:recordIdentifier", namespaces=ns)
    recordIdentifier = recordIdentifier[0].text if recordIdentifier else "unknown"
    
    # Extract Title
    title = record.xpath(".//mods:mods/mods:titleInfo/mods:title", namespaces=ns)
    title = title[0].text if title else "unknown"
    
    # Extract Date
    date = record.xpath(".//mods:mods/mods:originInfo/mods:dateCreated", namespaces=ns)
    date = date[0].text if date else "unknown"
    
    # Extract Names and Roles
    names = record.xpath(".//mods:mods/mods:name", namespaces=ns)
    senders = []
    receivers = []
    mentioned = []
    
    for name in names:
        name_text = name.xpath(".//mods:namePart/text()", namespaces=ns)
        role_text = name.xpath(".//mods:role/mods:roleTerm[@type='text']/text()", namespaces=ns)
        
        if name_text and role_text:
            name_value = name_text[0]
            role_value = role_text[0].lower()
            
            # Categorize based on role
            if "verfasser" in role_value:  # Adjust to match actual role values
                senders.append(name_value)
            elif "adressat" in role_value:
                receivers.append(name_value)
            elif "erwähnt" in role_value:
                mentioned.append(name_value)
    
    # Extract Genre
    genre_letter = record.xpath(".//mods:mods/mods:genre[text()='Brief']", namespaces=ns)
    genre_letter = genre_letter[0].text if genre_letter else "unknown"
    
    # Return a dictionary to build the DataFrame
    return {
        "recordIdentifier": recordIdentifier,
        "title": title,
        "date": date,
        "senders": " /".join(senders),    # Combine names into a single string
        "receivers": "/ ".join(receivers),
        "mentioned": "/ ".join(mentioned),
        "genre": genre_letter
    }


#### Definition der Kalliope SRU-Schnittstellenabfrage



In [59]:
# Define the SRU query string, e.g. to retrieve letters from a specific archive or person
# Definieren der SRU-Abfragezeichenkette, z. B. um Briefe aus einem bestimmten Archiv oder von bestimmten Personen abzurufen
# Example query

#query = 'ead.archdesc.id="DE-611-BF-73161"'

#query = 'ead.archdesc.id="DE-611-BF-73161" and ead.unitdate_end<1980'

#query = 'ead.archdesc.id="DE-611-BF-73161" AND ead.unitdate_start>=1945 AND ead.unitdate_end<=1950'

query = 'ead.archdesc.id="DE-611-BF-73161" AND ead.unitdate_start>=1915 AND ead.unitdate_end<=1935'

#'ead.archdesc.id'
#query = "ead.addressee"=="Heisenberg"
records_xml = kalliope_sru(query)

print(f'{len(records_xml.xpath("//srw:record", namespaces={"srw": "http://www.loc.gov/zing/srw/"}))} Ergebnisse gefunden')


328 Ergebnisse gefunden


In [60]:
# Parse the retrieved XML records and convert them into a list of dictionaries
# Parsen der abgerufenen XML-Datensätze und Umwandlung in eine Liste von Dictionaries
# Parse data and convert to DataFrame
records = records_xml.xpath("//srw:record", namespaces={"srw": "http://www.loc.gov/zing/srw/"})
output = [parse_mods(record) for record in records]
df = pd.DataFrame(output)
df


Unnamed: 0,recordIdentifier,title,date,senders,receivers,mentioned,genre
0,DE-611-HS-3493963,Brief von Otto Wolfgang Bechtle von Bechtle-Ve...,1923-11-23,"Bechtle, Otto Wolfgang (1918-2012) /Bechtle-Ve...","Heisenberg, Werner (1901-1976)","Einstein, Albert (1879-1955)/ Clark, Ronald W....",Brief
1,DE-611-HS-3590152,Die Entdeckung der Seele. (Drucktitel),1925-12-18,Unbekannt,,,unknown
2,DE-611-HS-3585335,"Gedicht von Richard Bär, Annemarie Schrödinger...",1927-09,"Bär, Richard",,"Heisenberg, Werner (1901-1976)/ Ehrenhaft, Fel...",unknown
3,DE-611-HS-3586786,Brief von Louis de Broglie an Werner Heisenber...,1931-12-16,"Broglie, Louis de (1892-1987)","Heisenberg, Werner (1901-1976)",,Brief
4,DE-611-HS-3587309,Brief von Peter J. W. Debye von Université de ...,1934-11-10,"Debye, Peter J. W. (1884-1966) /Université de ...","Heisenberg, Werner (1901-1976)",,Brief
...,...,...,...,...,...,...,...
323,DE-611-HS-4157018,Heisenberg S.S. 35 Elektrizitätslehre (Incipit...,1935,"Sandberger, Hedwig [vermutlich]",,,unknown
324,DE-611-BF-87394,"1472. VIII. Sammlungen, 3. DruckschriftenArtik...",1931,,,,unknown
325,DE-611-HS-3878140,Über Energieschwankungen in einem Strahlungsfe...,1931,"Heisenberg, Werner (1901-1976)",,,unknown
326,DE-611-HS-3879721,"Bibljografja Artykulow, Zawartych w tomach I-X...",1935,"Heisenberg, Werner (1901-1976)",,,unknown


In [14]:
# Optional: print raw XML for inspection
# Optional: Rohes XML zur Überprüfung ausgeben
#print(etree.tostring(records_xml, pretty_print=True).decode())

In [52]:
# Save full parsed data as CSV file
# Gespeicherte, vollständig geparste Daten als CSV-Datei

# /home/PK/b-kj102/Dokumente/Kalliope/network/files
df.to_csv("../network/files/heisenberg_1915-1935.csv", index=False) # adjust filename according to query

In [61]:
#df_bibsonomy_Europa_publications = df_bibsonomy_Europa.loc[df_bibsonomy_Europa['type'] == 'Publication']
df_B = df.loc[df["genre"] == "Brief"]
df_B

Unnamed: 0,recordIdentifier,title,date,senders,receivers,mentioned,genre
0,DE-611-HS-3493963,Brief von Otto Wolfgang Bechtle von Bechtle-Ve...,1923-11-23,"Bechtle, Otto Wolfgang (1918-2012) /Bechtle-Ve...","Heisenberg, Werner (1901-1976)","Einstein, Albert (1879-1955)/ Clark, Ronald W....",Brief
3,DE-611-HS-3586786,Brief von Louis de Broglie an Werner Heisenber...,1931-12-16,"Broglie, Louis de (1892-1987)","Heisenberg, Werner (1901-1976)",,Brief
4,DE-611-HS-3587309,Brief von Peter J. W. Debye von Université de ...,1934-11-10,"Debye, Peter J. W. (1884-1966) /Université de ...","Heisenberg, Werner (1901-1976)",,Brief
5,DE-611-HS-3587314,Brief von Peter J. W. Debye an Werner Heisenbe...,1935-03-17,"Debye, Peter J. W. (1884-1966)","Heisenberg, Werner (1901-1976)",,Brief
6,DE-611-HS-3587472,Brief von Peter J. W. Debye von Kaiser-Wilhelm...,1935-05-20,"Debye, Peter J. W. (1884-1966) /Kaiser-Wilhelm...","Heisenberg, Werner (1901-1976)/ Universität Le...",,Brief
...,...,...,...,...,...,...,...
310,DE-611-HS-3723301,Brief von Otto Petersen von Verein Deutscher E...,1934-02-07,"Petersen, Otto (1874-1953) /Verein Deutscher E...","Heisenberg, Werner (1901-1976)","Goerens, Paul",Brief
311,DE-611-HS-3724861,Brief von Ebbe Rasmussen und Unbekannt an Wern...,1935-06-13,"Rasmussen, Ebbe (1901-1959) /Unbekannt","Heisenberg, Werner (1901-1976)","Kramers, Hendrik Anthony (1894-1952)/ Klein, O...",Brief
312,DE-611-HS-3724902,"Brief von Hans Rau an Werner Heisenberg, 04.07...",1934-07-04,"Rau, Hans (1881-)","Heisenberg, Werner (1901-1976)",,Brief
313,DE-611-HS-3724964,Brief von Max Reich von Georg-August-Universit...,1934-02-09,"Reich, Max (1874-1941) /Georg-August-Universit...","Heisenberg, Werner (1901-1976)",,Brief


In [46]:
# Optional: save only letter-related records
# Optional: nur briefbezogene Datensätze speichern
df_B.to_csv("../network/files/heisenberg_1915-1935_lettersonly.csv", index=False) # adjust filename according to query

In [62]:
# ToDo: further clean and filter the data (e.g., by sender/recipient, date)
# Noch zu tun: Daten weiter bereinigen und filtern (z. B. nach Absender/Empfänger, Datum)
# next steps to clean data: pick columns date, senders, receivers from df and think
# about how to deal with separators in order to cleary separate the columns
#df_bibsonomy_Europa_selection = df_bibsonomy_Europa_publications[
#   ["type", "id", "tags", "label", "user", "description", "date", "authors", "publisher", "isbn"]] 


df_l_selec = df_B[["date","senders", "receivers"]]
df_l_selec

Unnamed: 0,date,senders,receivers
0,1923-11-23,"Bechtle, Otto Wolfgang (1918-2012) /Bechtle-Ve...","Heisenberg, Werner (1901-1976)"
3,1931-12-16,"Broglie, Louis de (1892-1987)","Heisenberg, Werner (1901-1976)"
4,1934-11-10,"Debye, Peter J. W. (1884-1966) /Université de ...","Heisenberg, Werner (1901-1976)"
5,1935-03-17,"Debye, Peter J. W. (1884-1966)","Heisenberg, Werner (1901-1976)"
6,1935-05-20,"Debye, Peter J. W. (1884-1966) /Kaiser-Wilhelm...","Heisenberg, Werner (1901-1976)/ Universität Le..."
...,...,...,...
310,1934-02-07,"Petersen, Otto (1874-1953) /Verein Deutscher E...","Heisenberg, Werner (1901-1976)"
311,1935-06-13,"Rasmussen, Ebbe (1901-1959) /Unbekannt","Heisenberg, Werner (1901-1976)"
312,1934-07-04,"Rau, Hans (1881-)","Heisenberg, Werner (1901-1976)"
313,1934-02-09,"Reich, Max (1874-1941) /Georg-August-Universit...","Heisenberg, Werner (1901-1976)"


In [55]:
# Save final selection with names and dates to a CSV file
# Speichere finale Auswahl mit Namen und Daten in eine CSV-Datei

df_l_selec.to_csv("../network/files/heisenberg_namesdates_1915-1935.csv", index=False, sep=";") # adjust filename according to query