# Postcards

The digital collection [Postcards](https://kuleuven.limo.libis.be/discovery/collectionDiscovery?vid=32KUL_KUL:KULeuven&collectionId=81531489730001488&lang=en) provides a large collection of old Belgian postcards showing, amongst others, village views and a wide range of events. It is published and maintained by [KU Leuven Libraries](https://bib.kuleuven.be/english).

This notebook is the result of the collaboration between KU Leuven and the [University of Alicante](https://www.ua.es/), as part of the [Impact Centre of Competence in digitisation](https://www.digitisation.eu/).

### Introduction
In this example we are going to extract additional information concerning the location of publication of the records from the title

### Getting started

The first thing that we need to do is importing all the libraries (Python packages) that we will need to analyse the data. Note that we are using as data a CSV file, consisting of rows and columns. In order to work with CSV files, [pandas](https://pandas.pydata.org/) is a popular Python package that or is used for working with data sets. It has functions for analyzing, cleaning, exploring, and manipulating data.

In [3]:
import pandas as pd

### Transformation to LOD

Note that the domain URL should be updated with a proper domain

Let's now loop the CSV file to describe the records

In [4]:
postcards_data = pd.read_csv("../../../20250506_CaD_jnbworkshop/Metadata-exports/Postcards/20230301_Postcards.csv", skiprows=[1])
print(postcards_data)

                 MMS ID                                      Uniform title  \
0      9990136310101488                   Belœil. Gebouwen. Kastelen. Park   
1      9990302540101488                 Beringen. Folklore en volkscultuur   
2      9990544990101488                                   Bilzen. Panorama   
3      9990616780101488                             Blankenberge. Panorama   
4      9990731350101488                              Blankenberge. Zeedijk   
...                 ...                                                ...   
35644  9992683362801488  Antwerpen. Beelden en objecten. Koninklijk Mus...   
35645  9992688280601488                              Antwerpen. Leysstraat   
35646  9992688285001488        Antwerpen. Gebouwen. Algemeen. Den Botaniek   
35647  9992704600301488  Antwerpen. Beelden en objecten. Koninklijk Mus...   
35648  9992713007101488                               Antwerpen. Suikerrui   

                                              Main title Varian

Let's now transform the data

In [6]:
for index, row in postcards_data.iterrows():
    location = row['Uniform title'].split(".")[0]
    print(location)

    if index == 5: # testing purposes
        break

Belœil
Beringen
Bilzen
Blankenberge
Blankenberge
Borgerhout


Let's try to create a new column with the extracted data. 
First we define a function that we will apply to all the rows.

In [9]:
# Function to extract the location from the title
def extract_location(title):
    result = ''
    try:
        if type(title) is str:
            result = title.split(".")[0]
    except:
        print('Error, the title does not contain the location:' + title)
    return result

postcards_data['Location title'] = postcards_data['Uniform title'].apply(extract_location)
print(postcards_data)

                 MMS ID                                      Uniform title  \
0      9990136310101488                   Belœil. Gebouwen. Kastelen. Park   
1      9990302540101488                 Beringen. Folklore en volkscultuur   
2      9990544990101488                                   Bilzen. Panorama   
3      9990616780101488                             Blankenberge. Panorama   
4      9990731350101488                              Blankenberge. Zeedijk   
...                 ...                                                ...   
35644  9992683362801488  Antwerpen. Beelden en objecten. Koninklijk Mus...   
35645  9992688280601488                              Antwerpen. Leysstraat   
35646  9992688285001488        Antwerpen. Gebouwen. Algemeen. Den Botaniek   
35647  9992704600301488  Antwerpen. Beelden en objecten. Koninklijk Mus...   
35648  9992713007101488                               Antwerpen. Suikerrui   

                                              Main title Varian

In [10]:
print(postcards_data['Location title'].unique())

['Belœil' 'Beringen' 'Bilzen' ... '' 'Bois-de-Lessines (Lessines)'
 'Soye (Floreffe)']


In [11]:
print(postcards_data['Location title'].nunique())

1155


In [12]:
print(postcards_data['Location title'])

0              Belœil
1            Beringen
2              Bilzen
3        Blankenberge
4        Blankenberge
             ...     
35644       Antwerpen
35645       Antwerpen
35646       Antwerpen
35647       Antwerpen
35648       Antwerpen
Name: Location title, Length: 35649, dtype: object
