# Plotting Newspaper Data
An example of how to use the eLuxemburgensia digital collection and plotting to visually display data.

This project uses Jupyter Notebooks to encapsulate all information regarding the project. The notebook requests the a date range from the user. It then uses those dates to select a list of newspapers published during that time period. The newspapers are then plotted showing their periodicity.

## Requirements
* Python 3.12
* [requests](https://pypi.org/project/requests/): HTTP library to run HTTP requests
* [pandas](https://pandas.pydata.org/): format the output into tabular layout
* [yarl](https://pypi.org/project/yarl/): format the output URL into a clickable URL link

In [1]:
from datetime import datetime
import requests
import pandas as pd

In [2]:
# Request the start date from the user   
while (True):
    inputDate = input("Enter the start date (dd/mm/yyyy):")
    try:
        startDateValue = datetime.strptime(inputDate,'%d/%m/%Y')
        break
    except:
        print("Please enter a valid date in the format dd/mm/yyyy.")
        

Enter the start date (dd/mm/yyyy): 01/07/1880


In [3]:
# Request the end date from the user
while (True):
    inputDate = input("Enter the end date (dd/mm/yyyy):")
    try:
        endDateValue = datetime.strptime(inputDate,'%d/%m/%Y')
        break
    except:
        print("Please enter a valid date in the format dd/mm/yyyy.")

Enter the end date (dd/mm/yyyy): 15/12/1890


In [4]:
# get the BnL eluxembourgensia collection
elux_collection = requests.get("https://viewer.eluxemburgensia.lu/api/viewer2/cms/v2/digitalcollections")
elux_collection = elux_collection.json()

In [5]:
# select only those newspapers published between the start date and end date
print("Newspapers published between " + startDateValue.strftime('%d/%m/%Y') + " - " + endDateValue.strftime('%d/%m/%Y') + ":")

# to display all the rows in the table - otherwise, some rows are hidden
pd.set_option('display.max_rows', None)

filtered_newspapers = []
for newspaper in elux_collection["data"]:
    newspaper_dict = {}
    newspaperstartdate = newspaper["startdate"]
    try:
        newspaperenddate = newspaper["enddate"]
    except:
        newspaperenddate = "9999-12-31"
    if newspaperstartdate <= endDateValue.strftime("%Y-%m-%d") and newspaperenddate >= startDateValue.strftime("%Y-%m-%d"):
        # Newspaper published between the selected dates so get the link to a-z
        azLink = newspaper["az_url"]
        # parse out the docID
        startPosition = azLink.find("docid=alma") + 10
        endPosition = azLink.find("&",startPosition)
        docId = azLink[startPosition:endPosition]
        
        newspaper_dict = {'Title': newspaper["title"],'Start Date': newspaperstartdate, 'End Date': newspaperenddate, 'docId': docId}
        filtered_newspapers.append(newspaper_dict)
        
df = pd.DataFrame(filtered_newspapers, columns=["Title", "Start Date", "End Date", "DocId"])
dfStyler = df.style.set_properties(**{'text-align': 'left'})
dfStyler.set_table_styles([dict(selector='th', props=[('text-align', 'left')])])


Newspapers published between 01/07/1880 - 15/12/1890:


Unnamed: 0,Title,Start Date,End Date,DocId
0,Indépendance luxembourgeoise (L'),1871-10-01,1934-12-31,
1,Luxemburger Volks-Blättchen,1888-09-29,1889-09-29,
2,Mémorial du Grand-Duché de Luxembourg,1814-05-20,9999-12-31,
3,Komm mit mir!,1884-01-15,1884-12-01,
4,Echo (Das),1890-10-18,1897-12-26,
5,Luxemburger Volksbote,1882-02-04,1882-12-24,
6,Kirchlicher Anzeiger für die Diözese Luxemburg,1871-01-15,9999-12-31,
7,Arbeiter (Der),1889-10-05,1890-10-08,
8,Ordo archidioecesis luxembourgensis,1842-01-01,9999-12-31,
9,Obermosel-Zeitung,1881-06-18,1948-04-03,
