# Plot Number of Zenodo Links Over Time

This notebook reads CSV files located in the `download_statistics` folder, extracts the number of records in each file, and plots these numbers over time.

## Import Necessary Libraries

We will use `pandas` for data handling and `matplotlib` for plotting.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import os

ModuleNotFoundError: No module named 'pandas'

## List All CSV Files

List all files in the `download_statistics` directory with `.csv` extension.

In [None]:
directory = 'download_statistics'
csv_files = [f for f in os.listdir(directory) if f.endswith('.csv')]
csv_files

## Read Files and Count Records

For each file, count the number of records and store the results along with the associated date.

In [None]:
record_counts = []
dates = []

for file in csv_files:
    date_str = file.split('.')[0]
    date = pd.to_datetime(date_str, format='%Y%m%d')
    file_path = os.path.join(directory, file)
    df = pd.read_csv(file_path)
    record_counts.append(len(df))
    dates.append(date)

## Create a DataFrame

Store the dates and record counts in a DataFrame for easier plotting.

In [None]:
data = pd.DataFrame({'Date': dates, 'Record Count': record_counts})
data = data.sort_values('Date')
data.head()

## Plot the Data

Generate a line plot showing the number of records over time.

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(data['Date'], data['Record Count'], marker='o')
plt.title('Number of Zenodo Links Over Time')
plt.xlabel('Date')
plt.ylabel('Record Count')
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig('zenodo_links_over_time.png')
plt.show()