# Library Database Management

This project involves the creation and management of a library database using data from bibliographic resources and publishers. The goal is to organize the information in a relational database, allowing for efficient data retrieval and management.

## Data Overview

The data consists of two primary entities:

1. `BibliographicResource`: Represents the bibliographic information of various resources.
2. `Publisher`: Represents the publishers of these resources.

The relationship between these entities is established through the `Publisher`'s ID, which is used to link each `BibliographicResource` with its respective `Publisher`.

![Classes Diagram](classes_diagram.jpg)

## Detailed Description of the Data Files

### The `data.csv` File

The `data.csv` file contains bibliographic information with the following columns:

- `id`: A unique identifier for each bibliographic entry, typically formatted as a digital object identifier (DOI).
- `title`: The title of the work, which can range from research papers to datasets.
- `type`: The type of the resource, such as 'dataset', indicating the kind of material the bibliographic entry represents.
- `publisher`: An identifier linking the resource to its publisher, prefixed with 'crossref:' followed by a numerical identifier.

Here is a snippet of what the data looks like:

| id | title | type | publisher |
|----|-------|------|-----------|
| doi:10.1037/e383822004-001 | Persistence And Change: Standards-Based Systemic... | dataset | crossref:15 |
| doi:10.1037/e531662006-001 | BIA And DOD Schools: Student Achievement And Oth... | dataset | crossref:15 |
| doi:10.1037/e596722007-001 | Drugs, Brains, And Behavior: The Science Of Addi... | dataset | crossref:15 |

### The `publishers.csv` File

The `publishers.csv` file lists publishers, providing their unique identifiers and names. Each publisher is associated with the bibliographic entries from the `data.csv` file through the `id` column.

| id | name |
|----|------|
| crossref:15 | American Psychological Association (APA) |
| crossref:320 | Association For Computing Machinery (ACM) |
| crossref:263 | Institute Of Electrical And Electronics... |
| crossref:2914 | Jaypee Brothers Medical Publishing |

## Exercise 1: Creating the DataFrames

**Objective**: Use Pandas to create two separate DataFrames: one for `BibliographicResource` and one for `Publisher` with manually created unique IDs for each row.

**Instructions**:

1. Read the `data.csv` file into a DataFrame named `bibliographic_df`. This file contains information on bibliographic resources.
2. Read the `publishers.csv` file into a DataFrame named `publishers_df`. This file contains information on publishers.
3. Manually create a unique identifier for each row in both DataFrames using the pattern `biblio-<index>` for BibliographicResource and `publisher-<index>` for Publisher.
4. Add these unique identifiers as a new column in each DataFrame.
5. Print the first five rows of each DataFrame to verify the presence of unique IDs.

## Exercise 2: Merging DataFrames

**Objective**: Merge the `BibliographicResource` and `Publisher` DataFrames into a single DataFrame, ensuring that the relationship between the resources and publishers is maintained.

**Instructions**:

1. Merge `bibliographic_df` and `publishers_df` on the `Publisher` `id` field. The resulting DataFrame should have the bibliographic information along with the name of the publisher.
2. Ensure that all `BibliographicResource` entries have the `Publisher`'s name included in the merged DataFrame.
3. Print the first five rows of the merged DataFrame to verify your data.

## Exercise 3: Connecting to SQLite and Creating the Database

**Objective**: Establish a connection to a SQLite database and create the tables `BibliographicResource` and `Publisher` based on the structure of the provided DataFrames and UML diagram.

**Instructions**:

1. Use Python's `sqlite3` library to establish a connection to a new SQLite database called `library.db`.
2. Create a `BibliographicResource` table with appropriate columns based on the `bibliographic_df` DataFrame and the UML diagram.
3. Create a `Publisher` table with appropriate columns based on the publishers_df DataFrame and the UML diagram.
4. Close the database connection.




















