<a href="https://colab.research.google.com/github/selgebali/Colabs/blob/main/RTGs_OverTime_Plotly.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# README

## Overview
This Python script is designed to visualize the growth of a specified resource type (e.g., instruments) over time by querying the DataCite API. It fetches data about yearly counts of DOIs (Digital Object Identifiers), processes the data, and generates an interactive bar graph for visualizing the growth. The graph is saved in an HTML file for easy sharing and viewing.

## Features
- Fetches resource-type data from the DataCite API dynamically.
- Extracts yearly growth counts of DOIs.
- Visualizes the data as an interactive bar graph using Plotly.
- Appends visualizations to an HTML file for presentation purposes.

## Requirements
Ensure the following Python libraries are installed before running the script:

- `requests`: For making HTTP requests to the DataCite API.
- `json`: To parse and handle JSON data.
- `pandas`: For potential data manipulation (though not actively used in the main functions).
- `plotly`: For creating interactive visualizations.

Install the required libraries using pip:
```bash
pip install requests pandas plotly
```

## Script Details

### Key Functions

#### 1. `fetch_api_data(url)`
Fetches data from the given API URL.
- **Input**: `url` (string) - The endpoint URL of the DataCite API.
- **Output**: Parsed JSON data.
- **Logs**: API URL and response status code for debugging.

#### 2. `extract_yearly_counts(meta_data)`
Extracts yearly counts of DOIs from the metadata.
- **Input**: `meta_data` (dict) - Metadata from the API response.
- **Output**: List of dictionaries containing yearly counts.
- **Logs**: Extracted yearly counts for debugging.

#### 3. `visualize_growth_bar_graph(yearly_counts, html_file)`
Creates a bar graph of DOI growth over time and appends it to an HTML file.
- **Input**:
  - `yearly_counts` (list): List of dictionaries with yearly data.
  - `html_file` (string): Path to the HTML file.
- **Output**: Interactive bar graph displayed in the browser and appended to the HTML file.
- **Logs**: Confirms graph display and HTML file updates.

### Example Usage

#### Parameters
- **Resource Type**: The type of resource to visualize (e.g., "instrument").
- **API URL**: Formed dynamically using the resource type.
- **HTML File**: File name for saving visualizations (`resource_growth_graphs.html` by default).

#### Steps
1. Initialize the HTML file with a basic structure.
2. Fetch API data for the specified resource type.
3. Extract yearly counts and total counts from the API response.
4. Generate and display a bar graph visualizing the growth of the resource.
5. Append the graph to the HTML file.
6. Finalize the HTML file.

#### Execution
Run the script as is or modify the `resource_type` variable to visualize different resources:
```python
resource_type = "instrument"
api_url = f"https://api.datacite.org/dois?resource-type-id={resource_type.lower()}"
html_file = "resource_growth_graphs.html"
```

### Output
The script produces:
1. **Bar Graph**: Visualizing resource growth over time.
2. **HTML File**: Interactive graphs saved in `resource_growth_graphs.html`.

## Customization
- **Color Palette**: Modify the `custom_colors` list to update the bar graph colors.
- **Font Styles**: Adjust font size, family, or weight in the `visualize_growth_bar_graph` function for custom appearance.

## Debugging
The script includes debug logs to trace its execution:
- API URL being fetched.
- Response status code.
- Extracted yearly counts and total counts.
- Confirmation of graph display and HTML file updates.

## Example Output
Sample output for resource type "instrument":
- A bar graph showing the growth of instruments by year.
- Total DOIs counted across all years.
- Saved graphs in `resource_growth_graphs.html`.

## License
This script is provided under the MIT License. Feel free to use and modify it as needed.

## Contributing
Contributions are welcome! Submit issues or pull requests on the repository to suggest improvements or report bugs.



In [None]:
import requests
import json
import pandas as pd
import plotly.graph_objs as go
import plotly.io as pio

# Custom color palette
custom_colors = ['#243B54', '#00B1E2', '#5B88B9', '#46BCAB', '#90D7CD', '#BC2B66']

# Function to fetch data from the API endpoint with dynamic resource type
def fetch_api_data(url):
    response = requests.get(url)  # Send a request to the given API URL
    print(f"Fetching data from URL: {url}")  # Debug log: API URL being fetched
    response.raise_for_status()  # Raise an error if the request was unsuccessful
    data = response.json()  # Parse the JSON response
    print(f"API response status code: {response.status_code}")  # Debug log: Status code of response
    return data

# Function to extract yearly counts from meta data
def extract_yearly_counts(meta_data):
    yearly_counts = []
    for year_data in meta_data.get('created', []):
        yearly_counts.append({
            'year': year_data['id'],
            'count': year_data['count']
        })
    print(f"Extracted yearly counts: {yearly_counts}")  # Debug log: Yearly counts extracted
    return yearly_counts

# Function to visualize resource type growth over time as a bar graph
def visualize_growth_bar_graph(yearly_counts, html_file):
    years = [data['year'] for data in yearly_counts]
    counts = [data['count'] for data in yearly_counts]
    # Reverse the order to plot from latest to earliest year
    years.reverse()
    counts.reverse()
    # Create bar graph
    bar_trace = go.Bar(
        x=years,
        y=counts,
        marker=dict(color=custom_colors[0]),
        text=counts,  # Add counts as text labels
        textposition='auto',  # Position text labels automatically (inside or on top of bars)
        textfont=dict(
        size=22,  # Increase text size
        color='white',  # Set text color
        family='Arial',  # Font family with bold variant
        weight= 'bold'
    )

    )

    # Create the Plotly figure
    fig = go.Figure(data=[bar_trace],
                    layout=go.Layout(
                        title={
                            'text': f"{resource_type.title()} Growth Over Time  ",
                            'font': dict(size=24, family='Arial', weight='bold')  # Set title font size and type
                        },
                        xaxis=dict(

                            title='Year',
                            tickmode='linear',
                            titlefont=dict(size=22, family='Arial', weight='bold'),  # Set x-axis title font size and type
                            tickfont=dict(size=22, family='Arial', weight='bold')  # Set x-axis tick labels font size and type
                        ),
                        yaxis=dict(
                            showticklabels=False,  # Hide y-axis tick labels
                            title='Number of DOIs',
                            titlefont=dict(size=22, family='Arial', weight='bold'),  # Set y-axis title font size and type
                            tickfont=dict(size=22, family='Arial')  # Set y-axis tick labels font size and type
                        ),
                        width=2000,
                        height=900,
                        plot_bgcolor='white',
                        paper_bgcolor='white'
                    ))

    fig.show()
    print(f"Displayed bar graph for resource growth.")  # Debug log: Bar graph displayed

    # Append to HTML file
    with open(html_file, 'a') as f:
        f.write(pio.to_html(fig, include_plotlyjs='cdn'))
        print(f"Appended bar graph to HTML file: {html_file}")  # Debug log: Graph appended to HTML

# Example usage
resource_type = "instrument"
api_url = f"https://api.datacite.org/dois?resource-type-id={resource_type.lower()}"

html_file = "resource_growth_graphs.html"

# Initialize HTML file with basic structure
with open(html_file, 'w') as f:
    f.write("<html><head><title>Resource Growth Graphs</title></head><body>")
    print(f"Initialized HTML file: {html_file}")  # Debug log: HTML file initialized

# Fetch API data and extract yearly counts
data = fetch_api_data(api_url)
meta_data = data.get('meta', {})
yearly_counts = extract_yearly_counts(meta_data)
# Extract total count of all time
total_count = meta_data.get('total', 0)
print(f"Total count of all time: {total_count}")  # Debug log: Total count of all time

# Visualize the growth as a bar graph
visualize_growth_bar_graph(yearly_counts, html_file)

# Close the HTML file after writing all content
with open(html_file, 'a') as f:
    f.write("</body></html>")
    print(f"Closed HTML file: {html_file}")  # Debug log: HTML file closed

print(f"Graphs saved in {html_file}")

Initialized HTML file: resource_growth_graphs.html
Fetching data from URL: https://api.datacite.org/dois?resource-type-id=instrument
API response status code: 200
Extracted yearly counts: [{'year': '2024', 'count': 108}, {'year': '2023', 'count': 3}, {'year': '2021', 'count': 7}, {'year': '2020', 'count': 1}, {'year': '2019', 'count': 17}]
Total count of all time: 136


Displayed bar graph for resource growth.
Appended bar graph to HTML file: resource_growth_graphs.html
Closed HTML file: resource_growth_graphs.html
Graphs saved in resource_growth_graphs.html
