## Why is it important to use Web APIs for research?

Web APIs help automate access to research data and metadata. This enables reproducibility, automation of data pipelines, and programmatic interaction with repositories like 4TU.ResearchData.

## REST APIs in a nutshell

A REST API is a web service that uses HTTP methods (GET, POST, etc.) to allow communication between clients and servers. Responses are usually in JSON format, making them easy to parse and reuse.

## 1. REUSE: Search and Download Datasets

### Get 10 datasets published after 01-01-2025 (via `curl`)

In [None]:
!curl "https://data.4tu.nl/v2/articles" | jq

## What is curl?

curl stands for **Client URL**. 

It’s a command-line tool that allows you to transfer data to or from a server using various internet protocols, most commonly HTTP and HTTPS.

It is especially useful for making API requests — you can send GET, POST, PUT, DELETE requests, upload or download files, send headers or authentication tokens, and more.

## Why curl works for APIs

REST APIs are based on the HTTP protocol, just like websites. When you visit a webpage, your browser sends a GET request and displays the HTML it gets back. When you use curl, you do the same thing, but in your terminal. For example: 

`curl https://data.4tu.nl/v2/articles` This sends an HTTP GET request to the 4TU.ResearchData API.

## Key reasons why curl is used:

It’s built into most Linux/macOS systems and easily installable on Windows.

Scriptable: usable in bash scripts, notebooks, automation.

Supports headers, query parameters, tokens, POST data, etc.

Can output to files (>, -o, -O) or pipe to processors like jq.

In [None]:

!curl "https://data.4tu.nl/v2/articles?limit=2&published_since=2024-07-25" > data.json

In [None]:
!curl "https://data.4tu.nl/v2/articles?limit=2&published_since=2024-07-25" | jq

In [None]:
!curl "https://data.4tu.nl/v2/articles?item_type=3&limit=10&published_since=2025-01-01" | jq

### Get 10 software records published after 01-01-2025 (via `curl`)

In [None]:
!curl "https://data.4tu.nl/v2/articles?item_type=9&limit=10&published_since=2025-01-01" | jq

### Save dataset titles and DOIs to file (via `curl`)

In [None]:
!curl "https://data.4tu.nl/v2/articles?item_type=3&limit=10&published_since=2025-01-01" \| jq '.[] | "* " + .title + " (" + .doi + ")"' > datasets.md

### Exercise: Save dataset title, DOI, and publication date (via `curl`)

In [None]:
!curl "https://data.4tu.nl/v2/articles?item_type=3&limit=10&published_since=2025-01-01" \| jq '.[] | "* " + .title + " (" + .doi + ") (" + .published_date + ")"' > datasets.md

## Search Datasets by Keyword

In [None]:
!curl --request POST  --header "Content-Type: application/json" \--data '{ "search_for": "mechanical engineering" }' \https://data.4tu.nl/v2/articles/search | jq

In [None]:
!curl --request POST  --header "Content-Type: application/json" \--data '{ "search_for": "Nanomechanical String Resonators" }' \https://data.4tu.nl/v2/articles/search | jq

## Using a Token to Access Author Info (via `curl`)

#### Create the .env file in binder and copy and paste the token for demosntrations purposes 

`echo 'API_TOKEN="your_token_here"' > ~/.env`

`echo "Token loaded: ${API_TOKEN:0:5}..."`

`source ~/.env`

- maybe we need to create a second token for the production server

`echo 'API_TOKEN_MAIN="your_token_here"' >> ~/.env`

### Troubleshooting 

- Most probably we have to move to the terminal in binder to make it work because in the notebook it does not work

In [None]:
# Requires setting a token in a sourced .env file (maybe skip this step but mention it
!curl --request POST \--header "Authorization: token ${variable_token_name_main}" \--header "Content-Type: application/json" \--data '{ "search": "Leila Iñigo" }' \https://data.4tu.nl/v2/account/authors/search | jq > author_info.md

## Upload Datasets (POST Requests)

### Basic Upload

In [None]:
!curl -X POST https://next.data.4tu.nl/v2/account/articles \ --header "Authorization: token ${variable_token_name_next}" \--header "Content-Type: application/json" \--data '{ "title": "Example dataset" }' | jq

### Upload with Author Metadata

In [None]:
!curl -X POST https://next.data.4tu.nl/v2/account/articles \ --header "Authorization: token YOUR_TOKEN_NEXT" \--header "Content-Type: application/json" \--data '{ "title": "Example dataset", "authors": [{ "first_name": "John", "full_name": "John Doe", "last_name": "Doe", "orcid_id": "0000-0003-4324-5350" }] }'| jq

### Upload Using YAML Metadata

In [None]:
!yq '.' example_metadata.yaml | \curl -X POST https://next.data.4tu.nl/v2/account/articles \-H "Authorization: token ${API_TOKEN_NEXT}" \-H "Content-Type: application/json" -d @-

## Motivation for Using Python

### Get the title, uuid and published dates of the datasets uploaded in April 2025

In [None]:
!curl "https://data.4tu.nl/v2/articles?item_type=3&limit=10&published_since=2025-04-01" \| jq '.[] | "* " + .title + " (" + .uuid + ") (" + .published_date + ")"' > datasets.md

### Get the description and categories of the datasets uploaded in April 2025

In [None]:
!curl -s https://data.4tu.nl/v2/articles/fb26fd3f-ba3c-4cf0-8926-14768a256933 \| jq -r '"Description: " + .description + "\nCategories: " + (.categories | map(.title) | join(", "))' \> datasets_description_categories.md

### Bash Script: Loop Through UUIDs to Collect Metadata

In [8]:
!curl -s "https://data.4tu.nl/v2/articles?published_since=20250401&item_type=3&limit=10" \| jq '.[] | {uuid: .uuid}' > article_ids.jsoncat article_ids.json | jq -r '.uuid' | while read uuid; do  curl -s "https://data.4tu.nl/v2/articles/$uuid" \  | jq -r '"Description: " + .description + "\nCategories: " + (.categories | map(.title) | join(", "))' >> articles_full_metadata.md ; done

### Limitations of Bash Scripts

- Harder to debug or extend
- Tricky to structure or merge data
- Not ideal for large-scale automation

## Using the API with Python

See `get_description_categories_datasets_example.ipynb` for a full example using `requests`.

## Bonus: Using `connect4tu` Python Package

You can also use the [connect4tu](https://github.com/leilaicruz/connect4tu) package for a cleaner Python interface to the 4TU API.