# 1) Introduction to API Authentication

Authentication is key to confirming identity and safeguarding sensitive data within APIs. Understanding the need for authentication is crucial for several reasons:

1. Data Protection: It ensures sensitive information remains secure.
1. Rate Limiting: Authentication enables APIs to impose specific usage limits based on the user's identity.
1. User Management: It allows APIs to deliver personalized data and responses to individual users.

There are several common methods of authentication in APIs:

1. API keys: Simple yet effective unique identifiers for verifying users or applications.
1. OAuth: A more complex, secure protocol for granting access.
1. Basic Auth: This involves sending username and password credentials, encoded in Base64, via the HTTP header.
1. Token authentication: Involves issuing a token after login, which is then used in subsequent request headers for authentication.

## Instructions

In this exercise, we will attempt to access a protected endpoint: `https://api-server.dataquest.io/private/historical_data` of the World Development Indicators API without authentication. This will help us understand the role and importance of authentication in API usage.

1. Import the `requests` library.

1. Define a variable `url` that contains the URL of a protected endpoint of the World Development Indicators API.

1. Use the `requests.get()` function to send a GET request to the API endpoint stored in url. Store the response in a variable named response.

1. Print the status code and the content of the response.

In [1]:
import requests
url = 'https://api-server.dataquest.io/private/historical_data'

response = requests.get(url)

print(response)

<Response [401]>


# 2) Navigating API Documentation

This documentation is designed to be your comprehensive guide to navigating our API side server, which utilizes content from the World Development Indicators.

Our API offers a comprehensive guide to effectively utilizing its features. Here's what we need to know:

1. **Base URL:** The primary access point for the API is https://api-server.dataquest.io/economic_data. This URL serves as the foundation to which endpoint paths are appended. For private tables, we will use https://api-server.dataquest.io/private/economic_data.

1. **Endpoints:** The API provides various endpoints, such as /indicators, /countries, /historical_data, /footnote, /country_series, and /series_time. These endpoints represent different data resources accessible through the API.

1. **HTTP Methods:** Our API primarily responds to GET requests, allowing users to retrieve data from the specified endpoints.

1. **Parameters:** Parameters like limit, offset, field, filter_by, sort_by, and sort_desc can be used to refine data retrieval, offering control over the quantity, pagination, and sorting of the data.

1. **Response Formats:** The API delivers responses as JSON-formatted strings, specifically as lists of dictionaries enclosed in quotes. To convert these strings into usable list formats in Python, you can use json.loads(). This approach ensures that the data is both user-friendly and easily parsed for practical utilization.

1. **Rate Limits:** Most APIs impose limits on the number of requests you can make in a certain timeframe to prevent abuse and ensure fair usage.

1. **Error Messages:** Our API is designed to handle various error scenarios effectively, providing clear messages to aid in troubleshooting. Here are some potential error cases and their corresponding messages:

    * Authentication Errors: If a request to a private endpoint lacks proper authentication, the API responds with "Unauthorized" or "Invalid authentication credentials". This occurs when the 'Authorization' header is missing, incorrect, or if the provided credentials do not match our records.

    * Database and Table Validation: When a request is made for a non-existent database or table, the API raises a "Selected database does not exist" or "Selected table does not exist" error. This ensures users are querying valid data sources.

    * Invalid Query Parameters: If query parameters like field, filter_by, or sort_by contain values not present in the database schema, the API responds with "Invalid [parameter] parameter, please ensure that only fields available in schema are specified."

    * Filter Syntax Errors: Incorrect filter syntax, particularly with comparators, triggers a "Incorrect filter comparator, please use one of the following: =, !=, ~" error message.

    * Access Violations: Attempts to access private data without proper authorization, or using the public API for private data, result in "Not authorized" or "Please use public API to retrieve data from a public source" messages.

    * General Exceptions: For other unforeseen errors, the API provides a generic "Unknown error occurred" message. This is a catch-all response for exceptions not explicitly handled by the API.

1. **Authentication:** Access to private endpoints requires authentication, which is supported through **two methods**: `basic authentication` and `token-based authentication`. For basic authentication, use the header Authorization: `Basic` . For token-based authentication, use Authorization: `Bearer` .

# 3) Basic Authentication

In Basic Authentication, we encode a username and password, which are then sent in the request headers. Hereâ€™s how to implement this in Python:

Begin by setting up your credentials. For the the API we are working on ,we have set the username as 'dq' and the password as 'test'.

```Python 
username = "dq"
password = "test"
```

Encode these credentials using base64. This standard method ensures the secure transmission of authentication data.

    ``` 
    import base64
    encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()
    ```

Include the encoded credentials in your request headers. This step securely communicates your authentication details to the API server.

    ``` 
    headers = {
        'Authorization': f'Basic {encoded_credentials}'
    }
    ```

Now, use the requests library to send a GET request to the API with these headers.
    
    ```
    import requests
    url = "https://api-server.dataquest.io/private/economic_data/FootNotes"
    response = requests.get(url, headers=headers)
    ```

## Instructions

1. Encode your credentials using base64 and assign the result to a variable called encoded_credentials.

1. Write a Python script to send a GET request to the /historical_data endpoint. Include your encoded credentials in the request headers for Basic Authentication.

1. Use https://api-server.dataquest.io/private/economic_data as base URL
Save the server's response in a variable named response.

1. Print the status code of the response to verify if your request was successful.

1. If the status code is `200`, indicating a successful request, assign the first 500 characters of the response text to a variable named `response_preview` and print.

1. use `response`.text to access the content of the response in text form.

In [2]:
import base64
import requests
username = "dq"
password = "test"
url = 'https://api-server.dataquest.io/private/economic_data/historical_data'

#encode credentials
encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()

#headers
headers = {'Authorization': f'Basic {encoded_credentials}'}


response = requests.get(url, headers=headers)
print(response)

if (response.status_code == 200):
    response_preview = response.text[:500]
    print(response_preview)

<Response [200]>
"[{\"country_name\": \"Africa Eastern and Southern\", \"country_code\": \"AFE\", \"indicator_name\": \"Access to clean fuels and technologies for cooking (% of population)\", \"indicator_code\": \"EG.CFT.ACCS.ZS\", \"1960\": null, \"1961\": null, \"1962\": null, \"1963\": null, \"1964\": null, \"1965\": null, \"1966\": null, \"1967\": null, \"1968\": null, \"1969\": null, \"1970\": null, \"1971\": null, \"1972\": null, \"1973\": null, \"1974\": null, \"1975\": null, \"1976\": null, \"1977\": nul


# 4) Enviroment Variable

We learned that, in Basic Authentication, **credentials are encoded** and sent in the request headers. This method is more secure compared to directly using API keys in the code, but it still poses a risk if the credentials are exposed.

So, how do we protect our credentials in a Basic Authentication scenario? The answer lies in using **environment variables**.

An environment variable is a dynamic "object" containing a value that can be used by one or more software programs in an operating system. You can think of them as secret lockers in your computer where you can keep important information.

In Python, we use the os module to set and get environment variables. Here's how you can **set** an environment variable:

    import os

    os.environ['USERNAME'] = 'your-username'
    os.environ['PASSWORD'] = 'your-password'

And to **retrieve** these credentials in your script:

    username = os.environ.get('USERNAME')
    password = os.environ.get('PASSWORD')

Alternatively, storing your Basic Authentication credentials in a **separate configuration file** is another secure approach. This is useful when managing multiple credentials or configurations. Your Python script can read these credentials as needed, keeping them out of your main codebase.

For instance, your credentials.json file might look like this:

    {
        "username": "your-username",
        "password": "your-password"
    }

And in your Python script, you could use:

    import json

    # Reading credentials from a file
    with open('credentials.json', 'r') as file:
        credentials = json.load(file)
        username = credentials['username']
        password = credentials['password']


Let's consider some risks associated with exposed credentials in Basic Authentication:

1. **Unauthorized Access**: Exposed credentials can lead to unauthorized access to APIs, potentially leading to data breaches.

1. **Data Manipulation**: If someone else gets hold of your credentials, they could manipulate the data you are accessing or modifying through the API.

1. **Service Abuse**: Misuse of your credentials could result in the abuse of the API services, affecting your application's functionality and possibly incurring additional costs.

1. **Reputation Damage**: Misuse of credentials under your name could negatively impact your or your organization's reputation

# 5) Troubleshooting: Handling Authentication Errors

When working with APIs, it's important to handle situations where authentication fails. These failures are usually indicated by specific HTTP status codes, part of the HTTP (HyperText Transfer Protocol) response sent by servers to indicate the status of a web request.

Here are some common HTTP status codes related to authentication errors:

`400 Bad Request`: This status code indicates that the request sent to the server was incorrect or corrupted and the server couldn't understand it. It often points to issues such as improperly formatted request syntax, invalid request message parameters, or deceptive request routing.

`401 Unauthorized`: This means the request lacks valid authentication credentials for the target resource. In other words, our API key was invalid, and we've been denied access.

`403 Forbidden`: This means the server understood the request, but it refuses to authorize it. This status is similar to 401 (Unauthorized), but indicates that the client must authenticate itself to get the requested response.

`429 Too Many Requests`: This means the user has sent too many requests in a given amount of time ("rate limiting"). In this case, we've exceeded our quota and need to slow down.

In Python, we can elegantly handle these errors using a `try/except` block. This approach allows us to catch the error and respond appropriately, rather than having the entire program crash. Here's an example of how this can be implemented:


In [5]:
import requests

import base64

username = 'dq'
password = 'tests'

# Encode credentials using base64
encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()

# Set up headers for Basic Authentication
headers = {
    'Authorization': f'Basic {encoded_credentials}'
}

url = 'https://api-server.dataquest.io/private/economic_data/footnotes'

try:
    response = requests.get(url, headers=headers)
    response.raise_for_status()
except requests.exceptions.HTTPError as err:
    print(f"HTTP error occurred: {err}")  # Handling specific HTTP error

HTTP error occurred: 401 Client Error: Unauthorized for url: https://api-server.dataquest.io/private/economic_data/footnotes


In the code above, we use the `raise_for_status()` method of the response object to raise an exception if the HTTP request returned an error status. We then catch this exception in the `except` block and print a custom error message. Since we have input the wrong password

## Instructions

You'll then use a try/except block to handle this error and print a custom error message.

1. Set a variable named url with the value https://api-server.dataquest.io/private/economic_data/historical_data.

1. Set username and password with incorrect credentials.

1. Encode the credentials using base64 and prepare a dictionary named headers for the request. This dictionary should have a key Authorization with the value being the formatted string `Basic + encoded_credentials`.

1. Write a try/except Block for the GET Request: In the try block, use the `requests.get()` method to send a GET request to the url using the headers. Call the raise_for_status() method of the response object to raise an exception for HTTP errors.

1. In the except block, catch the `requests.exceptions.HTTPError` exception and print a custom error message.

In [6]:
import requests
import base64

url = 'https://api-server.dataquest.io/private/economic_data/historical_data'

username = 'dq'
password = 'xxx'

encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()

#headers
headers = {
    'Authorization': f'Basic: {encoded_credentials}'
}

try:
    response = requests.get(url, headers=headers)
    response.raise_for_status()
except requests.exceptions.HTTPError as err:
    print(f"HTTP error occurred: {err}")  # Handling specific HTTP error
        

HTTP error occurred: 401 Client Error: Unauthorized for url: https://api-server.dataquest.io/private/economic_data/historical_data


# 6) Understanding API Rate Limits and Quotas

 API quotas and rate limit are mechanisms implemented to manage server resources, ensure fair usage among all users, and prevent the API from being overwhelmed by too many requests

**API Quotas** are defined as the maximum number of requests allowed over a specific period, like per minute, hour, or day. For example, an API might limit you to 1000 requests per day. Exceeding this quota means your additional requests within the same day will be denied. This is similar to a public library limiting the total number of books you can borrow at once.

**API Rate Limits**, on the other hand, restrict the number of requests you can make in a given timeframe. This is similar to the library's policy on how often you can renew your borrowed books. For instance, an API might have a rate limit of 100 requests per hour. If you send 101 requests within one hour, the last request will be denied.

When you exceed an API's quota or rate limit, the server typically responds with a `429 Too Many Requests` HTTP status code. This is a clear signal to pause your requests for a while, much like a library flagging your account when you've exceeded your renewal limit.

Assuming our API allows a maximum of `1000` requests per minute, if we exceed this number of requests, then we will have to wait until 1 minute has elapsed before we can make another request. Here's an example in Python, illustrating what might happen when an API's quota or rate limit is exceeded:

```Python
import requests
import time
import base64

# Setting up Basic Authentication
username = 'dq'
password = 'test'
encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()
headers = {'Authorization': f'Basic {encoded_credentials}'}

url = 'https://api-server.dataquest.io/private/economic_data/historical_data'

# Attempting to send multiple requests to demonstrate exceeding the quota or rate limit
for i in range(1001):
    response = requests.get(url, headers=headers)
    print(response.status_code)
    if response.status_code == 429:
        print("Rate limit exceeded. Waiting for 60 seconds before next request.")
        time.sleep(60)  # Waiting for 1 minute before making the next

```

# 7) Timing API Requests

Now, we'll learn how to manage our requests to stay within these limits of `Quota` and `Rate`.

API rate limits control the maximum number of requests that can be made to an API within a specific timeframe. If this limit is exceeded, the API will deny further requests, similar to facing penalties for not adhering to set rules.

To ensure we don't exceed the rate limit, we need to manage the timing of our requests effectively. One way to do this is by using the `time.sleep()` function in Python. This function pauses the execution of the program for a specified number of seconds.

Let's say we're working with an API that has a rate limit of `100` requests per minute. To stay within this limit, we could add a delay of 0.6 seconds (60 seconds / 100) between each request. Here's an example:

```Python
import requests
import time

url = 'https://api-server.dataquest.io/private/economic_data/footnotes'
username = 'dq'
password = 'test'
encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()
headers = {'Authorization': f'Basic {encoded_credentials}'}

for i in range(100):
    response = requests.get(url, headers=headers)
    print(response.status_code)
    time.sleep(0.6)

```

Timing our requests in this way is especially crucial when we're dealing with large-scale data retrieval tasks. For example, if we need to retrieve data for thousands of indicators from the World Development Indicators API, we'll likely need to make thousands of requests. By adding a delay between each request, we can ensure that we don't exceed the API's rate limit.

Now, let's consider some real-world scenarios where managing rate limits is crucial:

1. **Social Media Analysis**: If you're building a sentiment analysis tool that uses a social media company's API to gather posts, you'll be making a large number of requests. Most companies impose a rate limit, and if you send too many requests too quickly, your access could be temporarily suspended.

1. **Web Scraping**: If you're scraping a website for data, you'll likely be making many requests to the site's server. If you send these requests too quickly, the server might interpret it as a denial-of-service attack and block your IP address.

1. **Financial Data Analysis**: If you're using an API to gather real-time stock market data for analysis, you'll need to manage your requests to avoid exceeding the API's rate limit. Exceeding the limit could result in your access being suspended, which would disrupt your data analysis.

# 8) Dealing with Rate Limit

In our data retrieval journey, we've learned to authenticate our requests using Basic Authentication, time our requests to avoid exceeding rate limits, and handle various potential errors. But what if we still hit the rate limit? Does this mean our data retrieval process has to stop? Not necessarily!

When we exceed the rate limit, the API server responds with a `429 Too Many Requests` status code. Instead of letting this error halt our data retrieval, we can catch this error, pause our script temporarily, and then continue with our requests. This approach respects the API's rate limit while ensuring our data retrieval continues.

```Python

import requests
import time

url = 'https://api-server.dataquest.io/private/economic_data/footness'

username = 'dq'
password = 'test'
encoded_credentials = base64.b64encode(f"{username}:{password}".encode()).decode()
headers = {'Authorization': f'Basic {encoded_credentials}'}

for i in range(101): # Sending 101 requests to demonstrate the rate limit error
    response = requests.get(url, headers=headers)
    if response.status_code == 429:  # If we hit the rate limit
        print("Rate limit exceeded. Waiting for 60 seconds before next request.")
        time.sleep(60)  # Pause the script for 60 seconds
    else:
        print(response.status_code)
    time.sleep(0.6)  # Pause for 0.6 seconds between requests to respect the rate limit

```

In the example above, we're using a for loop to send 101 requests to the API. Notice that we're pausing for 0.6 seconds between each request to respect the API's rate limit of `100` requests per minute. However, we're sending 101 requests in this example to intentionally exceed the rate limit and demonstrate how we can handle the `429 Too Many Requests` status code.

When we hit the rate limit, the server responds with a `429` status code. We catch this status code using an if condition and pause our script for 60 seconds before continuing with our requests. This pause allows us to stay within the API's rate limit while ensuring the continuity of our data retrieval process.

In real-world data retrieval tasks, handling rate limit errors effectively is crucial to ensure the continuity of our data retrieval process and respect the API's usage policies.

# 9) Understanding API Key Authentication

In a typical scenario where an API supports API key authentication, the process would involve registering or signing up on the platform hosting the API. After registration, you would usually receive an API key, often accessible in your account settings or a dedicated API section. This key is then used to authenticate your requests to the API.

For illustrative purposes, let's consider a hypothetical example using Python's requests library, assuming that an API supports API key authentication.

```Python

import requests

# Hypothetical URL for an API that supports API key authentication
url = 'https://api.example.com/data'
headers = {'Authorization': 'Bearer YOUR_API_KEY'}

# Making a request with the API key
response = requests.get(url, headers=headers)
print(response.json())

```