<!--- screen_index='889.0' type='text' --->

# Optional Query Parameters and Data Filtering
## <p>Learn how to use optional query parameters to refine your API data requests.</p>

## Casey Bates

<!--- screen_index='889.1' type='text' experimental='False' error_okay='False' sequence='1' --->

# Introduction to Query Parameters in APIs

Welcome to this lesson on **Optional Query Parameters and Data Filtering!** In an earlier lesson, we covered the basics of APIs and the Python requests library, including how to send `HTTP` requests, `retrieve data` from APIs, and `handle errors`. This foundational knowledge sets the stage for our next topic: optional query parameters in API interactions, a concept crucial not only in APIs but also in the world of Large Language Models (LLMs) like ChatGPT.

On this screen, we'll explore the significance of query parameters and data filtering techniques, focusing on their applications in APIs and LLMs. We'll start by understanding the role of optional query parameters in refining API requests to retrieve specific data sets. This concept mirrors the way LLMs process and analyze vast datasets, often encompassing billions of parameters, to understand and generate human-like text based on specific contexts or user queries.

### Large Language Models and Data Filtering:

LLMs like ChatGPT are trained on extensive datasets to grasp the nuances of human language and generate relevant responses. When you prompt ChatGPT with a question, such as 'Explain the theory of relativity,' it filters through its dataset to find and compile information that aligns with your request. This process is analogous to using query parameters in an API to extract a subset of data from a larger dataset. Just as you might use a `filter_by` parameter (as we will see on the next screen) in an API to get data about a specific country or topic, ChatGPT uses its trained parameters to filter and generate a context-specific response. In both cases, the underlying concept involves selecting the most relevant information from a large pool of data to meet the specific needs of the query.

Understanding how query parameters work in API interactions provides valuable insights into the principles underlying LLMs' data processing and response generation. As we explore these concepts, you'll gain a deeper appreciation for the complexity and power of both APIs and LLMs in handling and interpreting vast amounts of information.

<!--- screen_index='889.1.5' type='code' experimental='False' error_okay='False' sequence='2' --->

# Further Exploration of Query Parameters in APIs

Having introduced the concept of query parameters and their relevance in both API interactions and Large Language Models, let's now take a closer look at how these parameters work in a more familiar context.

We will focus on the [World Bank's Development Indicators](https://datatopics.worldbank.org/world-development-indicators/), a comprehensive database containing detailed global development data for over 200 countries, some dating back more than 50 years. To enhance our interaction with this extensive resource, we have built a dedicated side server featuring our own APIs(`https://api-server.dataquest.io/economic_data`). This setup will provide streamlined access to the database, allowing us to efficiently utilize these valuable indicators in our coursework.

Imagine you're at a restaurant, and you order a burger. However, you don't just want any burger, you want it with extra cheese, no pickles, and a side of sweet potato fries instead of regular fries. In this scenario, the extra cheese, no pickles, and sweet potato fries are all specific instructions or parameters to modify your 'request' for a burger. 


<center>
<img src="https://s3.amazonaws.com/dq-content/889/1.1a-m889.svg">
</center>



In the world of APIs, we often want to do something similar. Rather than retrieving all the data an API offers, we might want only a subset of that data. This is where optional query parameters come in. They allow us to specify or filter the data we want from an API, much like adding specific instructions to our burger order.

Optional query parameters allow us to select a subset of data from an API, rather than retrieving everything it offers. For instance, to filter data to only include countries in Sub-Saharan Africa, we use a query parameter in the URL, like `https://api-server.dataquest.io/economic_data/countries?filter_by=region=Sub-Saharan%20Africa`.

The World Development Indicators API supports these parameters, enabling refined searches. It has several endpoints, including `/countries`, `/indicators`, `/country_series`, `/series_time`, `/footnotes`, and `/historical-data`.



To illustrate, sending a GET request to the API without query parameters looks like this:

```python
import requests

response = requests.get("https://api-server.dataquest.io/economic_data/countries")
data = response.json()
```

The `data` variable now holds a list of all countries in the database. However, if we are specifically interested in countries within a certain region, such as Sub-Saharan Africa, we need to utilize query parameters to refine our request. It's important to understand that not every API will accept the same query parameters, which is why consulting the API's documentation is essential. In our case, our API supports the `filter_by` parameter, which allows for more targeted searches.

We can modify our request to include this parameter:

```python
response = requests.get("https://api-server.dataquest.io/economic_data/countries?filter_by=region=Sub-Saharan%20Africa")
data = response.json()
```

Here, the `filter_by=region=Sub-Saharan%20Africa` segment in the URL is a query parameter where:

- `?` is a delimiter that marks the beginning of the query string. It separates the path of the URL from the parameters that are being passed.
- `filter_by` indicates the type of filtering we are applying.
- `region` is a specific field in the API's database that we want to filter by. In this context, `region` refers to the geographical area of the countries.
- The first `=` sign following `filter_by` is used to assign the filtering criteria (`region` in this case), and the second `=` sign assigns the specific value (`Sub-Saharan Africa`) to the `region` field.
- `%20` is URL encoding for a space character, necessary because URLs cannot contain actual space characters. However, when composing a GET request in an editor or a tool, you don't need to manually type `%20` for spaces; it is typically handled automatically by the software.

Now, the `data` variable holds a list of countries specifically in the Sub-Saharan Africa region. This demonstrates how query parameters can be effectively used, when supported by an API, to refine data requests according to our requirements.

In the upcoming exercises, you'll explore in detail API interactions using the World Development Indicators API. Your journey will guide you through refining data requests and query parameters, handling errors, and optimizing data retrieval.

Let's get started!

## Instructions

In this exercise, you'll practice using query parameters to retrieve specific data from the World Development Indicators API. We have imported requests library and you don't need to import it again
.
You will need to:

1. Send a GET request to the World Development Indicators API's `/countries` endpoint.
	- Use the base url: `"https://api-server.dataquest.io/economic_data"`

2. Use the `filter_by` query parameter to filter the data for countries in the region of `South Asia`.

3. Retrieve the data and store it in a variable called `region_south_asia`.

4. Print the `region_south_asia` variable to observe the filtered data.

## Hint

- Make sure to include the necessary query parameter in the URL to filter the data for 'South Asia'.
- Remember to import the `requests` library and use the `get()` function to send the GET request.

In [0]:
## Display

import requests

## Answer

response = requests.get("https://api-server.dataquest.io/economic_data/countries?filter_by=region=South Asia")
region_south_asia = response.json()
print(region_south_asia)

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    # Check for correct assignment to agricultural_data variable
    ac.add(CheckStringVariable("region_south_asia"))

    return ac.check_answer()


<!--- screen_index='889.2' type='code' experimental='False' error_okay='False' sequence='3' --->

# Multiple Query Parameters

On the previous screen, we learned about query parameters and their role in refining data requests to APIs. Let's continue applying this concept in a practical scenario, where we require specific data from the World Development Indicators API, particularly from the `/indicators` endpoint. This endpoint has various fields like `topic`, `indicator_name`,`series_code`, `source`, `periodicity` and `long_name`. 

Imagine we need to analyze `indicators` related to the **topic**, "Environment: Agricultural production" and also wish to gather data with **series_code**, "G.AGR.TRAC.NO". To achieve this, we can use multiple query parameters in our API request to filter the data more precisely.

For instance, a GET request to the `/indicators` endpoint with two query parameters would look like this:

```python
import requests

response = requests.get("https://api-server.dataquest.io/economic_data/indicators?filter_by=topic=Environment:%20Agricultural%20production&filter_by=series_code=AG.AGR.TRAC.NO")
data_str = response.json()
print(data_str)
```
Below is a snippet our output:
```text
"[{\"series_code\": \"AG.AGR.TRAC.NO\", \"topic\": \"Environment: Agricultural production\", \"indicator_name\": \"Agricultural machinery, tractors\", \"short_definition\": null, \"long_definition\": \"Agricultural machinery refers to the number of wheel and crawler tractors (excluding garden tractors) in use in agriculture at the end of the calendar year specified or during the first quarter of the following year.\", \"unit_of_measure\": null}]"
```
Note that, for titles like 'Economic Policy & Debt,' which contain an ampersand (&), we use `%26` instead, as it is the URL encoding for an ampersand (&) for proper API interpretation. The `&` we see before the second `filter_by` is the character used to combine other query parameters. This precise structuring of parameters directs the API to return data specifically relevant to the chosen topic and series code.

Looking at the `data_str` variable, we can see that it's a string format that represents a JSON array. This JSON string needs to be parsed into a Python object (specifically a list of dictionaries) to allow for data manipulation and extraction. Here's how we can approach it:

```python
import json
# Parse the JSON string into a Python list of dictionaries
data = json.loads(data_str)
```
Once parsed, we can access the information in the same way as we would with a list of dictionaries. For example, if we want to extract details about "Agricultural machinery, tractors", we can loop through the list and find the relevant dictionary.

```python
for indicator in data:
    if indicator['indicator_name'] == "Agricultural machinery, tractors":
        # Extract and print the desired information
        print(f"Topic Name: {indicator['topic']}")
        print(f"Indicator Name: {indicator['indicator_name']}")
        print(f"Long Definition: {indicator['long_definition']}")
        # Assuming we only need the first match, we can break the loop after finding it
        break
```
This code filters through the returned data to locate the indicator "Agricultural machinery, tractors" and extract relevant details about it. It demonstrates how we can combine the use of query parameters with Python's data processing capabilities to efficiently obtain the exact data we need.

Recall in our previous prompt to ChatGPT, "*Explain the theory of relativity,*" where we used a `single query`. In scenarios requiring `multiple query` parameters, we're looking for more than one specific `filter` in our `API request`. This is similar to how a more complex ChatGPT prompt might require the model to integrate and filter multiple pieces of information. For instance, if we asked ChatGPT about the impact of the theory of relativity on modern physics and space exploration, the AI would need to filter through its dataset for relevant information on both topics, similar to using **multiple** `filter_by` parameters in an API.

By mastering the use of multiple query parameters and Python for data manipulation,we are equipped to make precise and efficient API requests, a valuable skill in data analysis and API interaction.

## Instructions

In this exercise, you'll construct a GET request to the World Development Indicators API using multiple query parameters to filter the data. You'll analyze the returned data to confirm that the query parameters worked as expected. We have imported both `requests` and `json` libraries for you.


1. Send a GET request to the World Development Indicators API at the `/indicators` endpoint. Include query parameters to filter the data to only include the topic (`topic=Health: Risk factors`) collected or updated once every two years (`periodicity=Biennial`). Store the response to avariable called `response`.
	- Use the base url: `https://api-server.dataquest.io/economic_data`

2. Convert the response to JSON format and store the result in a variable named `topic_str`.

3. Parse the JSON string(`topic_str` variable) into a Python list of dictionaries. Store it to a variable called `topic`.

4. Print the `series_code`, and `indicator_name` in the first row of your `topic` variable.

## Hint

- To filter data by multiple criteria, make sure your GET request's URL contains the necessary `filter_by` query parameters.
- convert API response using the `.json()` method and parse it with `json.loads()`.
- To access and display specific indicator information, use the keys to pull data from the first entry in your parsed JSON list.

In [0]:
## Display

import requests
import json

## Answer

response=requests.get("https://api-server.dataquest.io/economic_data/indicators?filter_by=topic=Health: Risk factors&filter_by=periodicity=Biennial")
topic_str = response.json()
topic=json.loads(topic_str)
for row in topic:
    print(f"indicator Code: {row['series_code']}")
    print(f"Indicator Name: {row['indicator_name']}")
    break

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    # Check for correct assignment to agricultural_data variable
    ac.add(CheckListVariable("topic"))
    return ac.check_answer()

<!--- screen_index='889.3' type='code' experimental='False' error_okay='False' sequence='4' --->

# Handling Invalid Parameter Values

In our exploration of API data requests, particularly with the World Development Indicators API’s `/indicators` endpoint, an essential aspect to grasp is the impact of using incorrect or unsupported fields in our queries. This is a common issue that can lead to errors or unanticipated responses from the API.

<center>
<img src="https://s3.amazonaws.com/dq-content/889/3.1-m889.svg">
</center>


To illustrate, let’s consider an example where we mistakenly use an unsupported field in our request to the `/indicators` endpoint. Suppose we make a request using the field `indicator_topic`, which does not exist in this API:

```python
import requests

response = requests.get("https://api-server.dataquest.io/economic_data/indicators?filter_by=indicator_topic=Health: Risk factors")
data_str = response.json()
print(data_str)

```

```text
{'detail': 'Invalid filter_by parameter, please ensure that only fields available in schema are specified.'}
```

The above request includes an invalid parameter (`indicator_topic=Health: Risk factors`). When we run this code, we received an error message instead of the data we expected.

It's crucial to understand how to handle situations like these because they're common when working with APIs. APIs can have complex structures and specific requirements, and it's easy to make a mistake when crafting our requests.

Just like LLMs, which are trained on a finite set of parameters and data, they can struggle or give unexpected responses when queried with prompts that fall outside these parameters. For example, asking an LLM like ChatGPT a question based on a false premise, like the collaboration between Einstein and Newton, can lead to incorrect or nonsensical answers. This is because the model's training doesn't include data to handle such historically inaccurate scenarios. We have to align queries with the model's training scope for accurate responses.

When we encounter an error, our first step should be to check the API's documentation to understand the valid parameters and their expected values. If the error persists, we can use the error message to troubleshoot the issue. In many cases, the error message will provide clues about what went wrong and how to fix it.

Now, let's put this into practice.

## Instructions

In this exercise, you'll experience firsthand what happens when an invalid query parameter is used in an API request. You'll send a GET request to the World Development Indicators API with an invalid query parameter, observe the response, and use it to troubleshoot the issue.

1. Send a GET request to the World Development Indicators API at the `/indicators` endpoint. Include a query parameters to filter the data to only include data collected or updated once every two years (`indicator_period=Biennial`).
	- Use `https://api-server.dataquest.io/economic_data` as base url.
2. Convert the response to JSON format and store the result in a variable named `invalid_data_str`.
3. Print the `invalid_data_str` variable to observe the response.

## Hint

- To initiate an API call, use `requests.get()` with the `https://api-server.dataquest.io/economic_data/indicators` and include the incorrect `indicator_period=Biennial` parameter.
- To capture the response, apply the `.json()` method which helps in identifying possible errors in the query.
- To observe the problematic response, print the `invalid_data_str` after the JSON conversion to better understand the API's feedback.

In [0]:
## Display

import requests
import json

## Answer

response = requests.get("https://api-server.dataquest.io/economic_data/indicators?filter_by=indicator_period=Biennial")
invalid_data_str =response.json()
print(invalid_data_str)

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    ac.add(CheckDictionaryVariable("invalid_data_str"))
    return ac.check_answer()

<!--- screen_index='889.4' type='text' experimental='False' error_okay='False' sequence='5' --->

# Error Handling in API Requests

When working with APIs, encountering errors is a common occurrence. These errors can arise from various causes, such as incorrect URLs, invalid or missing parameters, or server-side issues.

Recall from earlier that making a request to an API is like asking for a specific dish in a restaurant. Now, imagine that you're in a restaurant and you ask for a dish that's not on the menu. What happens? The waiter will likely tell you that the dish you're asking for isn't available, right?

Similarly, when we make a request to an API, we may not always get the response we expect. Sometimes, we might get an error message instead. As in our previous exercise, our API responded with the message `{'detail': 'Invalid filter_by parameter, please ensure that only fields available in schema are specified.'}`. This message indicates that something was incorrect in the request. However, not all APIs are equipped with such descriptive responses. In this section, we will learn how to manage these errors effectively.

In Python programming, these errors can be managed using `try/except` blocks. When code within a `try` block encounters an error, Python shifts to the `except` block, allowing the programmer to handle the error or provide alternative instructions. This mechanism ensures that the program can gracefully manage unexpected situations or errors during execution.

<center>
<img src="https://s3.amazonaws.com/dq-content/889/4.1a-m889.svg">
</center>

In the context of API requests, we can use a try/except block to handle potential errors like this:

```python
import requests

try:
    response = requests.get('api-server.dataquest.io/economic_data/historical_data?filter_by=country_code=PAK&indicator_code=SP.POP.TOTL&year=2017')
    data = response.json()
    print(data)
except Exception as e:
    print("An error occurred with the request:", e)
```

```text
An error occurred with the request: Invalid URL 'api-server.dataquest.io/economic_data/historical_data?filter_by=country_code=PAK&indicator_code=SP.POP.TOTL&year=2017': No scheme supplied. Perhaps you meant https://api-server.dataquest.io/economic_data/historical_data?filter_by=country_code=PAK&indicator_code=SP.POP.TOTL&year=2017?
```

In the above code, we've enclosed our request in a try block. If the request fails for any reason, Python will execute the code in the except block and print an error message. Do you understand why this API request failed? Read the output to try and understand.


Like handling errors in `API requests`, managing inaccurate prompts in **Large Language Models (LLMs)** involves interpreting and responding to the information within their training limits. When an `LLM` receives a prompt that is unclear or based on incorrect information, it uses its trained algorithms to decipher and respond to the query as accurately as possible. However, if the prompt is outside the scope of its training or based on false premises, the response may be less accurate or relevant. This situation is comparable to an API generating an error when receiving a request with invalid parameters. Both scenarios emphasize the need for precise and accurate input to achieve the desired output or response.

Learning to interpret error messages is an invaluable skill. Messages like "Invalid URL: No scheme supplied" serve as clues, guiding you to identify and resolve the issue. Although it might seem challenging at first, understanding these messages is a key part of troubleshooting and will become more intuitive with practice. Let’s continue to develop these skills together.

<!--- screen_index='889.5' type='code' experimental='False' error_okay='False' sequence='6' --->

# Pagination in API Requests

So far, we've learned how to use optional query parameters to filter our API data requests and handle errors when they arise. Now, let's explore another critical aspect of API usage: **pagination**.

**Pagination** in APIs is a technique used to divide the data into smaller, manageable segments or pages. This becomes crucial when dealing with APIs that contain large volumes of data. Without pagination, a request to an API might attempt to fetch all available data at once, which can be overwhelming and inefficient for both the client and the server.

By implementing pagination, we can control the volume of data received in each request. This is achieved by specifying the amount of data per 'page' and which 'page' or segment of data to retrieve.

While other APIs use `page` and `per_page` parameters for pagination, our side API server employs `limit` and `offset`. Here's how they work:

- `Limit`: This parameter sets the number of records to return in a single response. It's akin to defining the 'size' of each page.
- `Offset`: This parameter determines the starting point in the dataset, effectively 'skipping' a specified number of records before returning data.

For instance, if we want to access only 5 records  of data from the World Development Indicators API, skipping the first 3 records, our request would look like this:

```python
import requests

parameters = {
    "limit": 5,   
    "offset": 3  
}
response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
data = response.json()

```
Notice the additional argument `params` in our response. Let's break it down to have a better understanding:

- `params=parameters`: In this case, params is an argument in the `requests.get()` method. It allows you to pass query parameters in the form of a dictionary to the URL. The variable parameters, contains these query parameters


In this example, the API will return records starting from the 4th record and include a total of 5 records. In other words, it will return the 4th to the 8th record in the dataset. Such a method of pagination is particularly effective for methodically processing or browsing through large datasets.

Keep in mind that not all APIs use the same parameter names for pagination, and some APIs may not support pagination at all. For example, some APIs might use `startIndex` and `maxResults`, while others could use `cursor` for more complex pagination schemes. Always refer to the API documentation to understand the specific pagination parameters and their usage.

In **Large Language Models** (LLMs), managing large datasets or responses can be analogous to the concept of pagination in APIs. LLMs, when processing extensive data or long texts, often need to segment this information into smaller, manageable parts for effective processing and response generation. This is similar to `pagination` in APIs, where data is divided into pages or segments to avoid overwhelming requests and to ensure efficient data retrieval. In LLMs, techniques akin to `limit` and `offset` might be employed to handle lengthy inputs or to generate responses in a structured manner, ensuring that the model stays focused and relevant to the prompt without being overloaded with information. This segmentation approach in both APIs and LLMs highlights the importance of structuring data processing to enhance efficiency and effectiveness.

## Instructions

In this exercise, you'll practice using pagination in API requests. You will work with the World Development Indicators API to understand the impact of pagination on the data retrieved.

1. Create a dictionary named `parameters` with keys `limit` and `offset`, that will fetch the first 10 countries in our API.

2. Using the `parameters` dictionary, send a GET request to the World Development Indicators API at the `/countries` endpoint. 
	- Ensure you include pagination by passing the `parameters` dictionary in the request. 
    - Use `"https://api-server.dataquest.io/economic_data"` as the base URL.

3. Convert the response to python list and store the result in a variable named `data_with_pagination`. 
	- Note that in this case, you will directly load the JSON response using `json.loads(response.json())`.

4. Print the length of `data_with_pagination` to observe the number of records returned. This will help you understand the impact of pagination on the data you receive from the API.

## Hint

- To paginate the request to have the first 10 countries, add `limit` and `offset` to the `parameters` dictionary with values `0` and `10`, respectively.
- To see pagination in action, print the length of `data_with_pagination`.

In [0]:
## Display

import requests
import json

## Answer

parameters = {
    "limit": 10,
    "offset": 0
}
response = requests.get("https://api-server.dataquest.io/economic_data/countries", params=parameters)
data_with_pagination = json.loads(response.json())
print(len(data_with_pagination))

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    data_list=['data_with_pagination']
    for data in data_list:
        ac.add(CheckListVariable(data))
    return ac.check_answer()

<!--- screen_index='889.6' type='code' experimental='False' error_okay='False' sequence='7' --->

# Implementing Pagination

On the previous screen, we introduced the concept of pagination in API data requests. Much like reading a book page by page, pagination allows us to systematically retrieve API data in segments, which is particularly beneficial when dealing with large datasets. It enables us to break down the data into manageable chunks, avoiding the inefficiency and overwhelming nature of trying to process everything at once.

For instance, in a typical API response supporting pagination, the structure might look like this:

```json
{
    "page": 1,
    "per_page": 10,
    "total": 100,
    "total_pages": 10,
    "data": [...]
}
```

In this example, `total` represents the total number of records available, and `total_pages` indicates the number of pages, calculated based on the `per_page` parameter.

When working with the World Development Indicators API, which has a substantial number of records, we use pagination parameters such as `limit` and `offset` for efficient data navigation. The `limit` parameter defines the number of rows to be returned in each request, while `offset` determines the starting point for data retrieval.

Here’s an example of how we can apply these parameters:

```python

import requests

parameters = {
    "limit": 10,  
    "offset": 0  
}

response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
data_str = response.json()
data=json.loads(data_str)

print("Total Records:", len(data))
print("Current Page Records:", data[0].get("topic", []))

```
```text

Total Records: 10
Current Page Records: Environment: Agricultural production
```

In this code, `limit` is set to `10`, so our API response will include 10 records. The `offset` is `0`, corresponding to the first `page` of data.

Note that, `data[0].get("topic", [])` means: "Get the value associated with the key`topic` from the first dictionary in data. If the key `topic` does not exist, return an empty list instead."

While the focus here is on `limit` and `offset`, it's important to note that different APIs might use various pagination methods. For instance:

### Cursor-Based Pagination

Example with cursor-based pagination:

```python
parameters = {
    "cursor": "abc123"
}
```

The API provides a cursor to fetch the next set of records.

### Keyset Pagination

Example with keyset pagination:

```python
parameters = {
    "start_after": "2021-01-01T12:00:00Z"
}
```

Here, `start_after` is the last record from the previous fetch.

Always check the API documentation to understand its specific pagination scheme. This will help you avoid unexpected behavior or incomplete data retrieval. Now, let's proceed to the exercise to implement what we've learned about pagination.

## Instructions

In this exercise, you'll practice implementing pagination in API requests. You'll send a GET request to the World Development Indicators API with pagination to figure out how many pages of data there are, and then return data from a specific page.

2. Create a dictionary named `parameters` with keys `limit` and `offset`, setting their values to `10` and `0` respectively.

3. Using the `parameters` dictionary, send a GET request to the `/indicators` endpoint of the World Development Indicators API. Convert the response to JSON format and store the result in a variable named `indicator_page_str`. 
	- Use `"https://api-server.dataquest.io/economic_data"` as base URL.

4. Parse the JSON string(`indicator_page_str`) into a Python list of dictionaries and store it in variable called `indicator_page`.

5. Store the length of records to a variable called `indicator_len_records`.

6. Get the `indicator_name` of the  fourth record in the `indicator_page` variable and store it in a variable called `fourth_indicator_name`.

7. Print  both `indicator_len_records` and `fourth_indicator_name` to view the results.

## Hint

- To apply pagination, configure the `limit` and `offset` values within the `parameters` dictionary before making the GET request.
- Convert the returned JSON string to a Python list of dictionaries for easy data manipulation using `json.loads(json_string)`. Replace with `json_string` with `data_page_str`.
- To get the length of records, use `len()` function.
- Access the `indicator_name` of the fourth entry using `indicator_page[fourth_entry_index].get("indicator_name", [])`. Remember in python, indexing start at zero.

In [0]:
## Display

import requests
import json 

## Answer

parameters = {
    "limit": 10,  
    "offset": 0  
}
response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
indicator_page_str = response.json()
indicator_page=json.loads(indicator_page_str)
indicator_len_records = len(indicator_page)
fourth_indicator_name =  indicator_page[3].get("indicator_name", [])
print(indicator_len_records)
print(fourth_indicator_name)

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    ac.add(CheckListVariable("indicator_page"))
    ac.add(CheckIsNumericVariable("indicator_len_records"))
    ac.add(CheckStringVariable("fourth_indicator_name"))
    return ac.check_answer()

<!--- screen_index='889.7' type='code' experimental='False' error_okay='False' sequence='8' --->

# Optimizing Pagination

In our previous discussions, we've explored the fundamentals of pagination in API requests. As we progress, it becomes crucial to focus on **optimized pagination**. This is about strategically fetching data in a way that minimizes the number of requests and reduces unnecessary data retrieval, thereby enhancing efficiency.

Imagine you're at a buffet. You have a plate to fill, but there's a catch ― the plate can only hold a certain amount of food at a time. You could go back and forth, picking up a little bit of everything. But that would be time-consuming, right? Instead, it would be more efficient to fill your plate with the dishes you like the most first, and then go back for the rest if there's space left. This is essentially what optimized pagination does. It helps you get the most valuable data first, reducing the number of requests you need to make.


<center>
<img src="https://s3.amazonaws.com/dq-content/889/7.1a-m889.svg">
</center>



Optimized pagination in API interactions is primarily about making judicious choices regarding the `limit` and `offset` parameters. These parameters are instrumental in controlling both the volume and the specific segment of data you retrieve in each request.


For instance, let's consider an API that houses `100` records. If your requirement is to access only the first `30` records, optimization can be achieved by setting the limit to `30` and the offset to `0`. This method efficiently retrieves all the needed records in a single request, rather than multiple requests for smaller chunks of data.

Here's how this can be practically implemented:

```python
import requests
import json

# Setting optimized pagination parameters
parameters = {
    "limit": 30,  # Number of records to retrieve
    "offset": 0   # Starting point of the data retrieval
}

# Executing a GET request with optimized pagination
response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
data_str = response.json()
data=json.loads(data_str)

# Displaying a record to verify the data
print(data[0])  # Assuming 'data' is a list of records

```

In this example, the `limit` parameter is utilized to specify the exact number of records we intend to retrieve, while `offset` helps us define the starting point of our data retrieval. By fetching 30 records in one go, we significantly streamline our data retrieval process.


It's important to remember that the scope for optimization can differ from one API to another. Certain APIs might impose fixed limits on data retrieval or employ different methods for data navigation. Therefore, it's always recommended to refer to the API documentation for precise information on pagination capabilities and any existing limitations.

Equipped with this knowledge of optimized pagination, let's proceed to the next exercise where we'll apply these concepts.

## Instructions

In this exercise, you will put the concept of optimized pagination into practice. You will construct multiple GET requests to the `/countries` endpoint of the World Development Indicators API using optimized pagination parameters.

2. Create a dictionary named `parameters` with keys `limit` and `offset`. Set their initial values to `50` and `0` respectively, to retrieve the first `50` records.

3. Use the `parameters` dictionary to send a GET request to the `/indicators` endpoint. Convert the response to JSON format and store it in a variable named `data_page_1`. 
	- Make sure to Parse the JSON string(data_str) into a Python list of dictionaries before saving it back to `data_page_1` variable.	

4. Print the `limit`, and the first `topic`  in the current response from the JSON data.

5. Modify the `offset` value in the `parameters` dictionary to `50` to access the next `50` records.

6. Make a second GET request to the `/indicators` endpoint using the updated `parameters`. Parse the JSON response into a Python list and store it in a variable named `data_page_2`.

7. Print the the first `topic` in `data_page_2` to confirm that you have successfully fetched the next set of data.

## Hint

- To set up optimized pagination, determine the `limit` and `offset` values best suited for the data you aim to retrieve.
- Use the `len()` function to know the `limit` of your response .
- Use the `parameters["offset"] = new_offset` to modify the value of the key "offset".
- To observe the impact of pagination, compare the first topic retrieved from the initial and subsequent API calls.

In [0]:
## Display

import requests
import json

## Answer

parameters = {
    "limit": 50,  
    "offset": 0   
}
response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
data_page_1 = json.loads(response.json())
print("Limit (total records):", len(data_page_1))
print("First topic:", data_page_1[0].get("topic", []))
parameters["offset"] = 50
response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
data_page_2 = json.loads(response.json())
print("New topic:",data_page_2[0].get("topic", []))

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    ac.add(CheckListVariable("data_page_2"))
    return ac.check_answer()

<!--- screen_index='889.8' type='code' experimental='False' error_okay='False' sequence='9' --->

# Combining Query Parameters and Pagination

On the previous screens, we've learned a lot about how to refine our API data requests using optional query parameters and how to manage large amounts of data using pagination. But what happens when we need to do both at the same time? Can we combine query parameters and pagination to fetch specific subsets of data efficiently? The answer is yes, and that's exactly what we're going to cover in this screen.

Consider a scenario where you're at a large library with thousands of books. You're looking for books on a specific topic, say, climate change (this is where query parameters come in). However, the books are arranged in such a way that only a certain number can be displayed at a time (this is where pagination comes in). In order to find what you're looking for, you'd need to use both the topic filter (query parameter) and the book display limit (pagination). 

The same applies to APIs. Sometimes we need to both refine our data requests (using query parameters) and manage large amounts of data (using pagination) at the same time. And just like in the library scenario, we can combine query parameters and pagination in our API requests.

Let's see how we can implement this in Python using the requests library:

```python
import requests
import json


parameters = {
    "filter_by":"currency_unit=Euro,income_group=High income",
    "limit": 5,
    "offset": 0
}


response = requests.get("https://api-server.dataquest.io/economic_data/countries", params=parameters)
data_str = response.json()
data = json.loads(data_str)

for record in data:
    print(record.get("country_code"))
```

```text
AND
AUT
BEL
CYP
DEU
```

In this example, we've combined multiple query parameters and pagination into a single API request to retrieve specific data. The request aims to fetch countries that meet two criteria: those using the `Euro` as their currency unit (`currency_unit: "Euro"`) and those classified as `high-income` nations (`income_group: "High income"`). These filters are combined in the `filter_by` parameter, separated by a comma, which allows for multiple filtering conditions in one request.

For pagination, the `limit` is set to `5`, and the `offset` is `0`. This means the request will return the first `5` records from the dataset, starting from the very beginning. Essentially, the API is asked to return the first five countries that match both of our specified criteria: using the Euro and being high-income nations.

After receiving the response, the script converts the JSON response into a Python list and prints the country codes for the first 5 records. This approach efficiently filters and paginates data, showcasing how to extract targeted information from an API

By combining query parameters and pagination, we can fetch specific subsets of data more efficiently, saving us a lot of time and resources, especially when dealing with large datasets.

## Instructions

In this exercise, you'll practice combining query parameters and pagination to fetch specific subsets of data from the World Development Indicators API more efficiently. You'll construct a GET request using both query parameters and pagination, and analyze the returned data to confirm that the request fetched the desired subset of data efficiently.

2. Set the query parameters to filter the data by `region` and `income_group`. Use the values "Europe & Central Asia" for `region` and "Upper middle income" for `income_group`.

3. Set the pagination parameters to request the first 5 records. Use the keys `limit` and `offset` for this.

4. Combine the query parameters and pagination parameters into a single dictionary called `parameters`.

5. Send a GET request to the `/countries` endpoint of the World Development Indicators API using the `parameters` dictionary. Store the response in a variable called `response`.
	- Use `https://api-server.dataquest.io/economic_data` as the base URL.

6. Convert the response data to JSON format and store it in a variable called `data_combined_str`.  

7. Parse the JSON response into a Python list and store it in a variable named `data_combined`.

8. Loop through the `data_combined` variable. Inside the loop:
	- Assign the name of each country to a variable called `country_name`  using `table_name` field. 
    - Print `country_name` variable.

## Hint

- To filter data effectively, specify the `region` and `income_group` criteria in the query parameters.
- To control the data volume, configure the `limit` and `offset` pagination parameters for the response.
- To integrate both methods, combine query parameters and pagination into a single `parameters` dictionary.
- To evaluate the efficiency of your combined request, examine the `table_name` field of the countries in `data_combined`.

In [0]:
## Display

import requests
import json

## Answer

parameters = {
    "filter_by":"region=Europe & Central Asia,income_group=Upper middle income",
    "limit": 5,
    "offset": 0
}
response = requests.get("https://api-server.dataquest.io/economic_data/countries", params=parameters)
data_combined_str = response.json()
data_combined=json.loads(data_combined_str)
for record in data_combined:
    country_name=record.get("table_name")
    print(country_name)

In [0]:
## Custom check










def __CUSTOM_CHECK__(check_info):
    ac = AnswerChecker(check_info)
    ac.add(CheckListVariable("data_combined"))
    ac.add(CheckVariable("country_name"))
    return ac.check_answer()

<!--- screen_index='889.9' type='text' experimental='False' error_okay='False' sequence='10' --->

# Pagination with Page and Page_Number in API Requests

In our previous lessons, we've extensively explored the use of `limit` and `offset` for pagination, understanding how to effectively apply these parameters in API requests. Different APIs, however, may adopt various approaches to pagination. This lesson introduces another common pagination method used by many APIs: the use of `page` and `page_number`.

<center>
<img src="https://s3.amazonaws.com/dq-content/889/5.1-m889.svg">
</center>


This method simplifies the process of navigating through large datasets. While `limit` and `offset` directly control the number of records returned and the starting point in the dataset, `page` and `page_number` provide a more intuitive way to access data, akin to flipping through pages of a book.


Let's put these new concepts into action, assuming our API (the World Development Indicators API) supports `page` and `page_number` for pagination. This approach offers a more intuitive method for navigating large datasets, similar to flipping through pages of a book or a report. We'll now apply this method in a real-world scenario to see how it enhances our ability to manage and retrieve data efficiently.

Here's how we can adapt our request to the World Development Indicators API with these pagination parameters:

```python
import requests


# Set base URL and endpoint
base_url = "https://api-server.dataquest.io/economic_data"
endpoint = "/historical_data"

# Set query parameters with pagination
parameters = {
    "country_code": "IND",
    "indicator_code": "SP.POP.TOTL",
    "from_year": 2000,
    "to_year": 2020,
    "page": 1,         # Current page number
    "page_number": 50  # Number of records per page
}

# Send GET request
response = requests.get(base_url + endpoint, params=parameters)
data = response.json()

# Print first 5 records
print(data["records"][:5])

# Handling total number of records
total_records = data["total"]
print(f"Total records available: {total_records}")

```

In this script, we've set up an API request to fetch specific historical data:

- The `country_code` parameter is set to "IND", targeting data related to India.
- The `indicator_code` is "SP.POP.TOTL", which typically refers to a specific statistical indicator, like total population in this case.
- The `from_year` and `to_year` parameters specify the time range for the data, from the year 2000 to 2020.

For pagination, the script uses two key parameters:

- `page`: This parameter indicates the current page number within the dataset. In this example, it's set to 1, meaning the first page of the dataset is being accessed.
- `page_number`: This parameter specifies the number of records to display per page. Here, it's set to 50, indicating that each page will contain 50 records.

This pagination method, using `page` and `page_number`, offers a user-friendly approach to accessing large datasets. It simplifies data retrieval by abstracting away the need for calculating offsets (as required in `limit` and `offset` pagination). Instead, you directly specify which page number to access and how many records that page should contain.

After sending the GET request, the script prints the first 5 records. It then retrieves and prints the total number of records available for the query, stored under the key `total` in the response. This is a common feature in APIs providing paginated data, where the total count of available records across all pages is given, offering a comprehensive view of the data's scope. Such information is useful for understanding the dataset's extent and for planning subsequent data retrieval and navigation through the pages.

<!--- screen_index='889.10' type='text' experimental='False' error_okay='False' sequence='11' --->

# Review

Great work! In this lesson, we explored the power of APIs, starting with basic `GET` requests to the World Development Indicators API and learning how to use optional `query parameters` for efficient and targeted data retrieval.

We began by understanding the role of optional `query parameters` in refining API `requests`, using the analogy of specifying details when ordering food at a restaurant. This included an example of filtering data for specific countries in Sub-Saharan Africa using the World Development Indicators API. We then advanced to leveraging `multiple filters` in an API request to obtain more precise data, and explored parsing `JSON` strings into Python objects for thorough data analysis.

We also addressed the common issue of using incorrect query parameters, using examples from the World Development Indicators API to illustrate potential errors and the importance of consulting API documentation for correct parameter usage. This led us to the concept of `error handling` in API interactions, where we compared API requests to restaurant orders and introduced Python's `try/except` blocks as a method for managing errors and ensuring continued program execution.

Finally, we focused on `pagination` as a vital technique for handling large data sets from APIs. We discussed the significance of pagination parameters like `limit` and `offset`, drawing a parallel to reading a book page by page. We emphasized the need for `optimizing pagination` for efficient data fetching, similar to making smart choices at a buffet. This included **integrating** query parameters with pagination techniques for handling extensive datasets and introduced an alternative pagination strategy using `page` and `page_number` for easier navigation through large datasets, as demonstrated in the World Development Indicators API.

In the next lesson, we will now shift our focus to authentications and rate limits.

<!--- screen_index='889.11' type='takeaways' experimental='False' error_okay='False' sequence='12' --->

# Takeaways

## Syntax 

- Sending a GET request to the World Development Indicators API without any query parameters:

    ```
    import requests
    response = requests.get("https://api-server.dataquest.io/economic_data/countries")
    data = response.json()
    ```
    
- Sending a GET request to the API with a `filter_by` query parameter to refine the data:

    ```
    response = requests.get("https://api-server.dataquest.io/economic_data/countries?filter_by=region=Sub-Saharan%20Africa")
    data = response.json()
    ```
   
- Adding multiple query parameters to an API GET request:

    ```
    response = requests.get("https://api-server.dataquest.io/economic_data/indicators?		   filter_by=topic=Environment:%20Agricultural%20production&filter_by=series_code=AG.AGR.TRAC.NO")
    data_str = response.json()
    ```
    
- Parsing a JSON response to a Python data structure:

	```
    import json
	data = json.loads(data_str)
    ```
    
- Searching within a list of dictionaries for specific data:

    ```
    for indicator in data:
        if indicator['indicator_name'] == "Agricultural machinery, tractors":
            print(f"Indicator Name: {indicator['indicator_name']}")
        break
    ```
    
- An example of a GET request with an invalid query parameter:

	```
    response = requests.get("https://api-server.dataquest.io/economic_data/indicators?filter_by=indicator_topic=Health: Risk factors")
    invalid_data_str = response.json()
    print(invalid_data_str)
    ```
    
- Implementing a try/except block in API requests:
	
    ```
    try:  
        response = requests.get("https://api-server.dataquest.io/...")
        data = response.json()
    except Exception as e:
        print("An error occurred with the request:", e)
    ```
    
- Sending a GET request with pagination to fetch a specific segment of data:

	```
    parameters = {
    "limit": 5,
    "offset": 3
    }
    response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
    data = response.json()
    ```
    
- Defining pagination parameters in an API request:

	```
    parameters = {
    "limit": 10,
    "offset": 0
    }
    response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
    data_page_str = response.json()
    data_page = json.loads(data_page_str)
    ```
- Setting up and executing a GET request using optimized pagination:

	```
    parameters = {
    "limit": 30,
    "offset": 0
    }
    response = requests.get("https://api-server.dataquest.io/economic_data/indicators", params=parameters)
    data = json.loads(response.json())
    print(data[0])
    ```
    
- Crafting a GET request with combined query parameters and pagination:

	```
    parameters = {
    "filter_by": "currency_unit=Euro,income_group=High income",
    "limit": 5,
    "offset": 0
    }
    response = requests.get("https://api-server.dataquest.io/economic_data/countries", params=parameters)
    data = json.loads(response.json())
    for record in data:
        print(record.get("country_code", []))
    ```
- Setting up a paginated API request with `page` and `page_number`:

	```
    parameters = {
    "page": 1,
    "page_number": 50
    }
    response = requests.get("https://api-server.dataquest.io/economic_data/historical_data", params=parameters)
    data = response.json()
    ```
    
    
    
    
## Concepts


- **Optional Query Parameters and Filters**: Customize API data retrieval like choosing pizza toppings, using filters to refine searches like using specific sieves for precision.

- **URL Encoding and JSON Parsing**: URL encoding ensures accurate interpretation of special characters, akin to a secret code. JSON parsing translates data into a Python-friendly format, much like translating a foreign language.

- **Handling API Errors and Documentation**: Error messages are clues for troubleshooting, similar to solving a detective case. API documentation is the essential guide, akin to a cookbook for API queries.

- **Error Handling Techniques**: Using try/except in Python adds safety nets to code, preventing crashes. Interpreting error messages is key for issue resolution, like diagnosing car problems based on dashboard indicators.

- **Pagination and Data Retrieval Efficiency**: Managing data like reading a book one chapter at a time. Selecting the right limit and offset for data retrieval is akin to planning an efficient road trip.

- **Combined Strategies for Data Management**: Filtering and pagination together optimize data retrieval, like a librarian organizing books. Page-based pagination enhances user experience, similar to a book's table of contents.


## Resources

- [API Query Parameters and Filters](https://www.moesif.com/blog/technical/api-design/REST-API-Design-Filtering-Sorting-and-Pagination/)
- [Understanding URL Encoding and JSON Parsing](https://realpython.com/python-json/)