# Example Workbook for Using APIs to Download Data from Organizations

This Jupyter Notebook demonstrates how to use APIs to download data from a specific organization and provider. The example uses the search functionality from the original website: [Saudi Open Data](https://open.data.gov.sa/en/publishers).

## Workflow Overview

1. **Import Necessary Modules**: Import the required functions from the `open_ksa` module.
2. **Search for Organizations**: Use the search function to find organizations based on a keyword.
3. **Retrieve Organization ID**: Extract the organization ID from the search results.
4. **Get Organization Resources**: Fetch all dataset IDs and other resources associated with the organization.
5. **Download Data**: Download the datasets to a specified directory.

### Steps 1 and 2

Here we showcase how the modules are imported and the organizations are found in the search in the process of discovery based on certain key word term.


In [3]:
#Here you can import all of the corresponding functions from the workbook
import open_ksa as ok

#An example on how to use the search function
orgs = ok.organizations(search="saudi")

#Note: Orgs as a value has the export of the full JSON taken from the API


| publisherID                          | slug                                | Datasets |
--------------------------------------------------------------------------------------
| 8ac3c65d-196b-4c20-84a0-a8f7df8eb97d | saudi_arabian_airlines_organization | 147      |
| 4ae6e154-831c-4b91-af10-08d1712b9457 | saudi_post                          | 130      |
| d69f01cd-ef1b-47e4-ad23-8e58a8d5a468 | saudi-ports-authority               | 97       |
| 31112e94-7ee6-4a3f-9f5e-7e847508fa9f | saudi_railways_organization         | 83       |
| 8355d923-079a-4637-92f6-01af9c2b2ab2 | saudi-red-crescent-authority        | 65       |
| ec06e501-f8a2-43a8-a207-81315c7da202 | saudi-press-agency                  | 51       |
| 784467c1-0949-4120-b969-e4ffbd037bfa | saudi_customs_authority             | 38       |
| 1005ea7c-0848-44ee-b193-ea9be821ec14 | saudi-geological-survey             | 36       |
| c6fedddd-8103-400c-a073-6a26d878baf2 | saudi_industrial_development_fund   | 33       |
| 83a1712a-7c

### Steps 3 - 5

Once you've discovered the organization you want to pull the data from, we can begin to assign the organization and similarly look for all of the associated resources and download them in the process to a local folder.

The folder path is usually assigned in the workspace where `python` is running and the allowed extensions specify the file types to download.

#### Variables

- **`dataset_ids`**: List of dataset IDs retrieved from the organization.
- **`ks`**: Organization ID.
- **`location`**: Directory path where the datasets will be saved.
- **`organization_id`**: Organization ID (same as `ks`).
- **`orgs`**: Dictionary containing search results for organizations.
- **`resources`**: Dictionary containing resources and dataset IDs for the organization.

This workbook provides a structured approach to interact with the API, search for organizations, retrieve relevant data, and download it for further analysis.


In [7]:
#Here, we grab the first value which is the value of the organization ID from the API
#Depending on the parameters, we can specify the return of the response
ks = orgs['content'][5]['publisherID']
# We have now gotten the publisher ID programmatically. If you change the ID to a string of your choosing or decide to 
#change the search, you can change the orgs['content'][0]['publisherID'] to match your search and the index 0 to N to 
#to match the organization you want
resources = ok.get_org_resources(org_id=ks)
#Here, we grab all of the different dataset_ids
dataset_ids = resources['dataset_ids']
#Here, we grab the organization ID as well. But we can use the same organization ID from the ks value
# we named it ks for 'King Saud University'
organization_id = resources['organization_id']
location = f"opendata/{resources['organization_name'].strip().replace(' ', '_').lower()}"
# Create a directory named after the organization ID
# Get all of the data resources for the organization
ok.get_dataset_resources(dataset_ids=dataset_ids, 
                         allowed_exts=["csv"],
                        #You can update the dataset resource location to change the output directory
                        #Note: you may have to make the directory
                         output_dir=location
                         )
