
# <font color="orange"> Project:  The Data Collection Phase </font>

This project has several goals. The primary objective is to prepare you for next week's learning objective: Use Python to **programmatically** access data made available via Internet APIs.

Following this project you should be able to:
- Recognize JSON and CSV data.
- Read JSON data.
- Practice reading API documentation and API usage.
- Know how to format URLs, including query strings, to access specific API data that you want.
- Understand the concept of different API endpoints within the same overall API.
- Think about how you can use JSON data returned from API calls to solve a problem.

If you feel comforable accessing API data and reading the resulting JSON data, then you are ready to use Python to programmatically access API data and use that data to solve a problem.


## Part I - Study Documentation and Use a Simple API

- Investigate and use the Dog API
  - https://dog.ceo/dog-api/

- You should test and (use a new code cell below to) copy the API endpoints that achieve the following:
  - List all dog breeds
  - Return a random image of a dog from the entire dataset of all dog images.
    - Copy/paste the JSON that is returned by this endpoint.
  - Return the list of all images for a **specific breed**.
    - How many images are in the dataset for this breed.
  - Return the list of all images for a **specific sub-breed**.
    - HINT:  Look at the list of all breeds to find sub-breeds.
  - Displays a JSON with a **random** image of a **specific breed**.  Copy/paste the resulting JSON below.
  - Copy and paste in the code cell below the JSON returned by ANY API call that gives a `"status" : "error"` in the result.

**NOTE**: If possible, avoid using examples that are explicitly highlighted on the *Dog API* site.

In [None]:
# All dog breeds
https://dog.ceo/api/breeds/list/all

# Random image of a dog
https://dog.ceo/api/breeds/image/random

# List of all images for specific breed
https://dog.ceo/api/breed/hound/images/random

# List of all images for specific sub-breed
https://dog.ceo/api/breed/hound/afghan/images

# List a RANDOM image for specific sub-breed
https://dog.ceo/api/breed/Pitbull/images/random
# Error status
https://dog.ceo/api/breed/basset/list
"status": "error",
  "message": "Breed not found (master breed does not exist)",
  "code": 404


## Part II - Investigate and Use a New API Provided by the National Park Service

You will investigate the [**National Park Service API**](https://www.nps.gov/subjects/developer/api-documentation.htm#/).

- Register for an API key.
- Peruse the documentation above.
- What is the `base URL` that will be included in all NPS API calls?
- Try the `activities` endpoint. How many categories of activities are returned by this API call?
- Which API endpoint can list all the `parks` in their database.
  - Write down the URL for this API endpoint.
  - Execute that API call.
  - How many `JSON name/value pairs` are contained within the **top-level JSON object** that is returned by the call.
  - How many parks are in the database, based on this API call?  
  - How many JSON `park objects` are returned by the API call that you just executed?
  - How can you modify the `query string` to have `100 park objects` returned in the JSON?  Execute that API call to verify.
- Execute the `parks` API call for a specific national park of your choice.
  - Write down the API call
  - Copy and paste the values that have this information about the park:
    - Full name of the park
    - Description of the park
    - The state(s) that the park is in.
    - Current weather information about the park.

In [None]:
# Base URL
https://developer.nps.gov/api/v1

# activities endpoint
https://developer.nps.gov/api/v1/activities?api_key=3BudttuttaQbk4LGoH8npuqolJFtqUI1JeTuQjgE

# parks endpoint
https://developer.nps.gov/api/v1/parks?api_key=3BudttuttaQbk4LGoH8npuqolJFtqUI1JeTuQjgE

# Top-level object's keys
"total"

# Total number of parks
"471"

# Total park objects returned by this call
"limit": "50"

# How can I get 100 park objects returned?
Set the limit to 100
https://developer.nps.gov/api/v1/parks?limit=100&api_key=3BudttuttaQbk4LGoH8npuqolJFtqUI1JeTuQjgE

# Specific endpoint
https://developer.nps.gov/api/v1/parks?limit=50&q=Acadia%20National%20Park&api_key=3BudttuttaQbk4LGoH8npuqolJFtqUI1JeTuQjgE
"fullName": "Acadia National Park",
"description": "Acadia National Park protects the natural beauty of the highest rocky headlands along the Atlantic coastline of the United States, an abundance of habitats, and a rich cultural heritage. At 4 million visits a year, it's one of the top 10 most-visited national parks in the United States. Visitors enjoy 27 miles of historic motor roads, 158 miles of hiking trails, and 45 miles of carriage roads."
"states": "ME"
"weatherInfo": "Located on Mount Desert Island in Maine, Acadia experiences all four seasons. Summer temperatures range from 45-90F (7-30C). Fall temperatures range from 30-70F (-1-21C). Typically the first frost is in mid-October and first snowfall begins in November and can continue through April with an average accumulation of 73 inches (185 cm). Winter temperatures range from 14-35F (-10 - 2C). Spring temperatures range from 30-70F (-1-21C).",


## Part III - Use a NY Times API

You will investigate the [**NY Times Books API**](https://developer.nytimes.com/docs/books-product/1/overview).

- Register for a [NY Times API key](https://developer.nytimes.com/get-started) if you don't already have one. See the `Data Collection - Accessing Internet Datasets` Workshop for detailed instructions.
- Peruse the documentation above and look at the API calls
- On the Books API documentation page, look for `Example Calls` to see how to invoke endpoints from this API. In particular, take note of the `base URL` that will be included in all the API calls.
 - What is the base URL that will be used for all Books API endpoints? Copy that to the code cell below.
- Many endpoints display information on NY Times Best Seller lists. The `lists/names.json` endpoint displays all Best Seller categories (ie. names).  What URL displays the Best Seller name lists? How many lists are there?
- Which endpoint displays the Top 5 books for all the Best Seller lists?
 - Use the `published_date` query string to display the top 5 Best Sellers for each category beginning the first day of this month. What is the URL?
 - How many results are returned?
- Which endpoint can you use to determine the Best Sellers for the "Trade Fiction Paperback" category for the current month?
 - What is the URL that displays this?
 - How many results are returned?
- Use the `reviews.json` endpoint to show all book reviews for the author *Jon Meacham*.
  - Copy/paste ONLY THE URL in the code cell below the JSON object that contains the answer.
  - How many results are returned?
  - Try a different author.  Paste the URL AND the number of results returned for that author below



In [None]:
# Copy the base URL here
https://api.nytimes.com/svc/

# The list of Best Seller categories
https://api.nytimes.com/svc/books/v3/lists/names.json?api-key=[YOUR_API_KEY]
"num_results": 59,
# List the top 5 books for all the Best Sellers lists for this month.
https://api.nytimes.com/svc/books/v3/lists/full-overview.json?published_date=2023-08-01&api-key=[YOUR_API_KEY]
"num_results": 230,

# List the Best Sellers for the Trade Fiction Category for the first day of this month
https://api.nytimes.com/svc/books/v3/lists.json?list=trade%20fiction%20paperback&published-date=2023-08-01&api-key=[YOUR_API_KEY]
"num_results": 15,

# The URL to show book reviews for Jon Meacham
"https://api.nytimes.com/svc/books/v3/reviews.json?author=Jon%20Meacham&api-key=[YOUR_API_KEY]
"num_results": 7,

# The URL to show book reviews for an author of your choice
https://api.nytimes.com/svc/books/v3/reviews.json?author=Toni%20Morrison&api-key=[YOUR_API_KEY]
"num_results": 15,

## Part IV (OPTIONAL) - Think About How to Solve a Problem using a Stock API

You will investigate the [**Alpha Vantage Stock API**](https://www.alphavantage.co/documentation/).

- Register for an [Alpha Vantage API key](https://www.alphavantage.co/support/#api-key).
- Peruse the documentation above but be sure to look at the API calls in the `Data Collection - Accessing Datasets (Lecture)` examples for Alpha Vantage.
- What is the `base URL` that will be included in all the API calls?
- Use the appropriate API endpoint to determine the **stock symbols** for both Tesla and Nvidia corporations.
  - Copy/paste in a code cell below the JSON object that contains the answer.
- Use the `Quote Endpoint`  to determine the most current **stock value** for both Tesla and Nvidia corporations.
  - Write down the URL of the API call that gives you this information.
  - What is the JSON `name/value` pair that has the current stock price on the most recent day of stock trading? What is the last trading day for this stock price?

### Solve a Problem Using the Alpha Vantage API
I want to know how much a specific stock price went up (or down) following Friday December 31, 2021 and through Friday November 4, 2022. You may use any other company that you'd like for this. Some examples:  `Ford=F`, `Apple=aapl`, `Nvidia=nvda`.

- How can the `Time_Series_weekly` Alpha Vantage API endpoint be used to solve this problem?
  - Show the specific URL that you would use for this problem.
  - What specific JSON data could you use to solve the problem?  How?
    - Write down the specific `name/value` pairs AND the `names` of the JSON objects where you obtained the data to solve the problem.

- Is it possible to download an appropriate CSV file that also has this data?  What is the specific URL that you would use?

In [None]:
# TSLA and NVDA
https://www.alphavantage.co/query

# Stock symbol query

# Stock quote

# Solve a problem

# Download the CSV
