## Goals:

Looking at the udacity catalogue, 
I want to be able to view how nanodegrees are rated.


### Tasks
- Retrieve the catalogue data from the udacity api
- Store data locally so we can analyse it later
- Retrieve ratings data from ratings api
- Store data locally so we can analyse it later
- Merge catalogue and ratings data under processed.json


### import dependencies

In [None]:
import requests
import json

#### Task:

We will fetch from the catalogue api to get the list of all courses.
Then we will save this data under `raw_data.json`.

In [None]:
source = "https://catalog-api.udacity.com/v1/catalog?locale=en-us"

r = requests.get(source)
print("fetching catalogue data")

data = r.json()

with open("./data/raw_data.json", "w") as text_file:
    text_file.write(json.dumps(data, indent=2))
    print("done writing data")

### Task

Use raw_data and clean the data as there are empty values and we want to clear degrees that have no "level" or "title"

In [None]:
with open("./data/raw_data.json") as data_file:
    json_data = json.load(data_file)
    
    
    degrees = [{
        "affiliates": degree.get("affiliates", ""),
        "key": degree.get("key", ""),
        "title": degree.get("title", ""),
        "level": degree.get("level", ""),
        "num_of_projects": len(degree.get("projects")) if degree.get("projects") else 0,
        "tags": degree.get("tags"),
        } for degree in json_data["degrees"] if degree["level"] != "" and degree["title"] != ""
    ]

    with open("./data/degrees.json", "w") as data_file:
        data_file.write(json.dumps(degrees))

### Task
We want to ba able to view the ratings api to determine what it provides.

In [None]:
def fetch_reviews_for_key(key):
    source = f"https://ratings-api.udacity.com/api/v1/reviews?node={key}&limit=5000&page=1"
    r = requests.get(source)
    return r.json()

`fetch_reviews_for_key(nd0044)`

Response data
```
{
  "average_rating": 4.536585365853658,
  "count": 164,
  "stats": [
    {
      "rating": 5,
      "count": 109,
      "percentage": 66.46341463414635,
      "_id": 5
    },
    {
      "rating": 4,
      "count": 43,
      "percentage": 26.21951219512195,
      "_id": 4
    },
    {
      "rating": 3,
      "count": 7,
      "percentage": 4.2682926829268295,
      "_id": 3
    },
    {
      "rating": 2,
      "count": 1,
      "percentage": 0.6097560975609756,
      "_id": 2
    },
    {
      "rating": 1,
      "count": 4,
      "percentage": 2.4390243902439024,
      "_id": 1
    }
  ]
}
```

### Task

From `degrees.json` we want to append the rating to `processed.json` which will have the api data from ratings.

In [None]:
with open("./data/degrees.json", "r") as data_file:
    degrees = json.load(data_file)
    
    print("start fetching reviews")

    output_data = []
    for degree in degrees:
        key = degree.get("key")
        data = fetch_reviews_for_key(key)
        
        print("key: " + key)
        
        ratings = {
          "average_rating": data["nd_avg_rating"],
          "count": data["count"],
          "stats": data["stats"]  
        }

        sample = {
          **degree,
          **ratings
        }

        output_data.append(sample)
    
    print("processed job")
    with open("./data/processed.json", "w+") as output:
        output.write(json.dumps(output_data, indent=2))
        print("job has been finished")

### todo
Clean level value to lowercase - 
- `sorted_df.loc[sorted_df["level"].str.contains("Advanced"), "level"] = "advanced"`
- `sorted_df.loc[sorted_df["level"].str.contains("Intermediate"), "level"] = "intermediate"`
- `sorted_df.loc[sorted_df["level"].str.contains("Beginner"), "level"] = "beginner"`