## Instructions {-}

1. You may talk to a friend, discuss the questions and potential directions for solving them. However, you need to write your own solutions and code separately, and not as a group activity. 

2. Write your code in the *Code* cells and your answer in the *Markdown* cells of the Jupyter notebook. Ensure that the solution is written neatly enough to understand and grade.

3. Use [Quarto](https://quarto.org/docs/output-formats/html-basics.html) to print the *.ipynb* file as HTML. You will need to open the command prompt, navigate to the directory containing the file, and use the command: `quarto render filename.ipynb --to html`. Submit the HTML file.

4. The assignment is worth 100 points, and is due on **14th October 2024 at 11:59 pm**. 

5. **Five points are properly formatting the assignment**. The breakdown is as follows:
- Must be an HTML file rendered using Quarto (2 pts).
- There aren’t excessively long outputs of extraneous information (e.g. no printouts of entire data frames without good reason, there aren’t long printouts of which iteration a loop is on, there aren’t long sections of commented-out code, etc.) (1 pt)
- Final answers of each question are written in Markdown cells (1 pt).
- There is no piece of unnecessary / redundant code, and no unnecessary / redundant text (1 pt)

## List comprehension
USA's GDP per capita from 1960 to 2021 is given by the tuple `T` in the code cell below. The values are arranged in ascending order of the year, i.e., the first value is for 1960, the second value is for 1961, and so on.

### 
Use list comprehension to create a list of the gaps between consecutive entries in `T`, i.e, the increase in GDP per capita with respect to the previous year. The list with gaps should look like: [60, 177, ...]. Let the name of this list be `GDP_increase`.

*(4 points)*

In [7]:
GDP_increase = [T[i+1] - T[i] for i in range(61)]
print(GDP_increase)

[60, 177, 131, 199, 254, 318, 190, 360, 336, 202, 375, 485, 632, 500, 575, 791, 861, 1112, 1109, 901, 1401, 458, 1110, 1577, 1116, 834, 968, 1378, 1440, 1032, 453, 1077, 968, 1308, 996, 1277, 1491, 1395, 1661, 1815, 804, 864, 1492, 2235, 2398, 2179, 1748, 520, -1375, 1456, 1415, 1718, 1507, 1833, 1639, 1104, 2048, 2890, 2290, -2067, 6260]


### 
Use `GDP_increase` to find the maximum gap size, i.e, the maximum increase in GDP per capita.

*(1 point)*

In [8]:
max(GDP_increase)

6260

### 
Use list comprehension with `GDP_increase` to find the percentage of gaps that have size greater than $1000.

*(3 points)*

In [9]:
gaps = [gap for gap in GDP_increase if gap > 1000]
print(round(100*len(gaps)/len(GDP_increase), 2), "%")

52.46 %


### 

Use list comprehension with `GDP_increase` to print the list of years in which the GDP per capita increase was more than $2000.

**Hint:** The [`enumerate()`](https://docs.python.org/3/library/functions.html#enumerate) function may help.

*(4 points)*

In [10]:
years = [1961 + i for i, gap in enumerate(GDP_increase) if gap > 2000]
print(years)

[2004, 2005, 2006, 2017, 2018, 2019, 2021]


### 
Use list comprehension to:

1. Create a list that consists of the difference between the maximum and minimum GDP per capita values for each of the 5 year-periods starting from 1976, i.e., for the periods 1976-1980, 1981-1985, 1986-1990, ..., 2016-2020.

2. Find the five year period in which the difference *(between the maximum and minimum GDP per capita values)* was the least.

*(4 + 2 points)*

In [16]:
five_year_diff = [max(T[i:i+5]) - min(T[i:i+5]) for i in range(16, len(T) - 1, 5)]
print(five_year_diff)
min_difference_idx = five_year_diff.index(min(five_year_diff))
min_difference_period = (1976 + min_difference_idx * 5, 1976 + (min_difference_idx + 1) * 5 - 1)
print(min_difference_period)

[3983, 4261, 4818, 4349, 6362, 6989, 2349, 6697, 7228]
(2006, 2010)


## Nested list-comprehension 
Below is the list consisting of the majors / minors of students of the course STAT303-1 Fall 2023. This data is a list of lists, where each sub-list *(smaller list within the outer larger list)* consists of the majors / minors of a student. Most of the students have majors / minors in one or more of these four areas:

1. Math / Statistics / Computer Science

2. Humanities / Communication

3. Social Sciences / Education

4. Physical Sciences / Natural Sciences / Engineering

There are some students having majors / minors in other areas as well.

Use list comprehension for all the questions below.

In [17]:
majors_minors =  [['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Humanities / Communications',  'Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Humanities / Communications',  'Social Sciences / Education',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Physical Sciences / Natural Sciences / Engineering'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Cognitive Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Music'], ['Social Sciences / Education'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Data Science'], ['Social Sciences / Education'], ['Math / Statistics / Computer Science', 'jazz'], ['Humanities / Communications', 'Social Sciences / Education'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Humanities / Communications',  'Social Sciences / Education',  'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Econ'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', ''], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Humanities / Communications',  'Social Sciences / Education',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Humanities / Communications', 'Social Sciences / Education'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Social Sciences / Education'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Humanities / Communications', 'Social Sciences / Education'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education'], ['Humanities / Communications'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education',  'Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Social Sciences / Education'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Humanities / Communications',  'Social Sciences / Education',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Humanities / Communications'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Social Sciences / Education',  'Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science', 'Music'], ['Physical Sciences / Natural Sciences / Engineering'], ['Humanities / Communications'], ['Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education', 'Math / Statistics / Computer Science'], ['Humanities / Communications', 'Math / Statistics / Computer Science'], ['Physical Sciences / Natural Sciences / Engineering'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science'], ['Social Sciences / Education'], ['Physical Sciences / Natural Sciences / Engineering'], ['Physical Sciences / Natural Sciences / Engineering',  'Math / Statistics / Computer Science'], ['Math / Statistics / Computer Science']]

### 
How many students have major / minor in any three of the above mentioned four areas?

*(1 point)*

In [18]:
three_areas = [student for student in majors_minors if len(student) >= 3]
print(len(three_areas))

6


### 
How many students have *Math / Statistics / Computer Science* as an area of their major / minor? 

**Hint:** Nested list comprehension

*(4 points)*

In [19]:
math_stat_cs = [student for student in majors_minors if 'Math / Statistics / Computer Science' in student]
print(len(math_stat_cs))

105


### 
How many students have *Math / Statistics / Computer Science* as the **only area** of their major / minor? 

**Hint:** Nested list comprehension

*(5 points)*

In [20]:
only_math = [student for student in majors_minors if 'Math / Statistics / Computer Science' in student and len(student) == 1]
print(len(only_math))

44


### 
How many students have *Math / Statistics / Computer Science* and *Social Sciences / Education* as a couple of areas of their major / minor? 

**Hint:** The in-built function [`all()`](https://docs.python.org/3/library/functions.html#all) may be useful.

*(6 points)*

In [21]:
math_social = [student for student in majors_minors if 'Math / Statistics / Computer Science' in student and "Social Sciences / Education" in student]
print(len(math_social))

34


## Dictionary
The code cell below defines an object having the nutrition information of drinks in starbucks.

In [22]:
starbucks_drinks_nutrition={'Cool Lime Starbucks Refreshers™ Beverage': [{'Nutrition_type': 'Calories', 'value': 45}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 11}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Strawberry Acai Starbucks Refreshers™ Beverage': [{'Nutrition_type': 'Calories', 'value': 80}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 18}, {'Nutrition_type': 'Fiber', 'value': 1}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Very Berry Hibiscus Starbucks Refreshers™ Beverage': [{'Nutrition_type': 'Calories', 'value': 60}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 14}, {'Nutrition_type': 'Fiber', 'value': 1}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Evolution Fresh™ Organic Ginger Limeade': [{'Nutrition_type': 'Calories', 'value': 110}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 28}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Iced Coffee': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Iced Espresso Classics - Vanilla Latte': [{'Nutrition_type': 'Calories', 'value': 130}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 21}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 5}, {'Nutrition_type': 'Sodium', 'value': 65}], 'Iced Espresso Classics - Caffe Mocha': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 23}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 5}, {'Nutrition_type': 'Sodium', 'value': 90}], 'Iced Espresso Classics - Caramel Macchiato': [{'Nutrition_type': 'Calories', 'value': 130}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 21}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 5}, {'Nutrition_type': 'Sodium', 'value': 65}], 'Shaken Sweet Tea': [{'Nutrition_type': 'Calories', 'value': 80}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 19}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Berry Blossom White': [{'Nutrition_type': 'Calories', 'value': 60}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 15}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Black Mango': [{'Nutrition_type': 'Calories', 'value': 150}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 38}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 15}], 'Tazo® Bottled Black with Lemon': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Brambleberry': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 15}], 'Tazo® Bottled Giant Peach': [{'Nutrition_type': 'Calories', 'value': 150}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 37}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 15}], 'Tazo® Bottled Iced Passion': [{'Nutrition_type': 'Calories', 'value': 70}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 17}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Lemon Ginger': [{'Nutrition_type': 'Calories', 'value': 120}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 31}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Organic Black Lemonade': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Organic Iced Black Tea': [{'Nutrition_type': 'Calories', 'value': 60}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 15}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Organic Iced Green Tea': [{'Nutrition_type': 'Calories', 'value': 120}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 31}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Plum Pomegranate': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Tazo® Bottled Tazoberry': [{'Nutrition_type': 'Calories', 'value': 150}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 38}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 15}], 'Tazo® Bottled White Cranberry': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Teavana® Shaken Iced Black Tea': [{'Nutrition_type': 'Calories', 'value': 30}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 8}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Teavana® Shaken Iced Black Tea Lemonade': [{'Nutrition_type': 'Calories', 'value': 70}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 17}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Teavana® Shaken Iced Green Tea': [{'Nutrition_type': 'Calories', 'value': 30}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 8}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Teavana® Shaken Iced Green Tea Lemonade': [{'Nutrition_type': 'Calories', 'value': 70}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 17}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Teavana® Shaken Iced Passion Tango™ Tea': [{'Nutrition_type': 'Calories', 'value': 30}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 8}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 5}], 'Teavana® Shaken Iced Passion Tango™ Tea Lemonade': [{'Nutrition_type': 'Calories', 'value': 90}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 24}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Teavana® Shaken Iced Peach Green Tea': [{'Nutrition_type': 'Calories', 'value': 60}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 15}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Starbucks Refreshers™ Raspberry Pomegranate': [{'Nutrition_type': 'Calories', 'value': 90}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 27}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Starbucks Refreshers™ Strawberry Lemonade': [{'Nutrition_type': 'Calories', 'value': 90}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 27}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Starbucks® Doubleshot Protein Dark Chocolate': [{'Nutrition_type': 'Calories', 'value': 210}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 33}, {'Nutrition_type': 'Fiber', 'value': 2}, {'Nutrition_type': 'Protein', 'value': 20}, {'Nutrition_type': 'Sodium', 'value': 115}], 'Starbucks® Doubleshot Protein Vanilla': [{'Nutrition_type': 'Calories', 'value': 200}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 34}, {'Nutrition_type': 'Fiber', 'value': 2}, {'Nutrition_type': 'Protein', 'value': 20}, {'Nutrition_type': 'Sodium', 'value': 120}], 'Starbucks® Iced Coffee Caramel': [{'Nutrition_type': 'Calories', 'value': 60}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 13}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Starbucks® Iced Coffee Light Sweetened': [{'Nutrition_type': 'Calories', 'value': 50}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 11}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Starbucks® Iced Coffee Unsweetened': [{'Nutrition_type': 'Calories', 'value': 10}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 2}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Blonde Roast': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Clover® Brewed Coffee': [{'Nutrition_type': 'Calories', 'value': 10}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Decaf Pike Place® Roast': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Featured Dark Roast': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Nariño 70 Cold Brew': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 15}], 'Nariño 70 Cold Brew with Milk': [{'Nutrition_type': 'Calories', 'value': 0}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Nitro Cold Brew': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 0}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Nitro Cold Brew with Sweet Cream': [{'Nutrition_type': 'Calories', 'value': 70}, {'Nutrition_type': 'Fat', 'value': 5.0}, {'Nutrition_type': 'Carb', 'value': 5}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 20}], 'Pike Place® Roast': [{'Nutrition_type': 'Calories', 'value': 5}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 0}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 10}], 'Vanilla Sweet Cream Cold Brew': [{'Nutrition_type': 'Calories', 'value': 110}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 14}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 1}, {'Nutrition_type': 'Sodium', 'value': 25}], 'Hot Chocolate': [{'Nutrition_type': 'Calories', 'value': 320}, {'Nutrition_type': 'Fat', 'value': 9.0}, {'Nutrition_type': 'Carb', 'value': 47}, {'Nutrition_type': 'Fiber', 'value': 4}, {'Nutrition_type': 'Protein', 'value': 14}, {'Nutrition_type': 'Sodium', 'value': 160}], 'Starbucks® Signature Hot Chocolate': [{'Nutrition_type': 'Calories', 'value': 430}, {'Nutrition_type': 'Fat', 'value': 26.0}, {'Nutrition_type': 'Carb', 'value': 45}, {'Nutrition_type': 'Fiber', 'value': 5}, {'Nutrition_type': 'Protein', 'value': 12}, {'Nutrition_type': 'Sodium', 'value': 115}], 'Caffè Latte': [{'Nutrition_type': 'Calories', 'value': 190}, {'Nutrition_type': 'Fat', 'value': 7.0}, {'Nutrition_type': 'Carb', 'value': 19}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 13}, {'Nutrition_type': 'Sodium', 'value': 170}], 'Caffè Mocha': [{'Nutrition_type': 'Calories', 'value': 290}, {'Nutrition_type': 'Fat', 'value': 8.0}, {'Nutrition_type': 'Carb', 'value': 42}, {'Nutrition_type': 'Fiber', 'value': 4}, {'Nutrition_type': 'Protein', 'value': 13}, {'Nutrition_type': 'Sodium', 'value': 140}], 'Cappuccino': [{'Nutrition_type': 'Calories', 'value': 120}, {'Nutrition_type': 'Fat', 'value': 4.0}, {'Nutrition_type': 'Carb', 'value': 12}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 8}, {'Nutrition_type': 'Sodium', 'value': 100}], 'Caramel Macchiato': [{'Nutrition_type': 'Calories', 'value': 250}, {'Nutrition_type': 'Fat', 'value': 7.0}, {'Nutrition_type': 'Carb', 'value': 35}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 10}, {'Nutrition_type': 'Sodium', 'value': 150}], 'Cinnamon Dolce Latte': [{'Nutrition_type': 'Calories', 'value': 260}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 40}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 11}, {'Nutrition_type': 'Sodium', 'value': 150}], 'Coconutmilk Mocha Macchiato': [{'Nutrition_type': 'Calories', 'value': 250}, {'Nutrition_type': 'Fat', 'value': 9.0}, {'Nutrition_type': 'Carb', 'value': 32}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 12}, {'Nutrition_type': 'Sodium', 'value': 180}], 'Flat White': [{'Nutrition_type': 'Calories', 'value': 180}, {'Nutrition_type': 'Fat', 'value': 7.0}, {'Nutrition_type': 'Carb', 'value': 18}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 12}, {'Nutrition_type': 'Sodium', 'value': 160}], 'Iced Caffè Latte': [{'Nutrition_type': 'Calories', 'value': 130}, {'Nutrition_type': 'Fat', 'value': 4.5}, {'Nutrition_type': 'Carb', 'value': 13}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 8}, {'Nutrition_type': 'Sodium', 'value': 115}], 'Iced Caffè Mocha': [{'Nutrition_type': 'Calories', 'value': 230}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 36}, {'Nutrition_type': 'Fiber', 'value': 4}, {'Nutrition_type': 'Protein', 'value': 9}, {'Nutrition_type': 'Sodium', 'value': 90}], 'Iced Caramel Macchiato': [{'Nutrition_type': 'Calories', 'value': 250}, {'Nutrition_type': 'Fat', 'value': 7.0}, {'Nutrition_type': 'Carb', 'value': 37}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 10}, {'Nutrition_type': 'Sodium', 'value': 150}], 'Iced Cinnamon Dolce Latte': [{'Nutrition_type': 'Calories', 'value': 200}, {'Nutrition_type': 'Fat', 'value': 4.0}, {'Nutrition_type': 'Carb', 'value': 34}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 7}, {'Nutrition_type': 'Sodium', 'value': 95}], 'Iced Coconutmilk Mocha Macchiato': [{'Nutrition_type': 'Calories', 'value': 260}, {'Nutrition_type': 'Fat', 'value': 9.0}, {'Nutrition_type': 'Carb', 'value': 34}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 11}, {'Nutrition_type': 'Sodium', 'value': 180}], 'Iced Vanilla Latte': [{'Nutrition_type': 'Calories', 'value': 190}, {'Nutrition_type': 'Fat', 'value': 4.0}, {'Nutrition_type': 'Carb', 'value': 30}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 7}, {'Nutrition_type': 'Sodium', 'value': 100}], 'Iced White Chocolate Mocha': [{'Nutrition_type': 'Calories', 'value': 300}, {'Nutrition_type': 'Fat', 'value': 8.0}, {'Nutrition_type': 'Carb', 'value': 47}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 10}, {'Nutrition_type': 'Sodium', 'value': 190}], 'Latte Macchiato': [{'Nutrition_type': 'Calories', 'value': 190}, {'Nutrition_type': 'Fat', 'value': 7.0}, {'Nutrition_type': 'Carb', 'value': 19}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 12}, {'Nutrition_type': 'Sodium', 'value': 160}], 'Starbucks Doubleshot® on Ice Beverage': [{'Nutrition_type': 'Calories', 'value': 45}, {'Nutrition_type': 'Fat', 'value': 1.0}, {'Nutrition_type': 'Carb', 'value': 5}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 3}, {'Nutrition_type': 'Sodium', 'value': 40}], 'Vanilla Latte': [{'Nutrition_type': 'Calories', 'value': 250}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 37}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 12}, {'Nutrition_type': 'Sodium', 'value': 150}], 'White Chocolate Mocha': [{'Nutrition_type': 'Calories', 'value': 360}, {'Nutrition_type': 'Fat', 'value': 11.0}, {'Nutrition_type': 'Carb', 'value': 53}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 14}, {'Nutrition_type': 'Sodium', 'value': 240}], 'Cinnamon Dolce Frappuccino® Blended Coffee': [{'Nutrition_type': 'Calories', 'value': 350}, {'Nutrition_type': 'Fat', 'value': 4.5}, {'Nutrition_type': 'Carb', 'value': 64}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 15}, {'Nutrition_type': 'Sodium', 'value': 0}], 'Coffee Light Frappuccino® Blended Coffee': [{'Nutrition_type': 'Calories', 'value': 110}, {'Nutrition_type': 'Fat', 'value': 0.0}, {'Nutrition_type': 'Carb', 'value': 24}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 3}, {'Nutrition_type': 'Sodium', 'value': 200}], 'Mocha Frappuccino® Blended Coffee': [{'Nutrition_type': 'Calories', 'value': 280}, {'Nutrition_type': 'Fat', 'value': 2.5}, {'Nutrition_type': 'Carb', 'value': 60}, {'Nutrition_type': 'Fiber', 'value': 2}, {'Nutrition_type': 'Protein', 'value': 4}, {'Nutrition_type': 'Sodium', 'value': 220}], 'Mocha Light Frappuccino® Blended Coffee': [{'Nutrition_type': 'Calories', 'value': 140}, {'Nutrition_type': 'Fat', 'value': 0.5}, {'Nutrition_type': 'Carb', 'value': 28}, {'Nutrition_type': 'Fiber', 'value': 1}, {'Nutrition_type': 'Protein', 'value': 4}, {'Nutrition_type': 'Sodium', 'value': 180}], 'Cinnamon Dolce Crème': [{'Nutrition_type': 'Calories', 'value': 200}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 28}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 10}, {'Nutrition_type': 'Sodium', 'value': 135}], 'Vanilla Crème': [{'Nutrition_type': 'Calories', 'value': 200}, {'Nutrition_type': 'Fat', 'value': 6.0}, {'Nutrition_type': 'Carb', 'value': 28}, {'Nutrition_type': 'Fiber', 'value': 0}, {'Nutrition_type': 'Protein', 'value': 10}, {'Nutrition_type': 'Sodium', 'value': 135}], 'Chocolate Smoothie': [{'Nutrition_type': 'Calories', 'value': 320}, {'Nutrition_type': 'Fat', 'value': 5.0}, {'Nutrition_type': 'Carb', 'value': 53}, {'Nutrition_type': 'Fiber', 'value': 8}, {'Nutrition_type': 'Protein', 'value': 20}, {'Nutrition_type': 'Sodium', 'value': 170}], 'Strawberry Smoothie': [{'Nutrition_type': 'Calories', 'value': 300}, {'Nutrition_type': 'Fat', 'value': 2.0}, {'Nutrition_type': 'Carb', 'value': 60}, {'Nutrition_type': 'Fiber', 'value': 7}, {'Nutrition_type': 'Protein', 'value': 16}, {'Nutrition_type': 'Sodium', 'value': 130}]}

### 
What is the nested data structure of the object `starbucks_drinks_nutrition`? An example of a nested data structure can be a list of dictionaries, where the dictionary values are a tuple of dictionaries? 

*(1 point)*

In [23]:
print(type(starbucks_drinks_nutrition))
print(type(starbucks_drinks_nutrition['Cool Lime Starbucks Refreshers™ Beverage']))
print(type(starbucks_drinks_nutrition['Cool Lime Starbucks Refreshers™ Beverage'][0]))


<class 'dict'>
<class 'list'>
<class 'dict'>


starbucks_drinks_nutrition is a dictionary with each key mapping to a list of dictionaries. the values of these dictionaries are numbers

### 
Use dictionary-comprehension to print the name and `carb` content of all drinks that have a `carb` content of more than 50 units.

**Hint:** It will be a nested dictionary comprehension.

*(6 points)*

In [27]:
high_carb_drinks = {
    drink: value[2]["value"] for drink, value in starbucks_drinks_nutrition.items() if value[2]["value"] > 50
}
high_carb_drinks

{'White Chocolate Mocha': 53,
 'Cinnamon Dolce Frappuccino® Blended Coffee': 64,
 'Mocha Frappuccino® Blended Coffee': 60,
 'Chocolate Smoothie': 53,
 'Strawberry Smoothie': 60}

## Ted talks

### 
Read the [data](https://raw.githubusercontent.com/cwkenwaysun/TEDmap/master/data/TED_Talks.json) on ted talks from 2006 to 2017.

*(1 point)*

In [28]:

tedtalks_data = pd.read_json('https://raw.githubusercontent.com/cwkenwaysun/TEDmap/master/data/TED_Talks.json')
print(tedtalks_data.head())


URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)>

### 
Find the number of talks in the dataset.

*(1 point)*

In [29]:
tedtalks_data.count()

NameError: name 'tedtalks_data' is not defined

### 
Find the `headline`, `speaker` and `year_filmed` of the talk with the highest number of views.

*(4 points)*

In [None]:
sorted_ted_talks = tedtalks_data.sort_values(by = "views", ascending = False)
print(sorted_ted_talks.iloc[0:1, [2, 1, 7]])

### 
Do the majority of talks have less views than the average number of views for a talk? Justify your answer.

*(3 points)*

**Hint:** Print summary statistics for questions (4) and (5).

In [None]:
tedtalks_data.describe()


### 
Do at least 25% of the talks have more views than the average number of views for a talk? Justify your answer.

*(3 points)*

### 
The last column of the dataset consists of votes obtained by the talk under different categories, such as *Funny, Confusing, Fascinating, etc.* For each category, create a new column in the dataset that contains the votes obtained by the tedtalk in that category. Print the first 5 rows of the updated dataset.

*(8 points)*

### 
With the data created in (a), find the `headline` of the talk that received the highest number of votes as *Confusing*.

*(4 points)*

### 
With the data created in (a), find the `headline` and the `year` of the talk that received the highest percentage of votes in the *Fascinating* category. 

$$\text{Percentage of } \textit{Fascinating} \text{ votes for a ted talk} = \frac{Number \ of \  votes \ in \ the \ category \ `Fascinating`}{Total \ votes \ in \ all  \ categories}$$

*(7 points)*

## University rankings

### 
Download the data set “univ.txt”. Read it with python.

*(1 point)*

In [2]:
import pandas as pd

univ_txt = pd.read_csv("/Users/vaibhavrangan/Downloads/Stat_303-1/Data/univ.txt", sep="\t")
print(univ_txt)

                                         Name             Location   Rank  \
0                        Princeton University        Princeton, NJ    1.0   
1                          Harvard University        Cambridge, MA    2.0   
2                       University of Chicago          Chicago, IL    3.0   
3                             Yale University        New Haven, CT    3.0   
4                         Columbia University         New York, NY    5.0   
..                                        ...                  ...    ...   
226    University of Massachusetts--Dartmouth  North Dartmouth, MA  220.0   
227         University of Missouri--St. Louis        St. Louis, MO  220.0   
228  University of North Carolina--Greensboro       Greensboro, NC  220.0   
229        University of Southern Mississippi      Hattiesburg, MS  220.0   
230                     Utah State University            Logan, UT  220.0   

     Tuition and fees  Undergrad Enrollment  
0             45320.0        

### 
Find summary statistics of the data. Based on the statistics, answer the next four questions.

*(1 point)*

In [3]:
univ_txt.describe()

Unnamed: 0,Rank,Tuition and fees,Undergrad Enrollment
count,231.0,231.0,231.0
mean,113.982684,33769.246753,14946.619048
std,65.995518,10756.733516,10569.664095
min,1.0,5300.0,1001.0
25%,56.0,25693.0,6238.5
50%,111.0,31608.0,12949.0
75%,171.0,42721.0,22145.5
max,220.0,55056.0,54513.0


### 
How many universities are there in the data set?

*(1 point)*

In [4]:
univ_txt.count()

Name                    231
Location                231
Rank                    231
Tuition and fees        231
Undergrad Enrollment    231
dtype: int64

### 
Estimate the maximum `Tuition and fees` among universities that are in the bottom 25% when ranked by total tuition and fees.

*(2 points)*

In [5]:
import numpy as np

bottom_25 = np.percentile(univ_txt["Tuition and fees"], 25)
affordable_unis = univ_txt.loc[univ_txt["Tuition and fees"] <= bottom_25]

print(affordable_unis["Tuition and fees"].max())

25673.0


### 
How many universities share the ranking of 220? (If `s` universities share the same rank, say `r`, then the next lower rank is `r+s`, and all the ranks in between `r` and `r+s` are dropped)

*(4 points)*

In [27]:
print(len(univ_txt[univ_txt["Rank"] == 220]))

12


### 
Can you find the mean `Tuition and fees` for an undergrad student in the US from the summary statistics? Justify your answer.

*(3 points)*

No, we can't. The summary statistics give us the unweighted mean of the tuition of all universities, without accounting for the number of students in each school who actually pay the tuition. We would need the number of students at each university to find an average amount of tuition paid by an undergrad student in the US.

### 
Find the average `Tuition and fees` for an undergrad student in the US.

*(5 points)*

In [30]:
univ_txt["total_tuition"] = univ_txt["Tuition and fees"] * univ_txt["Undergrad Enrollment"]
print(univ_txt["total_tuition"].sum() / univ_txt["Undergrad Enrollment"].sum())

30845.32903298868


## File formats
Consider the file formats - *csv, JSON, txt*. Mention one advantage and one disadvantage of each format over the other two formats. 

*(2+2+2 = 6 points)*