In [1]:
# Imports pandas and assign it to the variable `pd`
import pandas as pd

# We often import NumPy (numerical python) with pandas
# we will import that and assign it to the variable `np`
import numpy as np

# Load matplotlib for plotting later in the workshop
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
unemployment = pd.read_csv('../data/country_total.csv')

## 🥊 Challenge 1
**Setup**  
We previously imported unemployment data into `pandas` using the `read_csv` function and a relative file path. `read_csv` also allows us to import data using a URL as the file path. 

A .csv file with data on world countries and their abbreviations is located at the following URL:

[https://raw.githubusercontent.com/dlab-berkeley/introduction-to-pandas/master/data/countries.csv](https://raw.githubusercontent.com/dlab-berkeley/introduction-to-pandas/master/data/countries.csv)

We can load that data directly as follows:

In [3]:
countries_url = 'https://raw.githubusercontent.com/dlab-berkeley/Python-Data-Wrangling/main/data/countries.csv'
countries = pd.read_csv(countries_url)

**Challenge**  
Whenever we open a new DataFrame, it's important to get a basic understanding of its structure.

Using the methods and attributes we just discussed, **answer the following questions** about `countries`:

1. What columns does `countries` contain?
2. How many rows and columns does it contain?
3. What are the minimum and maximum values of the columns with numerical data?

<details><summary><a>Click for hint</a></summary>
Hint: consider using <code>.columns</code>, <code>.shape</code>, and <code>.describe()</code> here.
</details>

In [4]:
# YOUR CODE HERE

countries.columns

Index(['country', 'google_country_code', 'country_group', 'name_en', 'name_fr',
       'name_de', 'latitude', 'longitude'],
      dtype='object')

In [5]:
# YOUR CODE HERE

countries.shape

(30, 8)

In [6]:
# YOUR CODE HERE

countries.describe()

Unnamed: 0,latitude,longitude
count,30.0,30.0
mean,49.092609,14.324579
std,7.956624,11.25701
min,35.129141,-8.239122
25%,43.230916,6.979186
50%,49.238087,14.941462
75%,54.0904,23.35169
max,64.950159,35.439795


## 🥊 Challenge 2: Indexing with `.loc`

Let's get a little practice with the `.loc` operator.  

<span style="color:purple">Select rows 10 through 20, then compute their average `latitude` </span>
<details>
    <summary><a>Click for Hint</a></summary>
    This can be done using <code>`.loc`</code> and <code>`.mean()`</code>, all in one line of code: <code>countries.loc[{your row selection}, {your column selection}].mean()</code>
</details>

In [7]:
# YOUR CODE HERE

countries.loc[10:20, 'latitude'].mean()

49.852639057272725

## 🥊 Challenge 3: Boolean Indexing

Let's push our boolean indexing skills a little further with a challenge problem.
1. Find the average longitude of countries in our data, assign it to the variable `average_long`
2. Find countries that have "above average" longitude
<details>
    <summary><a>Click for Hint</a></summary>
    Compute the average longitude of the data: <code>countries['longitude'].mean()</code> and save that to a variable <code>average_long</code>. Then, you can use that variable to create a boolean mask for indexing: <code>countries['longitude'] > average_long</code>
</details>

In [8]:
# YOUR CODE HERE

average_long = countries['longitude'].mean()

In [9]:
# YOUR CODE HERE

countries[countries['longitude'] > average_long]

Unnamed: 0,country,google_country_code,country_group,name_en,name_fr,name_de,latitude,longitude
2,bg,BG,eu,Bulgaria,Bulgarie,Bulgarien,42.725674,25.482322
3,hr,HR,non-eu,Croatia,Croatie,Kroatien,44.746643,15.340844
4,cy,CY,eu,Cyprus,Chypre,Zypern,35.129141,33.428682
5,cz,CZ,eu,Czech Republic,République tchèque,Tschechische Republik,49.803531,15.474998
7,ee,EE,eu,Estonia,Estonie,Estland,58.592469,25.80695
8,fi,FI,eu,Finland,Finlande,Finnland,64.950159,26.067564
11,gr,GR,eu,Greece,Grèce,Griechenland,39.698467,21.577256
12,hu,HU,eu,Hungary,Hongrie,Ungarn,47.161163,19.504265
15,lv,LV,eu,Latvia,Lettonie,Lettland,56.880117,24.606555
16,lt,LT,eu,Lithuania,Lituanie,Litauen,55.173687,23.943168
