<a href="https://colab.research.google.com/github/krauseannelize/nb-py-ms-exercises/blob/sprint03/notebooks/s03_pandas_foundation/36_exercises_pandas_foundations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 36 | Exercises - Pandas Foundations (Building on the Basics)

💡 **Tip:** In future versions of `Pandas`, integer keys like `series[1]` will always be treated as labels, not positions.  To avoid ambiguity, use:

- `.iloc[pos]` → position‑based access  
- `.loc[label]` → label‑based access  

## DataFrame Indexing and slicing with `.loc[]`

`Pandas` `.loc[]` is label-based: select rows and columns by their labels or by boolean conditions. It’s concise, readable, and inclusive on slice endpoints.

```python
# basic syntax
df.loc[row_labels, column_labels]
```

- `row_labels`: Labels, label slices, lists, boolean masks
- `column_labels`: Single label, list of labels, label slices

In [None]:
import pandas as pd

data = {
    'Band': ['Black Sabbath', 'Iron Maiden', 'Metallica', 'Slayer',
             'Megadeth'],
    'Country': ['UK', 'UK', 'USA', 'USA', 'USA'],
    'Year Formed': [1968, 1975, 1981, 1981, 1983],
    'Genre': ['Heavy Metal', 'Heavy Metal', 'Thrash Metal',
              'Thrash Metal', 'Thrash Metal'],
    }
df = pd.DataFrame(data, index=['BS', 'IM', 'MT', 'SL', 'MG'])

# Access rows using labels
print(df.loc['IM'])

Band           Iron Maiden
Country                 UK
Year Formed           1975
Genre          Heavy Metal
Name: IM, dtype: object


In [None]:
# Access the band and genre for "MT" and "SL"
print(df.loc[["MT", "SL"], ["Band", "Genre"]])

         Band         Genre
MT  Metallica  Thrash Metal
SL     Slayer  Thrash Metal


In [None]:
# Access rows from 'IM' up to and inclusing 'MG'
print(df.loc['IM':'MG'])

           Band Country  Year Formed         Genre
IM  Iron Maiden      UK         1975   Heavy Metal
MT    Metallica     USA         1981  Thrash Metal
SL       Slayer     USA         1981  Thrash Metal
MG     Megadeth     USA         1983  Thrash Metal


### Common `.loc[]` selections

| Type | Example | Description |
| --- | --- | --- |
| **One cell** | `.loc['MT', 'Genre']` | Extracts genre of band in row with index 'MT' |
| **One row** | `.loc['SL']` | Extracts all details of band in row with index 'SL' |
| **All rows starting from a row** | `.loc['MT':]` | Extracts all rows starting from band with index 'MT' |
| **All rows up to a row** | `.loc[:'SL']` | Extracts all rows from the first band up to band with index 'SL' |
| **Multiple consecutive rows** | `.loc['IM':'MG']` | Extracts rows from 'IM' to 'MG', inclusive. |
| **One column** | `.loc[:, 'Genre']` | Extracts all values from 'Genre' column |
| **Multiple columns** | `.loc[:, ['Band', 'Genre']]` | Extracts all values from 'Band' and 'Genre' columns |
| **Multiple consecutive columns** | `.loc[:, 'Country':'Genre']` | Extracts all values from columns 'Country' to 'Genre' |

### Conditional Filtering with `.loc[]`

Use `.loc[]` to filter rows based on conditions. Combine multiple conditions with logical operators like `&` (AND), `|` (OR), and `~` (NOT). Wrap each condition in parentheses.

In [None]:
# Find bands formed after 1975 and based in the USA
filtered_bands = df.loc[(df['Year Formed'] > 1975) & (df['Country'] == 'USA')]
print(filtered_bands)

         Band Country  Year Formed         Genre
MT  Metallica     USA         1981  Thrash Metal
SL     Slayer     USA         1981  Thrash Metal
MG   Megadeth     USA         1983  Thrash Metal


## DataFrame Indexing and slicing with `.iloc[]`

`Pandas` `.iloc[]` is position-based: select rows and columns by their integer positions. Row/column indexing starts at 0. Slices are end-exclusive.

```python
# basic syntax
df.iloc[row_index, column_index]
```

In [None]:
import pandas as pd

data = {
    'Band': ['Black Sabbath', 'Iron Maiden', 'Metallica', 'Slayer',
             'Megadeth'],
    'Country': ['UK', 'UK', 'USA', 'USA', 'USA'],
    'Year Formed': [1968, 1975, 1981, 1981, 1983],
    'Genre': ['Heavy Metal', 'Heavy Metal', 'Thrash Metal',
              'Thrash Metal', 'Thrash Metal'],
    }
df = pd.DataFrame(data, index=['BS', 'IM', 'MT', 'SL', 'MG'])

# Access genre of band in 3rd row (index 2), 4th column (index 3)
print(df.iloc[2, 3])

Thrash Metal


In [None]:
# Get all details of the band in 5th row (index 4)
print(df.iloc[4])

Band               Megadeth
Country                 USA
Year Formed            1983
Genre          Thrash Metal
Name: MG, dtype: object


In [None]:
# Access rows 2 to 4 (excludes index 5)
print(df.iloc[2:5])

         Band Country  Year Formed         Genre
MT  Metallica     USA         1981  Thrash Metal
SL     Slayer     USA         1981  Thrash Metal
MG   Megadeth     USA         1983  Thrash Metal


### Common `.iloc[]` selections

| Type | Example | Description |
| --- | --- | --- |
| **Single cell** | `df.iloc[2, 3]` | Cell at row 2, column 3 |
| **Single row** | `df.iloc[4]` | All values in row 4 |
| **Single column** | `df.iloc[:, 3]` | All values in column 3 |
| **Row slice** | `df.iloc[2:5]` | Rows 2 to 4 (excludes 5) |
| **Column slice** | `df.iloc[:, 1:3]` | Columns 1 to 2 (excludes 3) |
| **Specific rows/columns** | `df.iloc[[1, 3, 5], [0, 2]]` | Rows 1, 3, 5 and columns 0, 2 |

## Comparing `.loc[]` with `.iloc[]`

| Feature |	.loc[] | .iloc[] |
| --- | --- | --- |
| **Indexing type** | Labels or names |	Integer positions	|
| **Use case** | Meaningful labels |	Numerical slicing	|
| **Missing key error** | KeyError |	IndexError |

## Adding, Subtracting, and Modifying Columns

### Adding New Columns

You can easily add new columns to a **DataFrame** by assigning values directly. These can be constants, lists, or calculations based on existing columns.

In [25]:
import pandas as pd

data = {
    'Band': [
        'Black Sabbath', 'Iron Maiden', 'Metallica',
        'Slayer', 'Megadeth', 'Angra',
        ],
    'Country': [
        'UK', 'UK', 'USA',
        'USA', 'USA', 'Brazil',
        ],
    'Year Formed': [
        1968, 1975, 1981,
        1981, 1983, 1991,
        ],
    'Genre': [
        'Heavy Metal', 'Heavy Metal',
        'Thrash Metal', 'Thrash Metal',
        'Thrash Metal', 'Power Metal',
        ],
    }
df = pd.DataFrame(data)
print(f"The original DataFrame:\n{df}")

# adding a new column for years the band has been active
df["Years Active"] = 2023 - df["Year Formed"]
print(f"\nDataFrame after adding column:\n{df}")

The original DataFrame:
            Band Country  Year Formed         Genre
0  Black Sabbath      UK         1968   Heavy Metal
1    Iron Maiden      UK         1975   Heavy Metal
2      Metallica     USA         1981  Thrash Metal
3         Slayer     USA         1981  Thrash Metal
4       Megadeth     USA         1983  Thrash Metal
5          Angra  Brazil         1991   Power Metal

DataFrame after adding column:
            Band Country  Year Formed         Genre  Years Active
0  Black Sabbath      UK         1968   Heavy Metal            55
1    Iron Maiden      UK         1975   Heavy Metal            48
2      Metallica     USA         1981  Thrash Metal            42
3         Slayer     USA         1981  Thrash Metal            42
4       Megadeth     USA         1983  Thrash Metal            40
5          Angra  Brazil         1991   Power Metal            32


### Updating Specific Values

You can update values conditionally using `.loc[]`. For example, Black Sabbath stopped touring in 2017, so we need to correct their "Years Active":

In [26]:
# \ is a line continuation character that lets you split long lines across multiple lines

df.loc[df["Band"] == "Black Sabbath", "Years Active"] = 2017 \
    - df.loc[df["Band"] == "Black Sabbath", "Year Formed"]

print(f"\nDataFrame after updating specific value:\n{df}")


DataFrame after updating specific value:
            Band Country  Year Formed         Genre  Years Active
0  Black Sabbath      UK         1968   Heavy Metal            49
1    Iron Maiden      UK         1975   Heavy Metal            48
2      Metallica     USA         1981  Thrash Metal            42
3         Slayer     USA         1981  Thrash Metal            42
4       Megadeth     USA         1983  Thrash Metal            40
5          Angra  Brazil         1991   Power Metal            32


### Creating a Boolean Column

You can also create a new column with `True` or `False` values based on a condition:

In [27]:
# Check if the band is still active
df["is_active"] = (2023 - df["Years Active"]) == df["Year Formed"]
print(f"\nDataFrame after adding boolean column:\n{df}")


DataFrame after adding boolean column:
            Band Country  Year Formed         Genre  Years Active  is_active
0  Black Sabbath      UK         1968   Heavy Metal            49      False
1    Iron Maiden      UK         1975   Heavy Metal            48       True
2      Metallica     USA         1981  Thrash Metal            42       True
3         Slayer     USA         1981  Thrash Metal            42       True
4       Megadeth     USA         1983  Thrash Metal            40       True
5          Angra  Brazil         1991   Power Metal            32       True


### Modifying Columns

You can overwrite an entire column by assigning a new list, **Series**, or calculation. This is useful when you want to recalculate or redefine a column across all rows.

In [28]:
# Increment "Years Active" by 1 only for bands where "is_active" is True
df.loc[df["is_active"] == True, "Years Active"] += 1

print(f"DataFrame after modidying column:\n{df}")

DataFrame after modidying column:
            Band Country  Year Formed         Genre  Years Active  is_active
0  Black Sabbath      UK         1968   Heavy Metal            49      False
1    Iron Maiden      UK         1975   Heavy Metal            49       True
2      Metallica     USA         1981  Thrash Metal            43       True
3         Slayer     USA         1981  Thrash Metal            43       True
4       Megadeth     USA         1983  Thrash Metal            41       True
5          Angra  Brazil         1991   Power Metal            33       True


### Dropping Columns

To remove a column from a **DataFrame**, use `.drop()`:

```python
# basic syntax
df.drop(labels, axis=0 or 1, inplace=False or True)
```

- `labels`: Name(s) of rows or columns to drop.
- `axis`: 0 for rows (default), 1 for columns.
- `inplace`:
  - `False` (default): Returns a new **DataFrame** with specified rows or columns removed.
  - `True`: modifies the original **DataFrame**.

In [29]:
# Remove the "Year Formed" column
df = df.drop("Year Formed", axis=1)
print(f"DataFrame after dropping 'Year Formed':\n{df}")

DataFrame after dropping 'Year Formed':
            Band Country         Genre  Years Active  is_active
0  Black Sabbath      UK   Heavy Metal            49      False
1    Iron Maiden      UK   Heavy Metal            49       True
2      Metallica     USA  Thrash Metal            43       True
3         Slayer     USA  Thrash Metal            43       True
4       Megadeth     USA  Thrash Metal            41       True
5          Angra  Brazil   Power Metal            33       True


## Aggregating Values

Pandas makes it easy to summarize your data using built-in aggregation methods. These help you spot patterns, compare metrics, and clean up messy datasets. Here’s a quick reference of common methods and when to use them:

| Method | What It Does | When to Use |
| --- | --- |--- |
| `.sum()` | Adds up all values in a column | Totals, like cumulative durations or combined scores |
| `.mean()` | Calculates the average of column values | Spotting trends or benchmarks |
| `.median()` | Finds the middle value of a column | Great for skewed data with outliers |
| `.min()` | Finds the smallest value in a column | Identify the lowest data point |
| `.max()` | Finds the largest value in a column | Spot the record-breakers |
| `.copy()` | Creates a safe duplicate of your data | When experimenting or modifying without risks |

In [32]:
import pandas as pd

data = {
    "Band": ["Black Sabbath", "Iron Maiden", "Metallica",
             "Slayer", "Megadeth", "Angra"],
    "Country": ["UK", "UK", "USA", "USA", "USA", "Brazil"],
    "Year Formed": [1968, 1975, 1981, 1981, 1983, 1991],
    "Genre": ["Heavy Metal", "Heavy Metal", "Thrash Metal",
              "Thrash Metal", "Thrash Metal", "Power Metal"],
    "Years Active": [49, 49, 43, 43, 41, 33],
    "is_active": [False, True, True, True, True, True]
}
df = pd.DataFrame(data)

# Calculate the total years active for all bands
total_years_active = df["Years Active"].sum()
print(f"Total Years Active: {total_years_active}")

Total Years Active: 258


In [33]:
# Calculate the average years active
average_years_active = df["Years Active"].mean()
print(f"Average Years Active: {average_years_active}")

Average Years Active: 43.0


In [34]:
# Calculate the median years active
median_years_active = df["Years Active"].median()
print(f"Median Years Active: {median_years_active}")

Median Years Active: 43.0


In [35]:
# Find the shortest and longest years active
min_years_active = df["Years Active"].min()
max_years_active = df["Years Active"].max()

print(f"Shortest Years Active: {min_years_active}")
print(f"Longest Years Active: {max_years_active}")

Shortest Years Active: 33
Longest Years Active: 49


## Working Without Wrecking

Before making changes to your data, it’s smart to create a backup. The `.copy()` method lets you duplicate your **DataFrame** so you can experiment freely without affecting the original.

In [37]:
print(f"Original DataFrame:\n{df}\n")

# Create a copy of the DataFrame
df_copy = df.copy()

# Modify the copy without changing the original
df_copy["Years Active"] += 1
print(f"Modified Copy:\n{df_copy}")

Original DataFrame:
            Band Country  Year Formed         Genre  Years Active  is_active
0  Black Sabbath      UK         1968   Heavy Metal            49      False
1    Iron Maiden      UK         1975   Heavy Metal            49       True
2      Metallica     USA         1981  Thrash Metal            43       True
3         Slayer     USA         1981  Thrash Metal            43       True
4       Megadeth     USA         1983  Thrash Metal            41       True
5          Angra  Brazil         1991   Power Metal            33       True

Modified Copy:
            Band Country  Year Formed         Genre  Years Active  is_active
0  Black Sabbath      UK         1968   Heavy Metal            50      False
1    Iron Maiden      UK         1975   Heavy Metal            50       True
2      Metallica     USA         1981  Thrash Metal            44       True
3         Slayer     USA         1981  Thrash Metal            44       True
4       Megadeth     USA         1983  T

## Sorting Data

### Sorting Rows by Column Values

The `.sort_values()` method allows you to sort rows in your **DataFrame** based on the values in one or more columns.

```python
# basic syntax
df.sort_values(by="column_name", ascending=True, inplace=False)
```

- `by`: Specify the column to sort by
- `ascending`: Descending by default, `ascending=False` for descending order
- `inplace`:  If `True`, modifies the original **DataFrame**. If `False` (default), returns a new **DataFrame**.

In [41]:
import pandas as pd

data = {
    "Band": ["Black Sabbath", "Iron Maiden", "Metallica",
             "Slayer", "Megadeth", "Angra"],
    "Years Active": [49, 49, 43, 43, 41, 33]
}
df = pd.DataFrame(data)
print(f"Unsorted DataFrame:\n{df}\n")

# Sort bands by "Years Active"
sorted_df = df.sort_values(by="Years Active")
print(f"DataFrame sorted by 'Years Active':\n{sorted_df}")

Unsorted DataFrame:
            Band  Years Active
0  Black Sabbath            49
1    Iron Maiden            49
2      Metallica            43
3         Slayer            43
4       Megadeth            41
5          Angra            33

DataFrame sorted by 'Years Active':
            Band  Years Active
5          Angra            33
4       Megadeth            41
3         Slayer            43
2      Metallica            43
0  Black Sabbath            49
1    Iron Maiden            49


## Resetting Row Indices

Sorting a **DataFrame** can leave the row indices out of order. Use `.reset_index()` to restore a clean, sequential index.

```python
# basic syntax
df.reset_index(drop=False, inplace=False)
```

- `drop=True`: Removes the old index
- `inplace=True`: Applies the change directly to the original **DataFrame**

In [42]:
reset_df = sorted_df.reset_index()
print(f"DataFrame with indices reset:\n{reset_df}")

DataFrame with indices reset:
   index           Band  Years Active
0      5          Angra            33
1      4       Megadeth            41
2      3         Slayer            43
3      2      Metallica            43
4      0  Black Sabbath            49
5      1    Iron Maiden            49


By default, the old index is kept as a new column. If you don’t need it, drop it:

In [43]:
# Reset index and remove the old one
reset_df = sorted_df.reset_index(drop=True)
print(f"DataFrame with old indices dropped:\n{reset_df}")

DataFrame with old indices dropped:
            Band  Years Active
0          Angra            33
1       Megadeth            41
2         Slayer            43
3      Metallica            43
4  Black Sabbath            49
5    Iron Maiden            49


## Exercise 1

Use `loc` and conditional filtering to select the rows from the DataFrame `df` where the `Country` is 'UK'.

- Create a boolean mask (a Series of True/False values) by checking which rows in the `Country` column have the value 'UK'.
- Use the `loc` accessor with the boolean mask to select only the rows where the mask is `True`.
- Print the resulting **DataFrame**.

```python
data = {
    'Band': [
        'Black Sabbath', 'Iron Maiden', 'Metallica',
        'Slayer', 'Megadeth', 'Angra',
        ],
    'Country': [
        'UK', 'UK', 'USA',
        'USA', 'USA', 'Brazil',
        ],
    'Year Formed': [
        1968, 1975, 1981,
        1981, 1983, 1991,
        ],
    'Genre': [
        'Heavy Metal', 'Heavy Metal',
        'Thrash Metal', 'Thrash Metal',
        'Thrash Metal', 'Power Metal',
        ],
    }
```

In [None]:
import pandas as pd

data = {
    'Band': [
        'Black Sabbath', 'Iron Maiden', 'Metallica',
        'Slayer', 'Megadeth', 'Angra',
        ],
    'Country': [
        'UK', 'UK', 'USA',
        'USA', 'USA', 'Brazil',
        ],
    'Year Formed': [
        1968, 1975, 1981,
        1981, 1983, 1991,
        ],
    'Genre': [
        'Heavy Metal', 'Heavy Metal',
        'Thrash Metal', 'Thrash Metal',
        'Thrash Metal', 'Power Metal',
        ],
    }

df = pd.DataFrame(data)

# Create mask
mask = df['Country'] == 'UK'

# Filter rows using loc
uk_bands = df.loc[mask]

print(uk_bands)

            Band Country  Year Formed        Genre
0  Black Sabbath      UK         1968  Heavy Metal
1    Iron Maiden      UK         1975  Heavy Metal


## Exercise 2

We want to find the OGs of **US Heavy Metal** in our dataset. Help us out.

- Use `.loc[]` with **multiple conditions** to filter and display all bands:
  - From the **USA**.
  - Formed **before 1982**.
- Use `.iloc[]` with slicing to display the first **three rows** and their **first two columns**.

```python
data = {
    "Band": [
        "Pantera", "Sepultura", "Dream Theater",
        "Anthrax", "Death", "Exodus",
        "Judas Priest", "Testament"
    ],
    "Country": [
        "USA", "Brazil", "USA",
        "USA", "USA", "USA",
        "UK", "USA"
    ],
    "Year Formed": [
        1981, 1984, 1985,
        1981, 1983, 1979,
        1969, 1983
    ],
    "Genre": [
        "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal"
    ]
}
```


In [None]:
import pandas as pd

data = {
    "Band": [
        "Pantera", "Sepultura", "Dream Theater",
        "Anthrax", "Death", "Exodus",
        "Judas Priest", "Testament"
    ],
    "Country": [
        "USA", "Brazil", "USA",
        "USA", "USA", "USA",
        "UK", "USA"
    ],
    "Year Formed": [
        1981, 1984, 1985,
        1981, 1983, 1979,
        1969, 1983
    ],
    "Genre": [
        "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal"
    ]
}

df = pd.DataFrame(data)

# filter bands from the USA formed before 1982
filtered_bands = df.loc[(df['Year Formed'] < 1982) & (df['Country'] == 'USA')]
print(filtered_bands)
print() # line break

# display the first three rows and their first two columns
print(filtered_bands.iloc[:3, :2])

      Band Country  Year Formed         Genre
0  Pantera     USA         1981  Groove Metal
3  Anthrax     USA         1981  Thrash Metal
5   Exodus     USA         1979  Thrash Metal

      Band Country
0  Pantera     USA
3  Anthrax     USA
5   Exodus     USA


## Exercise 3

Let’s refine the **heavy metal bands** dataset to focus on specific data points and add calculated insights. Your tasks are:

1 . Remove the `Country` and `Genre` columns from the dataset.
2 . Add a new column named `Albums per Decade`, calculated as the number of albums divided by the number of decades since the band was formed.
3 . Modify the `Albums per Decade` column by doubling its values to reflect hypothetical re-releases.

```python
data = {
    'Band': [
        'Pantera', 'Sepultura', 'Dream Theater', 'Anthrax', 'Death',
        'Exodus', 'Judas Priest', 'Testament',
        ],
    'Country': [
        'USA', 'Brazil', 'USA', 'USA', 'USA', 'USA',
        'UK', 'USA',
        ],
    'Year Formed': [
        1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983,
        ],
    'Genre': [
        'Groove Metal', 'Thrash Metal', 'Progressive Metal',
        'Thrash Metal', 'Death Metal', 'Thrash Metal',
        'Heavy Metal', 'Thrash Metal',
        ],
    'Albums Released': [
        9, 15, 14, 11, 7, 10, 18, 12,
        ],
    }
```

In [31]:
import pandas as pd

data = {
    'Band': [
        'Pantera', 'Sepultura', 'Dream Theater', 'Anthrax', 'Death',
        'Exodus', 'Judas Priest', 'Testament',
        ],
    'Country': [
        'USA', 'Brazil', 'USA', 'USA', 'USA', 'USA',
        'UK', 'USA',
        ],
    'Year Formed': [
        1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983,
        ],
    'Genre': [
        'Groove Metal', 'Thrash Metal', 'Progressive Metal',
        'Thrash Metal', 'Death Metal', 'Thrash Metal',
        'Heavy Metal', 'Thrash Metal',
        ],
    'Albums Released': [
        9, 15, 14, 11, 7, 10, 18, 12,
        ],
    }

df = pd.DataFrame(data)
print(f"Original DataFrame:\n{df}\n")

# Remove the "Country" and "Genre" columns
df = df.drop(["Country", "Genre"], axis=1)
print(f"DataFrame after dropping 'Country' and `Genre`:\n{df}\n")

# Add a new column "Albums per Decade"
# Calculate albums per decade using regular division
df["Albums per Decade"] = df["Albums Released"] / ((2023 - df["Year Formed"]) / 10)
print(f"DataFrame after adding 'Albums per Decade' column:\n{df}\n")

# Double the values in the "Albums per Decade" column
df["Albums per Decade"] *= 2
print(f"DataFrame after doubling 'Albums per Decade' column:\n{df}")

Original DataFrame:
            Band Country  Year Formed              Genre  Albums Released
0        Pantera     USA         1981       Groove Metal                9
1      Sepultura  Brazil         1984       Thrash Metal               15
2  Dream Theater     USA         1985  Progressive Metal               14
3        Anthrax     USA         1981       Thrash Metal               11
4          Death     USA         1983        Death Metal                7
5         Exodus     USA         1979       Thrash Metal               10
6   Judas Priest      UK         1969        Heavy Metal               18
7      Testament     USA         1983       Thrash Metal               12

DataFrame after dropping 'Country' and `Genre`:
            Band  Year Formed  Albums Released
0        Pantera         1981                9
1      Sepultura         1984               15
2  Dream Theater         1985               14
3        Anthrax         1981               11
4          Death         1983 

## Exercise 4

Now you will be performing basic statistical analysis on the **Heavy Metal Bands** dataset for the first time. So calculate the following:

1. **Total Albums Released**: Use `.sum()` to calculate the total number of albums released by all bands.
2. **Average Albums Released**: Use `.mean()` to calculate the average number of albums released.
3. **Maximum Albums Released**: Use `.max()` to find the highest number of albums released by any band.

```python
data = {
    "Band": [ "Pantera", "Sepultura", "Dream Theater", "Anthrax",
                    "Death", "Exodus", "Judas Priest", "Testament" ],
    "Country": [ "USA", "Brazil", "USA", "USA", "USA", "USA", "UK", "USA" ],
    "Year Formed": [ 1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983 ],
    "Genre": [ "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal" ],
    "Albums Released": [ 9, 15, 14, 11, 7, 10, 18, 12 ]
}
```

In [38]:
import pandas as pd

data = {
    "Band": [ "Pantera", "Sepultura", "Dream Theater", "Anthrax",
                    "Death", "Exodus", "Judas Priest", "Testament" ],
    "Country": [ "USA", "Brazil", "USA", "USA", "USA", "USA", "UK", "USA" ],
    "Year Formed": [ 1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983 ],
    "Genre": [ "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal" ],
    "Albums Released": [ 9, 15, 14, 11, 7, 10, 18, 12 ]
}

df = pd.DataFrame(data)

# Calculate the total albums released
total_albums_released = df["Albums Released"].sum()
print(f"Total Albums Released: {total_albums_released}")

# Calculate the average albums released
average_albums_released = df["Albums Released"].mean()
print(f"Average Albums Released: {average_albums_released}")

# Find the maximum albums released
max_albums_released = df["Albums Released"].max()
print(f"Maximum Albums Released: {max_albums_released}")

Total Albums Released: 96
Average Albums Released: 12.0
Maximum Albums Released: 18


## Exercise 5

Let’s organize the Heavy Metal Bands dataset by performing the following operations:

1. **Sort by Albums Released**:
  - Use `.sort_values()` to sort the DataFrame by the `Albums Released` column in descending order.
  - Set `inplace=True` to modify the original DataFrame.

2. **Reindex the DataFrame**:
  - Reset the indices of the DataFrame after sorting using .`reset_index()` with `drop=True`.

```python
data = {
    "Band": [ "Pantera", "Sepultura", "Dream Theater", "Anthrax",
                    "Death", "Exodus", "Judas Priest", "Testament" ],
    "Country": [ "USA", "Brazil", "USA", "USA", "USA", "USA", "UK", "USA" ],
    "Year Formed": [ 1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983 ],
    "Genre": [ "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal" ],
    "Albums Released": [ 9, 15, 14, 11, 7, 10, 18, 12 ]
}
```

In [39]:
import pandas as pd

data = {
    "Band": [ "Pantera", "Sepultura", "Dream Theater", "Anthrax",
                    "Death", "Exodus", "Judas Priest", "Testament" ],
    "Country": [ "USA", "Brazil", "USA", "USA", "USA", "USA", "UK", "USA" ],
    "Year Formed": [ 1981, 1984, 1985, 1981, 1983, 1979, 1969, 1983 ],
    "Genre": [ "Groove Metal", "Thrash Metal", "Progressive Metal",
        "Thrash Metal", "Death Metal", "Thrash Metal",
        "Heavy Metal", "Thrash Metal" ],
    "Albums Released": [ 9, 15, 14, 11, 7, 10, 18, 12 ]
}

df = pd.DataFrame(data)

# Step 1: Sort by "Albums Released" in descending order
df.sort_values(by="Albums Released", ascending=False, inplace=True)
print(f"DataFrame after sorting by 'Albums Released':\n{df}\n")

# Step 2: Reset the index
df.reset_index(drop=True, inplace=True)
print(f"DataFrame after reindexing:\n{df}")

DataFrame after sorting by 'Albums Released':
            Band Country  Year Formed              Genre  Albums Released
6   Judas Priest      UK         1969        Heavy Metal               18
1      Sepultura  Brazil         1984       Thrash Metal               15
2  Dream Theater     USA         1985  Progressive Metal               14
7      Testament     USA         1983       Thrash Metal               12
3        Anthrax     USA         1981       Thrash Metal               11
5         Exodus     USA         1979       Thrash Metal               10
0        Pantera     USA         1981       Groove Metal                9
4          Death     USA         1983        Death Metal                7

DataFrame after reindexing:
            Band Country  Year Formed              Genre  Albums Released
0   Judas Priest      UK         1969        Heavy Metal               18
1      Sepultura  Brazil         1984       Thrash Metal               15
2  Dream Theater     USA         1985