<a href="https://www.kaggle.com/code/matinmahmoodi/pandas-fun-problems-series?scriptVersionId=166184617" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Pandas Fun Problems - Series

Welcome to the first notebook in our Pandas course, focusing on "Series" - one of the foundational building blocks of data manipulation within the Pandas library. In this notebook, we delve into a variety of problems that will enhance your understanding and skills in handling Series in Pandas. You can access the problem set here:

- [Practice Problems Source](https://www.practiceprobs.com/problemsets/python-pandas/series/)

To complement your learning and provide additional perspectives on tackling these problems, I recommend watching the following YouTube tutorial. It offers clear explanations and demonstrations that can significantly aid your understanding:

- [YouTube Tutorial](https://www.youtube.com/watch?v=8xUgesdShE8)

This notebook is designed to guide you through the intricacies of working with Pandas Series, ensuring a strong foundation for the subsequent topics in this Pandas course. Whether you are a beginner or looking to refresh your skills, these fun problems offer a practical and engaging way to learn.

I encourage you to share your solutions, ask questions, and provide feedback in the comments section. Engaging with the community can offer new insights and accelerate your learning process.

### **If you find this notebook helpful, please show your support with an upvote!**


# Q1 - Baby Names Problem


https://www.practiceprobs.com/problemsets/python-pandas/series/baby-names/

You and your spouse decided to let the internet name your next child. You’ve asked the great people of the web to submit their favorite names, and you’ve compiled their submissions into a Series called babynames.

In [1]:
import pandas as pd
import numpy as np

babynames = pd.Series([
    'Jathonathon', 'Zeltron', 'Ruger', 'Phreddy', 'Ruger', 'Chad', 'Chad',
    'Ruger', 'Ryan', 'Ruger', 'Chad', 'Ryan', 'Phreddy', 'Phreddy', 'Phreddy',
    'Mister', 'Zeltron', 'Ryan', 'Ruger', 'Ruger', 'Jathonathon',
    'Jathonathon', 'Ruger', 'Chad', 'Zeltron'], dtype='string')

## Solution

In [2]:
babynames.value_counts()

Ruger          7
Phreddy        4
Chad           4
Jathonathon    3
Zeltron        3
Ryan           3
Mister         1
Name: count, dtype: Int64

In [3]:
babynames.value_counts().loc[['Chad', 'Ruger', 'Zeltron']]

Chad       4
Ruger      7
Zeltron    3
Name: count, dtype: Int64

# Q2 - Bees Knees Problem

https://www.practiceprobs.com/problemsets/python-pandas/series/bees-knees/

Given, two Series bees and knees, if the ith value of bees is NaN, double the ith value inside knees.


In [4]:
import pandas as pd
import numpy as np

bees = pd.Series([True, True, False, np.nan, True, False, True, np.nan])
knees = pd.Series([5,2,9,1,3,10,5,2], index = [7,0,2,6,3,5,1,4])

print(bees)
# 0     True
# 1     True
# 2    False
# 3      NaN
# 4     True
# 5    False
# 6     True
# 7      NaN
# dtype: object

print(knees)
# 7     5
# 0     2
# 2     9
# 6     1  <-- double this
# 3     3
# 5    10
# 1     5
# 4     2  <-- double this
# dtype: int64

0     True
1     True
2    False
3      NaN
4     True
5    False
6     True
7      NaN
dtype: object
7     5
0     2
2     9
6     1
3     3
5    10
1     5
4     2
dtype: int64


## Solution

In [5]:
knees.loc[bees.isna()] #Not correct

7    5
3    3
dtype: int64

In [6]:
knees.loc[bees.isna().to_numpy()] # Yes

6    1
4    2
dtype: int64

In [7]:
knees.loc[bees.isna().to_numpy()] *= 2
knees

7     5
0     2
2     9
6     2
3     3
5    10
1     5
4     4
dtype: int64

# Q3 - Car Shopping Problem

https://www.practiceprobs.com/problemsets/python-pandas/series/car-shopping/

After accidentally leaving an ice chest of fish and shrimp in your car for a week while you were on vacation, you’re now in the market for a new vehicle 🚗. Your insurance didn’t cover the loss, so you want to make sure you get a good deal on your new car.

Given a Series of car asking_prices and another Series of car fair_prices, determine which cars for sale are a good deal. In other words, identify cars whose asking price is less than their fair price.



In [8]:
import pandas as pd
import numpy as np

asking_prices = pd.Series([5000, 7600, 9000, 8500, 7000], index=['civic', 'civic', 'camry', 'mustang', 'mustang'])
fair_prices = pd.Series([5500, 7500, 7500], index=['civic', 'mustang', 'camry'])

print(asking_prices)
# civic      5000
# civic      7600
# camry      9000
# mustang    8500
# mustang    7000
# dtype: int64

print(fair_prices)
# civic      5500
# mustang    7500
# camry      7500
# dtype: int64

civic      5000
civic      7600
camry      9000
mustang    8500
mustang    7000
dtype: int64
civic      5500
mustang    7500
camry      7500
dtype: int64


## Solution

In [9]:
all_fair_prices = fair_prices.loc[asking_prices.index]
all_fair_prices

civic      5500
civic      5500
camry      7500
mustang    7500
mustang    7500
dtype: int64

In [10]:
asking_prices.loc[asking_prices - all_fair_prices  < 0]

civic      5000
mustang    7000
dtype: int64

# Q4 - Price Gouging Problem

https://www.practiceprobs.com/problemsets/python-pandas/series/price-gouging/

You suspect your local grocery’s been price gouging the ground beef. You and some friends decide to track the price of ground beef every day for 10 days. You’ve compiled the data into a Series called beef_prices, whose index represents the day of each recording.



For example, beef was priced 3.37 on the first day, 4.64 on the second day, etc.

Determine which day had the biggest price increase from the prior day.


In [11]:
import pandas as pd
import numpy as np

generator = np.random.default_rng(123)
beef_prices = pd.Series(
    data = np.round(generator.uniform(low=3, high=5, size=10), 2),
    index = generator.choice(10, size=10, replace=False)
)

print(beef_prices)
# 4    4.36
# 8    3.11
# 2    3.44
# 0    3.37
# 6    3.35
# 9    4.62
# 3    4.85
# 5    3.55
# 1    4.64
# 7    4.78
# dtype: float64

4    4.36
8    3.11
2    3.44
0    3.37
6    3.35
9    4.62
3    4.85
5    3.55
1    4.64
7    4.78
dtype: float64


## Solution

In [12]:
beef_prices.sort_index(inplace=True)
beef_prices

0    3.37
1    4.64
2    3.44
3    4.85
4    4.36
5    3.55
6    3.35
7    4.78
8    3.11
9    4.62
dtype: float64

In [13]:
beef_prices_prev = beef_prices.shift(periods=1)
beef_prices_diff = beef_prices - beef_prices_prev
beef_prices_diff

0     NaN
1    1.27
2   -1.20
3    1.41
4   -0.49
5   -0.81
6   -0.20
7    1.43
8   -1.67
9    1.51
dtype: float64

In [14]:
beef_prices_diff.idxmax()

9

# Q5 - Fair Teams Problem

https://www.practiceprobs.com/problemsets/python-pandas/series/fair-teams/

You’re organizing a competitive rock-skipping league. 6 coaches and 20 players have signed up. Your job is to randomly and fairly determine the teams, assigning players to coaches. Keep in mind that some teams will have three players and some teams will have four players. Given a Series of coaches and a Series of players, create a Series of random coach-to-player mappings. The resulting Series should have coach names in its index and corresponding player names in its values

In [15]:
import pandas as pd
import numpy as np

coaches = pd.Series(['Aaron', 'Donald', 'Joshua', 'Peter', 'Scott', 'Stephen'], dtype='string')
players = pd.Series(['Asher', 'Connor', 'Elizabeth', 'Emily', 'Ethan', 'Hannah', 'Isabella', 'Isaiah', 'James',
                     'Joshua', 'Julian', 'Layla', 'Leo', 'Madison', 'Mia', 'Oliver', 'Ryan', 'Scarlett', 'William',
                     'Wyatt'], dtype='string')

print(coaches)
# 0      Aaron
# 1     Donald
# 2     Joshua
# 3      Peter
# 4      Scott
# 5    Stephen
# dtype: string

print(players)
# 0         Asher
# 1        Connor
# 2     Elizabeth
# 3         Emily
# 4         Ethan
# 5        Hannah
# 6      Isabella
# 7        Isaiah
# 8         James
# 9        Joshua
# 10       Julian
# 11        Layla
# 12          Leo
# 13      Madison
# 14          Mia
# 15       Oliver
# 16         Ryan
# 17     Scarlett
# 18      William
# 19        Wyatt
# dtype: string

0      Aaron
1     Donald
2     Joshua
3      Peter
4      Scott
5    Stephen
dtype: string
0         Asher
1        Connor
2     Elizabeth
3         Emily
4         Ethan
5        Hannah
6      Isabella
7        Isaiah
8         James
9        Joshua
10       Julian
11        Layla
12          Leo
13      Madison
14          Mia
15       Oliver
16         Ryan
17     Scarlett
18      William
19        Wyatt
dtype: string


## Solution

In [16]:
coaches = coaches.sample(frac=1, random_state=2357)
players = players.sample(frac=1, random_state=7532)

In [17]:
repeats = np.ceil(len(players)/len(coaches)).astype('int64')
repeats

4

In [18]:
pd.concat([coaches] * repeats)

5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
dtype: string

In [19]:
coaches_repeated = pd.concat([coaches] * repeats).head(len(players))
coaches_repeated

5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
0      Aaron
2     Joshua
3      Peter
1     Donald
5    Stephen
4      Scott
dtype: string

In [20]:
result = players.copy()
result.index = pd.Index(coaches_repeated, name='coach')
result

coach
Stephen       Julian
Scott         Joshua
Aaron      Elizabeth
Joshua         Asher
Peter         Oliver
Donald       William
Stephen        Wyatt
Scott         Isaiah
Aaron          Ethan
Joshua       Madison
Peter            Leo
Donald          Ryan
Stephen     Scarlett
Scott            Mia
Aaron         Connor
Joshua         James
Peter          Emily
Donald        Hannah
Stephen        Layla
Scott       Isabella
dtype: string