# 75 pandas Exercises: Exercises 11 to 20

Exercises 11 to 20 from [here](https://www.machinelearningplus.com/python/101-pandas-exercises-python/). Each exercise includes the question, the input and the solution's code. Sometimes, alternative solutions and comments to better explain solutions/pandas functionality are offered.

Requirements: 
+ pandas
+ numpy

Happy Pandasing! 🐼

## Imports

In [3]:
import pandas as pd
import numpy as np # required for some questions

---

## Exercises 

### 🐼 Exercise 11

**How to bin a numeric series to 10 groups of equal size?** Bin the series `ser` into 10 equal deciles and replace the values with the bin name.

Input 

In [4]:
ser = pd.Series(np.random.random(20)) # ser is a series of 10 random numbers

Desired output (first 5 items)


_0    7th_

_1    9th_

_2    7th_

_3    3rd_

_4    8th_


_dtype: category_
_Categories (10, object): [1st < 2nd < 3rd < 4th ... 7th < 8th < 9th < 10th]`_

So, Pandas allows discretization in equal-sized buckets (quantiles) through `pd.qcut`. Apparently, we can even directly give a name to the resulting bins. This seems like it!

In [6]:
bins = pd.qcut(ser, q=10, labels=['1th', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th'])
bins.head(15)

0      4th
1      1th
2      4th
3      7th
4     10th
5      7th
6      8th
7      9th
8      8th
9      5th
10    10th
11     1th
12     6th
13     6th
14     3rd
dtype: category
Categories (10, object): [1th < 2nd < 3rd < 4th ... 7th < 8th < 9th < 10th]

So, in `bins`, for each index (matching `ser`'s index, of course), we have the quantile that sample belongs to. We can actually try to see what defines a bin.

In [13]:
bins_ret, bins_limits = pd.qcut(ser, q=10, labels=['1th', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th'], retbins=True)
print(bins_limits)

[0.04155485 0.11576393 0.19091413 0.25799771 0.32604731 0.50903033
 0.7155983  0.84597955 0.87102839 0.93717372 0.95575487]


In `bins_limits`, we have thus the values that limit each of the 10 bins. The first bin, for instances, contains all samples between `0.04155485` and `0.11576393`. 

### 🐼 Exercise 12

**Convert a numpy array to a dataframe of given shape.** Reshape the series `ser` into a dataframe with 7 rows and 5 columns.

Input

In [15]:
ser = pd.Series(np.random.randint(1, 10, 35))

Reshaping in pandas is a tricky business (because every row is strongly connected to an index). The easiest way is a bit of back and forth between pandas and numpy (a `np.array` representating of a `pd.DataFrame` is accessible in the `.values` method).

In [19]:
reshaped_df = pd.DataFrame(ser.values.reshape(7, 5))
print(reshaped_df.shape)

(7, 5)


_Et voilà!_

### 🐼 Exercise 13

**Find the positions of numbers that are multiples of 3 in a `pd.Series`.** 

Input 

In [21]:
ser = pd.Series(np.random.randint(1, 10, 7))