### 11. How to bin a numeric series to 10 groups of equal size?
- Bin the series ser into 10 equal deciles and replace the values with the bin name.

#### pd.qcut : qcut은 표본 변위치를 기반으로 데이터를 나눠준다.

In [5]:
import pandas as pd
import numpy as np

# Input

ser = pd.Series(np.random.random(20))
print(ser.head())

0    0.998526
1    0.326848
2    0.693504
3    0.471361
4    0.161573
dtype: float64


In [6]:
# Solution

pd.qcut(ser, q=[0, .10, .20, .3, .4, .5, .6, .7, .8, .9, 1], 
        labels=['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th']).head()

0    10th
1     4th
2     7th
3     6th
4     1st
dtype: category
Categories (10, object): ['1st' < '2nd' < '3rd' < '4th' ... '7th' < '8th' < '9th' < '10th']

### 12. How to convert a numpy array to a dataframe of given shape? (L1)
- Reshape the series ser into a dataframe with 7 rows and 5 columns

In [7]:
# Input

ser = pd.Series(np.random.randint(1,10,35))

In [8]:
# Solution

df = pd.DataFrame(ser.values.reshape(7,5))
print(df)

   0  1  2  3  4
0  3  3  2  8  6
1  5  2  8  8  4
2  2  7  3  8  2
3  7  7  7  8  5
4  9  7  3  4  8
5  7  1  1  5  2
6  9  6  4  6  7


### 13. How to find the positions of numbers that are multiples of 3 from a series?
- Find the positions of numbers that are multiples of 3 from ser.

#### numpy.argwhere(a)
- Find the indices of array elements that are non-zero, grouped by element.

In [18]:
# Input
import pandas as pd
import numpy as np

ser = pd.Series(np.random.randint(1, 10, 7))

In [20]:
# Solution
print(ser)
np.where(ser % 3 == 0)

0    9
1    2
2    4
3    7
4    8
5    8
6    7
dtype: int32


(array([0], dtype=int64),)

### 14. How to extract items at given positions from a series
- From ser, extract the items at positions in list pos.

In [21]:
# Input

ser = pd.Series(list('abcdefghijklmnopqrstuvwxyz'))
pos = [0, 4, 8, 14, 20]

In [22]:
# Solution

ser.take(pos)

0     a
4     e
8     i
14    o
20    u
dtype: object

### 15. How to stack two series vertically and horizontally ?
- Stack ser1 and ser2 vertically and horizontally (to form a dataframe).

In [23]:
# Input

ser1 = pd.Series(range(5))
ser2 = pd.Series(list('abcde'))

In [37]:
# Solution

# Vertical
df1 = ser1.append(ser2, ignore_index = True)
print(df1)

# Horizontal
df2 = pd.concat([ser1, ser2], axis=1)
print(df2)



0    0
1    1
2    2
3    3
4    4
5    a
6    b
7    c
8    d
9    e
dtype: object
   0  1
0  0  a
1  1  b
2  2  c
3  3  d
4  4  e


### 16. How to get the positions of items of series A in another series B?
- Get the positions of items of ser2 in ser1 as a list.

In [38]:
# Input
ser1 = pd.Series([10, 9, 6, 5, 3, 1, 12, 8, 13])
ser2 = pd.Series([1, 3, 10, 13])

In [47]:
# Solution 1

[np.where(i == ser1)[0].tolist()[0] for i in ser2]

# Solution 2

[pd.Index(ser1).get_loc(i) for i in ser2]

[5, 4, 0, 8]

### 17. How to compute the mean squared error on a truth and predicted series?
- Compute the mean squared error of truth and pred series.

In [48]:
# Input

truth = pd.Series(range(10))
pred = pd.Series(range(10)) + np.random.random(10)

In [49]:
# Solution

np.mean((truth-pred)**2)

0.28849008079413146

### 18. How to convert the first character of each element in a series to uppercase?
- Change the first character of each word to upper case in each word of ser.

In [52]:
# Input

ser = pd.Series(['how', 'to', 'kick', 'ass?'])

In [56]:
# Solution 1
ser.map(lambda x: x.title())

# Solution2
ser.map(lambda x: x[0].upper() + x[1:])

# Solution3
pd.Series([i.title() for i in ser])

0     How
1      To
2    Kick
3    Ass?
dtype: object

### 19. How to calculate the number of characters in each word in a series?

In [57]:
# Input

ser = pd.Series(['how', 'to', 'kick', 'ass?'])

In [58]:
# Solution

ser.map(lambda x: len(x))

0    3
1    2
2    4
3    4
dtype: int64

### 20. How to compute difference of differences between consequtive numbers of a series?
- Difference of differences between the consequtive numbers of ser.

In [59]:
# Input

ser = pd.Series([1, 3, 6, 10, 15, 21, 27, 35])

In [64]:
print(ser.diff().tolist())
print(ser.diff().diff().tolist())

[nan, 2.0, 3.0, 4.0, 5.0, 6.0, 6.0, 8.0]
[nan, nan, 1.0, 1.0, 1.0, 1.0, 0.0, 2.0]


### 21. How to convert a series of date-strings to a timeseries?

In [None]:
# Input

ser = pd.Series(['01 Jan 2010', '02-02-2011', '20120303', '2013/04/04', '2014-05-05', '2015-06-06T12:20'])

In [66]:
# Solution

pd.to_datetime(ser)

0   1970-01-01 00:00:00.000000001
1   1970-01-01 00:00:00.000000003
2   1970-01-01 00:00:00.000000006
3   1970-01-01 00:00:00.000000010
4   1970-01-01 00:00:00.000000015
5   1970-01-01 00:00:00.000000021
6   1970-01-01 00:00:00.000000027
7   1970-01-01 00:00:00.000000035
dtype: datetime64[ns]

### 22. How to get the day of month, week number, day of year and day of week from a series of date strings?
- Get the day of month, week number, day of year and day of week from ser.

In [68]:
# Input
ser = pd.Series(['01 Jan 2010', '02-02-2011', '20120303', '2013/04/04', '2014-05-05', '2015-06-06T12:20'])


In [71]:
# Solution
from dateutil.parser import parse
ser_ts = ser.map(lambda x: parse(x))

print(ser_ts)

0   2010-01-01 00:00:00
1   2011-02-02 00:00:00
2   2012-03-03 00:00:00
3   2013-04-04 00:00:00
4   2014-05-05 00:00:00
5   2015-06-06 12:20:00
dtype: datetime64[ns]


In [73]:
# day of month
print("Date: ",ser_ts.dt.day.tolist())

Date:  [1, 2, 3, 4, 5, 6]


In [78]:
# week number
print("Week number: ", ser_ts.dt.weekofyear.tolist())

Week number:  [53, 5, 9, 14, 19, 23]


  print("Week number: ", ser_ts.dt.weekofyear.tolist())


In [79]:
# day of year
print("Day number of year", ser_ts.dt.dayofyear.tolist())

Day number of year [1, 33, 63, 94, 125, 157]


In [89]:
# day of week

import calendar

print("Day of week: ", ser_ts.dt.day_name().tolist())

Day of week:  ['Friday', 'Wednesday', 'Saturday', 'Thursday', 'Monday', 'Saturday']
