### 1. How to import pandas and check the version?

##### solution:

In [1]:
import pandas as pd
print(pd.__version__)

1.4.2


### 2. How to create a series from a list, numpy array and dict?

##### Input:

In [2]:
import numpy as np
mylist = list('abcedfghijklmnopqrstuvwxyz')
myarr = np.arange(26)
mydict = dict(zip(mylist, myarr))

##### solution:

In [3]:
l=pd.Series(mylist)
a=pd.Series(myarr)
d=pd.Series(mydict)

### 3. How to convert the index of a series into a column of a dataframe?

##### Input:

In [6]:
mylist = list('abcedfghijklmnopqrstuvwxyz')
myarr = np.arange(26)
mydict = dict(zip(mylist, myarr))
ser = pd.Series(mydict)

##### solution:

In [10]:
df=ser.to_frame().reset_index()

### 4. How to combine many series to form a dataframe?

##### Input:

In [12]:
import numpy as np
ser1 = pd.Series(list('abcedfghijklmnopqrstuvwxyz'))
ser2 = pd.Series(np.arange(26))

##### solution:

In [15]:
df=pd.DataFrame(ser1, ser2).reset_index()

### 5. How to assign name to the series’ index?

##### Input:

In [24]:
#Give a name to the series ser calling it ‘alphabets’.

ser = pd.Series(list('abcedfghijklmnopqrstuvwxyz'))

##### solution:

In [25]:
ser=ser.rename_axis('alphabets')

### 6. How to get the items of series A not present in series B?

##### Input:

In [27]:
ser1 = pd.Series([1, 2, 3, 4, 5])
ser2 = pd.Series([4, 5, 6, 7, 8])

##### solution:

In [33]:
ser1[~ser1.isin(ser2)]

0    1
1    2
2    3
dtype: int64

### 7. How to get the items not common to both series A and series B?

##### Input:

In [55]:
ser1 = pd.Series([1, 2, 3, 4, 5])
ser2 = pd.Series([4, 5, 6, 7, 8])

##### solution:

In [56]:
u=pd.Series(np.union1d(ser1, ser2))
i=pd.Series(np.intersect1d(ser1, ser2))
u[~u.isin(i)]

0    1
1    2
2    3
5    6
6    7
7    8
dtype: int64

### 8. How to get the minimum, 25th percentile, median, 75th, and max of a numeric series?

##### Input:

In [52]:
ser = pd.Series(np.random.normal(10, 5, 25))

##### solution:

In [53]:
np.percentile(ser, q=[0, 25, 50, 75, 100])

array([-8.68069443e-03,  7.83023492e+00,  9.42703204e+00,  1.29712770e+01,
        1.91578363e+01])

### 9. How to get frequency counts of unique items of a series?

##### Input:

In [45]:
ser = pd.Series(np.take(list('abcdefgh'), np.random.randint(8, size=30)))

##### solution:

In [49]:
ser.value_counts()

e    8
d    5
f    4
b    4
g    3
a    3
c    2
h    1
dtype: int64

### 10. How to keep only top 2 most frequent values as it is and replace everything else as ‘Other’?

##### Input:

In [66]:
np.random.RandomState(100)
ser = pd.Series(np.random.randint(1, 5, [12]))

##### solution:

In [67]:
ser[~ser.isin(ser.value_counts().index[:2])] = 'Other'

### 11. How to bin a numeric series to 10 groups of equal size?

##### Input:

In [71]:
ser = pd.Series(np.random.random(20))

##### solution:

In [72]:
pd.qcut(ser, q=[0, .10, .20, .3, .4, .5, .6, .7, .8, .9, 1], 
        labels=['1st', '2nd', '3rd', '4th', '5th', '6th', '7th', '8th', '9th', '10th'])

0      3rd
1      7th
2     10th
3      4th
4      3rd
5     10th
6      7th
7      5th
8      2nd
9      9th
10     6th
11     9th
12     8th
13     2nd
14     1st
15     8th
16     5th
17     4th
18     1st
19     6th
dtype: category
Categories (10, object): ['1st' < '2nd' < '3rd' < '4th' ... '7th' < '8th' < '9th' < '10th']

### 12. How to convert a numpy array to a dataframe of given shape? (L1)

##### Input:

In [77]:
#Reshape the series ser into a dataframe with 7 rows and 5 columns

ser = pd.Series(np.random.randint(1, 10, 35))

##### solution:

In [78]:
df=pd.DataFrame(ser.values.reshape(7,5))
print(df)

   0  1  2  3  4
0  4  5  4  5  6
1  2  8  1  6  3
2  1  8  6  6  9
3  9  4  6  3  8
4  9  6  5  1  3
5  9  3  6  6  4
6  4  7  7  8  3


### 13. How to find the positions of numbers that are multiples of 3 from a series?

##### Input:

In [84]:
ser = pd.Series(np.random.randint(1, 10, 7))

##### solution:

In [86]:
np.where(ser % 3 == 0)

(array([1, 3, 5, 6]),)

### 14. How to extract items at given positions from a series

##### Input:

In [69]:
ser = pd.Series(list('abcdefghijklmnopqrstuvwxyz'))
pos = [0, 4, 8, 14, 20]

##### solution:

In [70]:
ser[pos]

0     a
4     e
8     i
14    o
20    u
dtype: object

### 15. How to stack two series vertically and horizontally ?

##### Input:

##### solution:

In [87]:
#vertical
ser1.append(ser2)

#horizontal
pd.concat([ser1, ser2], axis=1)

  ser1.append(ser2)


Unnamed: 0,0,1
0,1,4
1,2,5
2,3,6
3,4,7
4,5,8


### 16. How to get the positions of items of series A in another series B?

##### Input:

In [88]:
ser1 = pd.Series([10, 9, 6, 5, 3, 1, 12, 8, 13])
ser2 = pd.Series([1, 3, 10, 13])

##### solution:

In [91]:
[pd.Index(ser1).get_loc(i) for i in ser2]

[5, 4, 0, 8]

### 17. How to compute the mean squared error on a truth and predicted series?

##### Input:

In [92]:
truth = pd.Series(range(10))
pred = pd.Series(range(10)) + np.random.random(10)

##### solution:

In [95]:
print('mse:',np.mean((truth-pred)**2))

mse: 0.31832376151503505


### 18. How to convert the first character of each element in a series to uppercase?

##### Input:

In [96]:
ser = pd.Series(['how', 'to', 'kick', 'ass?'])

##### solution:

In [97]:
ser.str.capitalize()

0     How
1      To
2    Kick
3    Ass?
dtype: object

### 19. How to calculate the number of characters in each word in a series?

##### Input:

In [98]:
ser = pd.Series(['how', 'to', 'kick', 'ass?'])

##### solution:

In [99]:
ser.str.len()

0    3
1    2
2    4
3    4
dtype: int64

### 20. How to compute difference of differences between consequtive numbers of a series?

##### Input:

In [100]:
ser = pd.Series([1, 3, 6, 10, 15, 21, 27, 35])

##### solution:

In [101]:
ser.diff().diff().tolist()

[nan, nan, 1.0, 1.0, 1.0, 1.0, 0.0, 2.0]

### 21. How to convert a series of date-strings to a timeseries?

##### Input:

In [102]:
ser = pd.Series(['01 Jan 2010', '02-02-2011', '20120303', '2013/04/04', '2014-05-05', '2015-06-06T12:20'])

##### solution:

In [103]:
pd.to_datetime(ser)

0   2010-01-01 00:00:00
1   2011-02-02 00:00:00
2   2012-03-03 00:00:00
3   2013-04-04 00:00:00
4   2014-05-05 00:00:00
5   2015-06-06 12:20:00
dtype: datetime64[ns]

### 22. How to get the day of month, week number, day of year and day of week from a series of date strings?

##### Input:

In [125]:
ser = pd.Series(['01 Jan 2010', '02-02-2011', '20120303', '2013/04/04', '2014-05-05', '2015-06-06T12:20'])

##### solution:

In [126]:
from dateutil.parser import parse

ser=ser.map(lambda x: parse(x))

# day of the month
print('Date: ', ser.dt.day.tolist())

# week number
print('Week number: ', ser.dt.weekofyear.tolist())

# day of year
print('Day number of year: ', ser.dt.dayofyear.tolist())

# day of week
print('Day of week: ', ser.dt.day_name().tolist())

Date:  [1, 2, 3, 4, 5, 6]
Week number:  [53, 5, 9, 14, 19, 23]
Day number of year:  [1, 33, 63, 94, 125, 157]
Day of week:  ['Friday', 'Wednesday', 'Saturday', 'Thursday', 'Monday', 'Saturday']


  print('Week number: ', ser.dt.weekofyear.tolist())


### 23. How to convert year-month string to dates corresponding to the 4th day of the month?

##### Input:

##### solution:

### 24. How to filter words that contain atleast 2 vowels from a series?

##### Input:

##### solution:

### 25. How to filter valid emails from a series?

##### Input:

##### solution:

### 26. How to get the mean of a series grouped by another series?

##### Input:

##### solution:

### 27. How to compute the euclidean distance between two series?

##### Input:

##### solution:

### 28. How to find all the local maxima (or peaks) in a numeric series?

##### Input:

##### solution:

### 29. How to replace missing spaces in a string with the least frequent character?

##### Input:

##### solution:

### 30. How to create a TimeSeries starting ‘2000-01-01’ and 10 weekends (saturdays) after that having random numbers as values?

##### solution:

### 31. How to fill an intermittent time series so all missing dates show up with values of previous non-missing date?

##### Input:

##### solution:

### 32. How to compute the autocorrelations of a numeric series?

##### Input:

##### solution:

### 33. How to import only every nth row from a csv file to create a dataframe?

##### Input:

##### solution:

### 34. How to change column values when importing csv to a dataframe?

##### Input:

##### solution:

### 35. How to create a dataframe with rows as strides from a given series?

##### Input:

##### solution:

### 36. How to import only specified columns from a csv file?

##### Input:

##### solution:

### 37. How to get the nrows, ncolumns, datatype, summary stats of each column of a dataframe? Also get the array and list equivalent.

##### Input:

##### solution:

### 38. How to extract the row and column number of a particular cell with given criterion?

##### Input:

##### solution:

### 39. How to rename a specific columns in a dataframe?

##### Input:

##### solution:

### 40. How to check if a dataframe has any missing values?

##### Input:

##### solution:

### 41. How to count the number of missing values in each column?

##### Input:

##### solution:

### 42. How to replace missing values of multiple numeric columns with the mean?

##### Input:

##### solution:

### 43. How to use apply function on existing columns with global variables as additional arguments?

##### Input:

##### solution:

### 44. How to select a specific column from a dataframe as a dataframe instead of a series?

##### Input:

##### solution:

### 45. How to change the order of columns of a dataframe?

##### Input:

##### solution:

### 46. How to set the number of rows and columns displayed in the output?

##### Input:

##### solution:

### 47. How to format or suppress scientific notations in a pandas dataframe?

##### Input:

##### solution:

### 48. How to format all the values in a dataframe as percentages?

##### Input:

##### solution:

### 49. How to filter every nth row in a dataframe?

##### Input:

##### solution:

### 50. How to create a primary key index by combining relevant columns?

##### Input:

##### solution:

### 51. How to get the row number of the nth largest value in a column?

##### Input:

##### solution:

### 52. How to find the position of the nth largest value greater than a given value?

##### Input:

##### solution:

### 53. How to get the last n rows of a dataframe with row sum > 100?

##### Input:

##### solution:

### 54. How to find and cap outliers from a series or dataframe column?

##### Input:

##### solution:

### 55. How to reshape a dataframe to the largest possible square after removing the negative values?

##### Input:

##### solution:

### 56. How to swap two rows of a dataframe?

##### Input:

##### solution:

### 57. How to reverse the rows of a dataframe?

##### Input:

##### solution:

### 58. How to create one-hot encodings of a categorical variable (dummy variables)?

##### Input:

##### solution:

### 59. Which column contains the highest number of row-wise maximum values?

##### Input:

##### solution:

### 60. How to create a new column that contains the row number of nearest column by euclidean distance?

##### Input:

##### solution:

### 61. How to know the maximum possible correlation value of each column against other columns?

##### Input:

##### solution:

### 62. How to create a column containing the minimum by maximum of each row?

##### Input:

##### solution:

### 63. How to create a column that contains the penultimate value in each row?

##### Input:

##### solution:

### 64. How to normalize all columns in a dataframe?

##### Input:

##### solution:

### 65. How to compute the correlation of each row with the suceeding row?

##### Input:

##### solution:

### 66. How to replace both the diagonals of dataframe with 0?

##### Input:

##### solution:

### 67. How to get the particular group of a groupby dataframe by key?

##### Input:

##### solution:

### 68. How to get the n’th largest value of a column when grouped by another column?

##### Input:

##### solution:

### 69. How to compute grouped mean on pandas dataframe and keep the grouped column as another column (not index)?

##### Input:

##### solution:

### 70. How to join two dataframes by 2 columns so they have only the common rows?

##### Input:

##### solution:

### 71. How to remove rows from a dataframe that are present in another dataframe?

##### Input:

##### solution:

### 72. How to get the positions where values of two columns match?

##### Input:

##### solution:

### 73. How to create lags and leads of a column in a dataframe?

##### Input:

##### solution:

### 74. How to get the frequency of unique values in the entire dataframe?

##### Input:

##### solution:

### 75. How to split a text column into two separate columns?

##### Input:

##### solution: